Skip to content

ubleipzig/ottily

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ottily

Ottily executes a javascript snippet on each line of an input file in parallel.

Installation

$ go get github.com/miku/ottily/cmd/ottily

Usage

$ ottily -h
Usage of ottily:
  -cpuprofile="": write cpu profile to file
  -e="": execute argument on each line of input
  -p="": include this js file before processing
  -s="": script to execute on each line of input
  -v=false: prints current program version
  -w=4: number of workers

Examples

Noop -- is just an expensive cat.

$ ottily datasets/simple.ldj
{"name": "ottily", "language": "Golang"}

Inline script with -e:

$ ottily -e 'output=input.length' datasets/simple.ldj
40

$ ottily -e 'o=JSON.parse(input); o["language"] = "Go"; output=JSON.stringify(o);' \
            datasets/simple.ldj

{"language":"Go","name":"ottily"}

Pass a script file with -s:

$ cat scripts/classified.js
output = "CLASSIFIED"

$ ottily -s scripts/classified.js datasets/simple.ldj
CLASSIFIED

A string variable input is passed into the javascript snippet. To produce output, set the output variable to the desired string value. If output is set to null or is undefined, nothing is printed. That's it.

By default, ottily will run as many workers as there are cores.

Performance

Ottily is just a 140 LOC Go program. In many cases, there will be faster alternatives.

Given a file with 1 million lines, calculate the length of each line.

$ time awk '{ print length }' datasets/1M.ldj  > /dev/null

real    0m5.836s
user    0m5.514s
sys     0m0.314s

$ time ottily -e 'output=input.length' datasets/1M.ldj > /dev/null

real    0m11.758s
user    0m38.204s
sys     0m4.713s

Given a file with 1 million lines, one JSON document per line, add a new key to each JSON document.

$ time jq -c '."about" = "jq"' datasets/1M.ldj > /dev/null

real    2m31.063s
user    2m29.517s
sys     0m1.500s

$ time ottily -e 'o=JSON.parse(input); o["about"] = "ot"; output=JSON.stringify(o);' \
              datasets/1M.ldj > /dev/null

real    5m55.630s
user    20m25.530s
sys     0m46.750s

Above tests were done on 4 cores.

Credits

Ottily uses otto to run js.

About

Ottily executes a javascript snippet on each line of an input file in parallel.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 86.8%
  • JavaScript 11.0%
  • Makefile 2.2%