Unix utilities that process ‘abstracted tables’, or streams of records. In the spirit of of the classic Unix utilities: head, tail, cut, grep, sort, and works with multiple encodings (tab, csv, etc.).


go get
cd $GOPATH/src/


Record streams are identified using a URI structure. To convert a tab file in the pwd ( to a csv file:

abcat -i tab:// -o csv://foo.csv

Tab separated values is the core file format. By default, files are expected to have a header line describing the columns in the file though a header can be specified as part of the url.


‘Memo’ fields, and embedded delimiters:
Embedded Newlines represnted as ‘\n’
Embedded Tabs represnted as ‘\t’
Embedded CR represnted as ‘\r’


‘cat’ a tab delimited file to the terminal:

  abcat -i

All utilities supoprt using a URL to specify the encoding:

  abcat -i tab://

Convert from CSV to TAB:

  abcat -i csv://foo.csv -o tab://

View a ‘portrait’ mode of the records in a file:

  abview -i csv://foo.csv

Cut columns out of a file by name:

  # Id Created_At Updated_At Email_Address
  abcut -i csv://foo.csv -f Id,First_Name,Last_Name

Grep for lines

  abgrep -i csv://foo.csv -e 'First_Name == "Bob"'
  abgrep -i csv://foo.csv -e '__LINE__ >= 300 && __LINE__ < 400'

Take a random sample: Flip a coin for each record:

  abgrep -e 'RandFloat()>0.5' -i tab://

The expression is evaluated using:, a restricted sub-set of the go language. There will be a pre-defined variable for each field in the row as well all __LINE__ which contains the current record number.

Take a sample: record 300 to 399:

  abgrep -i csv://foo.csv -e '__LINE__ >= 300 && __LINE__ < 400'

Take a sample: Emit every Nth record:

  abgrep -e '(__LINE__%2)==0' -i tab://
  abgrep -e '(__LINE__%5)==0' -i tab://
  abgrep -e '(__LINE__%10)==0' -i tab://

Head and Tail

Header plus the first 100 records:

  abhead -i csv://foo.csv -n 100

Header plus the last 100 records:

  abtail -i csv://foo.csv -n 100

Header, skips the first 100 records, emits the rest:

  abtail -i csv://foo.csv -n +100


  absort -i csv://foo.csv -f Last_Name,First_Name



Expressions have access to the record number in the stream (lnum), the array of field values (rec) and each of the declared fields.

Modify a source: add columns, apply an expression to records.

Common Command Line Options

  • -tmp /path/to/tmp

Specfiy an alternatvie temp directory to use.

URL Encoding


Schema correlates to the driver / file encoding.


Path correlates to the file path to read from or write to. You can interaact with stdin/stdout on Unix by specifying file paths of /dev/stdin and /dev/stdout respectively. Drivers may interpret these differently, ex: database drivers would interpret the first parts of the path as the database host, schema and table name.

Query String

These are for common options and driver specific options.

Common Options
  • header=f1,f2,f3

Specify the header for a file if it does not have one of its own.

  • -header

Indicates the source has no header and that one should be fabricated (F1, F2, F3, …). This is assumed if a header is supplied, and can be used to strip the header off of a destination file.

  • skipLines=N

Indicates to skip the first N lines of the source.

Supported Encoding Formats


There are no custom options supported for the tab driver.


There are no custom options supported for the csv driver.


The fixed width driver can be used to read or write fixed width file formats. It supports the following options:




NB: The pg driver supports read operations but does not support write operations.

abcat -i “pg://localhost:5432/database_name/table_name?order=somecol&limit=10&offset=0&user=pg_username&password=pg_password” -o csv://table.csv




  abcat -i "tab://fixtures/,Count"
  abcat -i "tab://fixtures/,Count" -o csv:///dev/stdin
  abfillrates -i "tab://fixtures/,Count"

  abgrep -e 'Substr(Name,-1,0) == "y"' -i "tab://fixtures/,Count" 2>&1 | less


