Skip to content

lht/termite

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Termite is a generic distributed compilation system.

The master distributes the compilation to workers.  Workers run
arbitrary binaries chrooted in a FUSE mirror of the master's file
system, and then ship the results back to the master.


CAVEATS

Work in progress.


COMPILE

Use the following recipe

  git clone git://github.com/hanwen/go-fuse.git
  git clone git://github.com/hanwen/termite.git
  cd go-fuse && sh all.bash
  cd termite && sh all.bash

I do not goinstall go-fuse, since I develop go-fuse in tandem with
termite, but with some tweaks to the Makefile it should work.

Some other tools need patching/upgrading too.

  # Add MAKE_SHELL variable to make.
  wget http://ftp.gnu.org/gnu/make/make-3.82.tar.bz2
  tar xjf make-3.28.tar.bz2
  cd make-3.82 && patch -p1 < ../termite/patches/make-*patch
  ./configure && make && make install

  # Upgrade coreutils to 8.0 or later to have working "rm -rf".

OVERVIEW

There are 4 binaries:

* Coordinator: a simple server that administers a list of live
workers.  Workers periodically contact the coordinator.

* Worker: should run as root, and typically runs on multiple machines.

* Master: the daemon that runs on the machine.  It contacts the
coordinator to get a list of workers, and reserves job slots on the
workers.  Run it in the root of the writable directory for the
compile.  It creates a .termite-socket that the wrapper below uses.

* Shell-wrapper: a wrapper to use with make's SHELL variable.

The choice between remote and local can be set through the file
.termite-localrc in the same dir as .termite-socket.  The file is in
json format, and you can find examples in the patches/ subdirectory.
The default

  [{
    "Regexp": ".*termite-make",
    "Local": true,
    "Recurse": true,
    "SkipRefresh": true
  }, {
    "Regexp": ".*",
    "Local": false
  }]

(ie., only recursive make calls are run locally) should work for most
projects, but for performance reasons, you might want to run more
commands locally.

Typically, build-system commands should run locally (eg. make, cmake).

Commands that modify build artefacts should not run locally: local
commands do not run inside a FUSE sandbox, so termite can't tell what
files they modify, and how to update filesystem caches on the workers.
By default, after executing a local command, the termite master scans
for changed files.  If you know this is not the case, you can skip
this with SkipRefresh: true.



RUNNING

  dd if=/dev/random of=secret.txt bs=1 count=20 && chmod 0600 secret.txt
  ${TERMITE_DIR}/bin/coordinator/coordinator &
  sudo ${TERMITE_DIR}/bin/worker/worker -jobs 4

  # Run distributed build
  ln -s ./termite-make  ~/bin/
  ln -s ./bin/shell-wrapper ~/bin

  cd ${PROJECT}
  ${TERMITE_DIR}/bin/master/master -jobs 4 \
    -secret ${TERMITE_DIR}/secret.txt &
  termite-make -j20


PERFORMANCE

Lenovo T60, local build, local worker, LLVM+CLANG Cmake lib/Support with
separate build dir -j1: 2.0-2.3x slower

The worker can typically run 2 jobs per available core.

TODO - stats on how many workers saturate a master-core/master-harddisk.


SECURITY

* The worker runs binaries as user 'nobody' in a chroot of a FUSE file
  system.  It needs root permission for the following actions:

  - mount a FUSE fs as allow_other
  - chroot to FUSE fs
  - change uid to user 'nobody'

* Worker and master use plaintext TCP/IP, and use a shared secret with
  HMAC-SHA1 to authenticate the connection.  See
  https://github.com/hanwen/termite/blob/master/termite/connection.go
  for details.

  If this connection scheme gets broken,

  - a malicious user may request all files from the master that the
    UID running the master has access to.

  - a malicious user may try to run a binary on the worker that tries
    to break out of chroot.

* Wrapper and master run as the same user and use IPC unix domain
  sockets to communicate.  The socket mode is 0700.


CAVEATS

* Not all file operations are supported in the workers, due to
  limitations of Go-FUSE's UnionFs.  Some are not yet implemented, but
  posixly correct hardlinks cannot be supported without major changes.


TODO

* Packages to compile:
  - linux kernel

* Security:
  - More sophisticated exclusion for exporting file systems.
  - Security review for the connection scheme.

* Features
  - Collect worker log information and expose on HTTP.

* Data collection:
  - compute average parallelism.
  - number of jobs scheduled
  - collect data on cpu used/time used in worker.

* Speed
  - Worker:
    * Do full updates for OpenDir .
  - Master:
    * add in-memory cache for file contents.
    * worker <-> worker file content fetching
  - Master/Worker:
    * streaming connections for file transport.
    * expand directory deletion in worker.
    * for src/writableRoot, always sync worker and master,
    so never need to RPC for negative getattr.
  - Worker
    * implement in-memory FileSystem for backing store


SUCCESSFUL COMPILES

Termite timings by running master and single worker on the same
machine.

* Make 3.82. (1.6x slower, Lenovo T60, 2-core, make -j2)

* LLVM 2.9 (1.5 slower, Dell T5300 6-core, make -j12)

* GUILE 2.0.
 - Must run inside srcdir.
 - 1.1x slower, Dell T5300 6-core, make -j6

* Emacs 24
 - 1.8x slower (Lenovo T60, make -j2)
 - Must run in srcdir.

About

distributed compilation in go

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 98.7%
  • Shell 1.3%