Skip to content

ibm-security-innovation/menagerie

Repository files navigation

Menagerie

An orchestration platform for Docker containers runnning batch jobs. This was driven by a need to run multiple malware analyzers side by side, each with a different set of installation requirements and technologies. These needed to be exposed as services to other systems and users.

The following common services are added on top of the raw engines:

  • Job submission and tracking
  • Job history
  • Execution and resource containment
  • Engine version management

Quick start

$ cd environments/generic-ami
$ vagrant up
  • Launches the entire system, with a sample engine wrapping apktool
  • We are using ubuntu 14.04 box and docker 1.10 - you can adapt it to your env but keep in mind at least docker 1.10 is expected, for features we are using
  • It is recommended to install the vagrant-vbguest plugin - this will align the guest additions in the imported box

Once the system is up and running it can be used as follows (from outside the Vagrant box):

  • Submit job to engine via curl -v -XPOST http://localhost:8100/apktool/upload -F "upload=@<path-to-sample-apk>". The response is a job-id number
  • Get response via curl -v http://localhost:8100/result/<job-id>
  • Console and result viewer via menagerie console
  • RabbitMQ monitoring via RabbitMQ admin. Use the credentials provided in confs/default.json, by default menagerie|menagerie

API

Menagerie supports the following HTTP calls:

Method URL Parameters Result (JSON)
POST /<engine-name>/upload upload (Multipart): filename and body
  • jobid: job tracking ID
GET /result/<id> <id> (URL): jobid from upload call
  • status: [Running,Success,Failed]
  • summary: excerpt from result file
  • link: link to result file
GET /link/<id> <id> (URL): jobid from upload call Result file bytes

curl calls:

  • curl -XPOST http://<server>:<port>/<engine-name>/upload -F "upload=@<path-to-file>"
  • curl -XGET http://<server>:<port>/result/<jobid>
  • curl -XGET http://<server>:<port>/link/<jobid>

Architecture

alt text

The entire system is built with Docker containers that interact with each other. The vagrant box loads all of them in one place, but some services can be separated to external locations (see scalability section below). Here is what we use, each running in its own container:

  • Secure private Docker registry - for storing engines and the menagerie containers, as well as component version management
  • RabbitMQ
  • Mysql - for job tracking and history
  • Go frontend - HTTP API, console webapp, and internal storage service
  • Go backend worker threads that launch the engine containers

Job lifecycle is as follows:

  • A file is submitted to specific engine over HTTP API
  • The file is stored in a file storage over HTTP, a job is inserted into a named queue in RabbitMQ, and a job entry is created in Mysql
  • Workers configured to monitor queues by name pull jobs when free. When a job is pulled:
    • A directory is created, and the input file is pulled from the storage
    • An engine container is launched, with the directory mounted as configured
    • When the engine is done - the worker acks RabbitMQ, stores the result, and sends a completion request over an internal HTTP API
    • The completion call updates the database as needed

HOWTOS

Adding engines

Creating additional engines is as easy as creating a Docker file. See the sample wrapper we provide for apktools in here.

Once the engine is built and pushed to the private registry, you need to:

  • Add an entry in both 'engines.json' files (see below file system structure). An entry has the following structure (see also the apktool engine config):
{"engines": [
  { "name": "apktool",                        # engine queue name
    "workers": 2,                             # how many workers listening
    "image": "{{regserver}}/apktool:stable",  # regserver:port/new-engine-name:tag
    "runflags": [                             # additional run flag strings, concatenated with spaces to cmd
      "--security-opt",
      "apparmor:menagerie.apktool"
    ],
    "cmd": "/engine/run.sh",                  # entry point
    "mountpoint": "/var/data",                # where the engine expects the input (is ephemeral docker volume)
    "sizelimit": 50000000,                    # limit on upload size, bytes
    "inputfilename": "sample.apk",            # the script should expect a single file as input, by this name
    "user": {{uid}},                          # templated, using the UID provided to install. can be HC
    "timeout": 240                            # seconds on engine run
  }
]}
  • Pull the engine container in the deployed machine
  • sudo stop menagerie && sudo start menagerie

You will see a new engine/queue added to the console and RabbitMQ admin. Submitting jobs is the same as described in the quick start, use the engine-name instead of apktool in the curl request.

When you build and push new versions of engine containers, follow the steps above, simply correcting the config to point to the image tag you want (note if the config is using tag latest - merely pulling the latest image is enough no need to restart services).

Configuration

The menagerie container is launched twice as part of normal operation - once as the frontend server and once as the workers controller. For each of the two instances we push the same two config files into the container, located under /data/confs (see also volumes section).

For engine configuration - see the engines section above.

The second file is global configuration file (default.json that contains location of intenal services, credentials, and misc directories.

Cleanup script (for volumes) is located inside the menagerie containers under /usr/local/menagerie/scripts/cleanup.sh - edit this file if you need to increase/reduce the period files are kept. This script is launched periodically via an external cron task see /etc/cron.d/menagerie-cron

Deployment

It is recommended to read the Vagrantfile as this contains the most updated example of deploying the system. Note that vagrant places all required services in one box - this can be distributed as described below. In addition note that the vagrant box also functions as a dev/build box as we compile the Go binaries on it. This is not required in a production deployments where the menagerie container can be built on a separate machine and pushed to the Docker registry (we use Jenkins, feel free to use your fancy).

Volumes

We are using docker named volumes that are mounted under /data inside the core containers:

vagrant@vagrant-ubuntu-trusty-64:~$ docker volume ls
DRIVER              VOLUME NAME
local               menagerie_menage
local               menagerie_mysql
local               menagerie_rabbitmq
local               mngreg_registry_conf
local               mngreg_registry_data
local               menagerie_frontend

Structure inside the frontend volume:

frontend_1$ tree /data/
.
└── data/
    ├── keys/
    │   ├── engines.json
    │   └── frontend.json
    ├── log/
    ├── mule/
    └── store/
       ├── <<job-id>>/
       │   ├── input
       │   └── result
       └── ...

Structure inside the backend/menage volume:

menage_1$ tree /data/
.
└── data/
    ├── keys/
    │   ├── engines.json
    │   └── frontend.json
    ├── log/
    ├── mule/
    └── jobs/
       ├── <<running/failed job-id>>/
       └── ...

Docker volumes are supported from version 1.9, and allows to persist data in Dockerland. These volumes can also be remoted later on when using swarm or other scaling solutions.

Services

The containers are launched by Docker compose, monitored by upstart. The upstart scripts are located under /etc/init/ and are all named menagerie*.conf

Logs

all logs can be viewed using the docker logs <container-name> command. The maintainance logs that are launched via cron are tagged with container name and can be viewed in /var/log/syslog


Extra info

Security

If we break aside the Mysql/RabbitMQ/Docker-registry containers (all can run as external services in production environments, see scalability), the 2 core services running are frontend and menage, the worker controller.

We ran the docker-benchmark test and remediated relevant comments (not all are relevant on a developer/vagrant box). Note that it is highly recommended to run this on your production deployment.

We running docker with the user namespace mapping (DOCKER_OPTS="--userns-remap=default", see Vagrantfile), a new feature in 1.10. This means that although internally the containers are running as root user, the UID is actually that of a less privillaged user on the host. The engines are also confined with timeout, and we highly recomment to add additional run flags to limit them via the JSON file described above.

Since we are using Docker volumes, there is no mapping of host disk into the containers.

Finally, we implemented apparmor profiles for the menage/frontend containers, and provide a hands on wiki for adding apparmor profile to your custom engines. Profiles are defined in complain mode.

Scalability

alt text

As mentioned earlier, in real production environments we should break out the generic components:

  • Mysql
  • RabbitMQ
  • Docker registry
  • File store

Once we have that, we can have multiple nodes running only menagerie and engine containers, and place a load-balancer in front of the API.

Note that when using a private Docker registry - you need to make sure the certificate is installed and a docker login was performed. This can be done via the script reg-connect.sh, which is also used in the Vagrantfile.


Contribution

Contributions to the project are welcomed. It is required however to provide alongside the pull request one of the contribution forms (CLA) that are a part of the project. If the contributor is operating in his individual or personal capacity, then he/she is to use the individual CLA; if operating in his/her role at a company or entity, then he/she must use the corporate CLA.


License

(c) Copyright IBM Corp. 2015, 2016

This project is licensed under the Apache License 2.0. See the LICENSE file for more info.

3rd party software used by menagerie:

Project icon from iconka, under free license