Skip to content

nsaje/dagger

Repository files navigation

Dagger

Dagger is a dynamic realtime stream processing framework based on publishing of and subscribing to streams of data.

Computations are only executed on a stream (or streams) of data if there's someone actually subscribed to and using that computation. For example, a process that persists data streams to a DB is subscribed to a stream of CPU utilization for a node, averaged by a minute. This means a computation averaging CPU utilization over each minute will run as long as that process is running. When an administrator opens up a dashboard that would like to display a realtime chart of CPU utilization averaged over 5 seconds, a second computation will spin up that will supply average CPU utilization over 5 seconds. When the admin closes the dashboard, the computation shuts down, because there is no one interested in its data anymore.

This way, computation and data transfer over the network is only done when necessary, thus conserving compute cycles and reducing network traffic.

This software is a prototype in the making. It is being developed for the purpose of my master's thesis.

Properties

  • Multilang plugins

    Plugins for stream processing can be written in any language, interfaced via JSON-RPC.

  • Exactly once delivery

    Ensured using retries and efficient deduplication.

  • Timestamp-ordered processing

    Even when processing records from different streams, Dagger will process the records in the order of their timestamps. This is achieved using low watermarks, used also in Google's MillWheel stream processing system, which tell us when we have received all the records up to a certain point in time.

  • Fault tolerant

    Achieved without using a separate database system, with multiple computations being executed in parallel enabling no-loss failover.

  • Kafka-like rewinds

    Using the approach Kafka pioneered, Dagger persists all the streams to disk. This means you can decide from where on you want to receive stream data or even rewind to an earlier position in time.

  • Decentralized

    No single point of failure. Workers are fail-fast and coordinate via a consistent KV store such as Consul or etcd.

  • Easy deployment

    Written in Go, so all you need to deploy is a Consul or etcd cluster and a Go binary.

Dynamic work assignment example

1: the monitored node has a 'cpu_util' stream available

monitored node
+---------------+
|               |
| pub: cpu_util |
|               |
+---------------+

2: a subscriber comes along that wants a stream of 'cpu_util' averaged by 5 minutes (say, a process that persists this to a DB)

monitored node                                                        subscriber
+---------------+                                                    +---------------------------+
|               |                                                    |                           |
| pub: cpu_util |                                                    | sub: avg(cpu_util, 5min)  |
|               |                                                    |                           |
+---------------+                                                    +---------------------------+

3: a Dagger worker spins up that starts computing this average and making it available for subscribers

monitored node           worker node                                  subscriber
+---------------+       +-----------------------------------+        +---------------------------+
|               |       |                                   |        |                           |
| pub: cpu_util +-------> computation: avg(cpu_util, 5min)  +--------> sub: avg(cpu_util, 5min)  |
|               |       |                                   |        |                           |
+---------------+       +-----------------------------------+        +---------------------------+

4: another subscriber shows up, this time wanting an average of 'cpu_util' over 10 seconds (say, an admin opens a web dashboard that displays live data)

monitored node           worker node                                  subscriber
+---------------+       +-----------------------------------+        +---------------------------+
|               |       |                                   |        |                           |
| pub: cpu_util +-------> computation: avg(cpu_util, 5min)  +--------> sub: avg(cpu_util, 5min)  |
|               |       |                                   |        |                           |
+---------------+       +-----------------------------------+        +---------------------------+

                                                                     +---------------------------+
                                                                     |                           |
                                                                     | sub: avg(cpu_util, 10sec) |
                                                                     |                           |
                                                                     +---------------------------+

5: a new worker is started, providing the required stream

monitored node           worker node                                  subscriber
+---------------+       +-----------------------------------+        +---------------------------+
|               |       |                                   |        |                           |
| pub: cpu_util +---+---> computation: avg(cpu_util, 5min)  +--------> sub: avg(cpu_util, 5min)  |
|               |   |   |                                   |        |                           |
+---------------+   |   +-----------------------------------+        +---------------------------+
                    |
                    |   +-----------------------------------+        +---------------------------+
                    |   |                                   |        |                           |
                    +---> computation: avg(cpu_util, 10sec) +--------> sub: avg(cpu_util, 10sec) |
                        |                                   |        |                           |
                        +-----------------------------------+        +---------------------------+

6: the second subscribers unsubscribes (say, the admin closes the dashboard)

monitored node           worker node                                  subscriber
+---------------+       +-----------------------------------+        +---------------------------+
|               |       |                                   |        |                           |
| pub: cpu_util +---+---> computation: avg(cpu_util, 5min)  +--------> sub: avg(cpu_util, 5min)  |
|               |   |   |                                   |        |                           |
+---------------+   |   +-----------------------------------+        +---------------------------+
                    |
                    |   +-----------------------------------+
                    |   |                                   |
                    +---> computation: avg(cpu_util, 10sec) |
                        |                                   |
                        +-----------------------------------+

7: a lack of subscribers is detected for the second worker so it shuts down

monitored node           worker node                                  subscriber
+---------------+       +-----------------------------------+        +---------------------------+
|               |       |                                   |        |                           |
| pub: cpu_util +-------> computation: avg(cpu_util, 5min)  +--------> sub: avg(cpu_util, 5min)  |
|               |       |                                   |        |                           |
+---------------+       +-----------------------------------+        +---------------------------+

About

Dynamic decentralized realtime stream processing framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages