Skip to content

nmilford/phylactery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

phylactery

In D&D, a phylactery is a magic container used to store a Lich's soul.

phylactery

This is mainly a proof-of-concept Go program for interacting with Cassandra. Partially as a toy to learn Go with and partially to flesh out the design for a global file ledger abstraction to sync up geographically disparate object stores and to make different object store pluggable.

This is mainly to get the idea and notes out of my head.

Right now I am dealing with a globally namespaced, single MySQL master MogielFS object store hosung ~10PB of data across 3 datacenters.

Looking to the future I'd rather each data center have it's own distinct MogileFS installation that looks to itself and maintains it's own replica of the data.

Then, I would have something external that keeps a list of all of the files that should be available (this project, the Phylactery)

When a file is written to any data center, the client/service notifies the Phylactery and that same process notes the files are missing and grabs them.

Finally, there would be a proccess interacting with each data center's MogileFS installation to make sure it has all of the files listed in the Phylactery and take corrective action if needed (get the missing file from another datacenter)

This has the advantages of:

  • Smaller failure domains and we are not beholden to a single datastore in one data center.
  • Allowing each datacenter to work standalone and allow repair to happen offline.
  • Make adding new datacenters easier and incremental.
  • As we have different asset/object types we can have a finer grain control of what goes in what data center if needed.
  • We can swap out MogileFS, and use Swift or Ceph or what have you and just hook it up to the Phylactery. I love not being beholden to any one technology.

I use Cassandra to make sure the data is in every datacenter and since I do not care how long writes take, I can write a CL.ALL to assure all Cassandra nodes get the update.

To tinker:

First, Install Cassandra:

Installing via apt-get will start it automatically, you need not do any config changes.

Use cqlsh to connect to the local Cassandra instance and copy-paste from the setup.cql file.

Assuming you have Go installed and your GOPATH set grab the gocql dependancy:

go get github.com/gocql/gocql

Then, just build it

go build phylactery.go 

And run

./phylactery 

Here is some example usage:

List a file that is not in tx01

curl http://localhost:8080/file/bad/tx01

Toggle that file as now in tx01

curl -X POST -d "{\"Fid\":\"1234.fid\"}" http://localhost:8080//file/add/tx01

Add a new file to the ledger.

curl -X POST -d "{\"Fid\":\"1234.fid\",\"Origin\":\"ma01\",\"Ma01\":true,\"Tx01\":false}" http://localhost:8080/file/new

Obviously the schema would need to be improved and I'd like to use a map in Cassandra instead of each datacenter... but whatever, will get to it.

Moreover, I can build plugin in MogileFS that will write back other metadata data to Cassandra if I want to do some real analytics with Spark/Shark

About

A proof-of-concept Go program for interacting with Cassandra.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages