Hi. Co-founder and lead Pachyderm dev here. This is a really interesting comment...

hedgehog · on Feb 11, 2015

Andrew knows this stuff better than I do but one way to start thinking about this is to look at what the sources of latency are, where your data gets copied, and what controls allocation of storage (disk/RAM/etc). For example mmaped disk IO cedes control of the lifetime of your data in RAM to the OS's paging code. JSON over HTTP over TCP is going to make a whole pile of copies of your data as the TCP payloads get reassembled, copied into user space, then probably again as that gets deserialized (and oh the allocations in most JSON decoders). As for latency you're going to have some context switches in there which is not helpful. One way you might be able to improve performance is to use shared memory to get data in and out of the workers and process it in-place as much as possible. A big ring buffer and one of the fancy in-place serialization formats (capnproto for example) could actually make it pretty pleasant to write clients in a variety of languages.

jdoliner · on Feb 11, 2015

Thanks for the tips :D. These all seem like very likely places to look for low hanging fruit. We were actually early employees at RethinkDB so we've been around the block with low level optimizations like this. I'm really looking forward to get Pachyderm in a benchmark environment and tearing it to shreds like we did at Rethink.

angersock · on Feb 11, 2015

So, I'm not so sure that building a product that mandates both btrfs and Docker really counts as "decoupled".

jdoliner · on Feb 11, 2015

Hi, jd @ Pachyderm here.

Long term we don't want to mandate either of these technologies. We'd like to offer users a variety of job formats (docker and rocket come to mind more domain specific things like SQL are interesting as well) and a variety of storage options (zfs, overlayfs, aufs and in-memory to name a few). However we had to be pragmatic early on and pick what we thought were the highest leverage implementations so that we could ship something that worked.

We'll certainly be looking in to getting rid of this mandate in the near future, we like giving people a variety of choices.

SEJeff · on Feb 11, 2015

A very performant way would be native RDMA verbs access over an Infiniband fabric, 40g to 56g throughput. At that rate remote storage is faster latency wise than local SATA. Many many HPC shops operate in this fashion and the 100g Ethernet around the corner only solidifies this.