Hacker News new | more | comments | ask | show | jobs | submit login

Hi. Co-founder and lead Pachyderm dev here. This is a really interesting comment and raises some questions that I haven't fully considered. Building the infrastructure in this decoupled way has let us move quickly early on and made it really easy to scale out horizontally. We haven't paid as much attention to what you talk about here. However I think there might be some optimizations we can make within our current design that would help a lot. Probably the biggest source of slowness right now is that we post files in to jobs via HTTP, this is nice because it gives you a simple API for implementing new jobs but it's reaaally slow. A much more performant solution would be to give the data directly to the job as a shared volume (a feature Docker offers) this would lead to very performant i/o because the data could be read directly off disk by the container that's doing the processing.

Andrew knows this stuff better than I do but one way to start thinking about this is to look at what the sources of latency are, where your data gets copied, and what controls allocation of storage (disk/RAM/etc). For example mmaped disk IO cedes control of the lifetime of your data in RAM to the OS's paging code. JSON over HTTP over TCP is going to make a whole pile of copies of your data as the TCP payloads get reassembled, copied into user space, then probably again as that gets deserialized (and oh the allocations in most JSON decoders). As for latency you're going to have some context switches in there which is not helpful. One way you might be able to improve performance is to use shared memory to get data in and out of the workers and process it in-place as much as possible. A big ring buffer and one of the fancy in-place serialization formats (capnproto for example) could actually make it pretty pleasant to write clients in a variety of languages.

Thanks for the tips :D. These all seem like very likely places to look for low hanging fruit. We were actually early employees at RethinkDB so we've been around the block with low level optimizations like this. I'm really looking forward to get Pachyderm in a benchmark environment and tearing it to shreds like we did at Rethink.

So, I'm not so sure that building a product that mandates both btrfs and Docker really counts as "decoupled".

Hi, jd @ Pachyderm here.

Long term we don't want to mandate either of these technologies. We'd like to offer users a variety of job formats (docker and rocket come to mind more domain specific things like SQL are interesting as well) and a variety of storage options (zfs, overlayfs, aufs and in-memory to name a few). However we had to be pragmatic early on and pick what we thought were the highest leverage implementations so that we could ship something that worked.

We'll certainly be looking in to getting rid of this mandate in the near future, we like giving people a variety of choices.

A very performant way would be native RDMA verbs access over an Infiniband fabric, 40g to 56g throughput. At that rate remote storage is faster latency wise than local SATA. Many many HPC shops operate in this fashion and the 100g Ethernet around the corner only solidifies this.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact