

HDFS-alike in Go - michaelmaltese
https://github.com/michaelmaltese/golang-distributed-filesystem

======
vicaya
One missing feature/todo is the one of the important features that
differentiates HDFS from many "academic" distributed file systems (including
Ceph) is end-to-end data checksums, which are computed in client and stored on
data nodes and verified in client upon retrieval.

Personally, I'd avoid any distributed storage systems without end-to-end
checksums.

Another related mandatory feature is background scrubbing (periodically verify
block checksums and discard corrupted replicas).

~~~
michaelmaltese
Thanks for the advice! It was on my list of todos, but I was more interested
in getting to the point where I could experiment with a federated leadership.

I started implementing checksums (on uploads and replicating) and found an
error in my upload process, so it's already been useful. At some point I'll
add it to the download part and start background scrubbing. The cluster
already re-balances on-the-fly, so it'll just need a background routine to
remove the files and update the node state.

