Hacker News new | past | comments | ask | show | jobs | submit login
XtreemFS – Fault Tolerant Distributed File System (xtreemfs.org)
59 points by turrini on Feb 25, 2016 | hide | past | favorite | 26 comments

One of its creators here. If XtreemFS looks interesting also check out Quobyte (www.quobyte.com), our next software storage system.

Quobyte does not appear to be a mature product.

Clicking 'get started' merely leads to a 'please let me waste hours talking with your enterprise sales reps' form.

A mature product should be possible for me to evaluate and try on my own, and I would rather not waste time doing so if I don't even know what it costs.

You are totally right with your expectations. But in a startup you can't do everything at once and thus no direct download yet. If you contact us you'll talk directly to the developers and get a download link.

That said, Quobyte is in production for business critical workloads.

Of course you're right, a startup can't do everything -- so why not just put a download link up rather than require people to talk to you to get a download link? Given as a startup you're busy and have a lot to do, why not just post a link on your website?

Doing this thoroughly would mean that we would need to provide some form of support aswell.

How does Quobyte compare to Ceph?

Edit: From the product page the main advantage seems to be end-to-end checksums and an adaptive placement engine.

Architecture and performance comparisons would be very interesting, though I understand if you can't disclose that yet.

How are the two related exactly? Is XtreemFS part of the core of the Quobyte product, or is it a dead-end, with all new development happening in Quobyte?

XtreemFS is being maintained by the folks at Zuse Institute.

Quobyte is a new implementation and only shares the architecture blueprint with XtreemFS.

How are XtreemFS and Quobyte different than GlusterFS?

Looking at the architecture, there are two major differences:

* XtreemFS has POSIX file system semantics and split-brain safe quorum replication for file data. With Quobyte, we pushed that further and have now full fault tolerance for all parts of the system, working at high performance for both file and block workloads (Quobyte also does erasure coding). GlusterFS replication is not split brain safe, and there are many failure modes that can corrupt your data.

* XtreemFS and Quobyte have metadata servers. This allows them place data for each file individually as it can store the location of the data. With Quobyte we pushed this quite far and have a policy engine that allows you configuring placement. When the policy is changed, the system can move file data transparently in the background. This way you can configure isolation, partitioning and tiering. GlusterFS has a pretty static assignment of file data to devices.

GlusterFS developer here. Sure, we have bugs just like anybody else. So do you. Nonetheless, just saying that GlusterFS replication "is not split-brain safe" is FUD. Neither is yours, not 100%. Likewise, "static assignment" is simply not true. We move files around to rebalance capacity, we have tiering, and users can explicitly place a file on a particular node (though the mechanisms for that are clunky and poorly documented). I've been fair and even complimentary toward XtreemFS in many blog posts and public appearances. I'd appreciate the same honesty and courtesy in return.

I am sorry that this came across this way. I did not intend to say bad things about GlusterFS but carve out where the technical differences are in the big picture.

Also I actually tried not to make any valuing statements. I am using the term split brain safety as technical term, ie. the P in CAP. My understanding is that GlusterFS does not have this in its system model and you and the documentation seem to support this: "This prevents most cases of "split brain" which result from conflicting writes to different bricks."

Quobyte generally (and XtreemFS only for files) does quorum replication based on Paxos, where split brain is part of the system model. They are CP and hence data is not always available but is always consistent for reads if the quorum is there. Like Ceph.

I am sorry that I missed the progress on placement. It seems like I need to catch up on what happened after the volume types.

"Split brain safety" is not a commonly used term, and even if that weren't the case I'd say it's not a term that should be thrown around lightly. Also, using Paxos or Raft doesn't guarantee split-brain safety, as aphyr has proven over and over again with Jepsen. So what we have is two systems that take different approaches to quorum and split brain and all that. It seems a bit disingenuous to throw stones at the older open-source project while ignoring the potential for the exact same problems in the newer proprietary one.

FWIW, I do think the current Gluster approach to replication is not sufficiently resistant to split-brain in the all-important edge cases. That's why I've been working on a new approach, much more like Ceph and many other systems - though few of them use Paxos in the I/O path. That's wasteful. Other methods such as chain or splay replication are sufficient, with better performance, so they're more common.

I gather that XtreemFS and Quobyte can tolerate benign but not byzantine faults. Is there such a thing as a byzantine fault tolerant file system?

Since 1999:


Google byzantine fault tolerance filesystems to get quite a few more.

Okay, but are there any in-production file systems implementing that (or any other BFT consensus protocol)?

That I'm aware of, nobody wants to pay the price for that. Much like with high assurance systems in general. Demos, prototypes, and so on available with no uptake due to tradeoffs involved. Or maybe I.P. issues, too.

Yes, no byzantine faults.

So, do you have any reasons for us to use XtreemFS vs its top competitors? What's the main benefits today that others don't have?

XtreemFS is BSD, what is Quobyte?

Quobyte is not open source.

This looks a lot like ceph. Can you give a short overview of the differences?

> requires no special hardware or kernel modules


If not already loaded, load the FUSE kernel module:

> modprobe fuse

Except that FUSE is pretty mainstream.

It was officially merged into the mainstream Linux kernel tree in kernel version 2.6.14. Anything before that is a special kernel module.

How does it compare to Ceph and Tahoe-LAFS?

But more importantly, how are you using any of those? :)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact