Hacker News new | comments | show | ask | jobs | submit login

GlusterFS developer here. The OP is extremely misleading, so I'll try to set the record straight.

(1) Granted, snapshots (volume or file level) aren't implemented yet. OTOH, there are two projects for file-level snapshots that are far enough along to have patches in either the main review queue or the community forge. Volume-level snapshots are a little further behind. Unsurprisingly, snapshots in a distributed filesystem are hard, and we're determined to get them right before we foist some half-baked result on users and risk losing their data.

(2) The author seems very confused about the relationship between bricks (storage units) and servers used for mounting. The mount server is used once to fetch a configuration, then the client connects directly to the bricks. There is no need to specify all of the bricks on the mount command; one need only specify enough servers - two or three - to handle one being down at mount time. RRDNS can also help here.

(3) Lack of support for login/password authentication. This has not been true in the I/O path since forever; it only affects the CLI, which should only be run from the servers themselves (or similarly secure hosts) anyway. It should not be run from arbitrary hosts. Adding full SSL-based auth is already an accepted feature for GlusterFS 3.5 and some of the patches are already in progress. Other management interfaces already have stronger auth.

(4) Volumes can be mounted R/W from many locations. This is actually a strength, since volumes are files. Unlike some alternatives, GlusterFS provides true multi-protocol access - not just different silos for different interfaces within the same infrastructure but the same data accessible via (deep breath) native protocol, NFS, SMB, Swift, Cinder, Hadoop FileSystem API, or raw C API. It's up to the cloud infrastructure (e.g. Nova) not to mount the same block-storage device from multiple locations, just as with every alternative.

(5) What's even more damning than what the author says is what the author doesn't say. There are benefits to having full POSIX semantics so that hundreds of thousands of programs and scripts that don't speak other storage APIs can use the data. There are benefits to having the same data available through many protocols. There are benefits to having data that's shared at a granularity finer than whole-object GET and PUT, with familiar permissions and ACLs. There are benefits to having a system where any new feature - e.g. georeplication, erasure coding, deduplication - immediately becomes available across all access protocols. Every performance comparison I've seen vs. obvious alternatives has either favored GlusterFS or revealed cheating (e.g. buffering locally or throwing away O_SYNC) by the competitor. Or both. Of course, the OP has already made up his mind so he doesn't mention any of this.

It's perfectly fine that the author prefers something else. He mentions Ceph. I love Ceph. I also love XtreemFS, which hardly anybody seems to know about and that's a shame. We're all on the same side, promoting open-source horizontally scalable filesystems vs. worse alternatives - proprietary storage, non-scalable storage, storage that can't be mounted and used in familiar ways by normal users. When we've won that battle we can fight over the spoils. ;) The point is that even for a Cinder use case the author's preferences might not apply to anyone else, and they certainly don't apply to many of the more general use cases that all of these systems are designed to support.




1) It is great that snapshots are on their way. I am looking forward to use them. All in all you cannot benefit from them in Cinder right now.

2) I dont claim that all bricks must be specified in mount command. I just point out that having let's say 4 bricks it is impossible to mount volume by specifing only 2 servers if both of them are down, yet still the rest 2 work.

3) Like I wrote. It only considers CLI.

4) I totally agree with you. Mounting volume from many locations is one of advantages. It is not supported by Openstack. I dont blame GlusterFS for that.

5) My intension was not to describe GlusterFS cool features but current state (and preview of Openstack Havana implementation) of integration with Openstack.


(1) You can in some configurations. If you use qemu there's a block-device driver in qemu and another on the GlusterFS back end (as of 3.4), which both allow snapshots via methods external to us. I meant what I said about it being a hard problem. We're determined to deliver a general snapshot function. That's much harder than delivering snapshots that rely on an uncommon and/or unstable base technology, so it's taking a while.

(2) Yes, if you want to survive N concurrent failures you need N+1 mount servers, and currently released code only supports N=1. However, http://review.gluster.org/#/c/5400/ has already been merged and will be available in the next release.

(3) IMO you should also have mentioned that the problem only manifests in a specific deprecated use of the CLI (from machines other than the servers). Nonetheless, this is a known deficiency which I've been personally pushing to fix.

(5) The current state includes many of these "cool features" (thanks!) without need for any specific OpenStack integration. That's kind of the point. Unlike some, we don't need to re-implement features for every access method or use case. 90% of that functionality would be available e.g. to CloudStack or OpenNebula today. IMO making a big deal of snapshots as a differentiator in one direction without mentioning myriad differentiators in the other doesn't leave people with the information they need to make progress toward their own decisions.


1) Nobody says it is easy. I keep my fingers crossed for this feature.

2) Exactly. I have pointed out this ticket. When do you think it will be available and in which version?

3) Unfortunately, I have not found any document where setting authentication other than IP based is described. I would like to set it through openstack and by hand. Is it possible?

5) Of course it is great that most features do not require any specific OpenStack integration. I just wanted to describe those that need it. It mostly shows what needs to be done in Openstack.


(2) It'll automatically be in 3.5 approximately the end of this year. It will probably also be backported into the next 3.4.x, but I can't comment on schedules either for that or for the "downstream" Red Hat Storage releases.

(3) I don't know of any such instructions off the top of my head. Basically you'll have to edit /etc/glusterfs/glusterd.vol and add by hand the same auth options that are available for volumes via the CLI. I think there are (possibly hidden) CLI options that you'd need to specify username/password to work with that. If you try it and things aren't seeming fairly obvious, feel free to ping me via email (jdarcy@myemployer.com) or Freenode IRC.


(3) I would definitely like to avoid modifying files by hand. I knew about this method but it just does not work for me. I am automatically creating many volumes with cli and modifying them in configuration file is not convenient. I guess it is also not possible for cinder devs to implement it reasonably well.


You're opposed to manipulating config files by hand, but you recommend Ceph over GlusterFS? Interesting.


Well, I am not opposed to manipulating files by hand. I just dont want to do that when I have CLI for everything else. Keeping the content of the file the same on all servers is more toilsome than simple command. Of course there are tools like chef but that is not the point.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: