Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What distributed storage technology are you using?
71 points by gtirloni on Apr 21, 2017 | hide | past | favorite | 45 comments

I use (and contribute to) OpenStack Swift.

It's an object storage engine (think S3, but it's open source and you can put it in your own data center) that's excellent at storing unstructured data.

It's completely deployable and usable without any other OpenStack projects.

There's S3 API compatibility for it. It supports globally distributed clusters. It supports multiple storage polices that can be either replicated or use erasure coding. It's designed for very high availability, very high durability, and high aggregate throughput.

One of my favorite features is being able to create sharable, expiring signed URLs to any object in the cluster.

Some of the common uses for Swift include storing user-generated content (eg images, videos, game saves), static web assets, movies, scientific data sets, backups, document sharing, VM and container images, etc.

API docs: - https://developer.openstack.org/api-ref/object-storage/

Docs: - http://swift.openstack.org

Vagrant All-In-One setup: - https://github.com/swiftstack/vagrant-swift-all-in-one

Come say hi! - #openstack-swift on freenode IRC (I'm notmyname)

For those who might not know and didn't look at his profile, notmyname "contributes to" Swift in the sense of being its technical lead.

Just a quick question, how good is the S3 compatibility layer? Can I switch, say, a Rails app to use Swift easily?

We had been using S3 in the past and needed to move to Swift due to high cost of S3. Our integration just took a day. Most of the things went smooth and as far as I remember, only some of the meta properties were different, that's all. I can't recommend Swift highly enough.

Okay, I'm on board.

Where's your docker file and "migrate from S3 to OpenStack Swift in 20 minutes or less tutorial"?

We're looking at minio/riak and now you as alternatives to S3.

Is there a recent-ish comparison with ceph's radosgw?

I haven't used it extensively, but I've read some of the source code and I'm excited about where Minio is going -- the erasure coding storage capability in particular: https://www.minio.io/

The question is quite open ended as to whether it means backup, or something else.

I use IPFS. IPFS is great for sharing multi-gigabyte size files between machines in a cluster, bit-torrent style. In my case it is a couple of hundred Amazon spot instances that can come and go very fast and need to get the data ASAP to start some calculation, the same data for all nodes.

I really want to know what you are working on, in that it can just be in the mist.

My background is particle physics and it is quite a common paradigm there. Many people, or indeed the same person, want to run many calculations over the same large dataset. The results clearly need to be posted somewhere and are generally much smaller than the input data, so this isn't a problem.

I believe Uber uses IPFS if I'm not mistaken.

I am super interested in more details on this. Can you do a blog post?

What are you looking for that you wouldn't get from looking at the IPFS website, or just googling in general?

The details of using ipfs in production? It's very early beta software and I don't know of anybody using it for real work.

The only difference from standard ipfs usage is to keep the data private. To do that I use a private subnet to stop incoming connections, make sure the ipfs instances don't try and contact the real ipfs network, and use my own bootstrap ipfs node which is permanent.

I was actually considering tying it into consul somehow so the nodes could discover each other.

I think your interest makes sense considering the GGGGP post made claims of using IPFS in what seems like a usfeful setup w/o sharing the details.

I'm not currently using any but I've tried:

- Ceph: Very flexible. Supports many different kinds of replication. Has high overhead compared to local disk (on the order of ~50%) and was (for me) prone to hard to diagnose issues. Can be annoying to setup if you're not doing it on a supported Linux distro with ceph-deploy. It looks like Bluestore (a new on-disk format for data) will significantly improve performance but Bluestore is extremely RAM hungry.

- GlusterFS: Much faster than Ceph but less flexible. Has odd requirements about "bricks" being the same size. Much less RAM hungry than Ceph.

- A bunch of smaller ones I can't recall. Mostly discarded because they performed badly or lacked replication options (I really wanted erasure coding).

In the end I'm simply sharding my data manually. It's not as scalable but it's much more performant.

Take a look at MooseFS -- it's been several years since I eval'd all these options and MooseFS was the best -- it is a very sensible system.

edit - this is distributed block storage -- if you just need object storage, perhaps something else is in order.

I think Moosefs is file not block, like Gluster.

I actually found it better than Gluster in terms of robustness and performance. It's got support for multiple masters, failover and a nice dashboard.

One of those projects that should be much more well known than it is given there are few open source distributed storage solutions. The MFS devs are good but maybe lack the marketing savvy or perhaps just happy where they are.

I also had a lot of issues with Gluster -- when it was broken, it was hard to know what was wrong. With MooseFS, the dashboard would usually give you a good idea, and the logs were pretty good at showing how it was replicating data after a node or disk failed.

From memory MooseFS was one of the ones that lacked erasure coding. The documentation also left a lot to be desired.

For what it's worth, same here. It's easier to shard files than it is to admin these systems.

We're using LinkedIn's Ambry ( https://github.com/linkedin/ambry ). Its an easy to use distributed object store. It's basically a self hosted version of Amazon S3.

Not very popular compared to the likes of Swift and Ceph, but we've been using Cleversafe for media storage and it's been super solid and performant. It has adapters for S3, Swift, and even FTP and NFS (supposedly according to some press releases I've seen from years ago). Outside of IBM not sure who the heck is using it honestly. It's a pity, it's got some pretty interesting technology under the hood for cost-effective replication and availability.

Right now I'm using a home-made sysmtem, which is a simple object store replicated across six nodes:


I think in production I've used NFS, DRBD, GlusterFS and OpenStack. Each has their pros and cons, and without a precise set of constraints it's hard to know what how to usefully answer any question of the form "Which would you recommend? Why would you choose this?"

Distributed storage tends to be required either because you want redundancy, availability, or because your "stuff" is too large for a single box to host. But with a vague question it could mean "How do you backup boxes?" or something entirely different. (For example "distributed storage" could end up mapping to a pair of MySQL hosts, or a replicated PSQL database..)

While the question doesn't really specify use case, I am surprised HDFS hasn't come up yet. I guess this is focusing a lot more on online object stores for web use versus data processing distributed storage.

For backups, I have two git-annex [1] repositories,

- Personal files, stuff I can not afford to lose (photos,documents etc.) - Full archive on S3, full archive on a home server, 4 clients with partial copies.

- Big data stuff I can afford to lose (VM images, media files etc.) - Around 6 TB, each file has two copies split between 5 hard drives on home server and Hubic.

[1] https://git-annex.branchable.com/

How do you backup your git-annex repositories?

All clients has a copy of the git repo plus a server on Digital Ocean.

I'm starting to use Sia (sia.tech) to backup some heavy files starting with data I can afford to loose (even if I'm quite confident in the tool).

Thoughts on Gluster? I've used it for pet projects; really simple to get going initially

I'm currently using Google Cloud Storage for storing and archiving data. Using regional storage has helped us while running production jobs which ingest this data. Once the data is processed, we move to coldline storage for archiving.

At Instamotor we use Elastic Search, Redis, and S3 for files/photos. In my previous job (Nest), we looked at a decent number of options and ended up going with Cassandra.

LustreFS (intel enterprise lustre) using 10gbps networking.

Intel seem to just made Enterprise Lusre free.

See https://www.nextplatform.com/2017/04/20/intel-shuts-lustre-f...

We use this extensively at the NRAO and mostly love it.

Have you thought about investing in InfiniBand? Even a generation or two old beats the pants off of 10GigE in price and performance. As long as you're on a LAN....

We are using Infiniband in Socorro. We just established 10Gb ethernet from Charlottesville to Socorro last week. The sites with instruments typically are a little ahead of the main office in terms of connectivity.

Ceph and ElasticSearch. Elastic is even running on Ceph for now. That's a bit yuk though; they should both live on physical machines.

Google Cloud Storage is working quite beautifully for me. S3 would be similar I believe.

Currently S3, hadoop, dynamodb.. Probably elasticsearch again before too long.

We use Pithos (S3-compatible API, Cassandra backend) pretty successfully.


We're using Ceph exclusively

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact