Hacker News new | past | comments | ask | show | jobs | submit login

And yet Facebook appear to be successfully running a somewhat popular website while using NFS in production, which implies that the situation is somewhat less absolute than you claim.

Oh, you can definitely make money with a design from 2005 that uses NFS in production, 4 layers of caching, and enough servers that rebooting them constantly doesn't impact your service. I've worked at those companies. It was a bad idea then, it's a bad idea now.

That's not how it's done, and I know because I was in that group. Also, note that "production" at a company like Facebook includes a lot more than the direct web-content-serving path. I think I might agree that NFS doesn't make sense in that particular path, because it provides features and guarantees that nothing in that path needs and that means needless complexity, but there are plenty of other roles where NFS is a better fit than blob stores or copying files around.

And there's plenty of other solutions that don't have all the problems (and overhead) NFS has. There are precious few actual good reasons to use NFS and they aren't customer facing products. Particularly, when you want a rootless thin terminal on an embedded system, or read-only transfer of non-sensitive files (say, for bootstrapping a machine from bootp, or sharing a corpus of data).

For everything else (in production), NFS is a PITA, and most other networked filesystems should be avoided for similar reasons. Again: other solutions exist that provide the same functionality but do it without the inherent problems. Usually they are ignored because someone wanted to cut cost, or because legacy. Ask a storage engineer or infrastructure architect.

> Ask a storage engineer or infrastructure architect.

I've been both, and known hundreds. I'd love to hear about this solution you have that has the same functionality (your words) as a network/distributed filesystem, same compatibility, same semantic richness, same consistency, etc. without any of the tradeoffs. What are you selling?

Well to start, virtually any networked filesystem is better. Even CIFS won't hose your system from network blips, but there's also GFS2, HDFS, GPFS, OCFS, VMFS, Ceph, Lustre, Gluter, Lizard, Orange, Hammer2, MapR, Xtreem.

If you don't want to roll your own OSS solution (and I wouldn't recommend it) most modern SANs have all those features and more, available via a variety of interconnects and protocols (including shared volumes). Real replication, real snapshots, real encryption, real RBACs, and not going over the same interface as random network traffic. Storage manufacturers even offer custom drivers for K8s or VMware so you can control the persistent storage for your cloud apps directly from your orchestrator's control plane, or manage it in your SAN.

> there's also GFS2, HDFS, GPFS, OCFS, VMFS, Ceph, Lustre, Gluter, Lizard, Orange, Hammer2, MapR, Xtreem.

Until quite recently I was a maintainer for one of those. I've worked on two others. One more isn't done, two more are unmaintained, two more will trash data in specific ways I have identified to their developers, and HDFS isn't even a real filesystem. Also a couple more are proprietary. We're nowhere near "same functionality" here.

> most modern SANs have all those features and more

A SAN is not a filesystem, so it fails the "equivalent functionality" test again. No, that's not a "No True Scotsman" fallacy because you set the goalpost and it hasn't moved. If it's not mountable, not shared at file granularity, or not writable at byte granularity, it's not equivalent to NFS. You're also mixing cluster, network, and distributed filesystems in a way that makes me doubt your ability to comment cogently on their strengths and weaknesses. I developed a SAN-based filesystem at EMC, it made sense at the time (no later than 2002!), but nowadays it's utterly insane to posit that as a serious alternative. But at least now I know what you're trying to sell, so thanks I guess.

What I was trying to sell was that NFS is buggy, but past that, that it's just bad design to require the "equivalent functionality" of NFS at all. Really old applications and systems with really old designs needed NFS years ago, but now we have much better and more robust solutions for whatever janky thing a particular app is trying to accomplish with NFS-like semantics (which is really just emulating local filesystems over a network, and there's no reason a backend app needs to emulate local filesystems over a network).

If your project is greenfield, it is absolutely insane to require NFS's functionality in your design. Using NFS outside of production is okay, because when it ends up sucking, it won't sap your engineering time or budget or restrain your ability to scale. (Until non-production is a giant performance testing lab, and then NFS's suckitude does indeed restrain your business)

> Really old applications and systems with really old designs needed SANs years ago


Is there anything that's a good idea? Or is this just "everything has bugs and issues"?

Seems like the latter. I'd guess peterwwillis has had some bad experiences with NFS. I've had some bad experience with some of the same so-called alternatives. Every time I see someone put a text blob into a database that needs to be in a file at point of use, I want to slap them. Ditto for everyone who tries to implement a hierarchical namespace on top of a flat blob store. Ditto for everyone who distributes a thousand copies of a rarely-changing file via scp or chef, then complains about update speed or inconsistency. NFS is not generally a good idea in high-volume high-node-count traffic paths, but it can be a less-bad solution in many other situations including many that are still part of production.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact