You use EFS as a shared folder that you can share between a number of different ...

postalrat · on Nov 27, 2023

You you could spend a fraction of what you would pay a month to get a rack with 3x the performance.

threeseed · on Nov 27, 2023

Operating file systems like this e.g. GlusterFS, Lustre etc require skill and experience.

It's not something you can just throw together if you want something that is stable and performant.

Which means that the cost of that person to operate it is going to be significantly more expensive than if you just use AWS.

amluto · on Nov 27, 2023

If you don’t need this level of durability, then plain old local filesystems can work, too. XFS or ZFS or whatever, on a single machine, serving NFS, should nicely outperform EFS.

(If you have a fast disk. Which, again, AWS makes unpleasant, but which real hardware has no trouble with.)

count · on Nov 27, 2023

'if you don't need all the things that make this expensive, its way cheaper' is kind of a redundant, right?

redserk · on Nov 27, 2023

This is dependent on your usecase, what types of storage you use, familiarity with tuning systems, setting up raid layouts, etc.

I love ZFS. It's incredibly powerful. It's also incredibly easy to screw up when designing how you want to set up your drives, especially if you intend to grow your storage. This also isn't including the effort needed to figure out how to make your filesystem redundant across datacenters or even just between racks in the same closet.

At the end of the day, if I screw up setting something on EFS I can always create a new EFS filesystem and move my data over. If I screw up a ZFS layout, I'm going to need a box of temporary drives to shuffle data onto while I remake an array.

amluto · on Nov 27, 2023

> At the end of the day, if I screw up setting something on EFS I can always create a new EFS filesystem and move my data over. If I screw up a ZFS layout, I'm going to need a box of temporary drives to shuffle data onto while I remake an array.

True, but…

At EFS pricing, this seems like the wrong comparison. There’s no fundamental need to ever grow a local array to compete — buy an entirely new one instead. Heck, buy an entirely new server.

Admittedly, this means that the client architecture needs to support migration to a different storage backend. But, for a business where the price is at all relevant, using EFS for a single month will cost as much as that entire replacement server, and a replacement server comes with compute, too. And many more IOPS.

In any case, AWS is literally pitching using EFS for AI/ML. For that sort of use case, just replicate the data locally if you don’t have or need the absurdly fast networks that could actually be performant. Or use S3. I’m having trouble imagining any use case where EFS makes any sort of sense for this.

Keep in mind that the entire “pile” fits on ~$100 of NVMe SSD with better performance than EFS can possibly offer. Those fancy “10 trillion token” training sets fit in a single U.2 or EDSFF slot, on a device that speaks PCIe x4 and costs <$4000. Just replicate it and be done with it.

volf_ · on Nov 27, 2023

I'm aware.

Buuttt... you're trying to compare apples (a rack in a DC) to oranges (an AWS Native Solution that spans multiple DCs). And that's before you get into all the AWS bullshit that fucking sucks, but it sucks more to do it yourself.

A Rack in a DC isn't a solution that's useful to people who are in AWS.

postalrat · on Nov 28, 2023

A rack in a DC and services in AWS can securely talk to each other as easily as services in AWS can talk to other services in AWS.