You use EFS as a shared folder that you can share between a number of different workloads. If you want a POSIX compatible shared filesystem in the cloud, you're going to pay for it.
For example. I setup Developer Workspaces that can mount an EFS share to their linux box, and anything they put in there will be accessible from Kubernetes Jobs they kick off, and from their Jupyterhub workspace.
I can either pay AWS to do it for me, or I can figure out how to get a 250k IOPS GlusterFS server to work across multiple AZs in a region. I think the math maths out to around the same cost at the end of the day
If you don’t need this level of durability, then plain old local filesystems can work, too. XFS or ZFS or whatever, on a single machine, serving NFS, should nicely outperform EFS.
(If you have a fast disk. Which, again, AWS makes unpleasant, but which real hardware has no trouble with.)
This is dependent on your usecase, what types of storage you use, familiarity with tuning systems, setting up raid layouts, etc.
I love ZFS. It's incredibly powerful. It's also incredibly easy to screw up when designing how you want to set up your drives, especially if you intend to grow your storage. This also isn't including the effort needed to figure out how to make your filesystem redundant across datacenters or even just between racks in the same closet.
At the end of the day, if I screw up setting something on EFS I can always create a new EFS filesystem and move my data over. If I screw up a ZFS layout, I'm going to need a box of temporary drives to shuffle data onto while I remake an array.
> At the end of the day, if I screw up setting something on EFS I can always create a new EFS filesystem and move my data over. If I screw up a ZFS layout, I'm going to need a box of temporary drives to shuffle data onto while I remake an array.
True, but…
At EFS pricing, this seems like the wrong comparison. There’s no fundamental need to ever grow a local array to compete — buy an entirely new one instead. Heck, buy an entirely new server.
Admittedly, this means that the client architecture needs to support migration to a different storage backend. But, for a business where the price is at all relevant, using EFS for a single month will cost as much as that entire replacement server, and a replacement server comes with compute, too. And many more IOPS.
In any case, AWS is literally pitching using EFS for AI/ML. For that sort of use case, just replicate the data locally if you don’t have or need the absurdly fast networks that could actually be performant. Or use S3. I’m having trouble imagining any use case where EFS makes any sort of sense for this.
Keep in mind that the entire “pile” fits on ~$100 of NVMe SSD with better performance than EFS can possibly offer. Those fancy “10 trillion token” training sets fit in a single U.2 or EDSFF slot, on a device that speaks PCIe x4 and costs <$4000. Just replicate it and be done with it.
Buuttt... you're trying to compare apples (a rack in a DC) to oranges (an AWS Native Solution that spans multiple DCs). And that's before you get into all the AWS bullshit that fucking sucks, but it sucks more to do it yourself.
A Rack in a DC isn't a solution that's useful to people who are in AWS.
For example. I setup Developer Workspaces that can mount an EFS share to their linux box, and anything they put in there will be accessible from Kubernetes Jobs they kick off, and from their Jupyterhub workspace.
I can either pay AWS to do it for me, or I can figure out how to get a 250k IOPS GlusterFS server to work across multiple AZs in a region. I think the math maths out to around the same cost at the end of the day