I wish AWS would be a little more technical in their product descriptions and announcements. If there was ever an audience that wants technical data about how this type of technology will scale and compares to existing technology like glusterfs it would be the AWS users. Instead the performance specifications are ".. and is designed to provide the throughput, IOPS, and low latency needed for a broad range of workloads." We know that NFS to a lot of nodes is hard, show us that this scales better or am I the testing team?
Fair enough, but one of the biggest reasons I ignore these types of posts is because too often they lack these critical details, they come off as puff pieces to me. IMO these posts would hold value for a larger audience if that info was included in them.
The trick is, based on the title and other marketing tricks, one doesn't find out it's an actual puff piece until one has already spent time reading at least a portion of the puff. By then, you just feel used.
There's a business side for it. Besides recruiting future customers via a preview, you also sublimilaly put off would-be customers decisions until your solution is ready to compete with existing solutions. Microsoft did it very successfully for years.
I am a huge AWS fan, and user. Look at my accounts for the sums I have spent over the years. I lover new AWS services. But, this is my main complaint with AWS -- don't announce until you have usable numbers and workable examples for early adopters!
I can understand that writing up technical details can be difficult as things are constantly improving and also as AWS has to be careful about what they claim. Perhaps a reasonable compromise would be to team up with some 3rd parties to let them try it, optimize a bit with techs from AWS and write up their experiences. Similar to how Apple does the marketing for the iPhone but then lets other people review it. Not perfect but perhaps something that would help the community understand and deploy these technologies better.
Yea and when is that? If a product is announced, even if its a preview version, shouldn't the details be sorted out by then? This is like a 10000ft view of a product, a great product at that. But numbers to back your claims of scalability and performance would have been best.
As a developer (but not a sysadmin) I feel the opposite. Not saying your criticism is wrong, just giving another view from a AWS newbie perspective.
When I read most AWS product descriptions, I cannot understand what real world situations they are for. I either could not say, or could only vaguely say, what the product is in my own words. There seems to be a great scaffolding of assumed knowledge about the AWS system and distributed computing. Maybe that's intended, but it's intimidating and keeps me in simpler places like Heroku land.
I feel similarly about AWS and I've been using it for a while. AWS is very powerful and flexible but the number of options and different services can be bewildering. I've suggested AWS to several developers I know that have never used it before and they were turned off from not being able to understand what was being offered and how much it would cost them.
For example, compare the Heroku pricing page and the AWS pricing page for EC2:
I think this just illustrates your point, but EC2 and Heroku Dynos are two different types of services (IAAS vs PAAS). The appropriate comparisons are AWS Elastic Beanstalk to Heroku Dynos and AWS RDS to Heroku Postgres.
Yes, a link to Elastic Beanstalk would be a fairer comparison of what is being offered by Heroku but my point was most AWS pages are like that and aren't easy to follow, especially if you're new to AWS.
The "you" in the comparison changes as well, though. Indeed, the copy for (say) EC2 isn't comprehensible to the same people that the copy for Elastic Beanstock is. But different people will be reading the copy for those two things. The person AWS is selling EC2 to isn't a newbie developer wanting to run their app; it's someone wanting to build out something like Heroku: someone with lots of system-architecture experience.
Or, to put it more simply: the intended audience for AWS ad copy is exactly the set of people who are currently trying to compare some specific AWS service to similar services of competitors. AWS is never trying to "create a need" for something nobody was already asking for; they're just trying to serve the needs people already do have better/cheaper/more flexibly.
I completely agree that the AWS descriptions and documentations are written with the style that actual best practices are left as an exercise for the reader. Without much nurturing the community is left to develop solutions like boto-rsync for moving files into S3 and glusterfs to overcome a 1Tb limit on drives. I'm not sure if that is intentional or a side effect of something else but it certainly turns off a lot of newbies for whom it seems overwhelming and a lot of technical folks as it is difficult to get a grasp on exactly what to use and how in AWS.
heroku uses ec2 under the hood. heroku is heresy for any competent programmer. despite aws being packaged services at least both managed db and ec2 show you your actual servers and let you choose specs.
Until you see technical details on any file, network or storage system you should always assume your writes (or data for that matter) are/is not safe. It's disappointing to see a product launched and hyped without the appropriate details required to make an informed decision around its use.
I think that you are correct, that some technical details are definitely needed for comparison... To me, this matches up against Azure Files (which is CIFS/SAMBA based)...
Though, even if redundancy isn't a factor, it's nice that you can have network file shares without having to run your own dedicated instance in a given cloud. There are plenty of situations where having a common networked filesystem makes sense across a few servers for purposes of sharing some information, without it being physically on all of them... seeding static content, or user-uploaded files for example. It makes a given solution simpler to implement initially (though other considerations may take hold as a site/application grows).
Typically Azure seems to perform better in terms of storage i/o over AWS. Though, if you are disk constrained in ways you can't reasonable scale horizontally, then you may be better off with something using local disks, and your own backup strategy on something cheaper (Linode DigitalOcean, Joyent, etc).
This is really a powerful product, and it shows the wisdom and work in Amazon's product/market research.
Outside of the world of startups and young companies who "grew up" in the world of cloud-based solutions, there is a large ecosystem of more traditional enterprises who still have a lot on-premise computing.
These companies have a lot of lock in: Racks of physical on-site servers, Sharepoint-based access control, custom hardware and clusters for everything from large file storage to compliance metadata, and custom software built around this infrastructure.
Of those, one of the biggest lock-in dependencies I've seen is NFS. Not just because NFS one of the oldest protocols, but because of the nature of the NFS abstraction. Fundamentally, software that assumes a filesystem is shared, globally mounted, and read/write is very hard to adapt to a cloud solution. Many times, it requires re-writing the software, or coming up with a NFS shim (such as a FUSE solution) that is so underperforming it blocks usage.
If AWS implements this correctly, this could provide the cost/performance balance to potentially move such a solution completely to AWS. This would eliminate not just large amounts of physical overhead for these companies, but the productivity costs that come with the downtime that inevitably occurs when you don't have good redundancy.
These companies (and the industries they comprise) are trying to find out how best to leverage Amazon. Recently, even more conservative industries, such as Law, are becoming more aware of AWS and other cloud-based solutions. Lets hope that solutions like this, that bridge the old with the new, can empower that transition so we can all feel better about how our software is managed.
As a GlusterFS developer, and furthermore the founder of a project to create a "cloud" version of GlusterFS aimed at exactly this use case, this is pretty darn interesting to me. I guess I'm supposed to pick away at all the feature differences between EFS and GlusterFS-on-EC2, or something like that, but for now I'm more pleased to see that this use case is finally being addressed and the solution seems well integrated with other Amazon features. Kudos to the EFS team.
I'd love to see an easy to manage version for containerized cloud stacks like Kubernetes and Mesos. I think the AWS move here validates that NFS is still an ok pattern to use for some applications.
I've had mixed experiences with Gluster a year ago, including lost files, so something that was rock solid and easy to manage would be a great product.
Yeah, that's something the Ceph and Gluster teams have been working on to integrate with Kubernetes seamlessly (or at least, easily). The gluster core drivers for volumes landed recently, and self service service on demand FS is a goal.
But this is EBS + redundant NFS-mountable server, which is at least worth 2 EC2 instances as well as 2 x the EBS cost, which makes the equivalent ~$100 per month (m3.mediums) plus $0.20/GB/month for SSDs.
> - (Not mentioned) Both Linux and Windows have built-in NFS clients.
Does anyone use the NFS for Windows? Is it reliable enough I can move servers away from CIFS alogether if I want to access a Linux box from Windows. For instance if I click the folder's Properties in Windows, will I see the proper Unix perms and metadata and be able to edit them?
FYI: In windows (desktop versions) it's only now in the "Enterprise" version, and apparently will be removed in Windows 10. Not sure if there are limitations to which server versions are currently supported.
CIFS/SAMBA in Linux has been pretty good, imho for the past couple years... though I still get some occasional wonky behavior from my NAS box.
We used to do it over 10Base-T, which is slower than a lot of residential connections now. Well, latency will be higher over the internet, but bandwidth will be ok.
That's the whole point. Since NFS is such a round-trip oriented protocol, it's very sensitive to latency. You'll never fill up the bandwidth unless you do large bulk transfers in parallel.
EFS is a great addition to AWS. We have SAN as a service via EBS, now we get NFS as a service. Great.
The question (for me) now becomes "where do we go from here?"
Infinite NFS is great, but what I've always wanted is infinite EBS that is fully integrated from file system to SAN. In other words, something that behaves like a local file system (without the gotchas of NFS like a lack of delete on close), but I don't have to snapshot and create new volumes and issue file system expansion commands to grow a volume. I want seamless and automatic growth.
Furthermore, there's so much local SSD just sitting around when using EBS. I want to make full use of local SSD inside of an EC2 instance to do write-back or write-through caching. I could do this in software, but maybe there's an abstraction begging to be made at the service level.
Throw in things like snapshots, and this would make for a fairly powerful solution, and it would certainly remove a lot of operational concerns around growing database nodes and such.
Don't get me wrong, you can pull together a few things and write some automation to do this today. You could use LVM to stitch together many EBS volumes, add in caching middleware (dm-cache, flashcache, etc.), and then automate the addition of volumes and file system growth. However, it's clunky, and there's an opportunity to make this much easier.
I recognize that what I'm describing doesn't serve the same purpose as NFS - for example, EBS isn't mountable in multiple locations at once - but I'd really like to see the "seamless infinite storage" idea applied to EBS.
something that behaves like a local file system (without the gotchas of NFS like a lack of delete on close), but I don't have to snapshot and create new volumes and issue file system expansion commands to grow a volume. I want seamless and automatic growth.
Wouldn't EBS w/ thin provisioning get you most of this? Just create a massive volume, and you get billed for the space actually used. (and the volume size could also function as a limit on your bill.)
Yes, good point. This would provide effectively "infinite" backing storage. There might be some hurdles to overcome, though. For example, when you delete a file, will EBS know that the blocks are now free and thus can be decommissioned? This might mean the whole stack needs to support things like TRIM. I'm not sure the rest of the stack is smart enough yet. I'd love to hear from a storage/FS expert on this.
Edit: coincidentally, I just saw this article about XFS which observes the following:
"Over the next five or more years, XFS needs to have better integration with the block devices it sits on top of. Information needs to pass back and forth between XFS and the block device, [says Dave Chinner]. That will allow better support of thin provisioning."
https://lwn.net/Articles/638546/
The reason I suggested it is because thin volumes are well understood, so the issues are straightforward... and it's been implemented in many products (lvm; virtually every san; xenserver & vmware; etc). So there really shouldn't be many surprises if amazon were to implement it.
And yes, trim is used to mark blocks as free.
Honestly, it's so widespread, I would be surprised if Amazon weren't already using it to over-commit ebs.
That would be awesome. A huge engineering effort - I imagine that would require building a radically different filesystem more or less from scratch - but AWS has the resources to do that sort of thing.
I think this is a great product, but I've been actively avoiding NFS for a while now. It's a real shame that there isn't a better network FS protocol standardized already - NFS is complex, is usually a single point of failure (would be interesting to know if EFS isn't...) and comes with a whole set of cruft.
I understand why they did it, though. It's either this (everyone supports NFS, even Windows) or get crucified for vendor lock-in. Just like Github did yesterday (https://news.ycombinator.com/item?id=9343021), even when they released their product as an open standard with an open source reference implementation.
There's a world of difference between what Github did, and the hypothetical where AWS chose a proprietary protocol to access a filesystem. For AWS, as long as the underlying filesystem was "more or less POSIX", the access mechanism is largely irrelevant to lock-in; it would be as easy to switch from AWS as it would be to move between filesystems.
Git was not designed for large files. But what github released yesterday primarily serves to promote github's central-server model for git. Moreover, it seems that it could have been better done within the git protocol itself (modify git to do more sparse pulls, and then try to fetch on a checkout when it is missing blobs, rather than erroring immediately).
I suspect AWS chose to use NFS for expedience, the net effect is positive, but I don't think it would have much mattered anyway.
Github is trying to inject their own server-model into the git protocol, with an extension that is only half thought through; that is a huge step backwards, open-source or not.
Well, indeed, I'm not sure I could offer a different choice I would have preferred - so the criticism definitely isn't constructive in that sense. It does seem, though, that the world is missing a sane cross-platform network system. If there had even been some fuse-based system that used a more robust protocol I'd probably prefer that, although if they can more or less guarantee that the NFS service won't go down (easily) then I suppose most of my qualms would be put aside.
Can you do a simple time sh -c "dd if=/dev/zero of=testfile bs=8k count=11000"; rm testfile
"performance" test?
Also do the read test on a really big file, first from your host - so the file will get cached, then from within the qemu-vm where virtfs plan9 is mounted, dd if=bigfile of=/dev/null
Please, I want to confirm its not only my machines suffering from really totally shitty read/write performance thru virtfs?
From my own benchmark, on kernel Linux 3.19.3.201503270049-1-grsec, qemu 2.2.1, 9p was 500% slower on read and immensily - after 9min gave on the write test which took 29s on host.
This is what should have existed instead of EBS all along.
I'll never consider this to be as reliable as S3, but if I'm going to have a network filesystem I'd rather be dealing with NFS as my abstraction instead of virtualized network block devices.
I believe EBS was the right building block at the right time, and I'll still use it over EFS in the majority of my deployment designs.
I still like to "trifurcate" the storage into objects, local disposable, and local volumes. Having durable local volumes still makes sense in a lot of scenarios.
AWS is lean so they build what's easy for them to build, not what should exist. You can gauge how hard a feature is by how long it took them to implement it.
Not sure why NFS gets such a bad rap. On a low-latency network, properly tuned NFS has very few, if any performance issues.
I've personally seen read/write rates exceeding 800MByte/sec on more or less white-box hardware, at which point it was limited by the underlying storage infrastructure (8Gbit fiber), not the NFS protocols.
Dell has a 2013 white paper (I'm not affiliated with them, fwiw) about their fairly white box setup that achieved > 100,000 iops, 2.5GBbyte/s sequential write, and 3.5GBbyte/s sequential read:
http://en.community.dell.com/techcenter/high-performance-com...
Not sure how it would ever be technically possible for a networked filesystem to get even near directly attached storage.
But, for sure, the typical carrier grade EMC or Netapp is MUCH slower then a good SAN. I'm talking about petabytes of very small (average maybe 20kB) files with lots of _random_ sync writes and reads. NFS has a lot of other benefits, but it surely is not super high performance in every usecase. Regardless of what a theoretical marketing whitepaper has shown in some lab setup.
Someone who thinks that you can put a network protocol around a filesystem without _any_ performance impact is nuts.
BUT if your usecase fits NFS you might as well get very good performance out of it. As always, pick the right technology for your specific case.
Well, what would it help in terms of NFS? You'd still have to tell NFS to read 20kB. If it's 20kB from one big file or one 20kB file doesn't matter much. it's common to have one file per email and the usual filesystem has no problem with that.
My only test case has been a VMWare virtual machine, mounting an nfs share from the host so I could work on my local filesystem and execute within the VM. Switched to a filesystem wather + rsync combo after struggling with poor random read performance. Maybe it was due to bad configuration but always thought it would be a poor choice for anything serius.
That very much depends on what you're using and how you tune things. With NFSv4.1, you can use parallel nfs, which is essentially striping reads and writes over multiple nfs servers.
Depending on how you tune it, it can be a monster. Several years ago I was managing a cluster with ~5K linux instances all mounted to ~4PB of spinning disk served with NFS. Worked very well.
Unlike typical local Unix file systems, NFS does not support "delete on last close" semantics.
Ordinarily, even if you unlink a file, the operating system keeps the inode around until the last filehandle referencing it goes away. But an NFS mount cannot know when all filehandles on all networked systems have closed. When you attempt to read from an NFS file handle whose underlying file has been deleted out from under you, BOOM -- `ESTALE`.
The solution is typically to guard against file deletion using read locks... which are extremely annoying to implement on NFS because of portability issues and cache coherency problems.
I'm not sure I'd describe that as a "scaling problem" per se, because it gets bad quickly and stays bad. It's more of a severe limitation on how applications and libraries can design their interaction with the file system.
It very much depends on your workload, particularly with NFSv3 and earlier. We were able to reliably handle multiple gigabit streams no later than 2005 but that was writing to huge files (backing up a ~2-3Gbps data acquisition system being processed by 4 Mac or Linux clients).
Small files were much worse because they require a server round-trip every time something calls stat() unless you know that all of the software in use reliably uses Maildir-style practices to avoid contention. That meant that e.g. /var/mail could be mounted with the various attribute-cache values (see acregmin / acdirmin in http://linux.die.net/man/5/nfs) but general purpose volumes had to be safe and slow.
If you read through the somewhat ponderous NFSv4 docs, there are a number of design decisions which are clearly aimed at making that use-case less painful. I haven't done benchmarks in years but I'd assume it's improved significantly.
You run PostgreSQL or any other DB server with its DB data dir on the NFS mount.
Oracle supports this- and they even wrote a user-space NFS client to "get the highest level of performance" (because they thought the kernel NFS implementation sucked).
The important bit is to ensure the NFS client and server implementation handle whatever POSIX features are required by the DB server.
Why would you want to? You can't share across two instances at the same time anyway, it's going to be slower/more edge case-y, and the cost with Amazon is higher?
On the one hand, I've been wanting something like this for a while. On the other hand, I have so many bad memories of problems caused by NFS in production from the 90s that I'm leery.
At least the Linux NFS client made some big improvements in stability by the mid-2000s. FreeBSD took longer but I've heard they've fixed the kernel deadlocks as well.
The other interesting note is that they apparently only support NFSv4, which has some welcome improvements: it uses TCP over a single port, avoids the entire portmap/statd/lockd train-wreck, UTF-8 everywhere, etc. One of the more interesting ones is that it has referrals (think HTTP redirect) so one server doesn't have to handle every request from every client and you can do things like load-balancing. Clients are also assumed to be smarter so e.g. local caching is safe because the server can notify you about changes to open files rather than requiring the client to poll.
I'd be surprised if that didn't factor into the equation, although even just regular referrals would be acceptable for many uses where you can do some degree of load-balancing / fail-over with reasonable recovery times.
Not necessarily. IIRC, pNFS is an optional feature of NFSv4.1, so they might not have implemented that flavor. If they had, I'm pretty sure they'd be advertising it. They're not shy.
Technical details aren't out yet. They may not have implemented it, but given their comments about performance and scaling, it would be a very good reason not to support NFSv3.
Although skeptical about it going into the project, I've deployed a virtualized oracle rac environment over 10G NFS and with some tuning it was stable and performant. If it is good enough for rac, which has some of the most stringent latency / performance requirements that I've seen, I'd say that it is probably good enough for quite a few production use cases, although to be fair this was only an 8 node cluster.
Some naive questions if you don't mind... (I'm really curious about RAC .. my only DB experience has been small MySQL and MS SQL Server clusters).
1) I thought the really high end DBs like to manage their own block storage. Your NFS comment suggests that the database data files were running on an NFS mount, and you had a 10 gig Ethernet connection to the file server.
2) What would you say is the average size of a RAC cluster in (your opinion)? Is 8 considered a small cluster in this realm?
3) DBs have stringent requirements when it comes to operations like sync. Can you actually get ACID in an NFS backed DB?
Does anyone else cringe when someone suggestions using XYZ in production?
I can't be the only one that has been woken up at 2am because of an XYZ outage.
XYZ could be NFS, SCSI, MySQL, Rails, KVM, ..., you get the idea. Any technology that has seen wide use has caused someone to be woken up at 2am because of an outage. NFS has been very widely used for a very long time. As a distributed file system developer who once helped design a precursor of pNFS I think NFS has some pretty fundamental problems, but the fact that NFS servers sometimes go down is not one of them. Often that's more to do with the implementation and/or deployment than the protocol, and no functionally similar protocol would do much better under similar circumstances. People get woken up at 2am because of SMB failures too. My brother used to get woken up at 2am because of RFS failures. Nobody gets woken up at 2am because of 9p failures, but if 9p ever grew up enough to be deployed in environments with people on call I'm sure they'd lose sleep too. EBS failures have bitten more than a few people.
Citing the existence of failures, other than proportionally to usage, isn't very convincing. I'd actually be more concerned about the technology on the back end of EFS, not the protocol used on the front.
I can't say I have any problems with NFS - we use it for shared storage on some pretty busy servers without any issue. I'm not saying they don't happen - just that we don't experience them. I'd be interested to hear the problems you've encountered - did you submit bug reports for them that you could perhaps link to?
On a previous workplace, we had a pretty beefy VMware set up backed by NFS. Performance was excellent, and file level access offers a lot of functionality you can't have with for example iSCSI.
That sounds like an issue with the implementation, not the protocol. There are countless large environments I know of running NFS in production on NetApp without any issues at all.
As has been mentioned, EBS can only be mounted to one instance at a time. If you could mount it to multiple it would effectively be the same thing, but then you have all sorts of write-locking issues.
As the rest of the comments also allude, a lot of the cloud-entrenched world has abandoned NFS, or at least in AWS circles.
I'm not one of these people. Rather than relying solely on puppet->all instances to handle multiple deploys, the convenience of an NFS instance was appealing, so essentially I have a relatively small (read: medium) instance that does nothing but manage the deployment filesystems and allow new in-security groups to connect. I have often though of abandoning this for doing deploys in a more "modern" way but I'm still not sure what the benefit would be other than eliminating a minor source of shame.
The analog here is running a MySQL or Postgres database in an instance prior to RDS. RDS provided enough benefit that the minor price difference in rolling your own no longer factored in. A more reliable, fault-tolerant and extensible file system is, like RDS, a huge upgrade. It may not be for everyone, but for some of us it's just another reason why AWS keeps making it hard to even look anywhere else.
As simonw points out, it produces a real problem with regard to MySQL and the way it handles its data source(s) in general, but more importantly, it's wholly unnecessary.
The right way to do that (in AWS world) would be RDS accessible to all instances in the security group. This yields locking control to the application level and not the file system. For obvious reasons this makes a lot of sense.
There are of course non-ACID / NoSQL solutions for which this might be an acceptable practice, but in general I'd say it's fraught with peril.
This idea got shot down pretty quick, but I love that it was asked. Coming up with potentially disastrous sideways use-cases for new toys is a favorite pastime of mine.
You have to create a snapshot of the EBS volume and create a new (larger) volume from said snapshot to grow the volume. Yep, this resizes automatically.
NFS? I just simply don't see the place for this product. The world moved towards the use of smart data formats on platforms like S3 and HDFS. Where would be a NFS service better? It is kind of hard to see for me to use a distributed filesystem over a distributed datastore. Wondering about the outage scenarios with that. Historically speaking, the drivers that deal with this in the linux kernel are not the best in terms of locking.
There are two things wrong with the "nobody should use a distributed file system" meme. (Disclaimer: I'm a distributed file system developer.)
(1) Not all development is green-field and thus subject to using this week's fashionable data store. A lot of applications already use file systems and rely on their semantics. Ripping up the entire storage layer of a large and complex application would be irresponsible, as it's likely to create more problems than it solves. Look down your nose at "legacy code" all you want, but those users exist and have money to give people who will help them solve their business problems. Often the solutions are as cutting-edge as anything you'd find in more fashionable quarters, even if the facade is old fashioned.
(2) Even for green-field development, file systems are often a better fit than object stores. Nested directories are more than just an organizational scheme, but also support security in a way that object stores don't. What's the equivalent of SELinux for data stored in S3 or Swift? There is none. File system consistency/durability guarantees can be important, as can single-byte writes to the middle of a file, while parsing HTTP headers on every request drops a lot of functionality and performance on the floor. Distributed databases are much more compelling, but the fact remains that the file system model is often the best one even for new applications.
Go ahead and use something else if it suits you. Other people wouldn't have much use for your favorite storage model, and this one suits them perfectly.
Funny you mentioned that this is a meme to you, while it is really a technical consideration to me, and I supplied some details about my concerns.
Answers following your numbering:
(1) Calling S3 "this week's fashionable data store" is like saying that an elephant is an interesting microbe. The rest of your points about "please do not innovate, we have filesystems for 40 years and this is how you store your data". I do not agree. Disclaimer, I was member of the team that moved amazon.com from using an NFS based storage to S3. It was a great success, and it solved many of our problems including dealing with the insane amount of issues that was introduced by running an NFS cluster at that scale. And I would like to emphasize on scale, because your operational problems are quite often increase worse than linear with you scale.
I know about legacy code, and running several legacy services in production as of now. I can tell you one thing. There is a point when it is not financially viable to keep rolling with the legacy code. This point is very different based on your actual use case, banks tend to run "legacy code" while web2.0 companies tend to innovate and replace systems with a faster peace. I don't see any conflict here. We even did a compatibility layer for the new solution and it was possible to run your legacy code using the new system and your software was untouched.
(2) Nested directories is a logical layer on the top of how the data is stored, aka a view, your are a distributed FS developer so I guess you understand that. S3 also supports nested directories no biggie here.
Security. Well this is kind of weird because last time I checked S3 had an extensive security http://aws.amazon.com/s3/faqs/#security_anchor
Now the rest of your question can be re-phrased: "I am used to X why isn't there X with this new thing???" I am not sure how many of the file system users use SElinux my educated guess is that roughly around 1-10%. It is a very complex system that not too many companies invest using. For our use cases the fine grained ACLs were good enough so we are using those. File system durability: yes it is very important, this why I was kind of shocked about this bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/...
I guess you are right about the overhead of reading and writing, dealing with http headers etc. If the systems that benefit the most from S3 where single node systems it would be silly to use S3 at the first place. We are talking about 1000 - 10000 computers using the same data storage layer. And you can tell me if I am wrong but if you would like to access the same files on these nodes using a FS than you are going to end up with a locking hell. This is why modern software that is IO heavy moved away from in-place edits towards the "lock free" data access. Look at the implementation of Kafka log files or how Aeron writes files. This is exactly the same schematics how use use S3. Accident? ;)
I would like to repeat my original question: I don't see huge market for a distributed FS. I might be wrong, but this is how I see it.
"please do not innovate, we have filesystems for 40 years and this is how you store your data"
Please don't put words in my mouth like that. It's damn rude. I never said anything that was even close.
"S3 also supports nested directories no biggie here."
Not according to the API documentation I've seen. There are buckets, and there are objects within buckets. Nothing about buckets within buckets. Sure, there are umpteen different ways to simulate nested directories using various naming conventions recognized by an access library, but there's no standard and thus no compatibility. You also lose some of the benefits of true nested directories, such as combining permissions across different levels of the hierarchy. Also no links (hard or soft) which many people find useful, etc. Your claim here is misleading at best.
"last time I checked S3 had an extensive security"
Yes, it has its very own permissions system, fundamentally incompatible with any other and quite clunky to use. That still doesn't answer the question of how you'd do anything like SELinux with it.
Open up your bug list and we can have that conversation. Throwing stones from behind a proprietary wall is despicable.
"you can tell me if I am wrong but if you would like to access the same files on these nodes using a FS than you are going to end up with a locking hell."
You're wrong. Maybe you've only read about distributed file systems (or databases which have to deal with similar problems) from >15 years ago, but things have changed a bit since then. In fact, if you were at Amazon you might have heard of a little thing called Dynamo which was part of that evolution. Modern distributed systems, including distributed file systems, don't have that locking hell. That's just FUD.
"I don't see huge market for a distributed FS."
Might want to tell that to the EFS team. Let me know how that goes. In fact you might be right, but whether there's a market has little to do with your pseudo-technical objections. Many technologies are considered uncool long before they cease being useful.
We're contemplating moving our static image files (JPG/PNG) from an EBS volume to serving them from an S3 bucket (so we can deploy a HA setup). It sounds like it would be a lot less code if we used EFS instead. Would you guys recommend S3 or EFS for this scenario?
One thing to consider is that EFS is priced at 10x the cost of the most expensive tier of S3 storage ($0.30/Gbyte vs $0.03/Gbyte).
If they are just static assets, you would do better to put them in an S3 bucket, set appropriate Cache-Control headers and serve them via Cloudfront. This reduces your outbound bandwidth cost to the Internet versus EC2/S3, and yields better performance.
While there's alot of unknowns at this point, my money would be to stick with S3. Serving static files quickly is a large part of what it is designed to do.
From what I'm reading here, EFS seems more geared towards shared volumes where it being a filesystem is a critical part of the interface. If you can get away with not needing a posix filesystem layer over your datastore, you should.
S3 is going to be a lot cheaper. It isn't perfect, but it's pretty reliable. Then you can look at cloudfront. And they did recently release cross-region S3 replication. Then you can be really safe and keep a backup copy at a whole other location.
I don't think anyone can really recommend EFS for anything yet until it's in the wild for testing. I'd probably still store static files like that on S3 - with EFS you'd still need a server to handle the requests.
Depends, I work with a number of HA Drupal sites, and Drupal 7 can do most files in S3 but it still likes to put generated CSS/JS files and tmp files on local disk. In most of these cases (or for already existing sites) it's usually easier to just use NFS or Gluster instead of trying to force everything to S3.
S3fs filesystems are really slow. We tested around 10mb/s for file upload. Where it really struggles is when you have a lot of files in a folder. Try doing an 'ls' on a folder with hundreds of files to see it break.
As always, it depends on your use case. Just because it can be slow doesn't mean it's not a viable (and in some cases superior) option.
We use it to store petabytes of large video files and our system is structured such that no folder ever has more than a couple files in it (>20 is rare). With properly tuned caching this works fantastically well for our use case and I would take the simpler code and reduced points of failure over NFS nonsense any day.
That of course doesn't mean s3fs is the solution to every problem, it simply means it's good to have options and don't write something off because it "might be slow."
Know your data, know your use case, and know your tools. You can make smart decisions on your own rather than driven by anecdotal comments on HN.
I agree with your points. It is indeed a viable, if somewhat clunky solution.
For getting the data into S3, we found exponential improvements in using the AWS CLI, as I believe it handles uploads in a multi-threaded way.
S3fs turned out to be viable for our use case, storing Magento Enterprise content assets which are then served directly from S3, so the app's upload features rely on s3fs as well as the file checks from the app itself (which are indeed quite slow).
I've always wanted to do it natively, mounting EBS volumes on more than one instance (which is not currently possible) or wishing for a native NFS service like AWS released.
All in all, it is a happy day for me. More options make us more powerful.
The fuse-based ones that I've tried were ridden with problems and poor error handling. Hangs and truncated files were the rule rather than the exception.
s3fs-fuse has its share of problems, but master has fixes for some of the error handling and truncated files issues. Please report any bugs you encounter on GitHub!
> Q: What data consistency model does Amazon S3 employ?
> Amazon S3 buckets in the US Standard region provide eventual consistency. Amazon S3 buckets in all other regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES.
I'd be very interested to know what kind of consistency guarantees EFS provides. The history of NFS is plagued by syscalls whose docs have a variation of the phrase "this operation is atomic (except on NFS)".
> Finally, note that, for NFS version 3 protocol requests, a subsequent commit request from the NFS client at file close time, or at fsync() time, will force the server to write any previously unwritten data/metadata to the disk, and the server will not reply to the client until this has been completed, as long as sync behavior is followed. If async is used, the commit is essentially a no-op, since the server once again lies to the client, telling the client that the data has been sent to stable storage. This again exposes the client and server to data corruption, since cached data may be discarded on the client due to its belief that the server now has the data maintained in stable storage.
I am not certain how this works in NFSv4 which is what EFS will be. The safe solution is to use the sync option for mounting the NFS volume, at the cost of performance.
This would be big for us. When we initially looked at the problem of sharing or keeping a large number of files in sync, the prospects were dim. DRBD? etc. So we ended up using Gluster. Gluster has been temperamental at best. We've been able to move some data out and into elasticsearch, but not all. So, I've nudged my AWS rep and signed up already. Reliable NFS is good for me.
We couldn't find a solution either, so we built a posix filesystem with S3 backend backend that is easy to run and scale. If you want to give ObjectiveFS (https://objectivefs.com) a try, I'll be happy to hear your feedback.
Congratulation on launching a killer feature once you addicted to it, very hard to move out from AWS :) (but honestly, this is a very awesome technology we've been dreaming for years)
If anything, their use of NFSv4 means there are plenty of competitive offerings if you decide that performance, security, or physical access constraints dictate migrating off their service.
If you don't want to manage your own Linux/BSD/etc. NFS infrastructure, Oracle, Netapp, and EMC will all happily sell you a storage appliance that supports it. I don't see much lock-in here.
I think parent meant it in a positive way, as opposed to negative. "So good you don't want to leave" as opposed to proprietary tech results in vendor lock-in.
This is a filesystem. You mount it directly inside your EC2 boxes and work with it as if it's local (or really NFS-mounted). It's going to be a couple of orders of magnitude faster, but not directly web-connected, so you can't use it to serve content directly as if it was a CDN.
I can see this being useful for a few cases. For one, I can immediately use it for one of the projects I have where I have multiple worker servers and one of them needs to periodically process a few GB's of data, yet I don't want to give that much storage to every sever, and I don't want to make any one server special.
Another use case: you don't know how big your data will grow, yet you want to access it in a random fashion. S3 isn't great for this, but NFS is, and unlimited mounted storage is nice.
Edit: Third use case is logs. You can collect all the logs from all of your servers in one place and access them from any server.
I think they're meant to solve different problems. On one hand S3 is designed for (among other things) serving files over the web. EFS sounds like it's designed to be used as a filesystem multiple instances can write to (say for a high-availability setup).
One nice thing would be hosting wordpress on multiple EC2 servers. All local copies of your wordpress directory would be kept in sync automatically with no other configuration.
If you need something like EFS today, there is ObjectiveFS (https://ObjectiveFS.com), a posix cloud file system with S3 backend. Disclaimer: I am a co-founder :-)
Except that your one EC2 instance is a single point of failure, from instance through host through availability zone, and that providing reliable, multi-node NFS is hard (even with GlusterFS).
And you have to resize your block devices to store more stuff.
Well, this is a problem with a lot of ephemeral services hand-spun on AWS. There are also ways of mitigating it on your instances, but those are clunky, too.
Which is why new services like these are always preferable.
This may be a problem for this new service as well, but with windows isn't there a problem getting a network drive to mount without a user physically logging into the machine?
All services should eventually be like this. Just as you don't want to deal with provisioning BTUs of air conditioning or watts of power needed for your cloud infrastructure, why should you concern yourself about allocating a certain number of bytes of storage?
> Just as you don't want to deal with provisioning BTUs of air conditioning or watts of power needed for your cloud infrastructure
Maybe you don't want to, but there is definitely someone out there dealing with these issues.
E.g. during a heat-wave (100 F+) a transformer on top of the building (at a previous employer) started on fire. When the dust settled, we found out that the person in charge of it had not upgraded it as our power requirements increased. It was over-taxed and the heat-wave put it over the edge.
> All services should eventually be like this. Just as you don't want to deal with provisioning BTUs of air conditioning or watts of power needed for your cloud infrastructure, why should you concern yourself about allocating a certain number of bytes of storage?
Infrastructure guy here. Abstraction is to reduce workload; you still need to understand the underlying concepts. Otherwise, you're just the guy who freaks out when their DB is at 100% CPU utilization or hours of replica lag without knowing why.
WHY WAIT? Zetta.Net has been delivering enterprise grade capabilities, for more than 6 years, that Amazon/Google/MS are just now getting into. Don't wait for the pre-view, go to www.zetta.net and get all you can handle today. Customer's are moving in excess of 5TB in/out natively over the internet on a daily basis, with ability to do more given the available WAN connectivity. And add 100% Cloud Based DRaaS, and tomorrow is here today at Zetta.
Until now, you could bind an ebs to only one instance.
Or had to use gluster / hdfs otherwise.
Multiple Amazon EC2 instances can access an Amazon EFS file system
at the same time, providing a common data source for
workloads and applications running on more than one instance.
You effectively have a distributed filesystem now. this is great news.
I was wondering that too, since Aurora uses shared multi AZ storage. When I learned that yesterday I found myself wishing they made just that available on AWS as it's own service. One day later, they announce it. I love what these guys are doing.
Nice. Do they have any details on how the resizing works? Is it automatic as I generate more files? Or will I have to explicitly provision more space as I generate the files?
Its going to be really nice if you are able to snapshot these like EBS volumes. I didn't see any reference in the details on how data recovery would work.
concurrent versions of your app sitting next to each other somewhere in $PATH meaning you can roll forward/back by typing app-$version (or what ever your convention is)
Quickly and efficiently share files between instances.
It basically means you can treat a node group as single proper linux cluster, without having to buy GPFS licenses or indulge in the horror that is glusterfs.
MP3 players were available long before the iPod. Apple has a pretty good track record of turning niche markets into mainstream ones - I'll be interested to see if they can do it with the smartwatch category.
I've actually long advocated that the pocket brick is a temporary form factor for a phone/computer. The watch will soon be the brain of your computing, to which you will be able to pair any number of dumb displays, keyboards, earpieces, etc.
But you're right, I will likely get an iWatch. It will go in my QA toybox with my macbooks and ipads and iphones.
Are you kidding? Apple reinvents every space they enter with profound invention.
Do you remember computers before the Mac? Notebooks before the Macbook? Music players before the iPod? Phones before the iPhone? Tablets before the iPad? And in a month you'll think: watches before the Apple Watch?
I think their skill is more profound commercialization. I do in fact remember all of those things. I even owned all of those things before Apple entered those markets. Apple invented very little for any of them.
Their strength lies in taking existing technologies, rebuilding them with a strong user focus, and then marketing the hell out of them. So much so that many people apparently forget what came before.
Most of their success is owing to the invention of new interfaces that fit new form factors. Before that, only dissatisfying shit is available, and the market is mostly novelty. After that, new device categories have actual features a human can use, and a market happens and knock offs compete.
Which interface were you thinking they invented? the WIMP-style interface was borrowed from Xerox PARC. The MacBook was a shiny laptop. The iPod had a menu system much like other MP3 players of the time; they mainly made it a well-organized, consumer-friendly menu system with a bigger screen. The iPhone's main interface innovation I can think of was making the screen bigger; they did use a virtual keyboard, but so did the Palm Pilot and the Apple Newton of years before. The iPad, interface-wise, was basically a big iPod Touch.
Don't get me wrong; those are all very polished products, and they took a lot of technical smarts. They were much more usable to a mainstream consumer audience. But the non-Apple versions of those products were generally fine for non-consumer audiences. And Apple's marketing is masterful; I've never seen a tech company so good at generating hype.
Yeah, they were pretty good until I used the Apple version... :) (Exception: MacOS 9 laptops, MacBooks pre-titanium)
My phone before the iPhone was horrible, though. Partly that was my fault for getting a Razr. The user interface somehow managed it so that my thumb was always in exactly the wrong spot for what I needed to push right now. It was a joyous day when I found someone to give that thing away to. (On the good side, it woke up two years after I stopped using it, plugging it in, etc. to give me an alarm I had set. Now that is a reliability!)