
Amazon Elastic File System - suprgeek
http://aws.amazon.com/efs/
======
chuckcode
I wish AWS would be a little more technical in their product descriptions and
announcements. If there was ever an audience that wants technical data about
how this type of technology will scale and compares to existing technology
like glusterfs it would be the AWS users. Instead the performance
specifications are ".. and is designed to provide the throughput, IOPS, and
low latency needed for a broad range of workloads." We know that NFS to a lot
of nodes is hard, show us that this scales better or am I the testing team?

~~~
jeffbarr
We'll be providing a lot more details later.

~~~
click170
Fair enough, but one of the biggest reasons I ignore these types of posts is
because too often they lack these critical details, they come off as puff
pieces to me. IMO these posts would hold value for a larger audience if that
info was included in them.

~~~
vacri
It's a upcoming product announcement page from a vendor asking for preview
registration, how can it be anything other than a puff piece?

~~~
jsprogrammer
The trick is, based on the title and other marketing tricks, one doesn't find
out it's an actual puff piece until one has already spent time reading at
least a portion of the puff. By then, you just feel used.

------
lchengify
This is really a powerful product, and it shows the wisdom and work in
Amazon's product/market research.

Outside of the world of startups and young companies who "grew up" in the
world of cloud-based solutions, there is a large ecosystem of more traditional
enterprises who still have a lot on-premise computing.

These companies have a lot of lock in: Racks of physical on-site servers,
Sharepoint-based access control, custom hardware and clusters for everything
from large file storage to compliance metadata, and custom software built
around this infrastructure.

Of those, one of the biggest lock-in dependencies I've seen is NFS. Not just
because NFS one of the oldest protocols, but because of the nature of the NFS
abstraction. Fundamentally, software that assumes a filesystem is shared,
globally mounted, and read/write is very hard to adapt to a cloud solution.
Many times, it requires re-writing the software, or coming up with a NFS shim
(such as a FUSE solution) that is so underperforming it blocks usage.

If AWS implements this correctly, this could provide the cost/performance
balance to potentially move such a solution completely to AWS. This would
eliminate not just large amounts of physical overhead for these companies, but
the productivity costs that come with the downtime that inevitably occurs when
you don't have good redundancy.

These companies (and the industries they comprise) are trying to find out how
best to leverage Amazon. Recently, even more conservative industries, such as
Law, are becoming more aware of AWS and other cloud-based solutions. Lets hope
that solutions like this, that bridge the old with the new, can empower that
transition so we can all feel better about how our software is managed.

------
notacoward
As a GlusterFS developer, and furthermore the founder of a project to create a
"cloud" version of GlusterFS aimed at exactly this use case, this is pretty
darn interesting to me. I guess I'm supposed to pick away at all the feature
differences between EFS and GlusterFS-on-EC2, or something like that, but for
now I'm more pleased to see that this use case is finally being addressed and
the solution seems well integrated with other Amazon features. Kudos to the
EFS team.

~~~
tristanz
I'd love to see an easy to manage version for containerized cloud stacks like
Kubernetes and Mesos. I think the AWS move here validates that NFS is still an
ok pattern to use for some applications.

I've had mixed experiences with Gluster a year ago, including lost files, so
something that was rock solid and easy to manage would be a great product.

~~~
smarterclayton
Yeah, that's something the Ceph and Gluster teams have been working on to
integrate with Kubernetes seamlessly (or at least, easily). The gluster core
drivers for volumes landed recently, and self service service on demand FS is
a goal.

~~~
ipedrazas
Is it not the same that uses OpenStack? I can see a lot of potential with
containers + efs

------
DenisM
Salient points:

\- NFS (v4).

\- Supports petabyte-scale file systems, thousands of concurrent NFS
connections.

\- Automatically grows/shrinks in size.

\- Multi-zone storage and access.

\- $0.30 / (Gigabyte * month).

\- (Not mentioned) Both Linux and Windows have built-in NFS clients.

~~~
e40
For comparison, EBS is $0.10 / GB / Month for SSDs and $0.05 / GB / Month for
regular disks.

~~~
hrez
For comparison, keep in mind that EBS is $0.10/Gb per provisioned storage,
which is often over-provisioned. Versus EFS $0.30/Gb per utilized storage.

------
andrew311
EFS is a great addition to AWS. We have SAN as a service via EBS, now we get
NFS as a service. Great.

The question (for me) now becomes "where do we go from here?"

Infinite NFS is great, but what I've always wanted is infinite EBS that is
fully integrated from file system to SAN. In other words, something that
behaves like a local file system (without the gotchas of NFS like a lack of
delete on close), but I don't have to snapshot and create new volumes and
issue file system expansion commands to grow a volume. I want seamless and
automatic growth.

Furthermore, there's so much local SSD just sitting around when using EBS. I
want to make full use of local SSD inside of an EC2 instance to do write-back
or write-through caching. I could do this in software, but maybe there's an
abstraction begging to be made at the service level.

Throw in things like snapshots, and this would make for a fairly powerful
solution, and it would certainly remove a lot of operational concerns around
growing database nodes and such.

Don't get me wrong, you can pull together a few things and write some
automation to do this today. You could use LVM to stitch together many EBS
volumes, add in caching middleware (dm-cache, flashcache, etc.), and then
automate the addition of volumes and file system growth. However, it's clunky,
and there's an opportunity to make this much easier.

I recognize that what I'm describing doesn't serve the same purpose as NFS -
for example, EBS isn't mountable in multiple locations at once - but I'd
really like to see the "seamless infinite storage" idea applied to EBS.

~~~
rgbrenner
_something that behaves like a local file system (without the gotchas of NFS
like a lack of delete on close), but I don 't have to snapshot and create new
volumes and issue file system expansion commands to grow a volume. I want
seamless and automatic growth._

Wouldn't EBS w/ thin provisioning get you most of this? Just create a massive
volume, and you get billed for the space actually used. (and the volume size
could also function as a limit on your bill.)

~~~
andrew311
Yes, good point. This would provide effectively "infinite" backing storage.
There might be some hurdles to overcome, though. For example, when you delete
a file, will EBS know that the blocks are now free and thus can be
decommissioned? This might mean the whole stack needs to support things like
TRIM. I'm not sure the rest of the stack is smart enough yet. I'd love to hear
from a storage/FS expert on this.

Edit: coincidentally, I just saw this article about XFS which observes the
following:

"Over the next five or more years, XFS needs to have better integration with
the block devices it sits on top of. Information needs to pass back and forth
between XFS and the block device, [says Dave Chinner]. That will allow better
support of thin provisioning."
[https://lwn.net/Articles/638546/](https://lwn.net/Articles/638546/)

~~~
rgbrenner
The reason I suggested it is because thin volumes are well understood, so the
issues are straightforward... and it's been implemented in many products (lvm;
virtually every san; xenserver & vmware; etc). So there really shouldn't be
many surprises if amazon were to implement it.

And yes, trim is used to mark blocks as free.

Honestly, it's so widespread, I would be surprised if Amazon weren't already
using it to over-commit ebs.

------
ealexhudson
I think this is a great product, but I've been actively avoiding NFS for a
while now. It's a real shame that there isn't a better network FS protocol
standardized already - NFS is complex, is usually a single point of failure
(would be interesting to know if EFS isn't...) and comes with a whole set of
cruft.

~~~
koenigdavidmj
I understand why they did it, though. It's either this (everyone supports NFS,
even Windows) or get crucified for vendor lock-in. Just like Github did
yesterday
([https://news.ycombinator.com/item?id=9343021](https://news.ycombinator.com/item?id=9343021)),
even when they released their product as an open standard with an open source
reference implementation.

~~~
justinsb
There's a world of difference between what Github did, and the hypothetical
where AWS chose a proprietary protocol to access a filesystem. For AWS, as
long as the underlying filesystem was "more or less POSIX", the access
mechanism is largely irrelevant to lock-in; it would be as easy to switch from
AWS as it would be to move between filesystems.

Git was not designed for large files. But what github released yesterday
primarily serves to promote github's central-server model for git. Moreover,
it seems that it could have been better done within the git protocol itself
(modify git to do more sparse pulls, and then try to fetch on a checkout when
it is missing blobs, rather than erroring immediately).

I suspect AWS chose to use NFS for expedience, the net effect is positive, but
I don't think it would have much mattered anyway.

Github is trying to inject their own server-model into the git protocol, with
an extension that is only half thought through; that is a huge step backwards,
open-source or not.

------
alrs
This is what should have existed instead of EBS all along.

I'll never consider this to be as reliable as S3, but if I'm going to have a
network filesystem I'd rather be dealing with NFS as my abstraction instead of
virtualized network block devices.

~~~
alexchamberlain
May I ask why? My experience with NFS is pretty bad performance wise.

~~~
matheweis
Not sure why NFS gets such a bad rap. On a low-latency network, properly tuned
NFS has very few, if any performance issues.

I've personally seen read/write rates exceeding 800MByte/sec on more or less
white-box hardware, at which point it was limited by the underlying storage
infrastructure (8Gbit fiber), not the NFS protocols.

Dell has a 2013 white paper (I'm not affiliated with them, fwiw) about their
fairly white box setup that achieved > 100,000 iops, 2.5GBbyte/s sequential
write, and 3.5GBbyte/s sequential read:
[http://en.community.dell.com/techcenter/high-performance-
com...](http://en.community.dell.com/techcenter/high-performance-
computing/b/hpc_storage_and_file_systems/archive/2013/04/05/over-100-000-iops-
with-plain-ol-nfs-and-7-200-rpm-drives)

~~~
buster
(me: Working in the Messaging Business for ISPs)

Not sure how it would ever be technically possible for a networked filesystem
to get even near directly attached storage. But, for sure, the typical carrier
grade EMC or Netapp is MUCH slower then a good SAN. I'm talking about
petabytes of very small (average maybe 20kB) files with lots of _random_ sync
writes and reads. NFS has a lot of other benefits, but it surely is not super
high performance in every usecase. Regardless of what a theoretical marketing
whitepaper has shown in some lab setup.

Someone who thinks that you can put a network protocol around a filesystem
without _any_ performance impact is nuts.

BUT if your usecase fits NFS you might as well get very good performance out
of it. As always, pick the right technology for your specific case.

~~~
dekhn
Petabytes of 20k files?

I think you might want to use your filesystem more effectively.

~~~
buster
Well, what would it help in terms of NFS? You'd still have to tell NFS to read
20kB. If it's 20kB from one big file or one 20kB file doesn't matter much.
it's common to have one file per email and the usual filesystem has no problem
with that.

------
zippergz
On the one hand, I've been wanting something like this for a while. On the
other hand, I have so many bad memories of problems caused by NFS in
production from the 90s that I'm leery.

~~~
acdha
At least the Linux NFS client made some big improvements in stability by the
mid-2000s. FreeBSD took longer but I've heard they've fixed the kernel
deadlocks as well.

The other interesting note is that they apparently only support NFSv4, which
has some welcome improvements: it uses TCP over a single port, avoids the
entire portmap/statd/lockd train-wreck, UTF-8 everywhere, etc. One of the more
interesting ones is that it has referrals (think HTTP redirect) so one server
doesn't have to handle every request from every client and you can do things
like load-balancing. Clients are also assumed to be smarter so e.g. local
caching is safe because the server can notify you about changes to open files
rather than requiring the client to poll.

~~~
evol262
It's almost certainly NFSv4 only so they can utilize pNFS

~~~
notacoward
Not necessarily. IIRC, pNFS is an optional feature of NFSv4.1, so they might
not have implemented that flavor. If they had, I'm pretty sure they'd be
advertising it. They're not shy.

~~~
evol262
Technical details aren't out yet. They may not have implemented it, but given
their comments about performance and scaling, it would be a very good reason
not to support NFSv3.

------
jtchang
Does anyone else cringe when someone suggestions using NFS in production?

I can't be the only one that has been woken up at 2am because of an NFS
outage.

~~~
wildlogic
Although skeptical about it going into the project, I've deployed a
virtualized oracle rac environment over 10G NFS and with some tuning it was
stable and performant. If it is good enough for rac, which has some of the
most stringent latency / performance requirements that I've seen, I'd say that
it is probably good enough for quite a few production use cases, although to
be fair this was only an 8 node cluster.

~~~
throwaway1979
Some naive questions if you don't mind... (I'm really curious about RAC .. my
only DB experience has been small MySQL and MS SQL Server clusters).

1) I thought the really high end DBs like to manage their own block storage.
Your NFS comment suggests that the database data files were running on an NFS
mount, and you had a 10 gig Ethernet connection to the file server.

2) What would you say is the average size of a RAC cluster in (your opinion)?
Is 8 considered a small cluster in this realm?

3) DBs have stringent requirements when it comes to operations like sync. Can
you actually get ACID in an NFS backed DB?

Thanks for satisfying my curiosity :)

~~~
jsmeaton
Just to offer a little bit of information, we're currently running a 2 node
RAC cluster. I'm not entirely sure about the storage mechanism though.

------
positr0n
As someone who doesn't know AWS much, what are the major differences between
this and EBS?

~~~
nkozyra
As has been mentioned, EBS can only be mounted to one instance at a time. If
you could mount it to multiple it would effectively be the same thing, but
then you have all sorts of write-locking issues.

As the rest of the comments also allude, a lot of the cloud-entrenched world
has abandoned NFS, or at least in AWS circles.

I'm not one of these people. Rather than relying solely on puppet->all
instances to handle multiple deploys, the convenience of an NFS instance was
appealing, so essentially I have a relatively small (read: medium) instance
that does nothing but manage the deployment filesystems and allow new in-
security groups to connect. I have often though of abandoning this for doing
deploys in a more "modern" way but I'm still not sure what the benefit would
be other than eliminating a minor source of shame.

The analog here is running a MySQL or Postgres database in an instance prior
to RDS. RDS provided enough benefit that the minor price difference in rolling
your own no longer factored in. A more reliable, fault-tolerant and extensible
file system is, like RDS, a huge upgrade. It may not be for everyone, but for
some of us it's just another reason why AWS keeps making it hard to even look
anywhere else.

~~~
vizzah
Would it be a good idea to store Mysql db on EFS to be shared among load-
balanced instances?

~~~
simonw
No, absolutely not. See the warning on
[http://dev.mysql.com/doc/refman/5.0/en/multiple-data-
directo...](http://dev.mysql.com/doc/refman/5.0/en/multiple-data-
directories.html)

------
istvan__
NFS? I just simply don't see the place for this product. The world moved
towards the use of smart data formats on platforms like S3 and HDFS. Where
would be a NFS service better? It is kind of hard to see for me to use a
distributed filesystem over a distributed datastore. Wondering about the
outage scenarios with that. Historically speaking, the drivers that deal with
this in the linux kernel are not the best in terms of locking.

[http://docstore.mik.ua/orelly/networking_2ndEd/nfs/ch11_03.h...](http://docstore.mik.ua/orelly/networking_2ndEd/nfs/ch11_03.htm)

This is a step back from the right direction to me.

~~~
notacoward
There are two things wrong with the "nobody should use a distributed file
system" meme. (Disclaimer: I'm a distributed file system developer.)

(1) Not all development is green-field and thus subject to using this week's
fashionable data store. A lot of applications already use file systems and
rely on their semantics. Ripping up the entire storage layer of a large and
complex application would be irresponsible, as it's likely to create more
problems than it solves. Look down your nose at "legacy code" all you want,
but those users exist and have money to give people who will help them solve
their business problems. Often the solutions are as cutting-edge as anything
you'd find in more fashionable quarters, even if the facade is old fashioned.

(2) Even for green-field development, file systems are often a better fit than
object stores. Nested directories are more than just an organizational scheme,
but also support security in a way that object stores don't. What's the
equivalent of SELinux for data stored in S3 or Swift? There is none. File
system consistency/durability guarantees can be important, as can single-byte
writes to the middle of a file, while parsing HTTP headers on every request
drops a lot of functionality and performance on the floor. Distributed
databases are much more compelling, but the fact remains that the file system
model is often the best one even for new applications.

Go ahead and use something else if it suits you. Other people wouldn't have
much use for _your_ favorite storage model, and this one suits them perfectly.

~~~
istvan__
Funny you mentioned that this is a meme to you, while it is really a technical
consideration to me, and I supplied some details about my concerns.

Answers following your numbering:

(1) Calling S3 "this week's fashionable data store" is like saying that an
elephant is an interesting microbe. The rest of your points about "please do
not innovate, we have filesystems for 40 years and this is how you store your
data". I do not agree. Disclaimer, I was member of the team that moved
amazon.com from using an NFS based storage to S3. It was a great success, and
it solved many of our problems including dealing with the insane amount of
issues that was introduced by running an NFS cluster at that scale. And I
would like to emphasize on scale, because your operational problems are quite
often increase worse than linear with you scale.

I know about legacy code, and running several legacy services in production as
of now. I can tell you one thing. There is a point when it is not financially
viable to keep rolling with the legacy code. This point is very different
based on your actual use case, banks tend to run "legacy code" while web2.0
companies tend to innovate and replace systems with a faster peace. I don't
see any conflict here. We even did a compatibility layer for the new solution
and it was possible to run your legacy code using the new system and your
software was untouched.

(2) Nested directories is a logical layer on the top of how the data is
stored, aka a view, your are a distributed FS developer so I guess you
understand that. S3 also supports nested directories no biggie here. Security.
Well this is kind of weird because last time I checked S3 had an extensive
security
[http://aws.amazon.com/s3/faqs/#security_anchor](http://aws.amazon.com/s3/faqs/#security_anchor)
Now the rest of your question can be re-phrased: "I am used to X why isn't
there X with this new thing???" I am not sure how many of the file system
users use SElinux my educated guess is that roughly around 1-10%. It is a very
complex system that not too many companies invest using. For our use cases the
fine grained ACLs were good enough so we are using those. File system
durability: yes it is very important, this why I was kind of shocked about
this bug:
[https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/...](https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45)

I guess you are right about the overhead of reading and writing, dealing with
http headers etc. If the systems that benefit the most from S3 where single
node systems it would be silly to use S3 at the first place. We are talking
about 1000 - 10000 computers using the same data storage layer. And you can
tell me if I am wrong but if you would like to access the same files on these
nodes using a FS than you are going to end up with a locking hell. This is why
modern software that is IO heavy moved away from in-place edits towards the
"lock free" data access. Look at the implementation of Kafka log files or how
Aeron writes files. This is exactly the same schematics how use use S3.
Accident? ;)

I would like to repeat my original question: I don't see huge market for a
distributed FS. I might be wrong, but this is how I see it.

[http://kafka.apache.org/](http://kafka.apache.org/) [https://github.com/real-
logic/Aeron](https://github.com/real-logic/Aeron)

~~~
notacoward
"please do not innovate, we have filesystems for 40 years and this is how you
store your data"

Please don't put words in my mouth like that. It's damn rude. I never said
anything that was even close.

"S3 also supports nested directories no biggie here."

Not according to the API documentation I've seen. There are buckets, and there
are objects within buckets. Nothing about buckets within buckets. Sure, there
are umpteen different ways to simulate nested directories using various naming
conventions recognized by an access library, but there's no standard and thus
no compatibility. You also lose some of the benefits of true nested
directories, such as combining permissions across different levels of the
hierarchy. Also no links (hard or soft) which many people find useful, etc.
Your claim here is misleading at best.

"last time I checked S3 had an extensive security"

Yes, it has its very own permissions system, fundamentally incompatible with
any other and quite clunky to use. That still doesn't answer the question of
how you'd do anything like SELinux with it.

"File system durability: yes it is very important, this why I was kind of
shocked about this bug:
[https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/...](https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/...")

Open up your bug list and we can have that conversation. Throwing stones from
behind a proprietary wall is despicable.

"you can tell me if I am wrong but if you would like to access the same files
on these nodes using a FS than you are going to end up with a locking hell."

You're wrong. Maybe you've only read about distributed file systems (or
databases which have to deal with similar problems) from >15 years ago, but
things have changed a bit since then. In fact, if you were at Amazon you might
have heard of a little thing called Dynamo which was part of that evolution.
Modern distributed systems, including distributed file systems, don't have
that locking hell. That's just FUD.

"I don't see huge market for a distributed FS."

Might want to tell that to the EFS team. Let me know how that goes. In fact
you might be right, but whether there's a _market_ has little to do with your
pseudo-technical objections. Many technologies are considered uncool long
before they cease being useful.

------
joemccall86
We're contemplating moving our static image files (JPG/PNG) from an EBS volume
to serving them from an S3 bucket (so we can deploy a HA setup). It sounds
like it would be a lot less code if we used EFS instead. Would you guys
recommend S3 or EFS for this scenario?

~~~
skuhn
One thing to consider is that EFS is priced at 10x the cost of the most
expensive tier of S3 storage ($0.30/Gbyte vs $0.03/Gbyte).

If they are just static assets, you would do better to put them in an S3
bucket, set appropriate Cache-Control headers and serve them via Cloudfront.
This reduces your outbound bandwidth cost to the Internet versus EC2/S3, and
yields better performance.

------
willcodeforfoo
Looks interesting... but for my use case I'd rather not deal with NFS, use a
FUSE-based S3 mount and save 90%.

~~~
fi788
S3fs filesystems are really slow. We tested around 10mb/s for file upload.
Where it really struggles is when you have a lot of files in a folder. Try
doing an 'ls' on a folder with hundreds of files to see it break.

~~~
bmurphy1976
As always, it depends on your use case. Just because it can be slow doesn't
mean it's not a viable (and in some cases superior) option.

We use it to store petabytes of large video files and our system is structured
such that no folder ever has more than a couple files in it (>20 is rare).
With properly tuned caching this works fantastically well for our use case and
I would take the simpler code and reduced points of failure over NFS nonsense
any day.

That of course doesn't mean s3fs is the solution to every problem, it simply
means it's good to have options and don't write something off because it
"might be slow."

Know your data, know your use case, and know your tools. You can make smart
decisions on your own rather than driven by anecdotal comments on HN.

~~~
moe
Which s3fs are you using?

The fuse-based ones that I've tried were ridden with problems and poor error
handling. Hangs and truncated files were the rule rather than the exception.

~~~
gaul
s3fs-fuse has its share of problems, but master has fixes for some of the
error handling and truncated files issues. Please report any bugs you
encounter on GitHub!

------
b1twise
This would be big for us. When we initially looked at the problem of sharing
or keeping a large number of files in sync, the prospects were dim. DRBD? etc.
So we ended up using Gluster. Gluster has been temperamental at best. We've
been able to move some data out and into elasticsearch, but not all. So, I've
nudged my AWS rep and signed up already. Reliable NFS is good for me.

~~~
graceofs
We couldn't find a solution either, so we built a posix filesystem with S3
backend backend that is easy to run and scale. If you want to give ObjectiveFS
([https://objectivefs.com](https://objectivefs.com)) a try, I'll be happy to
hear your feedback.

------
netcraft
blog link: [https://aws.amazon.com/blogs/aws/amazon-elastic-file-
system-...](https://aws.amazon.com/blogs/aws/amazon-elastic-file-system-
shared-file-storage-for-amazon-ec2/)

------
tszming
Congratulation on launching a killer feature once you addicted to it, very
hard to move out from AWS :) (but honestly, this is a very awesome technology
we've been dreaming for years)

~~~
rcoder
How does this make it harder to move out of AWS?

If anything, their use of NFSv4 means there are plenty of competitive
offerings if you decide that performance, security, or physical access
constraints dictate migrating off their service.

If you don't want to manage your own Linux/BSD/etc. NFS infrastructure,
Oracle, Netapp, and EMC will all happily sell you a storage appliance that
supports it. I don't see much lock-in here.

~~~
Nexxxeh
I think parent meant it in a positive way, as opposed to negative. "So good
you don't want to leave" as opposed to proprietary tech results in vendor
lock-in.

------
shogun21
What's the advantage of this over S3?

~~~
IgorPartola
This is a filesystem. You mount it directly inside your EC2 boxes and work
with it as if it's local (or really NFS-mounted). It's going to be a couple of
orders of magnitude faster, but not directly web-connected, so you can't use
it to serve content directly as if it was a CDN.

I can see this being useful for a few cases. For one, I can immediately use it
for one of the projects I have where I have multiple worker servers and one of
them needs to periodically process a few GB's of data, yet I don't want to
give that much storage to every sever, and I don't want to make any one server
special.

Another use case: you don't know how big your data will grow, yet you want to
access it in a random fashion. S3 isn't great for this, but NFS is, and
unlimited mounted storage is nice.

Edit: Third use case is logs. You can collect all the logs from all of your
servers in one place and access them from any server.

~~~
shogun21
Okay, thanks for the explanation!

------
netcraft
We have been needing this very thing for hosting a large number of assets
across multiple ec2 instances. Hope they roll this out quickly.

~~~
nkozyra
For the record, there has never been anything stopping you from spinning up an
NFS EC2 that all of your other EC2s can use.

~~~
jdub
Except that your one EC2 instance is a single point of failure, from instance
through host through availability zone, and that providing reliable, multi-
node NFS is _hard_ (even with GlusterFS).

And you have to resize your block devices to store more stuff.

And...

And...

:-)

~~~
nkozyra
Well, this is a problem with a lot of ephemeral services hand-spun on AWS.
There are also ways of mitigating it on your instances, but those are clunky,
too.

Which is why new services like these are always preferable.

------
biot
All services should eventually be like this. Just as you don't want to deal
with provisioning BTUs of air conditioning or watts of power needed for your
cloud infrastructure, why should you concern yourself about allocating a
certain number of bytes of storage?

~~~
pyre
> Just as you don't want to deal with provisioning BTUs of air conditioning or
> watts of power needed for your cloud infrastructure

Maybe _you_ don't want to, but there is definitely someone out there dealing
with these issues.

E.g. during a heat-wave (100 F+) a transformer on top of the building (at a
previous employer) started on fire. When the dust settled, we found out that
the person in charge of it had not upgraded it as our power requirements
increased. It was over-taxed and the heat-wave put it over the edge.

~~~
tlrobinson
Right... which is why software should do it automatically.

------
JoelMaki
WHY WAIT? Zetta.Net has been delivering enterprise grade capabilities, for
more than 6 years, that Amazon/Google/MS are just now getting into. Don't wait
for the pre-view, go to www.zetta.net and get all you can handle today.
Customer's are moving in excess of 5TB in/out natively over the internet on a
daily basis, with ability to do more given the available WAN connectivity. And
add 100% Cloud Based DRaaS, and tomorrow is here today at Zetta.

------
bosky101
The best & key feature seems highly under-stated.

Until now, you could bind an ebs to only one instance.

Or had to use gluster / hdfs otherwise.

    
    
        Multiple Amazon EC2 instances can access an Amazon EFS file system
        at the same time, providing a common data source for
        workloads and applications running on more than one instance.
    

You effectively have a distributed filesystem now. this is great news.

------
arthursilva
This could be a byproduct of what they developed for Aurora (their upcoming
redesigned mysql) shared storage. Probably the other way around.

~~~
eloff
I was wondering that too, since Aurora uses shared multi AZ storage. When I
learned that yesterday I found myself wishing they made just that available on
AWS as it's own service. One day later, they announce it. I love what these
guys are doing.

------
agrothberg
How does this compare to just using AWS Storage Gateway
([http://aws.amazon.com/storagegateway/](http://aws.amazon.com/storagegateway/))
sitting in front of S3?

------
dumbfounder
Finally! Seems like this should have been a product day one on AWS.

------
daemonk
Nice. Do they have any details on how the resizing works? Is it automatic as I
generate more files? Or will I have to explicitly provision more space as I
generate the files?

~~~
zwily
Looks like it's automatic.

------
davidmichael
Its going to be really nice if you are able to snapshot these like EBS
volumes. I didn't see any reference in the details on how data recovery would
work.

------
simonjgreen
NFS as a service on AWS. That is going to shake things up.

------
kaizen2015
Does EFS include snapshot capabilities? If so, can I use object for my
protection scheme? How about EFS to Glacier connectors? In the works?

------
mrfusion
Will this break my backup strategy of snapshotting my EBS volumes?

Or is this a feature you need to enable, and you keep using EBS otherwise?

~~~
dangrossman
It's a totally separate product. It'll have no effect on your EBS usage.

------
joncooper
Love it. I hope they can deliver meaningful throughput.

------
afrc
Will this be available on Elastic Beanstalk?

------
jkrejci
Pretty sweet little feature.

------
Shreyansh1911
whoa

------
tschellenbach
Sounds like they are recommending to use an anti-pattern. Share your storage
across instances.... You really don't want to do that.

~~~
teraflop
Care to explain why not? It works well enough for Google:
[http://static.googleusercontent.com/external_content/untrust...](http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.reverse-
proxy.org/en/us/university/relations/facultysummit2010/storage_architecture_and_challenges.pdf)

------
rjurney
Amazon is the only company other than Apple that consistently thought leads
and innovates ahead of other companies and open source.

~~~
vially
Would you mind explaining how exactly "Apple consistently leads and innovates
ahead of other companies and open source"?

~~~
rjurney
Are you kidding? Apple reinvents every space they enter with profound
invention.

Do you remember computers before the Mac? Notebooks before the Macbook? Music
players before the iPod? Phones before the iPhone? Tablets before the iPad?
And in a month you'll think: watches before the Apple Watch?

~~~
wpietri
I think their skill is more profound commercialization. I do in fact remember
all of those things. I even owned all of those things before Apple entered
those markets. Apple invented very little for any of them.

Their strength lies in taking existing technologies, rebuilding them with a
strong user focus, and then marketing the hell out of them. So much so that
many people apparently forget what came before.

~~~
rjurney
Most of their success is owing to the invention of new interfaces that fit new
form factors. Before that, only dissatisfying shit is available, and the
market is mostly novelty. After that, new device categories have actual
features a human can use, and a market happens and knock offs compete.

~~~
wpietri
Which interface were you thinking they invented? the WIMP-style interface was
borrowed from Xerox PARC. The MacBook was a shiny laptop. The iPod had a menu
system much like other MP3 players of the time; they mainly made it a well-
organized, consumer-friendly menu system with a bigger screen. The iPhone's
main interface innovation I can think of was making the screen bigger; they
did use a virtual keyboard, but so did the Palm Pilot and the Apple Newton of
years before. The iPad, interface-wise, was basically a big iPod Touch.

Don't get me wrong; those are all very polished products, and they took a lot
of technical smarts. They were much more usable to a mainstream consumer
audience. But the non-Apple versions of those products were generally fine for
non-consumer audiences. And Apple's marketing is masterful; I've never seen a
tech company so good at generating hype.

