
Amazon FSx for Lustre - mcrute
https://aws.amazon.com/blogs/aws/new-amazon-fsx-for-lustre/
======
davidmr
This was inevitable, and I’m sure the engineering from Amazon’s side is
impressive because Lustre is an absolute beast to run well at scale, but I’m
not sure how great an idea it is for most people.

Coming from an academia HPC background and then moving into the private
sector, I’ve mostly come to believe that parallel filesystems (especially
POSIX-compliant ones) are rarely the right solution outside of MPI
simulations. Like NFS, it makes it extremely easy and attractive to implement
anti-patterns like using the filesystem for IPC or generating a bazillion
files and then needing to reduce them to move to the next stage of the
pipeline. In my experience, it’s rare that people don’t regret doing that sort
of stuff in the long run.

That said, I’m sure the AWS team knows their customers and what they’re doing
better than I do!

------
pinewurst
That would have to be the worst job in the world - keeping Lustre going as an
Amazon service, with management that utterly lacks understanding and sympathy.

~~~
davidmr
They’d have to pay me a lot of money to do it, that’s for sure. I’d love to
see the disaster recovery plans. Every major Lustre site I’m aware of has had
a data loss “incident” at some point in their history. It’s possible AWS has
it all figured out with background backups and block device replication and
whatnot, but I’m skeptical.

~~~
mbreese
Given that they call the non-S3 linked version 'ephemeral', I'm not sure there
is a plan. I think S3 _is_ the plan.

~~~
pinewurst
'Ephemeral' was/is the original Lustre design model. It was intended for high
performance swap/scratch at Livermore with a short data lifespan - your higher
priority bomb sim forces mine to roll out to disk and back in later, and
that's it. Lustre, even today, isn't long term stable. The longer you leave
data on it, the greater the probability of corruption - even silently.

~~~
mbreese
I've seen Lustre backed with ZFS listed a few places. Is the idea here to help
mitigate the possibility of corruption?

~~~
agapon
LLNL is the core force behind ZoL and it's primarily them who use ZFS-backed
Lustre.

~~~
pinewurst
I think ZoL is LLNL’s attempt to make up for inflicting Lustre on us.

------
fold_left
finally, something that can handle node_modules! :)

~~~
tardismechanic
Ayyy!

------
damnhungry
I don't mean to be cynical but for the last few days some one is bombarding
with lot of amazon news

~~~
SSilver2k2
This week is the AWS re:Invent conference, so every day new services are being
introduced.

~~~
damnhungry
okay, my apologies :)

~~~
SSilver2k2
it's all good :)

