Hacker News new | comments | show | ask | jobs | submit login

That's an artifact of the limited choices available. Plus the core HPC market is both notoriously inbred and tolerant of low system reliability and high delicacy that the enterprise wouldn't let in the door.

And no, "single limited use case" I mean, is a relatively small number of very large files that need to be sequentially streamed out/in as fast as possible. That's a small subset of HPC.

GPFS is the gold standard but it's expensive.

Lustre is more a byproduct of the HPC vendors trying to get more margin by using open source. They've drunk from the poisoned chalice in that the amount of time, money and effort required to get Lustre to acceptability (limited as that may be) has been far more than GPFS license/support cost. Even from the brain-eating zombie that is today's IBM.




> GPFS is the gold standard but it's expensive.

Confirming! And going on a small tangent (ha!).

Our previous configuration was Lustre and XFS/NFS; the former was the scratch file system for HPC applications and the latter for home directories and what not.

Lustre was definitely a beast (in a good sense), but we'd occasionally get bit by a work load high in metadata operations - which would bring Lustre to its knees due to latency.

XFS/NFS was great for its purpose (no HPC workloads), but we'd also get bit by a user or users reading seemingly tiny, innocuous files (inadvertently in parallel) which would cause load averages to spike; surprisingly the latency wasn't as bad as the Lustre workload mentioned above.

Not too drink the GPFS kool aid here, but it's definitely solved both problems above. It has its issues, but definitely handles the common I/O patterns seen on our cluster.


If I walked you through the Lustre metadata state machine you wouldn't be surprised by Lustre latency any longer. It's a veritable Rube Goldberg machine without the amusement.


I believe it.

At least the Lustre developers (pre-Intel) had the foresight to enable extremely good debugging - you could simply enable a few procfs settings and easily find the offenders.

It's still amusing to me however, that the biggest offenders of Lustre slow-downs were single core processes. I'd check the RPC counts per node, find the violator, and then check per PID statistics; it was always (95%+) a single core application performing thousands of calls.

We never did experiment with the 2.x branch of the software. I recall one of our co-workers stating that even the developers did not believe the dual MDS set-up was production ready at that time.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: