This doesn't seem like they actually provided and/or did anything new here. (Hard to tell cuz there is also no obvious code. Is it just cachefsd?)
About a year and a half ago I did something similar with essentially the exact same hardware.
BeeGFS filesystem across x3 IBM power8 Minksy's with NVMe drives for distributing data quickly, in parallel, and effectively across multiple machines. Really helped with the scripting of a batch job as the file system was unified across the platform and had some good in memory caching.
I have the numbers somewhere but I felt like theirs aren't really that great considering 40gbps mellanox & NVMe. Should really be able to get quiet a bit better throughput. Also ran thousands of jobs and many Tb of data. Not just 24 over gigabytes.
(FWIW I didn't thoroughly read the actual article that thoroughly.)
About a year and a half ago I did something similar with essentially the exact same hardware.
BeeGFS filesystem across x3 IBM power8 Minksy's with NVMe drives for distributing data quickly, in parallel, and effectively across multiple machines. Really helped with the scripting of a batch job as the file system was unified across the platform and had some good in memory caching.
I have the numbers somewhere but I felt like theirs aren't really that great considering 40gbps mellanox & NVMe. Should really be able to get quiet a bit better throughput. Also ran thousands of jobs and many Tb of data. Not just 24 over gigabytes.
(FWIW I didn't thoroughly read the actual article that thoroughly.)