I've spent a lot of time writing Spark code, and its ability to store data in a column oriented format in RAM is the only reason why - disk is goddamned slow.

As soon as you're touching it more than once, sticking it in RAM upon reading makes everything much faster.

