You can just write Spark SQL, set the executor memory to whatever the machine is and not worry about whether it's in RAM or not.

Spark will naturally use RAM first and then disk as needed.

