This is generally one of the things that's frustrated us about the Hadoop ecosystem. Hadoop's implementation of MapReduce, which only supports batch processing, has become conflated with MapReduce as a paradigm, which can be used for way more than just batch processing.
Hope this clears a few things up!
Our storage layer is built on top of btrfs, we haven't put together comprehensive benchmarks yet but our experience with btrfs is that there isn't a meaningful penalty to reading a batch of data from a commit based system compared to more traditional storage layers. I really wish I had concrete performance numbers to give you but we haven't had the time to put them together yet. I will say that our observations match with btrfs' reported performance numbers and are what I'd expect given the algorithmic complexity of their system.