But small fields won't compress so well on their own. Often there's a lot of redundancy across records for the same fields which is great for compression. This might be a great way to achieve the benefits of field name tokenization too (which is similar to part of how most compressors work). I'd like to see block compression, rather than field compression .
_hopefully_ the mmap interface would provide the best of all worlds: mongodb continues to be simple with respect to how it handles getting data from disk and the kernel/fs can do it's magic behind the scenes of mmap. Of course, it could be that mmap + compressed filesystem leads to some unexpected (and bad) perf results. But then again, I've never tried :) have you?