Hacker News new | comments | show | ask | jobs | submit login

You don't really get into why mmap is an unpopular choice. It's not as if other programmers just forgot to read the man page. Traditional RDBMSs dislike the OS's buffer cache because the dbms has information that could better drive those algorithms; e.g., streaming data should not be cached, and should not compete with useful items in the cache. The page replacement algorithm is similarly blind; yeah, madvise exists, but it rarely has teeth. mmap is convenient, and performant enough. But if you found yourself driving hard to get the last 1% of performance out of this system, I would argue that you'd end up doing explicit file I/O and manual management of memory; e.g., the only way to use large pages to reduce TLB misses on popular OS'es is to use funky APIs like hugetlbfs on Linux.

Also, a pet peeve: mmap != "memory-mapped I/O." The latter refers to a style of hardware/software interface where device registers are accessed via loads and stores, rather than magical instructions. If you're not writing a device driver, you don't know or care whether you're using "memory-mapped I/O". mmap is ... just mmap.

I'd be interested in knowing more about why it's unpopular. I'm a fan of mmap() because I like the way it can simplify my code, and so far I've been pleased with the speed as well. But if there are subtle downsides I'd love to be aware of them. My instinct was that mmap() isn't used much because it's relatively new, and because it's traditionally had poor support on Windows.

I'm primarily a Linux user, but the best discussion I was able to find with a quick search was this exchange on freebsd-questions from several years ago: http://lists.freebsd.org/pipermail/freebsd-questions/2004-Ju...

Do you have know of any updated articles about it's performance tradeoffs?

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact