Hacker News new | past | comments | ask | show | jobs | submit login

There's also the "not all memory is RAM" trick: plan ahead with enough swap to fit all the data you intend to process, and just pretend that you have enough RAM. Let the virtual memory subsystem worry about whether or not it fits in RAM. Whether this works well or horribly depends on your data layout and access patterns.

Don't even need to do that. Just mmap it and the virtual memory system will handle it.

Interesting. Can you provide some examples of where this is the correct approach?

This is how mongodb originally managed all its data. It used memory mapped files to store the data and let the underlying OS memory management facilities do what they were designed to do. This saved the mongodb devs a ton of complexity in building their own custom cache and let them get to market much faster. The downside is that since virtual memory is shared between processes, other competing processes could potentially mess with your working set (pushing warm data out, etc). The other downside is that since your turning over the management of that “memory” to the OS, you lose fine grained control that can be used to optimize for your specific use case.

Except nowadays with Docker / Kubr you can safely assume the db engine will be the only tenant of a given vm /pod whatever so I think it’s better to let OS do memory management than fight it

Might not be exactly the same use case, but a simple example is compiling large libraries on constrained/embedded platforms. Building OpenCV on a Pi certainly used to require adding a gig of swap.

With the Varnish HTTP cache the authors started out with a very "mmap or bust" type of approach, but later added a malloc-based backend.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact