Hacker News new | past | comments | ask | show | jobs | submit login

Suppose 100 processes on the same box call the same function in glibc. If each process had its own redundant version of it mapped into memory, then that's 100x more cache misses compared to them all sharing the same page.

I get that only a handful of libraries are actually used, but that's a very fat tail and the performance will be very bad if that handful of libraries are always statically linked. This may be less important if the program is in a container (isn't sharing anyway) or launched as an AWS Lambda throwaway. I dunno, maybe dynamic linking isn't relevant anymore now that most Linux programs are meant to run in containers on an expensive cloud with tons of memory.

> In practice I much prefer the microservice model with a formal IPC API.

This converts a simple and solved problem, sharing memory, to a client-server distributed systems problem. There is no need. It's all on the same box.




I would highly recommend you shift your thinking & consider even something as basic as threads as a distributed systems problem. The distributed systems field has a much richer successful history of how to design robust systems at scale (not just in terms of compute resources but also in terms of # of people working on the problem & number of components). Sharing memory is actually not a simple & solved problem but distributed systems problem generally have lots of formal methods & tools for dealing with unreliability of any given component.

That's also ignoring that measuring the distributed cost of a component is fundamentally simpler than measuring it for a library mapped into 50 million processes.

Shared libraries can be useful but their value is often vastly overstated.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: