The problem here of course is that a CGI-like approach does a fork plus execve for each request, which does not give a large benefit of sharing.

If you have a simple forking socket-based server, Linux (I assume that OS X is not any different) the amount of memory per process is much lower because it uses copy-on-write pages for forks and it's largely the same process.

