Hacker News new | past | comments | ask | show | jobs | submit login

It's pretty easy if you're keeping everything in RAM and don't have layers upon layers of frameworks. You've got about 3B cycles per second to play with, with a register reference taking ~ 1 cycle, L1 cache (48K) = 4 cycles, L2 cache (512K) = 10 cycles, L3 cache (8M) = 40 cycles, and main memory = 100 cycles. This entire comment page is 129K, gzipping down to 19K; it fits entirely in L2 cache. It's likely that all content ever posted to HN would fit in 128G RAM (for reference, that's about 64 million typewriter pages). With about 30M random memory accesses being possible per second (or about 7 billion consecutive memory accesses - you gain a lot from caching), it's pretty reasonable to serve several thousand requests per second out of memory.

For another data point, I've got a single box processing every trade and order coming off the major cryptocurrency exchanges, roughly 3000 messages/second. And the webserver, DB persistence, and a bunch of price analytics also run on it. And it only hits about 30% CPU usage. (Ironically, the browser's CPU often does worse, because I'm using a third-party charting library that's graphing about 7000 points every second and isn't terribly well optimized for that.)


Software gets slow because it has a lot of wasteful layers in between. Cut the layers out and you can do pretty incredible things on small amounts of hardware.

Would love to see a blog post on building it some time, I think the last vaguely similar (but serious) rundown in this vein was “One process programming notes”.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact