For another data point, I've got a single box processing every trade and order coming off the major cryptocurrency exchanges, roughly 3000 messages/second. And the webserver, DB persistence, and a bunch of price analytics also run on it. And it only hits about 30% CPU usage. (Ironically, the browser's CPU often does worse, because I'm using a third-party charting library that's graphing about 7000 points every second and isn't terribly well optimized for that.)
Software gets slow because it has a lot of wasteful layers in between. Cut the layers out and you can do pretty incredible things on small amounts of hardware.