However, contrary to the author, I'm serving 25,000 real requests per second with only 8 dynos.
The app is written in Scala and runs on top of the JVM. And I was dissatisfied that 8 dynos seem like too much for an app that can serve over 10K requests per sec on my localhost.
It's an integration with an OpenRTB bidding marketplace that's sending our way that traffic.
And 25K is not the whole story. In a lot of ways it's similar to high frequency trading. Not only do you need to decide in real time if you want to respond with a bid or not, but the total response time should be under 200ms, preferably under 100ms, otherwise they start bitching about latency and they could drop you off from the exchange.
And the funny thing is 25K is actually nothing, compared to the bigger marketplace that we are pursuing and that will probably send us 800K per second at peak.
However, contrary to the author, I'm serving 25,000 real requests per second with only 8 dynos.
The app is written in Scala and runs on top of the JVM. And I was dissatisfied that 8 dynos seem like too much for an app that can serve over 10K requests per sec on my localhost.