Agreed, but note that at high scale, stress-testing isn't as easy as it sounds.
For example, at an ad-serving company I worked with, stress-testing the ad servers required a botnet of hundreds of machines in order to approximate the load of millions of real users.
The hardware required to run a stress test can easily be 25% or 50% the hardware required to run your production environment. It's not just a workstation with JMeter.
Another challenge is the difference between volume of traffic, and diversity of traffic. Naively set up load tests tend to be really good tests of your caching mechanism, and non-tests of other parts of your system.
At high scale, you should have a few grand to pay gomez for some real-world load testing without having to buy your own hardware. Also, it sounds like the author did not even try a "workstation with jmeter". I am by no means a scalability expert but even I know better than to stress-test on a live system.
Sure, it's a different story if you have a server farm and need a client farm to test it. But in his scenario you could use an off-the-shelf laptop to saturate the local link with requests...
Even if it's not easy to tell the min/max bounds in high scale deployments, it's possible to give an approximate value by measuring elements. You have load balancers that don't do more than X req/s, servers that take X ms/req, databases which can do X q/s, caches with delay X ms in X% of request. That gives you an rough estimate - then it's about locating bottlenecks which prevent you from reaching those numbers. (simplified model)
It's actually pretty amazing to see how many people don't really put their systems in any kind of stress to validate their performance needs. The place I work at now never took the time to do it and things exploded. Good buddies at start-ups launched sites with no performance metrics and exploded.
I could not agree more. But I have learnt, not many clients have the money for a hardware solution. But one thing I too have learnt .. stay away from pound etc, just stick to LVS.
A solution that involves other-than-software is not automatically buy-vs-build, nor an appliance.
Broadly, to me, it's usually about choosing the right hardware and assembling/configuring it in a "custom"[1] way. This is usually significantly cheaper than trying to get anything off-the-shelf to scale.
stay away from pound etc, just stick to LVS
I've learned to stay away from anything with heavy dependency on kernel code. My preference is HApoxy.
[1] Optimized for the problem at hand, which, usually, is remarkably commonplace, just nowhere near as commonplace as a lowest common denominator web server.
I'd KILL to have scalability issues... we had one band that said they were likely to sell several hundred thousand copies of their single using BitBuffet, which got me all excited, but that never came to pass. I honestly believed it might happen because: 1) they were donating the money to charity, 2) it was pay-what-you-want, and 3) they were pretty well known.
I was actually a little sad he didn't kill my cute little Linode server...
I'm a pretty big noob in the scaling world, but as I read the article I kept thinking, "With services out there like Heroku, isn't scaling something of a solved problem (at least with a Rails app)?" Do web startups still manually handle the scaling process?
Well one reason is that Heroku and the likes don't allow for all possible options and configurations you might run into. If your app requires some obscure back end software then you kind of have to host (at least part of) it yourself.