

7 Stages of Scaling Web Applications - vp
http://www.slideshare.net/davemitz/7-stages-of-scaling-web-applications?type=powerpoint

======
tdavis
The only change I would make to the "stages" is to add caching up front. Most
modern web frameworks make the use of caching pretty trivial therefor making
it less of a pain than setting up load balancers and multiple web servers,
etc. Static content should also make use of caching headers and be served by a
separate server than dynamic content; this is also a trivial improvement.

Having only a single web server, database server, and static content server
while making liberal use of caching and smart querying result in benchmarks
indicating we can sustain around 1900 req/sec. Even if that isn't entirely
accurate, the chances of us reaching the 82 million page views per month to
find out is all but inconceivable.

~~~
alecco
1900 req/sec What kind of requests?

Typically dynamic content usually is well below 300 req/s per core and goes
down fast on many factors (e.g. non-local database.) Most web frameworks are
usually below the 300 req/s mark. Of course it is a very complex subject as
there are many things to consider. Good cache management, CDN, SSI, and
reverse cache come to mind.

About databases, it can become a problem if you do writes on the same database
as locking starts to show its dangerous tentacles.

------
jwilliams
Great post! It points out up front that performance != scale.

The other scaling point is concurrency. You can't (for example) serially add
features to a page. Eventually that page will simply take too long to render.

A lot of "dashboard" pages suffer from this problem. If you reach that stage
you need to start thinking about decoupling features - using things like
messaging.

~~~
Retric
For most companies performance might as well scale. For example Slashdot uses
4 machines. Now if you are serving up a lot of media that changes things but a
single web server can handle a _lot_ of traffic if your app is reasonable.

PS: This is a presentation by Rackspace of course they are going to say write
bad software and buy lot's of hardware.

~~~
jwilliams
I'll admit that depending on your situation you may be more concerned with
performance over scalability - but doesn't make them the same thing.

Scale is getting higher throughput by adding resources - Performance is
getting higher throughput using the resources you already have.

------
alecco
I agree on many points like performance and scalability are not the same but
his traditional approach to just make a bigger clustered database is not good.
It's like fixing the bottleneck by making a gigantic bottle, it'll work but
the cost will be many times over what was really necessary.

If a any single component is a critical bottleneck the best is to either
replace it or redesign to stop requiring it.

Same goes for the more memory approach.

~~~
litewulf
The day your website starts generating more traffic than one server can handle
you need to do something _very quickly_ to make sure it all stays up.

So I think its often reasonable to pay the exorbitant prices as a stopgap
before you can start doing any of the more complex scalability re-
architecturing.

~~~
alecco
That's a very addictive and self-validating reasoning path. Beware.

~~~
litewulf
Scaling is painful and requires a good deal of upfront work. I can't guarantee
that _anyone_ will use stuff that I make, so I'd rather improve performance
when I actually need to.

Its addictive also to spend all your time thinking about scalability and
sharding and clustering options instead of making something people use.

In summary: yes, but be pragmatic about these things. If you are the only
user, scaling might not be worth it. If you _know_ that people will be using a
feature all the time, think about caching from the get-go. Obviously figuring
out which of the cases you are in can sometimes be tricky. ;)

~~~
alecco

      Its addictive also to spend all your time thinking about scalability and sharding and clustering
      options instead of making something people use.
    

I never proposed that. But leaving doors open for possible alternatives and
quick fixes doesn't take much time, maybe 5 minutes of thinking. Sane
programming methodologies help, like modularization and partitioning your code
in different processes. Scalability and rapid development are not always
mutually exclusive.

------
ken
There's something weird about reading an August 2008 presentation that says
"Source Control: RCS, CVS, Subversion" -- right between slides mentioning
"64-bit" and "cloud computing".

I haven't used CVS in at least 5 years, and I've never even seen RCS in use.
Part of me wonders if RCS is Rackspace's secret weapon.

