

How StackOverflow Scales with SQL Server - jswinghammer
http://www.brentozar.com/archive/2011/11/how-stackoverflow-scales-sql-server-video/

======
wmwong
This wasn't so much about fine tuning SQL Server itself, but breaking
traditional thinking. The 5 rules he presents originally are all replaced in
the end.

    
    
      - Everybody's the DBA
      - Do what it takes to get what you want
      - Tune later, cache & separate now
      - NewEgg your way out of problems
      - Share for great good
    

For each point, he argues why the old rule no longer applies and what the new
solution is.

I felt a lot of the presentation was about tuning SQL Server without tuning
SQL Server: caching, leave full-text searching to Apache Lucene (because it's
not querying), and using SSDs to speed up performance without having to touch
any code.

~~~
BrentOzar
Yep, absolutely, you nailed it. There's a gazillion presentations out there
about tuning databases, but that only takes you so far. I wanted to show that
you need to take a step back before you go into query tuning details.

------
krmmalik
I had no idea StackOverflow was using Microsoft SQL Server. I guess its just a
learned response but i've always come to expect some large networking site to
always be using a non-MS based solution.

I havent had a chance to watch the video, but i hope to later on. What is
interesting, even with just the link is that for a website like StackOverflow
that MS SQL is a viable solution.

We have been using SQL Server in our own company for our projects and i was
really starting to get annoyed with it. I found it to be heavy on resources,
slow to respond and lets not forget cost. I just completed a project that i
have been working on for the last 4 months, and the majority of the work was
within SQL server. One thing i learned was that its actually quite a powerful
beast.

When used correctly and in the right way SQL is a very capable SQL solution.
I'm glad that we decided to stick to SQL Server. There is a lot i learned
about SQL Server in the last 4 months that i had no idea it was capable of.

~~~
silverbax88
Big networks (banks, insurance companies) basically use one of two options
most of the time. Oracle or SQL Server.

It only seems like it isn't when you read sites like Hacker News, where most
of the posters are not working in big environments. Facebook is a large scale
solution that doesn't use either, but their data management and caching is so
bad nobody should be considering them as a best practice.

~~~
icebraining
Google, Craigslist, Twitter, Wikipedia, Youtube and Netflix are just a few
example which don't use Oracle or MS SQL Server either.

It's not about being big, banks and insurance companies simply have different
requirements on how to access and store their data compared to most
websites/services.

~~~
arethuza
Google appears to use Oracle Hyperion and Oracle Essbase - obviously not for
their main product offerings - but they do use Oracle products:

[http://www.google.com/intl/en/jobs/uslocations/mountain-
view...](http://www.google.com/intl/en/jobs/uslocations/mountain-
view/engops/opsit/hyperion-developer-financial-management-essbase-planning-
mountain-view/index.html)

~~~
nl
They use it for their internal finance systems, not to run any external facing
systems.

Google _does_ use MySQL for their AdSense/AdWords transactional system (ie,
buying AdWords).

------
3am
I can't watch the video right now, but the idea that "Everybody's the DBA" is
risky in general. I'm hugely in favor of developers writing and optimizing
their own SQL, being able to create normalized schema (and know when it's
worth the tradeoff to denormalize), how to read a query plan, and generally be
as competent as a DBA. But... it's good to have one person who has the global
view of the database for things like tuning extent sizes, selecting the
optimal types of storage for various partions, doing reviews on the schema,
capacity planning, etc, etc.

The right setup (IMO) of having a DBA in an operational role with developers
that are highly proficient/self-sufficient is hard to get right and expensive
enough that it probably isn't right for an early stage company. And a bad DBA
can be a nightmare. So there are tradeoffs on both sides.

~~~
Duff
I think that with a modern database like SQL Server, the old-school "high
priest" DBA is obsolete.

But... if you don't have someone dedicated to thinking about database issues,
you need to treat database changes just like your code. It needs to be in a
repository, it needs to be reviewed, and you need a change management regime.

From a anecdotal POV, I've noticed that many folks have a good process (or at
least a consensus approach) to managing their code... but the database is
often a red-headed stepchild that doesn't get the attention it deserves.

~~~
MartinCron
_but the database is often a red-headed stepchild that doesn't get the
attention it deserves_

On my last three major projects, I have committed to devoting the proper level
of attention to the database, with automatically building databases in some
environments, scripted scheme changes and seed data loading as part of the
mainline code base, etc. and I have found the difference to be immense in
practical terms. I can move more quickly, more safely, and have a better
quality of life as a developer.

One of the best returns on (effort) invested I have ever seen.

~~~
lukencode
I definitely agree. We have been using fluent migrator
(<https://github.com/schambers/fluentmigrator>) for our .net based projects at
work for database migrations and versioning

~~~
MartinCron
Thanks. Fluent Migrator looks pretty cool. I've been rolling my own stuff as-
needed. As a result, it has evolved into something pretty useful for the
problems I've encountered so far, but does nothing outside of what I've
already thought of. I can see using the migrator as a good jumping off point.

------
andrewheins
I love that these guys are so consistently open on their scaling strategies.
Does anyone have any similar resources for Rails stacks?

~~~
moonboots
[http://www.readwriteweb.com/cloud/2011/04/twitter-drops-
ruby...](http://www.readwriteweb.com/cloud/2011/04/twitter-drops-ruby-for-
java.php)

~~~
astrodust
You should be so lucky as to have to take something to Twitter-scale.

I worry that if they went hand-rolled assembly, everyone would be jumping in
that bandwagon.

~~~
gaius
Twitter is actually not that high-volume. They peak at 6000 messages/sec and
average much lower (<http://blog.twitter.com/2011/03/numbers.html>). Meanwhile
Tibco will sell you a device doing 100,000 messages/sec sustained
([http://www.tibco.com/products/soa/messaging/messaging-
applia...](http://www.tibco.com/products/soa/messaging/messaging-
appliance/default.jsp)). There are plenty of people working on systems more
scaley than Twitter.

~~~
orcadk
That's not really a fair comparison however. There's an non-negligible
difference between just crunching messages, and then distributing said 6k
messages/sec into a directed graph of users. Those 6k messages/sec is just the
input, the output is far greater.

~~~
gaius
That is what RV does.

------
scottshea
I love it when companies offer insight into their practices like this.

------
mwsherman
It comes down to two things: economics and computer science, in that order.

For most companies, diving deep into your data persistence layer is probably
not worth it in the beginning. Your devs need to work on features. (MS SQL
does pretty well untuned.) This is the economics part.

The computer science part comes in when you are doing enough traffic that 200
hours of dev time toward a 10% performance improvement becomes good economics.
Then you dig into the data layer (and every other layer) and start counting
milliseconds.

Which is what we did at Stack O by using Dapper and renting Brent Ozar. :)

~~~
BrentOzar
RENTING, hahaha, I never thought of myself that way. I like it. I'm going to
use a "For Rent" sign around my neck in an upcoming blog post!

------
revetkn
The other part of this is throwing out the ORM in places where performance
matters and replacing it with a simple utility, Dapper, that maps SQL to
objects:
[http://samsaffron.com/archive/2011/03/30/How+I+learned+to+st...](http://samsaffron.com/archive/2011/03/30/How+I+learned+to+stop+worrying+and+write+my+own+ORM)

------
blhack
"Don't worry the cloud takes care of all of this!"

This is horrifying. Are there really people that think this way?

~~~
BrentOzar
Sadly, yes - obviously it's not this site's target audience, but a lot of PHBs
out there are buying into it hook, line, and sinker.

------
arthurprs
Very good stuff.

