
SQL Databases Don't Scale - DavidSJ
http://adamblog.heroku.com/past/2009/7/6/sql_databases_dont_scale/
======
mbrubeck
It's not the scale, it's the reliability.

When I worked at Amazon.com we had a deeply-ingrained hatred for all of the
SQL databases in our systems. Now, we knew perfectly well how to scale them
through partitioning and other means. But making them highly available was
another matter. Replication and failover give you basic reliability, but it's
very limited and inflexible compared to a real distributed datastore with
master-master replication, partition tolerance, consensus and/or eventual
consistency, or other availability-oriented features.

I think everyone is missing the boat by evaluating datastores in terms of
scalability. Even at Amazon's size, database scaling is only occasionally a
problem. By the time your site is that big, hopefully you've split your
software into independent services, each with its own (relatively) small
database. And for the services that do need a distributed cache or persistence
layer, there's now a huge body of literature and software that solves those
problems.

Reliability is another matter. Every business can benefit from improved
reliability, regardless of size. And reliability is harder. Scaling is
basically an engineering problem, which can be solved with the right
architecture and algorithms. Reliability is more like security: it's a
weakest-link problem and requires constant vigilance in all areas, including
engineering, testing, operations, and deployment. Adding redundant components
improves reliability against hardware failures (which are independent between
hosts) but not software bugs (which typically are not). Many of the comments
on <http://news.ycombinator.com/item?id=859058> are missing this point.

Anything that mitigates reliability problems has huge potential for saving
costs and increasing value. CouchDB, for example, doesn't even provide basic
scaling features like automatic partitioning, but it does provide reliability
features like distribution, replication, snapshots, record-level version
history, and MVCC. _That_ is what makes it interesting and useful in my
opinion. Similarly, although people talk about Erlang as a concurrency-
oriented language, it's really a reliability-oriented language. The
concurrency in Erlang/OTP is not designed for high-speed parallel computation.
Rather, it's one part of an overall strategy for reliability, a strategy which
also includes the ability to update code in running processes, a standard
distributed transactional database, and built-in process supervision trees.
(Steve Vinoski wrote something similar in his blog a while ago:
[http://steve.vinoski.net/blog/2008/05/01/erlang-its-about-
re...](http://steve.vinoski.net/blog/2008/05/01/erlang-its-about-
reliability/))

~~~
timr
_"By the time your site is that big, hopefully you've split your software into
independent services, each with its own (relatively) small database. And for
the services that do need a distributed cache or persistence layer, there's
now a huge body of literature and software that solves those problems."_

Yes, yes, a thousand times, _yes_. We seem to be slowly moving toward this
ideal at JTV as we grow, because scaling monolithic database architectures is
a pain, and it isn't reliable. Even if you scale your database via replication
or sharding, you've still got a global choke point for reliability and
performance. If your database(s) get slow, if replications lag, if a server
dies or whatever, there's a good chance it will take down your whole site. On
the other hand, if you use services, you can lose individual features without
taking down everything.

I would go so far as to say that it's probably a good idea to build a service-
oriented architecture from the start. It's only a bit harder to do, it keeps
your code base cleaner and more orthogonal over time, and it makes scaling
much easier, should the need ever arise.

~~~
bayareaguy
Amazon's CTO certainly is a big proponent of SOA.

Here is a light interview (by none other than Jim Gray) on this published in
ACM Queue from 2006.

[http://portal.acm.org/ft_gateway.cfm?id=1142065&type=pdf](http://portal.acm.org/ft_gateway.cfm?id=1142065&type=pdf)

------
dasil003
I could do without the sensationalist headline. In fact I've pretty much come
to hate any headline that says "X doesn't scale". It might be fair in some
circumstances, but most of the time it's an indication of an uninformed or
koolaid-driven opinion.

Scale is all relative, the only thing that scales infinitely are problems
where the parts are all independent. Therefore you need to pick a solution
that will scale appropriately to your needs. For the vast majority of
applications in existence, the properties of SQL databases scale just fine for
their needs. Where advanced SQL doesn't scale, you can often squeeze several
orders of magnitude more scalability through a variety of optimization
techniques including sharding and denormalization. A "NoSQL" (for lack of a
better term) database might have better "scalability" out of the box, but will
also tend to require more work up front wrangling data sets and more bugs than
well tested SQL databases provide for the same data management functionality.
Some people think this is offset by the lack of a parity mismatch between OO
code and relational data sets, but I think that is more a symptom of people
who don't really grok the relational model and the power and assurances it
provides.

Increasingly as we move to more and more cloud services there is definitely a
need for new approaches to more scalable data stores. However there's not
going to be something that comes along and is both more scalable and equally
as powerful as the relational model across the board. It's going to require
tradeoffs to solve different classes of application-dependent scalability
profiles. However the relational model is a solid and proven theoretical basis
for describing the majority of data we have. Sure you have some edge cases
like graphs and trees, but most data is not better described by key-value
stores. That's just wishful thinking.

~~~
timwiseman
Like most decisions, choosing your datastore is all about deciding which trade
offs are best for you. For a great many of those situations, it will be SQL,
but for some others it will not. But each way comes with compromises.

------
johnnybgoode
Heroku has been hyped up far and wide as being some kind of magical scaling
solution. Just give Heroku your app and it will scale infinitely on EC2,
right? Wrong. As they've admitted before, they haven't done anything to scale
the database. Adding more app servers is trivial by comparison.

This isn't a knock on the Heroku folks; to their credit, they've been open
about the DB issue. But you wouldn't see this most basic piece of information
in media coverage from TechCrunch and many others. Unfortunately (?) for
Heroku, the press has been writing checks in their name that the company can't
cash.

Edit: Literally seconds after I made this post, blasdel posted about the same
DB issue: <http://news.ycombinator.com/item?id=859545>

~~~
cmelbye
Yet adding more database disk capacity will cost you a fortune for anything
more that 0.5GB...

------
sprachspiel
I think scalability and reliability are mostly orthogonal issues and should be
discussed separately.

Scalability is a farce. In my experience applications are implemented in such
hilariously inefficient ways, because it's hard to see where the time and
especially the IO is spent. But adding more machines is not a good solution,
because it becomes even more difficult to understand the performance behavior
of the system. Also, it transforms the system into a distributed system and
therefore adds a whole new level of complexity and failure causes. Most
websites don't have the some problems as Google and Amazon.

Optimizing the application could prevent the need for scaling out to multiple
machines. Optimization is not rocket science: Find out the bottleneck and fix
it. Often it is random IO, so either load the data into RAM or change the
algorithm to use sequential IO (e.g. through batch computation).

Reliability means that the system must work correctly all the time and there
is no way to fix it (e.g. a satellite on a mission). Thus, for most
applications availability, which is the fraction of time a system works, is a
more appropriate metric.

There are two basic strategies for achieving high availability in a system:
perfection and fault-tolerance. Perfection is the default programming model,
which assumes that every hardware and software component of the system works
as expected. Fault-tolerant systems, on the other hand, are hard and require
that the system is designed around this idea. Also, fault-tolerance makes it
harder to change the system. An often overlooked fact is that for many systems
it is actually much easier to achieve a particular availability goal through
perfection rather than trying to build a more complex fault-tolerant system.

------
blasdel
Heroku : Ruby/Rack :: Google AppEngine : Python/WSGI

But there's a huge problem -- AppEngine succeeds at seamless multi-tenant
truly-distributed clustered hosting thanks to BigTable. Heroku needs to
support standard Rails apps, so Postgres is the best they can do, and it's a
huge hole in their offering.

You just can't make Postgres (or Oracle) scale on an ideal horizontal the same
way you can distribute IP, DNS, HTTP proxying, HTTP serving, memcache, message
queues, or bigtable. You can't expose Postgres as an ideal service that just
keeps up with what you throw at it.

~~~
alnayyir
>You just can't make Postgres (or Oracle) scale on an ideal horizontal the
same way you can distribute IP, DNS, HTTP proxying, HTTP serving, memcache,
message queues, or bigtable.

Is there something about MySQL that exempts it from this scaling weakness that
I'm not aware of?

I'm not bull-baiting, I'm just covering the bases and wondering if there's
some reason it's not presently under discussion. Part of the reason I ask is
because I've been pushing for a shift to Postgres at work after we get some
more pressing matters dealt with.

~~~
johnnybgoode
> Is there something about MySQL that exempts it from this scaling weakness
> that I'm not aware of?

No, there isn't. We're discussing Postgres since that's what Heroku uses.

~~~
alnayyir
In that case, I'll carry on with my Postgres evangelism. Thank you for your
time.

------
jacquesm
we've been here before:

<http://news.ycombinator.com/item?id=690656>

I had to look for a long time to see why the dupe detector didn't pick that
up, 'adam.blog' vs 'adamblog'.

~~~
jpwagner
Good to link to the prior discussion, but that item is no longer _open_ for
discussion, so it's not inappropriate to re-post if an interesting discussion
is to continue.

    
    
      feature request: dupe only detected when discussion is still open.

~~~
jacquesm
The cutoff is only 45 days, we'd be literally flooded with dupes if that was
done.

I think this was one of PGs recent optimizations to keep HN responsive, it
makes good sense to archive articles when they're no longer being updated.

------
crad
I think the author is unaware of a great tool for scaling PostgreSQL. A few
years ago Skype released plProxy and pgBouncer, both of which are key to
PostgreSQL scaling. What they in essence do is allow for transparent
horizontal scaling of servers. One defines a cluster in plProxy and then when
you run stored procedure queries, they remotely execute on the proper shared
server based upon the primary key data. It's fast and effective and I have
been using it in a very large OLTP environment using PostgreSQL, driving a US
Top-25 pageview site.

~~~
crad
I should clarify, plProxy is responsible for the transparent horizontal
scaling. pgBouncer is a proxy that allows you to run all of this using a
minimal number of PostgreSQL backends, which is essential because of the
amount of inter-process communication locking and information sharing in
PostgreSQL. pgBouncer basically manages a smaller pool of active backends
reducing the overhead in running high transaction velocity servers.

------
cturner
I'm glad this was reposted because it links into something I've working on for
a while. I'm on what must be the ninth rewrite now, but have had a
breakthrough on a long-term problem this weekend (notation).

In the past I've noticed that organisations run into trouble when they have
one schema with multiple apps off it, because schema changes become difficult.
I'm trying to write an engine that contains application business logic without
polluting it with user interface considerations. It aims to solve a problem
for which many people currently use stored procedures. However, with a better
syntax and being stateful it should be far more effective at enclosing the
schema and supporting rapid changes. The dynamic of interacting with the
application is a bit like a MUD, in that each session can only do things that
are exposed to it via the 'room' mechanism.

I had intended that its mainstay backend would be relational databases. But I
now realise - if I'm building business logic into this layer (which will be
the single point-of-entry into the schema) the datastore can afford to be a
lot less complex than a traditional relational db.

My focus this weekend, to 'prove' the new design, is a tool that wraps a
generic-webapp-framework-style CRUD app around a schema on the fly. So if
you're on crud screen rendered for the "person" table then there'll be an
organisation-foreign-key datapoint in the generated user interface. It will be
either a radio button list, dropdowns or search interface depending on the
number of records.

How do map/reduce programmers manage the structural integrity of their schemas
without the sort of foreign key protections that you get with relational
databases? Is there an analogue to the ORM layer that they use for this?

~~~
mr_luc
I had two thoughts:

a) Changing 'schemas' is supported by document, or advanced key-value, stores
like Couch or Cassandra. You end up thinking of your objects more as hashes
and program accordingly; everything else is just a mapreduce task (ie, you've
updated your schema. If you coded well, nothing should break because of that,
but you'd like to update all of the old objects. So write a mapreduce task,
like a migration, that does that).

b) DataMapper, for ruby, is used mostly as an ORM; however, the "R" is pretty
optional, as people have written data adapters for other architectures -- from
CouchDB to the SalesForce API.

Those are just some associated thoughts. I'm playing around with Cassandra and
Couch on my server at the moment (grr, when I have a moment).

------
jrockway
SQL is not a data storage scheme, it's a query language. If you mean
"relational database", say "relational database".

------
tzury
Seems like nowadays, everyone's saying SQL database do not scale, while
apparently, most crowded sites are using mySQL (or other SQL based) and
scaling it quite well, producing speedy websites that serve billions of pages
everyday.

~~~
kscaldef
Serving billions of pages a day is not a particularly hard scaling problem if
most of those page views only require reads. It's writes, and particularly
transactional updates, that are hard.

~~~
drusenko
yes, but even harder are the entire set of problems where the data is
extremely interrelated, like Facebook's is -- making it very difficult to
shard effectively.

------
datums
I wouldn't put HA into the same bucket as scaling. Scaling includes
performance where HA doesn't. You can take a simple html site and decide to
eliminate SPOF (single points of failure). You immediately increase your OpEX.
I enjoy scaling problems.

------
fauigerzigerk
It's not so much the DBMS that doesn't scale. It's particular data access
patterns in combination with particular requirements that don't scale. RDBMS
alternatives do little more than educate me about that fact

------
ecq
Teradata, Oracle RAC/Oracle Exadata,Vertica, Netezza, etc.

~~~
jbellis
All (?) [edit: I stand corrected: only some] of which rely on A Big SAN, which
falls under the "vertical" classification, i.e., scaling-by-throwing-money-at-
it. Which is fine as long as you're aware that your price/performance sucks,
and as long as your SAN is big enough.

~~~
wmf
No, most of the scalable SQL databases are shared-nothing.

~~~
jbellis
... A little more looking shows that while some systems, like Oracle RAC (the
one I'm most familiar with) and Sybase, do rely on shared storage, many do
not. Sorry to generalize from insufficient data!

~~~
ecq
You can have a shared-nothing architecture using traditional "shared-
everything" RDBMS.

------
gaius
Translation: Heroku doesn't know how to scale databases.

Meanwhile, out here in the real world, banks, stock exchanges, airlines,
telcos, retailers are doing transaction and storage volumes on relational
databases that say otherwise.

Also "SQL database" is what the PHB in Dilbert calls 'em. No-one with any
experience in the field uses that term.

------
chanux
Apparently we can put old stuff on HN again.

<http://news.ycombinator.com/item?id=690656>

