When I worked at Amazon.com we had a deeply-ingrained hatred for all of the SQL databases in our systems. Now, we knew perfectly well how to scale them through partitioning and other means. But making them highly available was another matter. Replication and failover give you basic reliability, but it's very limited and inflexible compared to a real distributed datastore with master-master replication, partition tolerance, consensus and/or eventual consistency, or other availability-oriented features.
I think everyone is missing the boat by evaluating datastores in terms of scalability. Even at Amazon's size, database scaling is only occasionally a problem. By the time your site is that big, hopefully you've split your software into independent services, each with its own (relatively) small database. And for the services that do need a distributed cache or persistence layer, there's now a huge body of literature and software that solves those problems.
Reliability is another matter. Every business can benefit from improved reliability, regardless of size. And reliability is harder. Scaling is basically an engineering problem, which can be solved with the right architecture and algorithms. Reliability is more like security: it's a weakest-link problem and requires constant vigilance in all areas, including engineering, testing, operations, and deployment. Adding redundant components improves reliability against hardware failures (which are independent between hosts) but not software bugs (which typically are not). Many of the comments on http://news.ycombinator.com/item?id=859058 are missing this point.
Anything that mitigates reliability problems has huge potential for saving costs and increasing value. CouchDB, for example, doesn't even provide basic scaling features like automatic partitioning, but it does provide reliability features like distribution, replication, snapshots, record-level version history, and MVCC. That is what makes it interesting and useful in my opinion. Similarly, although people talk about Erlang as a concurrency-oriented language, it's really a reliability-oriented language. The concurrency in Erlang/OTP is not designed for high-speed parallel computation. Rather, it's one part of an overall strategy for reliability, a strategy which also includes the ability to update code in running processes, a standard distributed transactional database, and built-in process supervision trees. (Steve Vinoski wrote something similar in his blog a while ago: http://steve.vinoski.net/blog/2008/05/01/erlang-its-about-re...)
Yes, yes, a thousand times, yes. We seem to be slowly moving toward this ideal at JTV as we grow, because scaling monolithic database architectures is a pain, and it isn't reliable. Even if you scale your database via replication or sharding, you've still got a global choke point for reliability and performance. If your database(s) get slow, if replications lag, if a server dies or whatever, there's a good chance it will take down your whole site. On the other hand, if you use services, you can lose individual features without taking down everything.
I would go so far as to say that it's probably a good idea to build a service-oriented architecture from the start. It's only a bit harder to do, it keeps your code base cleaner and more orthogonal over time, and it makes scaling much easier, should the need ever arise.
Here is a light interview (by none other than Jim Gray) on this published in ACM Queue from 2006.
Also, when used in active-active configuration, replication actually decreases the reliability of your application by the number of replicated nodes, because you're increasing the number of point of failures in your application (although scalability increases). Replication increases reliability only if used in an active-passive configuration. I've ran into this issue over and over again.
here's an interesting paper on sharding with Oracle vs. Mysql
After seeing some real-world, big business problems solved with weakened consistency guarentees, I'm skeptical that there are as many problems that "need" ACID as most people think. Rather, I think that (a) most engineers have not yet been exposed to the reasonable alternatives to ACID, and so have not actually thought about whether they "need" it, and (b) most businesses do not yet have the knowledge they need to weigh ACID's benefits against its availability costs.
For balance it would be helpful to have an analysis from a more independent source than Oracle.
Sure sharding is scaling, although there are several types of sharding, with varying degrees of scalability. Usually people use a combination of both database (which what I was talking about) and table sharding (w/ replication,clustering,etc.) to achieve scalability. I've encountered several highly-scalable db environments like these.
It's not application-transparent
There are several databases that have features that can make sharding transparent to your application.
and it's an operational nightmare
it depends on the RDBMS that you're using.
Scale is all relative, the only thing that scales infinitely are problems where the parts are all independent. Therefore you need to pick a solution that will scale appropriately to your needs. For the vast majority of applications in existence, the properties of SQL databases scale just fine for their needs. Where advanced SQL doesn't scale, you can often squeeze several orders of magnitude more scalability through a variety of optimization techniques including sharding and denormalization. A "NoSQL" (for lack of a better term) database might have better "scalability" out of the box, but will also tend to require more work up front wrangling data sets and more bugs than well tested SQL databases provide for the same data management functionality. Some people think this is offset by the lack of a parity mismatch between OO code and relational data sets, but I think that is more a symptom of people who don't really grok the relational model and the power and assurances it provides.
Increasingly as we move to more and more cloud services there is definitely a need for new approaches to more scalable data stores. However there's not going to be something that comes along and is both more scalable and equally as powerful as the relational model across the board. It's going to require tradeoffs to solve different classes of application-dependent scalability profiles. However the relational model is a solid and proven theoretical basis for describing the majority of data we have. Sure you have some edge cases like graphs and trees, but most data is not better described by key-value stores. That's just wishful thinking.
This isn't a knock on the Heroku folks; to their credit, they've been open about the DB issue. But you wouldn't see this most basic piece of information in media coverage from TechCrunch and many others. Unfortunately (?) for Heroku, the press has been writing checks in their name that the company can't cash.
Edit: Literally seconds after I made this post, blasdel posted about the same DB issue: http://news.ycombinator.com/item?id=859545
Scalability is a farce. In my experience applications are implemented
in such hilariously inefficient ways, because it's hard to see where
the time and especially the IO is spent. But adding more machines is
not a good solution, because it becomes even more difficult to
understand the performance behavior of the system. Also, it transforms
the system into a distributed system and therefore adds a whole new
level of complexity and failure causes. Most websites don't have the
some problems as Google and Amazon.
Optimizing the application could prevent the need for scaling out to
multiple machines. Optimization is not rocket science: Find out the
bottleneck and fix it. Often it is random IO, so either load the data
into RAM or change the algorithm to use sequential IO (e.g. through
Reliability means that the system must work correctly all the time and
there is no way to fix it (e.g. a satellite on a mission). Thus, for
most applications availability, which is the fraction of time a system
works, is a more appropriate metric.
There are two basic strategies for achieving high availability in a
system: perfection and fault-tolerance. Perfection is the default
programming model, which assumes that every hardware and software
component of the system works as expected. Fault-tolerant systems, on
the other hand, are hard and require that the system is designed
around this idea. Also, fault-tolerance makes it harder to change the
system. An often overlooked fact is that for many systems it is
actually much easier to achieve a particular availability goal through
perfection rather than trying to build a more complex fault-tolerant
But there's a huge problem -- AppEngine succeeds at seamless multi-tenant truly-distributed clustered hosting thanks to BigTable. Heroku needs to support standard Rails apps, so Postgres is the best they can do, and it's a huge hole in their offering.
You just can't make Postgres (or Oracle) scale on an ideal horizontal the same way you can distribute IP, DNS, HTTP proxying, HTTP serving, memcache, message queues, or bigtable. You can't expose Postgres as an ideal service that just keeps up with what you throw at it.
Is there something about MySQL that exempts it from this scaling weakness that I'm not aware of?
I'm not bull-baiting, I'm just covering the bases and wondering if there's some reason it's not presently under discussion. Part of the reason I ask is because I've been pushing for a shift to Postgres at work after we get some more pressing matters dealt with.
No, there isn't. We're discussing Postgres since that's what Heroku uses.
I had to look for a long time to see why the dupe detector didn't pick that up, 'adam.blog' vs 'adamblog'.
feature request: dupe only detected when discussion is still open.
I think this was one of PGs recent optimizations to keep HN responsive, it makes good sense to archive articles when they're no longer being updated.
In the past I've noticed that organisations run into trouble when they have one schema with multiple apps off it, because schema changes become difficult. I'm trying to write an engine that contains application business logic without polluting it with user interface considerations. It aims to solve a problem for which many people currently use stored procedures. However, with a better syntax and being stateful it should be far more effective at enclosing the schema and supporting rapid changes. The dynamic of interacting with the application is a bit like a MUD, in that each session can only do things that are exposed to it via the 'room' mechanism.
I had intended that its mainstay backend would be relational databases. But I now realise - if I'm building business logic into this layer (which will be the single point-of-entry into the schema) the datastore can afford to be a lot less complex than a traditional relational db.
My focus this weekend, to 'prove' the new design, is a tool that wraps a generic-webapp-framework-style CRUD app around a schema on the fly. So if you're on crud screen rendered for the "person" table then there'll be an organisation-foreign-key datapoint in the generated user interface. It will be either a radio button list, dropdowns or search interface depending on the number of records.
How do map/reduce programmers manage the structural integrity of their schemas without the sort of foreign key protections that you get with relational databases? Is there an analogue to the ORM layer that they use for this?
a) Changing 'schemas' is supported by document, or advanced key-value, stores like Couch or Cassandra. You end up thinking of your objects more as hashes and program accordingly; everything else is just a mapreduce task (ie, you've updated your schema. If you coded well, nothing should break because of that, but you'd like to update all of the old objects. So write a mapreduce task, like a migration, that does that).
b) DataMapper, for ruby, is used mostly as an ORM; however, the "R" is pretty optional, as people have written data adapters for other architectures -- from CouchDB to the SalesForce API.
Those are just some associated thoughts. I'm playing around with Cassandra and Couch on my server at the moment (grr, when I have a moment).
Meanwhile, out here in the real world, banks, stock exchanges, airlines, telcos, retailers are doing transaction and storage volumes on relational databases that say otherwise.
Also "SQL database" is what the PHB in Dilbert calls 'em. No-one with any experience in the field uses that term.