Hacker News new | past | comments | ask | show | jobs | submit login

It's not the scale, it's the reliability.

When I worked at Amazon.com we had a deeply-ingrained hatred for all of the SQL databases in our systems. Now, we knew perfectly well how to scale them through partitioning and other means. But making them highly available was another matter. Replication and failover give you basic reliability, but it's very limited and inflexible compared to a real distributed datastore with master-master replication, partition tolerance, consensus and/or eventual consistency, or other availability-oriented features.

I think everyone is missing the boat by evaluating datastores in terms of scalability. Even at Amazon's size, database scaling is only occasionally a problem. By the time your site is that big, hopefully you've split your software into independent services, each with its own (relatively) small database. And for the services that do need a distributed cache or persistence layer, there's now a huge body of literature and software that solves those problems.

Reliability is another matter. Every business can benefit from improved reliability, regardless of size. And reliability is harder. Scaling is basically an engineering problem, which can be solved with the right architecture and algorithms. Reliability is more like security: it's a weakest-link problem and requires constant vigilance in all areas, including engineering, testing, operations, and deployment. Adding redundant components improves reliability against hardware failures (which are independent between hosts) but not software bugs (which typically are not). Many of the comments on http://news.ycombinator.com/item?id=859058 are missing this point.

Anything that mitigates reliability problems has huge potential for saving costs and increasing value. CouchDB, for example, doesn't even provide basic scaling features like automatic partitioning, but it does provide reliability features like distribution, replication, snapshots, record-level version history, and MVCC. That is what makes it interesting and useful in my opinion. Similarly, although people talk about Erlang as a concurrency-oriented language, it's really a reliability-oriented language. The concurrency in Erlang/OTP is not designed for high-speed parallel computation. Rather, it's one part of an overall strategy for reliability, a strategy which also includes the ability to update code in running processes, a standard distributed transactional database, and built-in process supervision trees. (Steve Vinoski wrote something similar in his blog a while ago: http://steve.vinoski.net/blog/2008/05/01/erlang-its-about-re...)

"By the time your site is that big, hopefully you've split your software into independent services, each with its own (relatively) small database. And for the services that do need a distributed cache or persistence layer, there's now a huge body of literature and software that solves those problems."

Yes, yes, a thousand times, yes. We seem to be slowly moving toward this ideal at JTV as we grow, because scaling monolithic database architectures is a pain, and it isn't reliable. Even if you scale your database via replication or sharding, you've still got a global choke point for reliability and performance. If your database(s) get slow, if replications lag, if a server dies or whatever, there's a good chance it will take down your whole site. On the other hand, if you use services, you can lose individual features without taking down everything.

I would go so far as to say that it's probably a good idea to build a service-oriented architecture from the start. It's only a bit harder to do, it keeps your code base cleaner and more orthogonal over time, and it makes scaling much easier, should the need ever arise.

Amazon's CTO certainly is a big proponent of SOA.

Here is a light interview (by none other than Jim Gray) on this published in ACM Queue from 2006.


As i mentioned earlier, you can have a highly-scalable "shared-nothing" architecture using existing RDBMS's. It depends on your design and implementation.

Also, when used in active-active configuration, replication actually decreases the reliability of your application by the number of replicated nodes, because you're increasing the number of point of failures in your application (although scalability increases). Replication increases reliability only if used in an active-passive configuration. I've ran into this issue over and over again.

That's exactly why businesses like Amazon that are concerned with both scale and availability are using non-relational databases that replace ACID with "eventual consistency" or other weakened/modified guarantees. Then you can have active-active replication while increasing (rather than decreasing) reliability in the presence of node failures and network partitions.

That's only applicable to applications that does not require ACID. I've seen architectures where sharding + active-passive replication was used to provide scalability without sacrificing reliability (ACID-compliance).

here's an interesting paper on sharding with Oracle vs. Mysql


The CAP theorem is a very strong limit on providing both availability and consistency in any distributed system. In your sharding+replication example, what happens when the datacenters containing your master and slave lose their network link? There's no way you can maintain write availability for clients in both datacenters while also providing the ACID Consistency guarantee. (But systems like Dynamo or CouchDB can do so while guaranteeing eventual consistency.)

After seeing some real-world, big business problems solved with weakened consistency guarentees, I'm skeptical that there are as many problems that "need" ACID as most people think. Rather, I think that (a) most engineers have not yet been exposed to the reasonable alternatives to ACID, and so have not actually thought about whether they "need" it, and (b) most businesses do not yet have the knowledge they need to weigh ACID's benefits against its availability costs.

I agree with the CAP theorem and it applies to my example. In my example replication only provides a backup copy of the data and is not used by the application, that's why it's active-passive (provides reliability). This configuration provides the highest level of data protection that is possible without affecting the performance of the master database/shard. This is accomplished by allowing a transaction to commit writes to the master database. The same transaction is also written to the slave, but that transaction is optional and written asynchronously. Two-phase commit is not performed.

Yes it's interesting. Yes it compares Oracle and MySQL.

For balance it would be helpful to have an analysis from a more independent source than Oracle.

Sharding isn't scaling. It's not application-transparent, and it's an operational nightmare.

Sharding isn't scaling

Sure sharding is scaling, although there are several types of sharding, with varying degrees of scalability. Usually people use a combination of both database (which what I was talking about) and table sharding (w/ replication,clustering,etc.) to achieve scalability. I've encountered several highly-scalable db environments like these.

It's not application-transparent

There are several databases that have features that can make sharding transparent to your application.

and it's an operational nightmare

it depends on the RDBMS that you're using.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact