SQL Databases Don't Scale

nettdata · on July 7, 2009

I've been an Oracle database architect for almost 20 years.

His whole concept of "SQL doesn't scale" is the typical crap I always hear from people that are either not database experts, are using the wrong database technologies, or don't know what they're doing. More than likely a combination of all three.

And just because you can create an object model, and a simplistic data model, does not make you an architect of large, scalable database systems.

I have, over the past 4 years alone, built Oracle-based, fully scalable databases that handle over 25 million users daily.

Go read up on their RAC architecture, and the "shared everything" implementation.

Sure, it's expensive, but it works great.

For instance, if you play sports games from the world's largest video game corporation, all of your online transactions (achievements, etc) go through exactly such a system that I spent a year architecting and implementing.

If you do any online banking in Canada, or with some of the larger banks in the US, that, too, is on my resume.

I find that this type of FUD comes about from people that aren't good at designing and implementing large databases, or can't afford the technology that can pull it off, so they slam the technology rather than accept that they, themselves, are the ones lacking.

Most of them tend to come from the typical LAMP/SlashDot crowd that only have experience with the minor technologies.

Those of us that do it for a living, using the right technologies, seem to have no problems whatsoever scaling SQL.

Just saying.

joshu · on July 7, 2009

Have you ever noticed that every time there's an article like this there's one 20-year DBA veteran who says the original author doesn't understand SQL?

Yeah.

Anyway, having built both transactional systems (trading algorithms) and big social systems (delicious) I think the main issue is: The right data structure and tool for the right job.

Banking is a radically different problem then internet-scale social software (which I assume is what they are talking about when they say "doesn't scale".) Access patterns are different, read and write loads are different, etc.

The main issue here is that a lot of social software wants something like a fast per-user data store, something like a distributed inverted index for globally finding things.

There's NO great reason for one user's data to be in the same table as someone else's. I guess it was handy for stuff like calculating the average number of tags, etc.

Instead, you want to be able to have better control over data locality and similar, given your access patterns.

Now, personally, I would use an actual SQL engine for a single-machine persistent store, and build a distribution layer on top of that. Concurrency is hard, etc.

But assuming RAC is the right solution to all problems is probably not a good one. I've seen this go terribly awry.

SQL wins at things that btrees and hash indexes are good at; but a lot of things are better with other organizations of data.

blasdel · on July 7, 2009

You totally ignore the background of the author -- his company Heroku : Ruby/Rack :: Google AppEngine : Python/WSGI

But there's a huge problem -- AppEngine succeeds at seamless multi-tenant truly-distributed clustered hosting thanks to BigTable. Heroku needs to support standard Rails apps, so Postgres is the best they can do, and it's a huge hole in their offering.

You just can't make Postgres (or Oracle) scale on an ideal horizontal the same way you can distribute IP, DNS, HTTP proxying, HTTP serving, memcache, message queues, or bigtable. You can't expose Postgres as an ideal service that just keeps up with what you throw at it.

ianso · on July 7, 2009

WRT exposing Postgres, you're right that you can't simply expose a database and then expect the DB to scale under random selects, inserts, etc. (RAC is another matter I guess.)

However, as I said in the first comment to the blog post, if you define an interface using stored procedures (a pattern familiar to many Oracle DBAs), then PL/Proxy (http://pgfoundry.org/projects/plproxy/) lets you do hash-based partitioning in a way that's more or less transparent to the end-user of a DB, again assuming that it has a defined interface. The PL/Proxy installs form a 'bus' between the DB and the user.

Self-promotion: I'm currently working on hacking PL/Proxy into something that can be used to auto-scale a Postgres cluster on-demand, which has interfaces as a pre-requisite, along with some other things. The end-goal is to do exactly what you say: to scale-out Postgres like any other internet service on an ideal horizontal. Link: http://code.google.com/p/hotrepart

(Hum, I'm gonna get accused of spamming for repeating myself so often :-)

bayareaguy · on July 7, 2009

While Skype's PL/Proxy is a great way to make PostgreSQL more scalable, it doesn't do much to refute the basic argument that SQL databases aren't scalable since the SQL it helps you scale is limited to short RPC style operations.

ianso · on July 7, 2009

That's a good point, and it brings to light an unspoken assumption underlying my post, which is that the use-case is a web-based, read-heavy OLTP-style system. If you were thinking of things like OLAP and data warehousing, then I'd agree with you absolutely.

However, under my re-qualified assertion :-) for large, complicated commits, the logic would either have to be at the database level for it to be in the same transaction, or a solution using temporary tables could be put together for more convoluted calls.

Neither of these are elegant, I'll grant you, but the two basic approaches - longer transactions over logic closely coupled to the datastore, or staged writes - are what most DBAs on high-end databases end up doing anyway, and would probably be reproduced in some form or another in any ACID system, no? Either way you'd still have a database that scales.

nettdata · on July 7, 2009

For what it's worth, you can most definitely do that with Oracle. I can't speak to Postgres scalability, but it is my DB of preference. (I guess it might have something to do with the fact it's as close to Oracle as you're going to get in Open Source).

We design apps to treat the datasource as a network service, and have no problems load balancing DB connections across our database cluster, and adding new DB nodes as required.

The biggest problem I've seen is the relative lack of support for Oracle in something like Rails, or non-Oracle app servers in general. I've had to custom write some connection pool failure detection for a few apps to deal with cluster failure (something that comes out of the box with the Oracle app server, but we wanted to use JBoss and GlassFish), so while it's not perfect, it's most definitely doable.

Just because his particular architecture or offering doesn't fit well with typical database deployment architectures, doesn't mean that "SQL doesn't scale".

ntoshev · on July 7, 2009

Maintaining connections in the face of failure is a relatively little concern, what I perceive as the big problem here is having the database handle an arbitrary application with zero application-specific administration.

Google's datastore promises that if you write your app for their platform and it works on the small scale, it will scale out without problems. Datastore latency is constant regardless of how many records you have there. How will you scale a join between two tables, each being partitioned between multiple different servers, transparently for the app?

rythie · on July 7, 2009

Whilst I can believe that Oracle probably does scale, when it costs $17.5k per CPU in licensing then it's not for everyone.

It would seem that problems of a bank with a large already established userbase and lucrative, stable business model is very different to a start-up with no tested business model, may never become popular enough to need to scale and may not survive.

nettdata · on July 7, 2009

Uhmmm... duh?

The original article mentions nothing about cost or the ability (or lack thereof) of any company to afford it.

He made a flat out statement that said SQL DOESN'T SCALE, which is wrong.

There are other methods for dealing with initial growth and the cost constraints... no need to blow your wad on Oracle out of the gate.

But then I imagine a lot of non-Oracle types aren't even aware that Oracle can be very flexible with their licensing for startups. You can, for example, lease/rent your Oracle licenses on a monthly basis to help get the most out of your cash flow.

I've been involved with a number of startups where we did our initial rollout on Postgres, but ensured that the application architecture allowed for us to fairly easily swap in a larger, more scalable solution if/when needed.

For that matter, in my opinion, too many people and startups tend to over-engineer their initial product offerings, making them too complex and worrying more about nonexistent problems (like Google-sized scaling) rather than on solid features and business process. But that's fodder for another thread, I'm sure.

nailer · on July 7, 2009

CPU core. Not CPU (which is generally taken to mean CPU die).

rythie · on July 7, 2009

Really? That's 4 times worse than I thought then

nailer · on July 7, 2009

Yep. Oracle are famous for this - at my office we often talk about counting the molecules to work out Oracle licensing.

rythie · on July 12, 2009

The thing is that this makes it even harder to scale because the cost of licensing additional nodes is now several times the hardware cost.

TheXenocide · on July 7, 2009

For future reference, cold hard facts are much more useful than posting your resume. I'm not trying to say you're unqualified or unintelligent in any way, just that rationalizing an opinion with a job history is far less compelling evidence than factual examples. It's also interesting that you would immediately associate a well-made and supported argument (for those of us who have any experience scaling with technologies other than Oracle) with someone who isn't a database expert and doesn't know what he's doing.

Now let's get to the facts: while perhaps Oracle might have nice features, that doesn't mean that SQL is the best we can come up with, which is one of the primary points of the article (and one that seems completely unaddressed here). In a sense he brings to attention a common occurrence throughout human history wherein we reject change for comfort and these comments are doing nothing more than supporting that. Oracle is, for the most part, entirely too expensive for most businesses and even the businesses that are large enough to adopt it aren't making the profit they could with more affordable technology. You should also make note that, while banks do have strict requirements on data handling, they are really responsible for serving very small user bases when compared to things like Google, Amazon, Ebay, Facebook and the like. Sure the requirements are different for each of these, but ultimately your argument hold no water against these infrastructures, which are inherently not SQL and, at the same time, seem to be much more accepted as innovators of scalable technologies for the future...

One big problem is that real innovation comes at the expense of backwards compatibility, which would involve making a lot of changes. I can relate, since massive changes in most case imply bugs, which is a very unsettling position for some of these, but it doesn't mean there isn't a problem. Sure banks and other largely established companies would rather shell out the cash to support their legacy ideals than innovate new solutions, but he's right in that we've been spending many many years of man-hours trying to tackle the problem of porting a dated philosophy to an age that requires more scalability at lower cost. Lets also say that banks haven't been proving their practices to be economically sounds lately, so what is their input worth in this matter? Other than large boatloads of cash for Oracle that is.

nettdata · on July 8, 2009

Just to clarify, I never said Oracle was the be-all end-all solution for everything.

And I didn't rationalize anything with a "job history", but rather with large systems that actually have been built and are working.

I'd be very interested to see just how many people in this discussion actually have actual experience building large, scalable systems?

The original article made an asinine, generic statement without any context, or mention of cost, and I said it was silly, and pointed to the obvious (and easiest) reason why it was silly, and that was Oracle.

All of a sudden a bunch of people started making statements and assumptions about the scenario the article was probably talking about, that weren't actually made in the article, to discount Oracle as an option. Cost, commodity hardware, etc., etc. Then people started to point out other large websites that actually HAVE scaled, without the use of Oracle, as if that disproves Oracle's abilities... and yet it totally disproves the statement of the original article.

If you want to get into some context-specific details of why certain specific SQL technologies don't scale well (or at all), at a certain price-point, then that's a whole other discussion that I'd be happy to enter into, and would probably agree with.

It's also interesting to note that most of the "DB technologies" that are being used to scale those sites aren't DB's at all, but rather various levels of data caching that are employed to reduce the load on the databases, and are only applicable to the general read-only and non-transactional nature of social sites.

The whole reason I brought up banking sites in the first place is that they are one of the few, more obvious scenarios where most of your end-user interactions are actually hitting the database in real-time, and all data must be current and consistent. There is no real option for caching to save your ass, except at the DB layer itself, via such mechanisms as Oracle's Cache Fusion technology.

Social sites generally don't have any of those real-time, consistent constraints, and are therefore much easier to scale larger, because the nature of the site and the data allows for so much more technology to be used in front of the database.

The plain and simple fact of the matter is that building a large, scalable system is hard work. It requires that you analyze and design ALL aspects of the entire system to scale, not just the database. (Network, caching servers, application, database, hardware, etc).

TheXenocide · on July 7, 2009

Oh, it may also be worth noting that engineers and administrators would have to change their way of thinking in all this too, which doesn't affect some too much, but may have a bit of an impact on somebody who's been sitting on the same technology platform for 20 years.

staunch · on July 7, 2009

Yeah, and if Facebook used Oracle instead of MySQL and memcached, they'd be looking at a $50 million/year+ bill no doubt. Those silly kids with their minor technologies should learn to use a real DB. If those 100 million users that login every day only knew...

nettdata · on July 7, 2009

Firstly, I've never said that you couldn't build big stuff with MySQL or other Open Source tech. But if you for a second think that MySQL has the same kind of performance and feature set as Oracle, you're sadly mistaken.

The example you have given has nothing to do with the technologies, and everything to do with the application requirements.

An Internet Banking site has the need for real-time, centralized, transactional updates. There is no real option to cache a lot of stuff, or to delay the distribution of updates, or to shard/replicate data for reads, etc. It also has the need for real-time transactional replication, sophisticated auditing, global fault-tolerance, etc., etc. It's also a much more transactional site than something like Facebook.

Facebook's application requirements allow for a totally different set of tools to be used in a totally different manner. The dynamics of the site also has a lot to do with how it can be built. For example, Facebook is, for the most part, a read-only, fairly static site. That makes things a HELL of a lot easier to build out with their choice of tech. Same goes for SlashDot.

There are a ton of ways to build out something to the scale of Facebook, Slashdot, or LinkedIn using nothing but open source tools/technologies.

You cannot, however, build a realistic, large scale internet banking site using those same technologies. The only option is something like Oracle.

That's why a truly scalable system is so much more than just bolting on a DB to make it scale. It's about mapping the proper technology and tools to the business requirements, and figuring out how to deal with scaling issues before you even start to write a line of code.

Or, even better, abstract the database layer (Hibernate, etc), so that you can drop in more sophisticated DB technologies as you need to.

Just because something is free (MySQL), doesn't mean it sucks. Likewise, just because something costs money (Oracle), doesn't mean it sucks.

But there are many reasons (some of them even technical) why lots of companies have no problem paying stupid money for Oracle.

nostrademons · on July 7, 2009

"For example, Facebook is, for the most part, a read-only, fairly static site."

FaceBook's actually one of the more dynamic sites out there. I'd bet that the average FaceBook user makes many more updates to their FaceBook profile than they make financial transactions.

The real difference in requirements is that FaceBook can - and does - drop updates on the floor. If your friend throws a sheep at you and you don't get it on your news feed - oh well, FaceBook never claims the news feed is exhaustive anyway. But if somebody writes you a check for $10,000 and they see that you cashed it and you don't see it in your account, that's a problem.

nettdata · on July 7, 2009

I agree... never meant to imply that Facebook had fewer hits/etc than a banking site in raw numbers, just that the ratio of DB reads (not considered a transaction, very easy to cache, etc) to DB writes, makes the site much more read-only than write-only.

Facebook has the ability to do a TON of edge caching, with very few (relatively speaking) operations having to go to the database to perform an actual write operation.

It's the DB writes that really kill performance and limit your caching strategies and gains.

ntoshev · on July 7, 2009

My guess is apps hosted on Heroku (like the majority of the internet applications) are a lot like Facebook and little like online banking: they don't really need reliable transactions, they need to be flexible, scalable and cost-effective.

4buot · on July 7, 2009

For what its worth, Facebook does have a big Oracle install.

I find it strange that everyone trotted out their preformed opinion for this piece, but no one comments on the cassandra article which actually proposes a solution...although it requires you to learn something, how horrible.

TheXenocide · on July 7, 2009

http://incubator.apache.org/cassandra/

jhammerb · on July 7, 2009

are you sure facebook has a big oracle install? how do they use it?

rjurney · on July 7, 2009

It was the failure of Oracle to scale cost-effectively that led them to create HIVE.

bayareaguy · on July 7, 2009

You can't be talking about Hive[1], the SQL system for structuring and quering Hadoop datasets. That would never scale...

1- http://hadoop.apache.org/hive/

rjurney · on July 7, 2009

SQL atop an unstructured datastore, with minimal metadata. Something of an innovation, not a traditional RDBMS, but yes - still SQL :)

The nice thing about HIVE is - you're not limited to just SQL, though. You can still analyze your files on HDFS any whichaway, with Pig, with your own MapReduce jobs, whatever. Personally though, I look forward to Apache Pig getting SQL, and being able to run SQL queries on any intermediate state of a Pig script.

I don't think that anyone is really complaining about SQL. SQL is a swell query language for certain kinds of data. They're complaining about static schemas in relational dbs and having to store objects via SQL - reasonable complaints.

bayareaguy · on July 7, 2009

I don't think that anyone is really complaining about SQL. They're complaining about relational dbs.

Actually I suspect Mr. Wiggins is really only complaining about problems with MySQL and PostgreSQL. He would be less inaccurate if he admitted as much rather than making sweeping uninformed claims about SQL and relational systems.

4buot · on July 7, 2009

They use it for OLAP but are building out most of the new stuff in Hadoop,Hive because its cheaper.

trapper · on July 7, 2009

I've just read all your comments. Do you have a blog?

nettdata · on July 7, 2009

Nope. Had one for a while (http://orageek.com) but found I didn't really have much to say all that often to warrant maintaining it.

rjurney · on July 7, 2009

You can only scale Oracle on Big Iron, which means it doesn't scale well because its too expensive. Cost matters.

nettdata · on July 7, 2009

Absolutely wrong.

We use Dell 2950's for our DB nodes in our Oracle clusters. Biggest one right now is 12 nodes.

And if it scales, but costs too much (a relative concept at best), then it somehow doesn't scale any more?

Of course cost matters, and it comes down to what your application requirements are, and what your revenue model is, and your risk management requirements are, etc., etc.

If you want to talk technology only, it's a no-brainer.

If you want to talk cost and context-appropriate implementation of technology, then provide a detailed context, not generalizations.

Again, you're not going to see an online bank use CouchDB as their back-end, and likewise, you're not going to see a relatively free service use stupidly expensive technology.

rjurney · on July 7, 2009

How much data, and with a SAN? How much did the SAN cost? What kind of data?

But yes - if its too expensive to do, that means it doesn't scale well. Since when does money not matter in everything? The entire point of all of this is to use commodity PCs to achieve linear scalability cost-effectively, and to escape relational structure for data ill-suited to relational schemas.

I don't think anyone is suggesting an online bank should not use existing commercial databases and SQL.

logicalmind · on July 7, 2009

It's one thing when we're talking about facebook. It's entirely another matter when we're talking about a large bank. Do you want your bank to use mysql to store your financial data?

Regarding the importance of money, look at the existing infrastructure in financial institutions. They spend millions of dollars a year on mainframes from IBM. The cost of oracle compared to this is relatively low.

We're talking about two entirely different markets and applications of "databases" here. The claim of SQL db's not scaling is not true.

gnaritas · on July 7, 2009

You know what's funny... no ones talking about a bank but you. For most apps built by most startups, Oracle is absurdly expensive, so yes, it doesn't scale well.

pj · on July 7, 2009

Okay, see... here's more of the no-SQL agenda at hacker news. This isn't a question of "Does SQL Scale or not?" but "How much do SQL Databases Scale?"

I feel like the propaganda here is that because RDBMS doesn't scale to youtube or google scale they suck and that's not true. Like SQL is a waste of time because at some point, you're going to need to shard your database.

Look, at that kind of scale, you're going to have problems with any solution to any problem. Handling that kind of scale is going to be expensive no matter what solution you implement, whether it's map/reduce or flat files or some other solution.

But deciding to build a system from the beginning on something non-relational because someday you may have to accommodate that kind of scale is an example of premature optimization. The vast majority of features you get with SQL are going to outweigh the limitations of noSQL.

I've worked on some pretty high scale systems built on SQL and yes, there are problems, but there's just something irrational going on here and it's off-putting. It's like we are throwing out the baby with the bath water or something.

gaius · on July 7, 2009

As others have pointed out, the "no SQL" crowd are invariably MySQL users who have run into the limitations of MySQL but for ideological reasons can't state that the problems they encounter are specific to MySQL.

DB2, Teradata and Oracle users regularly tackle problems 100x larger than MySQL can handle.

TheXenocide · on July 7, 2009

I beg to differ; the company I'm at now had SAN representatives saying that they've never seen a non-SAN DB push as many IOPS as ours was before we started exploring scalability solutions and I'm still rather compelled by the potential of "no SQL" solutions, thanks. Oh, and no, we're not using MySQL or any other open source DB for that matter, so maybe it's you that has the prejudice here...

jmtulloss · on July 7, 2009

[citation needed]

gaius · on July 7, 2009

OK, try this: http://press.web.cern.ch/press/PressReleases/Releases2003/PR...

jmtulloss · on July 7, 2009

A press release isn't data. I'd be interested in seeing the comparative results of the trials CERN put Oracle through.

It's not that I don't believe that Oracle can scale better than MySQL, it's just that I haven't seen any convincing data that it can scale so much better that it's worth the cost. I'm no expert, so maybe the data exists. I just haven't seen it.

trezor · on July 7, 2009

Agreed. I also fund it hard to sympathize with these startups having ambitions to scale to Google-like sizes, yet are unwilling to pony up cash for a proper database system like Oracle or SQL Server.

Seeing the amount of ugly hacks people are willing to come up with and employ and features they are willing to cut, just to handle trivial loads, kinda makes me think that MySQL can only be considered free if your time is worthless.

gaius · on July 7, 2009

It's amusing that he thinks master-slave replication is MySQL's killer feature. Didn't IBM have it, umm, 30 years ago?

Also that he says SQL database and not relational database. The whole article reeks of inexperience.

timwiseman · on July 7, 2009

I agree with you with a caveat. Everything you say is absolutely true, but remember that many startups are highly focused on conserving cash, and moreso than hacker time.

The retail price (I know discounts can be negotiated, but for a point of reference...) for MS SQL Server Standard edition is $5000, enterprise edition is $25000. I have never had to research Oracle prices but I understand they run even higher.

I worked as an MS SQL Server DBA for a while for a mid-sized company, and I wrote and employed some "ugly hacks" to emulate some of the Enterprise features because management at the time was unwilling/unable to pay for Enterprise Edition.

gaius · on July 7, 2009

SQL Server and the rest of the Windows stack are effectively free under the BizSpark programme.

rythie · on July 7, 2009

It seems odd that startups should "pony up" for Oracle or SQL Server when the very company you mention - Google - does not use them to a significant degree. Nor do Yahoo, AOL, Facebook, Digg etc. AFAIK.

jacquesm · on July 7, 2009

yes, because we all know google runs on a 'proper' database system like Oracle or SQL server.

Google is one of those excellent examples of a non-sql datastore that works just fine and seems to blow the socks of anything the competition has come up with to date.

qhoxie · on July 7, 2009

Look, at that kind of scale, you're going to have problems with any solution to any problem. Handling that kind of scale is going to be expensive no matter what solution you implement, whether it's map/reduce or flat files or some other solution.

It is not a point of whether or not it is expensive. Scaling (nearly) always has expense associated with it. The issue is how much expense, and with some applications it is significantly less with one of the "no-SQL" solutions.

But deciding to build a system from the beginning on something non-relational because someday you may have to accommodate that kind of scale is an example of premature optimization. The vast majority of features you get with SQL are going to outweigh the limitations of noSQL.

You are making it sound as though a relational database is the correct choice barring any scaling. Perhaps you have not yet thoroughly evaluated some of the alternatives out there, because in many applications, there would be no step back from *SQL.

TimothyFitz · on July 7, 2009

RAID doesn't meet his definition of "scalable" because it has a central controller.

This whole post could be summed up as "ACID doesn't scale", which has been proven. Consistency, Availability or Partition Tolerance; pick two (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.1...).

Good introduction to non-ACID databases: http://highscalability.com/drop-acid-and-think-about-data

nettdata · on July 7, 2009

Oh please.

The mere fact that he's talking "RAID" instead of SAN speaks volumes. (No pun intended). Any storage engineer worth their salt would be rolling their eyes right now.

The doc you linked to was authored in 2002, about the same time that Oracle's RAC was introduced (late 2001). Some of his citations are from the 80's. THE 80'S!

DB technology has come a LONG way in 8 years, and that paper is no longer valid, unless you're talking about some basic, "minor"/open source db technologies.

If you want to say "open source DB technologies have problems scaling", then go right ahead, and I'll agree.

Just don't mind those of us who continue to build large, scalable systems, using the proper DB technologies, that disprove that "sql doesn't scale" generalization.

rjurney · on July 7, 2009

SANS are too expensive. They don't scale cost-effectively compared to commodity PC hardware.

nettdata · on July 7, 2009

I'd beg to differ.

Go take a look at Adam Leventhal's work with the Fishworks stuff.

Specifically, go check out the Sun Storage 7310.

It will scale HUGE, and is nowhere near the stupid cost of NetApp or EMC or the other major vendors.

I'd love to see how a bunch of commodity PC's will scale to 100TB, and still be manageable, and have anywhere near the same feature sets.

Again, if you're railing against something as ubiquitous as a SAN, I'm not sure there's anything I can say to change your mind.

Not that I'm really here to change your mind.

Again, the original focus was about SQL not scaling, and you seem to be fixated on the cost of that scaling.

moe · on July 7, 2009

I'd love to see how a bunch of commodity PC's will scale to 100TB, and still be manageable

100T is 666 spindles when you go RAID10 with 300G drives (common size in the SAS/FC area).

If you go S-ATA then it'll fit on 200 spindles using 1T drives. I'll stick to the latter variant for now because I'm too lazy to lookup Sun's pricing for SAS spindles, for my comparison below.

So, 200 spindles amounts to roughly 15 hosts (throwing in a few spares for good measure). The whole setup will comfortably fit into one rack, including the FibreChannel machinery and other fluff that you'll likely want.

Thus from the hardware side this is trivially managable, 100T is just not a lot of data nowadays.

On the software side it's up to your creativity and mostly depends on what you actually need to store. I've seen people setup commodity postgres clusters, as well as more fancy things like HDFS, GFS or homegrown storage layers that way. And it worked.

And, regardless of the indeed relatively sane pricing of the Sun (formerly StorageTek) products, the bottomline is what makes the difference.

In figures, for a 100T SAN on the 7310 you're looking at something like $50k for one head, plus around $75k for three trays. We're in the $125k ballpark, hardware-only. And I'm being rather optimistic here: This setup actually holds only 96T and the head is maxed out (3 trays max per controller/head). That means your next upgrade will incur another $25k markup for the next head, good thing you didn't ask for 150T...

Squeezing the same amount of storage into 15 min-spec supermicro pizzaboxes I arrive at roughly $3000 per node, including spindles. A good buyer will get them cheaper. This commodity cluster sets us back only ~$45k in hardware.

That's 1/3 the price of the 7310 solution, being optimistic on the Sun and pessimistic on the commodity side.

That kind of difference makes up for a lot of development effort for the custom solution - most of which is a one time investment anyways and yields better flexibility in the long run. It's also the reason why most of the big boys don't use off-the-shelf SANs for their primary storage.

itgoon · on July 7, 2009

FWIW, I've built one of those SuperMicro machines, with eight drives, around the beginning of the year.

Allowing for redundancy (RAID 5 on six disks for data, and mirrored the boot drives), we ended up with 6.5TB of storage at about $3500 (that includes i7 proc and 12Gb RAM, and a fancy controller). My labor added some more on top of the $3500.

Mostly, we've been happy. It is more sensitive to heat issues than our HP G5s, and of course, generates more of that heat. If I were to go with any kind of density, I'd need a much more robust cooling solution.

moe · on July 7, 2009

Yep, that spec would be over the top for a storage node, though. A large chunk likely went into the controller and the i7 whereas in a storage-node most cheapo controllers (i.e. 2x8 is cheaper than 1x16) and whatever old xeon/core2 will do.

Wrt cooling, for actual servers (i.e. not storage nodes) we've had good success with Sun XFires, the 4100+ range. They are very well built and the markup over the SuperMicro junk is minimal (around 15% last time I checked). Among the niceties is a nice array of hot-swap fans - something that SuperMicro still doesn't seem to deliver in their popular chassis.

I'd also consider the xfires for storage nodes if they'd take 3,5" drives, but afaik all their low-end models only have max eight 2,5" slots. That's just no worthwhile density when compared to the larger SuperMicro tins (which go up to 30x3,5" now I think).

itgoon · on July 7, 2009

Most of the money went to the hard drives themselves. They've come down a lot since (1.5TB had just been announced), but it was around 70% of the cost. Everything else was relatively cheap.

The 2.5 drives are good, and seem to becoming typical, because they run much cooler. We've got some G5s in the same closet, with just as many drives, and they don't run nearly as warm.

rjurney · on July 8, 2009

I second the 4100s. I had one on a loaner program, and they are sweet machines.

rjurney · on July 7, 2009

You can use Hadoop to scale to 100TB using Commodity PCs and still be manageable, much easier and cheaper than you can use Oracle to do same. The featureset could easily DWARF those available using Oracle, as you can mapreduce all the data using your own hadoop jobs, and it can hold whatever kind of data you want it to - you're not limited to a static schema and precomputed summaries. HIVE is running on commodity PCs at facebook on more than 100TB, and they prefer it to their enormous, overly expensive Oracle OLAP system. So there's your answer, for one use case.

But of course, if you're updating the data often - you wouldn't use hadoop (or if you were, you would run HBase or some such on top of it). But there are many use cases where you only write once at that scale. And in those cases, from my perspective - its much nicer to scale on commodity hardware than on big iron.

TheXenocide · on July 7, 2009

Maybe you should ask Google... they seem to be doing pretty well with commodity hardware and something's telling me that they've got better reliability, higher efficiency and lower costs. Also, unused features merely add unnecessary complexity to a system. Thanks for the info on the cheaper SAN though, I'll have to look into it.

derefr · on July 7, 2009

Why are there no open-source implementations of the "proper DB technologies", then?

nostrademons · on July 7, 2009

People can still make money charging for them?

Open-sourcing is the last step in the technology lifecycle, after the technology has become widely understood and commoditized. When people can still make money off something, they will. What's the incentive for them to give it away?

derefr · on July 7, 2009

Usually, an ideological one. The first group of Linux kernel developers (past the "toy" stage) all hated Microsoft with a passion, and wished to deprive it of as much revenue as possible. They wanted to give people a "free alternative", and lower the total investment people [that is, they] would have to make in owning a computer. I could imagine an analogous situation with some developers and Oracle.

trezor · on July 7, 2009

Being ideologically for open software and an open operating system is not the same as "hating Microsoft with a passion". Mixing these two different concepts into the same bag is misleading, as the first one represents an ideological conviction and the other merely childish spite.

And if you insist on being cheap, don't be surprised when it turns out that your free database was indeed some cheap stuff which doesn't come fully featured.

derefr · on July 7, 2009

It's a causal relationship. If you believe that all software should be free (the GNU folks—those "first kernel developers"), then you must at least dislike any company which tries to profit from the creation of artificial scarcity of software. The two groups (the idealists and the "haters"), which now have very little overlap, originally started much the same.

> Don't be surprised when it turns out that your free database was indeed some cheap stuff which doesn't come fully featured.

But if being "fully featured" is the goal, then it would be very surprising indeed if what you considered a "competitor" was not, in fact, fully featured. It would not, then, by definition, be a competitor. Or, at least, it would not be worth calling version 1.0 yet.

To refine that: Oracle currently has no FOSS competitors, because there are no FOSS databases that are trying to compete with Oracle. They may be trying to take parts of Oracle's market share, but this is a different thing—optimizing their fit for a situation where Oracle itself is a bad fit.

pj · on July 7, 2009

I've seen the term artificial scarcity used a few times in this context. Could you elaborate on what you mean by that?

When I think artifical scarcity, I think something like DeBeers who has warehouses full of diamonds they are hoarding. Whereas with software, there actually is a scarcity of great programmers. There are only so many of them.

With software, you know... it takes time and energy to support it. It costs money to write help documentation, pay for servers where people can download it. All that stuff costs money.

Is open source anti-capitalism?

Is the artificial scarcity coming from the thought that because software is just 1's and 0's that it can be copied almost infinitely? It takes a lot of time and energy to build something that is worth copying. What about all that research and development and risk that was taken to build it. How are those risk takers and R&D staffs going to generate a return on that investment without charging for the product they create?

derefr · on July 7, 2009

Artificial scarcity is not an ethical or legal principle, but an economic one. Because 1s and 0s can be "copied almost infinitely", in order to make money on it one has to enforce an artifical constraint on the number of copies that are allowed to be made. The distinction is simply that with its opposite, natural scarcity, no effort needs to be made to ensure that each object has value on its own.

To put it another way, the "default state" of a naturally scarce commodity, when it is simply produced and then discarded, is still high-value. It can be resold at auction, or traded in a market, and will be fought over even in a state of legal and moral anarchy. The default state of an artificially scarce commodity, however, is value asymptotically approaching zero; without some sort of agency to "prop up" its value, it is worthless, and no secondary party will assign it value unless it is forced to as part of a larger deal (e.g. accepting the legal code to be a citizen of a country.)

Your question ("is open source anti-capitalism?") is just a matter of equivocation. Open Source fits just fine with capitalism, but it is not the capitalism you would first imagine (i.e. that of the US); it is instead "true", or lassiez-faire Capitalism.

In Lassiez-faire Capitalism, there can be no artificially scarce commodities; they are simply a market inefficiency to be eliminated, along with those producing them. A true Capitalism would destroy the value in all "information products"—movies, books, music, games, and, yes, software. In a true Capitalism, value is just "what people are willing to pay"; you don't deserve an ROI just because you worked hard, you only earn money if people feel that your commodity has value to them. In such a market, the only software that could exist is that which was produced for other motives than profit—open source software—or produced as a means to an end of profit, e.g. software that makes a business process more efficient, rather than software that is a product in-and-of-itself.

And now you see why we do not use such a capitalism ;) Though some "creatives" would survive, whether as on-the-whole consultants or by donations from fans (see http://pc.ign.com/articles/967/967564p1.html), the majority would not. In reality, a majority of people desire to keep these creatives around and producing, even at the cost of large market inefficiencies. Realize thus that copyright, more than anything, is a form of socialism, in that it redistributes wealth to those we think deserve a "fair share" for their efforts. Sort of mixes up the arguments most people have on the subject: Open Source is anti-socialist ;)

stcredzero · on July 7, 2009

Artificial scarcity is not an ethical or legal principle, but an economic one. Because 1s and 0s can be "copied almost infinitely", in order to make money on it one has to enforce an artifical constraint on the number of copies that are allowed to be made.

Good customer service and a deep understanding of specific technical issues are not trivially reproducible.

pj · on July 7, 2009

Okay, I think I understand better now. There's only physically so much gold on the planet. Barring alchemy, we can't make more of it, so it has a natural scarcity.

From that, I gather that we can't use the traditional economics of Supply and Demand to allow a market to decide the value of software. Because the denominator in the equation demand divided by supply is infinite, essentially, it doesn't matter what value demand has in the numerator, because anything divided by infinity is zero.

But my initial impression of this model is that it is doomed to failure. It is doomed to failure, because the artificially scarce product being produced itself requires the consumption of naturally scarce products that are subject to the laws of supply and demand. The developers that produce the software for example need to eat, the need to live in a house, they need to consume energy to drive to work, etc...

So how do we find a balance? Do we find a new way to value software or will software just go away? Will software be relegated to a charitable organization like in the article you linked to, which was quite interesting by the way, so thanks for that. Will the only valuable software be that software which is paid for in advance to be developed at the risk of the consumer? "If you pay me, I'll build it, otherwise, I won't" sort of scenario?

It's like the software only has value if it doesn't exist. If the mechanism to produce the software exists only in the minds of a scarce few who can implement the solution or who have the ability to control access to it, perhaps through SaaS or some other monthly subscription mechanism like WoW or battle.net.

Perhaps we can value software based on how much additional revenue it helps you generate or how much savings it generates through optimizations or automations.

How are we going to strike that balance? Is the gold rush for software over? If there is no carrot, who will run those wheels and invest in our future? Who will take the time to solve the problems before the problems arise? Take Oracle vs. MySQL. Oracle has solved a lot of the problems MySQL has. Over time, as people contribute to the MySQL code base, those problems can be resolved, but consumers of MySQL have to wait for someone else to implement the solution, or they have to pay for a computer programmer to find a work around or implement some solution outside MySQL.

It feels like it will slow us down. Corporations like Oracle and Microsoft will survive a little while longer, but if MySQL ever actually does become as good as Oracle, then people will stop paying for Oracle. If people stop paying for Oracle, Oracle can't hire the best and the brightest.

Well, I appreciate the answer, I do understand better, but that just leaves me with more questions, so I'm thinking out loud.

All I can think of right now is that software will all move to aaS models or be embedded in physical devices so that we can attach a natural scarcity to them. There are only so many factories that can produce microchips. The article you linked to suggested selling plastic figurines with the software, which is a similar idea. It harkens back to the age of dongles.

stcredzero · on July 7, 2009

Perhaps we can value software based on how much additional revenue it helps you generate or how much savings it generates through optimizations or automations.

My former company tried to do this. People balk at this. "Why should I pay when there's 'free?'"

nailer · on July 7, 2009

> the GNU folks - those "first kernel developers")

First kernel developers like Linus, Ted, Randy etc aren't really GNU folks.

Confusion · on July 7, 2009

RAID setups can be outfitted with redundant controllers, in which case it does meet his definition of 'scalable'.

antirez · on July 7, 2009

I'm the author of Redis so I should be biased the other way around, but the Redis and KV experience taught me that to be exposed to the KV parading was for me a similar experience as learning the Scheme language, I started to write code in imperative languages in a new way. In the same way once you start thinking at scalability of your data in a new way, from the point of view of partitioning the data, organizing this data in away that is easy to access for your usage pattern, make judicious use of serialization, SQL databases can be a good pick. Simply you need to abandon the paradigm of let's design our nice tables and run our multiple joins and group by against.

But of course designing DBs in this new way makes most of the SQL features not needed in most scenarios... and you still have the overhead. And there are a number of other limits now that complex designs are not a good idea, for instance there is no way to get back the data in the natural ordering (the order you pushed this data, or the reverse)...

This is where KV stores start to be interesting as real world alternative, not just in order to learn the paradigm of scalability. But again, if you don't trust KV stores it is truly possible to use an SQL database in a more conservative and scalable way.

bayareaguy · on July 7, 2009

SQL databases are fundamentally non-scalable, and there is no magical pixie dust that we, or anyone, can sprinkle on them to suddenly make them scale.

I find this claim laughable. If the databases Mr. Wiggins chooses to work with fail to meet his scalability requirements, perhaps he should consider different databases. Scalable SQL databases have been around for over 3 decades.

qhoxie · on July 7, 2009

Your comment would hold more weight if you elaborated some more. Are you referring to scaling by his standards? Please explain.

bayareaguy · on July 7, 2009

Mr. Wiggins would have us think in terms of "true scalability" which conflates the following concerns: more servers creates more capacity (the classic definition of scalability), the business logic of the app is separated from concerns of scaling server resources (this is not possible past whatever limit you set and so has no classic definition) and no single point of failure (more classically formalized as availability).

Companies such as Teradata have long offered SQL systems which meet the classic definitions of scalability and availability.

I think TimothyFitz's reply above was accurate. Mr. Wiggins article would be better titled something like "ACID databases have scalability problems, especially cheap ones startups use" but then, like http://news.ycombinator.com/item?id=690653 , it wouldn't get much response.

Confusion · on July 7, 2009

So while sharding is a form of horizontal scaling, it fails point #2: it is not transparent to the business logic of the application.

I do not believe this is usually true, for two reasons. One may be nitpicking, but 'where to get the data' is not part of the business logic: it's pure application logic, dependent on your solution of the problem. In that sense, his argument is wrong. The other reason definitely isn't nitpicking: you can solve the problem by adding a layer between your DAO's and the databases, that handles the 'where to get the data' question. So yes, it requires some programming, but it is still transparant to the business logic. It does not require an invasive change in your application and I think he is grossly exagerating this point.

prodigal_erik · on July 7, 2009

I don't see a good reason for "which server has my data" to be in-your-face application logic to code, when "which sector of which disk has my data" has long since been delegated to the platform and forgotten.

Confusion · on July 7, 2009

If you are bound to SQL databases, either for historical reasons or because the no-SQL database make other problems harder, then 'scaling the application' is a pretty good reason for writing that logic. There is no silver bullet.

alexgartrell · on July 7, 2009

Crazy idea: Write your data-handling code in it's own tight little module. Use good abstractions so that the rest of your app doesn't give a crap HOW it's happening. When it's time to update to millions of clustered servers running bigTable, you rewrite the module, and you're done.

Software Engineering saves the day!

alexgartrell · on July 7, 2009

3 upvotes and 2 downvotes without a single response.

As a junior member of Hacker News, I demand an education when I'm downvoted! :)

akeefer · on July 7, 2009

I wasn't one of the ones who upvoted or downvoted you, but just because you asked . . . while abstractions around your data-handling code are always a good idea, on some level they always leak. (For example, see the classic post http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Compute... describing how ORM, one of the most common approaches to encapsulating data-handling code, tends to fail).

Most of your code is data-handling code on some fundamental level, so it's pretty inevitable that it's going to matter to your application how that data is stored and what sorts of operations you can do on it, and that's especially true if you care about performance (which you generally always do after some level). So even if you have a nice abstraction of your query layer, the kinds of questions you can ask the database efficiently depend heavily on the underlying storage mechanism, and so your application logic has to be built so it only asks the right kinds of questions.

Even abstracting away the differences between, say, Oracle and MS SQLServer and MySQL is difficult enough, because of the different capabilities and performance characteristics. When you do that, you basically end up coding to the lowest common denominator, which can often limit what your applications does.

Trying to come up with an abstraction that can encapsulate the difference between a row- versus column-oriented database, or between a relational database and some other kind of storage, is pretty much a losing proposition: they're too fundamentally different in terms of what kind of data you can store, how you can store it, how you can query it, what kinds of transactional guarantees you get, and what operations are fast and which ones are slow.

So you really do kind of have to take your best shot at it, choose an approach, and if you choose wrong and have to change, it's just going to hurt. A lot. Good encapsulation and abstraction will ease some of the pain, but it's more like drinking whiskey before your leg gets sawed off than it is like general anesthesia.

tybris · on July 7, 2009

Well, you're right in the sense that Newton was right about gravity. It works perfectly down here on earth, but breaks down out in the universe. In a distributed system you'll want to take into account caching, replication, atomicity, security, etc. In other words, that's either going to be a hell of a complex API or an API that gives you too little control to build a high-performance system. There's no telling what works best, every computer system is a set of trade-offs and it always depends on the application.

stuff4ben · on July 7, 2009

I thought it was a good summary of the problems facing SQL databases as organizations grow. However I would have appreciated some solutions to the problem.

dflock · on July 7, 2009

I think one of the points made at the end was that there kinda isn't one - people have been trying to come up with the perfect solution to scaling RDBMS's for ever and the current state of the art (pretty much) is as described in the article - i.e, not great. People with really massive MySQL setups (eBay, for example) basically just do a huge amount of the stuff described here, but with lots of message-queue type stuff to glue it together and loads of hard work; i.e. you can make it scale but it's really really hard work because RDBMS's aren't inherent scalable.

gaius · on July 7, 2009

the current state of the art (pretty much) is as described in the article - i.e, not great

That simply isn't true.

For me, thousands of transactions per second and 10s of terabytes of data on a single database is normal. It's unremarkable, it's everyday, it's what we do, we have done it for years. And I know of installations handling 10x that. It's only people who's only "experience" is websites that whinge about how RDBMS can't handle their tiny datasets.

dflock · on July 7, 2009

What hardware is this running on, roughly? What the author was getting at, I think, was not that you can't do it, just that it's hard/expensive; you need fairly beefy hardware and experienced DBA's to manage horizontal scaleout manually.

DanielBMarkham · on July 7, 2009

If that were the case, then I think the concluding sentences should be changed.

When hundreds of companies and thousands of the brightest programmers and sysadmins have been trying to solve a problem for twenty years and still haven’t managed to come up with an obvious solution that everyone adopts, that says to me the problem is unsolvable.

gaius · on July 7, 2009

That's a very curious statement, especially for HN. Why would you consider needing good, experienced people to be a disadvantage?

dflock · on July 7, 2009

Because that kind of hardware+DBA team is prohibitively expensive for a startup?

gaius · on July 7, 2009

Hardware yes, but the basic premise of a startup is that your people are top-notch. Scalability doesn't imply building something up-front that can handle enormous loads - it means building something that can grow with you. As opposed to "oh crap, we've got some load now, better start again" a la Twitter.

wildwood · on July 7, 2009

"Sharding kills most of the value of a relational database."

I can appreciate the point the author makes about partitioning schemes requiring heavy integration with the business logic, but I disagree with the claim that sharding doesn't work.

It's probably more accurate to say that sharding only works well if you design your database very carefully, or just get lucky about how your data model maps to sharding schemes. A bad sharded database can probably hobble your app, but a good one does get you remarkably close to true horizontal scaling.

In my experience, at least.

prodigal_erik · on July 7, 2009

Sharding means you can't do "select * from a join b using (k)" unless you can prove the a record and the b record will always be hosted on the same shard. So of course you often have to write application code that knows where they are and goes and gets them separately. But that's precisely the problem that relational databases were developed to solve for you.

Then there's the whole distributed transaction thing, which is so painful that some people just live with wrong answers instead.

psadauskas · on July 7, 2009

If you data model maps well to sharding, it maps well to distributed databases, and you're not gaining much by being on an RDBMS. I think the point they're trying to make is if you're going through the contortions to shard a relational database, you're better off just doing it in a distributed database in the first place.

TheXenocide · on July 7, 2009

There's more than one way to skin a mongoose and in this case the more common shard solutions are fundamentally different from distributed DB in that a distributed DB often times contains large amounts of a a subset of data that has to be merged with data on other DBs whereas a shard can have entire schema duplication on different shards where you can use relational technology just fine and even if the query is distributed you can just add rows to the end of a dataset rather than joining the data columns manually (implementing your own relational joins). In many cases you partition your sharded data in such a way that the majority of your queries are not distributed so that they can continue to leverage the relational model you had before, only performing a distributed query for more complex actions like reporting and such.

Also going back to the first place isn't really an option, shards usually come of a system that has grown, not ground up design. Ultimately I still support the distributed model myself, but a shard model does support relational data much more so than a distributed DB (at least out of the box).

brlewis · on July 7, 2009

This article is about relational databases that you read from and write to. It isn't SQL-specific. It's a good article, not another "SQL is obsolete" as you might think from the title.

Shakescode · on July 7, 2009

In the final paragraph he compares himself with Capt Kirk, facing the Kobayashi Maru: "we can only solve this problem by redefining the question."

Well, perhaps that's as difficult as solving the "SQL scalability that 20 years of brilliant programmers haven't solved"?

...because he declines to attempt to reframe the question.

[ or maybe he's leaving us wanting more: like his next post? ]

Anybody up for reformulating the question?

ojbyrne · on July 7, 2009

At a management level there are larger problems. SQL was a solution to the "letting programmers run the show" problem. I.e. a commoditized language that executives could mostly understand and budget for. See pretty well everything Philip Greenspun has written.

JulianMorrison · on July 7, 2009

Scale like what? If you aren't IBM, Facebook or Google, why do you even care?

jacquesm · on July 7, 2009

ouch, I'm not IBM, Facebook or Google but SQL scalability issues are a good part of my daily workload.

(the other part is made up from file system scalability issues).

Once you you get past a certain level these are non-trivial problems and anybody out there that is busy solving them has my interest.

tybris · on July 7, 2009

Can anyone give me the context? I don't believe in absolutes. I only believe in "it depends on the application/budget/knowledge/etc"

antidaily · on July 7, 2009

Maybe I am getting old, but I no longer find arguments over semantics as interesting as I once did.

jmtulloss · on July 7, 2009

I agree, but I'm not sure this is purely a semantic argument. Care to elaborate?

tommy_chheng · on July 7, 2009

Isn't Facebook using MySQL?

trezor · on July 7, 2009

Summary of an article about how "SQL Databases Don't Scale":

  1. Mentions RAID, not SAN
  2. Mentions MySQL and only MySQL (with the exception of PostgreSQL once).
  3. Mentions Master-slave replication as a killer scalability feature.

I suggest he rename the piece to "A $500 server and MySQL Don't Scale" and then we can all agree and get along.

khaless · on July 7, 2009

Yeah, there certainly are a few more tricks with regards to scaling RDBMS than the author covered. Depending on what your demands are there are different techniques and protocols which best suite you and can go very far to solving your problem. But with that in mind there may be better ways to solve your problem, and we should not forget about those.

If You start sharding you may get write gains, but you have to work very hard to keep things consistent (depending) and you may have to duplicate shards to make them highly available ($$$). Oh - and later down the track your schema might change in ways which your sharding scheme is just not flexible enough to deal with and depending on who you are that may be too much of a risk.

Besides the issue is really with availability, consistency and performance. It is very hard to scale all three of these together and even your cashcow solutions will hit their limits (although some of their limits are quite high :))

TheXenocide · on July 7, 2009

Most of the things he mentions apply to most DB engines, MySQL, MS-SQL, Postgres, Firebird and many editions of Oracle (all of which I've used in data-intensive applications) suffer the same woes. Ultimately it would appear that you went into it reading the article with a narrow minded view of "Yes they do!" instead of "Why don't they?" Also, the disk analogy was for relating information not discussing storage technology or to imply that was what he was using when trying to scale. The portion of vertical scalability clearly presents the option of (and pitfalls therein) using hardware to account for data scalability...