Hacker News new | comments | show | ask | jobs | submit login
2300 days ago | hide | past | web | favorite

Linkbait. One person's opinion, and the article doesn't even attempt to validate it against other experts or people with real knowledge of what Facebook's architecture and current concerns are.

Agreed. Not to mention the guy has founded a number of alternative solutions/competing companies. Of course he'd take any opportunity to say sensational things about the incumbent technology.

I truly don't know how such a ridiculous article made it to the top of Hacker News:

The widely accepted problem with MySQL is that it wasn’t built for webscale applications [snip]


In Stonebraker’s opinion, “old SQL (as he calls it) is good for nothing” and needs to be “sent to the home for retired software.” After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed.


However, it was quickly discovered that while NoSQL might be faster and scale better, it did so at the expense of ACID consistency.

This is ridiculous. He's making claims about a whole group of software that's only definition is a feature or attribute that they don't have. And it's flat out false. Look at Riak (a key/value store) and how it allows you to tune these parameters on each query or write.

The whole article is genuinely a massive trolling piece.

In Stonebraker’s opinion, “old SQL (as he calls it) is good for nothing” and needs to be “sent to the home for retired software.” After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed.

He lost me here.

when web startups decide they need to build a product in a hurry, MySQL is natural choice. But then they hit that hockey-stick-like growth rate like Facebook did

An awesome problem I'm still waiting to have.

With all due respect to Stonebraker, I consider FB's MySQL use a success story. They're successfully scaled it to truly crazy traffic loads. It might be a pain, and sharding schemes are always a pain in the ass, but they're scaled it to levels unheard of for any of the NewSQL solutions being proposed as silver bullets ..

FWIW, as an end user completely unaware of FB internals, facebook.com is as fast as I need it to be.

It’s called NewSQL [..] Pushed by companies such as Xeround, Clustrix, NimbusDB, GenieDB and Stonebraker’s own VoltDB,

It would be easy to accuse Stonebraker of tooting his own horn, but NewSQL vendors have been garnering lots of attention, investment and customers over the past year.


Facebook's use of MySQL makes total sense just like their use of PHP does. It was written in those languages/technologies way back when and while it might all be held together by duct tape and bubblegum, a wholesale rewrite is a TERRIBLE idea.

Does Facebook run and make money? Are users happy? Is it fast?

In general if those three things are good, then why do a rewrite or a massive switch?

Facebook has very smart developers and if they though they could go out and swap out database vendors and get a big gain, I bet they would, but they don't. Why? because I bet the gain isn't as big as advertised.

VoldDB is in-memory so it's fast? Cool, but FB already uses memcached, so you aren't going to get any huge perf wins there. Maybe it would be simpler and maybe if FB started today they might use it, but the cost to swap out their proven architecture for something new and shiny probably doesn't provide a big enough user benefit for it to be worth their time.

At the end of the day, very few companies truly understand what it is like to have half a billion people logging into your website on any given day to chat, send messages, upload photos, play games, and so on. Any platform is going to have problems at that load, swapping databases doesn't solve that at all.

MySQL and PHP were terrible technologies even in 2005. Maybe if Facebook started in 1999 they might have an excuse (even then it's debatable) but 2005?

Also, architecturally there's no reason to do a massive re-write of any software all at once, and it usually ends in tears. They can do it piece by piece and they should be making those investments now if they hope to last another half decade.

I'm intrigued by this, but there's so much ambiguity around what people mean by sql, noSql, and "newSql" (this post is the first time I've read this term).

I clicked through a bit to get more information... here's a potential definition:


“NewSQL” is our shorthand for the various new scalable/high performance SQL database vendors. We have previously referred to these products as ‘ScalableSQL’ to differentiate them from the incumbent relational database products. Since this implies horizontal scalability, which is not necessarily a feature of all the products, we adopted the term ‘NewSQL’ in the new report.

Like noSql, this term also appears to be intentionally broad, indicating a problem to be addressed rather than a very specific spec or solution. I don't have any problem with this, but the ambiguity can cause a lot of confusion...

Are they doing away with the relational model? Or the SQL language itself?

Here's another blurb that may answer this:

"Like NoSQL, NewSQL is used to describe a loosely-affiliated group of companies (ScaleBase has done a good job of identifying, some of the several NewSQL sub-types) but what they have in common is the development of new relational database products and services designed to bring the benefits of the relational model to distributed architectures, or to improve the performance of relational databases to the extent that horizontal scalability is no longer a necessity."

Which is great, but is it really new sql? I'm not sure the old relational model or sql language made any specifications or recommendations about how it should be implemented. This might just be an implementation of sql and relational databases that improves performances in distributed systems. Again, a great idea, but is a new implementation of a back-end the same as a "newSql".

I was initially very sceptical about "noSql" until I understood what it really meant (from what I've read, it was a movement away from using rdbms and sql as the default data store for all persistence, and toward a consideration of other persistence frameworks that might be more optimal for the task at hand - hard to find fault with that).

Similarly, "newSql" sounded like something more revolutionary than I'm seeing here. As far as I can tell, this is same old sql, but with a new implementation designed to scale well in distributed systems. Great idea, but not loving the name here.

I'll eat my hat if one of these newcomers can outperform Teradata, DB2 or Oracle in any widely-accepted benchmark. Real vendors publish TPC results.


SQL is a language, MySQL is a database implementation (and arguably a pretty crappy one). There is nothing about SQL that makes it so that it can't operate at 'webscale'. Oddly enough, Facebook and Google have both made MySQL work at 'webscale'. Most 'webscale' databases rely on keeping data in memory, you'll see equally impressive performance improvments in MySQL if you run it in a ramdisk.

Google uses MySQL?

Definitely Google via Youtube uses mysql, at least.

From 2007: http://ebiquity.umbc.edu/blogger/2007/12/28/how-youtube-scal... Undated job opening: http://www.google.com/intl/ln/jobs/uslocations/mountain-view...

Last I heard, AdSense runs entirely on MySQL. They tried migrating to Oracle for some reason, but backed out.

Not for web search obviously

For some things, yes.

I'm always confused by people suggesting that the options are to have either MySQL or NoSQL. (Or, as this article suggests, "NewSQL", which I'm desperately hoping will be clarified by its proponents.) Is Pg not an appropriate suggestion for people looking to improve their SQL-based backends?

Edit: The article includes this gem: "The widely accepted problem with MySQL is that it wasn’t built for webscale applications..." Every time I see "webscale", I get the feeling that the author doesn't actually have a great grasp of what's going on. However, this guy did mention ACID, which confuses me. Did we define "webscale" when I wasn't looking?

The question for most isn't really "MySQL or NoSQL", but "ACID or not." For most people, ACID means MySQL because PostgreSQL (despite my own affection for it) is a major, major pain in the ass to work with on a smaller scale (edit: and Oracle costs more than most programmers' houses, so that tends to rule it out in a lot of cases). PostgreSQL tooling is considerably worse than MySQL (phpMyAdmin is damn near ubiquitous and has no reasonable pgsql competitor). Documentation expects you to already be a guru. And there's often just no really compelling performance reason to pick it over MySQL.

And then, when you get to Facebook's scale--though as a coworker just pointed out, Facebook probably isn't particularly ACID-compliant--Postgres is going to have problems that are pretty much the same as those you see in MySQL.

PostgreSQL isn't really "better than" or "an improvement over" MySQL. Different tools, different tasks--though they do have considerable overlap.

Stonebraker has a new DB (VoltDB), so his claims and criticisms should be read in that light.

The false dichotomy between MySQL and NoSQL is really irritating. Just because MySQL isn't solving the problem, doesn't mean that you have to completely change the database structure.

There are more grown-up SQL servers out there that can handle way more than MySQL, and not one of these "NewSQL" things that the author talks about.

Sure, let's bet the whole site on a really new database. Worked really well for Digg!

I doubt Facebook is going to go spend megabucks on Oracle, though. And Postgres isn't going to make your life all that much better at FB levels of scale.

Spending money on Oracle or moving to Postgres will just make things worse.

At that scale you need to do 2 things, denormalize and cache. (And, if you're FB, buy 75% of FusionIO's production). Big complicated queries just don't scale like simple key-value lookups, especially if they impose locking constraints on your database or try to join against things that are sharded to different servers.

Doesn't really matter what database back-end you're using if you're only using the dead-simple features.

It depends what you mean by "spending money on Oracle". They'll happily sell you TimesTen or Coherence as well as the classic RDBMS remember. And the classic RDBMS in the right hands is bloody quick.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact