Hacker News new | past | comments | ask | show | jobs | submit login
Why Uber Engineering Switched from Postgres to MySQL (2016) (uber.com)
129 points by DDerTyp on June 10, 2018 | hide | past | favorite | 47 comments

If what they wanted was a Schemaless, sharded data store, then perhaps Postgres isn’t the best or most fair comparison. Also the claim of issues with table corruption seems strange / unbelievable— I have used Postgres as my daily driver for data (at scale)!for 12 years now and yet to see it. But again, I haven’t got same scale requirement as Uber.

I think Evan's identified a ton of problems with Postgres (at the time) that extend far beyond just the query/data pattern. Many of these problems were directly acknowledged by the PG team, which made for interesting reading: https://www.postgresql.org/message-id/flat/579795DF.10502%40....

Use any database enough and you'll find weird table corruption bugs. I've seen them in MySQL and MSSQL in my professional career. Fortunately in both cases it was caught before there was an unrecoverable issue, but it happens.

I've read the article before and I do agree, Postgres wasn't really the tool for what they're doing.

> Use any database enough and you'll find weird table corruption bugs

I'm not convinced. The one feature I really need a database to have is to guarantee that whatever data I write to it, I will be able to read back unaltered.

I'm a heavy Postgres user since 2001 and I have never seen table corruption, even in the face of defective hardware. The only time something remotely comparable has happened was index corruption on a replica due to a bug in 9.2.0 which was very quickly fixed and as it was just a corrupted index, very easily restored (REINDEX).

If a database I trust with production data was corrupting tables, I would move off that system in a heartbeat (which is incidentally why I stopped using MySQL back then)

Postgres has bugs, or behaves in ways that are likely to trigger bugs in your code. Try a combination of long-running transactions and a high commit frequency, such that an appreciable portion of the xid space is active.

Or don't. You know that combination is tempting fate.

I love postgres. It's so great. The very best thing I can say about software is you have to tempt fate to get bitten by a bug.

> long-running transactions

don't do that if you value your uptime at all. Long-running transactions are the bane of any Postgres setup, no matter the XID overflow.

A few weeks ago, one slipped through my monitoring and a cache table we often sequence scan which normally had 10k rows in it had 28M dead rows that could never be removed due to that one transaction.

That sequence scan suddenly became a lot more expensive.

To be fair, MySQL/InnoDB can also hit performance problems in a similar set of conditions. A long-running transaction will prevent old row versions from being purged, which can hurt performance due to accumulation of them. The impact just isn't quite as drastic as in Postgres, since InnoDB effectively stores old row versions out-of-band (i.e. in a separate file).

Just out of curiosity, what do you find to be long running? Five minutes, five days? Depends on the volume of other transactions happening simultaneously?

As it's the case very often in our business, the answer is: "it depends".

It depends on the write load, the read load, and the write patterns.

Think about it this way: for the duration of any transaction, Postgres is basically in append-only mode.

If you have cache tables that normally get rewritten entirely, no row will be deleted during the duration of any transaction in the whole cluster (so this even affects databases other than the one the transaction is connected to).

In our case, we run some heavy import jobs daily which causes transaction lifetimes in hours, depending on the job. This isn't an issue, or rather, it's an issue, you can throw hardware at in order to circumvent it.

The transaction I was referring to in my earlier comment had a lifetime in days and nearly brought down the whole system.

Practically speaking: A long-running transaction is one that issues BEGIN, locks some rows, then blocks on other network input, and finally COMMIT. Once a day/week/month that input arrives very slowly and the COMMIT happens minutes or hours after the BEGIN.

There was a bug related to replication; it was caught and fixed rather quickly at the time, but Uber still got bitten by it.

See http://thebuild.com/presentations/uber-perconalive-2017.pdf

The missing context here is the scale. What size of data and transaction volume have you been working with?

By now the data is about 6.2 TB with overall 20k transactions per second, so I would say not small any more, but also certainly not Uber scale.

Still. I have had MySQL corrupt tables with way less load and data (in the less than 10GB size range)

6.2TB is single machine scale

The company here is doing petabytes of data and millions of ops per second. That’s scale.

We should be clear about what “at scale” means, and 6.2TB is not.

No matter the scale. Table corruption is never acceptable.

If table corruption is acceptable, you might as well just serve data from /dev/urandom and store user data to /dev/null.

Table corruption may not be 'acceptable,' but that doesn't matter. Even if your software is perfect and has no bugs, it cannot account for issues at the system and hardware level. Backups and redundant systems exist for a reason. I've have data corruption in a PG database before. I don't remember the cause--I think it was the filesystem rather than Postgres that was at fault--but it still happened and I had to deal with it.

Obviously the situation being discussed here is corruption specifically because of the database software. If a comet destroys the entire datacenter and the data is lost, nobody is going to blame Postgres for it.

> but also certainly not Uber scale.

How many Uber rides are happening every second? I suppose it depends exactly on their setup, but I could see that being at least the right order of magnitude.

EDIT: According to the Internet, there are about a billion Uber rides per year, which makes for about 31 rides per second. Even if there are 1000 transactions per ride, 21k/second sounds approximately Uber scale to me.

The table corruption bugs were triggered when the load was very high, as far as I know. It happened twice to me. In both cases, the corruption couldn't be corrected.

The worse part was the corruption was propagated into replica, which brought up much more serious issue with Postgres's replication model. So the only solution was to install the latest good backup.

If Schemaless worked on PG, would this article have seen the light of day?

There are solid alternatives. pg_json with Postgrest on top. I should not have capitalised Schemaless as I was more referring to the notion of a schema-less database.

From the article:

> MySQL’s replication architecture means that if bugs do cause table corruption, the problem is unlikely to cause a catastrophic failure. Replication happens at the logical layer, so an operation like rebalancing a B-tree can never cause an index to become corrupted. A typical MySQL replication issue is the case of a statement being skipped (or, less frequently, applied twice). This may cause data to be missing or invalid, but it won’t cause a database outage.

So, data might be invalid or missing, but it will not cause a database outage...

Also, not too long after that blog post was written, Postgres added support for logical replication.

Tbh I prefer my RDBMS to fail-stop rather than wreck my data like I just asked the Hulk what he thinks about not smashing ever again.

Missing or invalid data can however cause an application outage instead when the guarantees of the RDBMS are no longer true. You can't win so don't try. Go for consistency throughout or handle lack of it and choose a different class of tool for storage.

this is a huge accusation against Postgres and at Uber scale, I would assume some homework has been done here.

Does anyone know if this is relatively true - within the strict constraints of comparing mysql and postgres ?

Is mysql really prone to lesser corruption than postgres.

MySQL is prone to /more/ corruption, not less.

It's gotten better in recent versions, but implicit type conversion and truncation of data is a common "feature" of MySQL.

0000-00-00 isn't a valid date; but in MySQL it is a "null" data on a column with a NOT NULL constraint.

PostgreSQL is more anal about what you put into it, if you try to insert into a row with no value for a NOT NULL constrained column it will fail the transaction.

Once data is in the system mysql will tend to favour binary replicated statement level queries; where postgresql's built-in replication is block level.

This means that if you delete a row on a mysql replica the replication will continue and you will only notice that you're out of sync when you try to alter the (now missing) row from the master; then replication will fail.. It's impossible to even modify data on a postgresql replica.

My understanding is that mysql will also truncate data in a column if the column is altered to where a value would no longer fit (IE; from varchar(20) to varchar(8)) where postgresql will fail and abort the transaction.

(This is my experience from using PGSQL and MySQL with varying degrees of scale for 15 years).

PostgreSQL has/had problems to be sure, the auto vacuum was an issue on the 8.0->8.2 line; and they're slow to adopt features (like baked-in replication and upsert) but I treat this as a feature itself; better to have a working feature that is cleanly engineered than a feature which is half-baked.

> MySQL is prone to /more/ corruption, not less

MySQL isn't prone to "corruption" (meaning, random unpredictable data loss or lack of validity) unless you are using MyISAM, a legacy storage engine that has no valid use-case for persistent data in 2018.

This is my experience as someone who has also been using MySQL for 15 years, including at literally the largest scale on the planet. I am not saying MySQL is perfect (no database is) or "better" than Postgres (both are great databases for different use-cases), nor am I agreeing with Uber's original post or motivations. But I do wish people would stop spreading FUD about MySQL-isms that are long out of date.

Most of your post is describing data conversion issues, which follows predictable (if admittedly esoteric) logic. This is different than random corruption. In MySQL 5.7+ (released in 2015), this undesirable data conversion behavior simply does not occur with default settings. Those settings were also available and widely used in prior versions, just not enabled by default.

> 0000-00-00 isn't a valid date; but in MySQL it is a "null" data on a column with a NOT NULL constraint.

Not correct. 0000-00-00 will error upon insert by default since 2015. https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sql-mo...

> This means that if you delete a row on a mysql replica

MySQL replicas should be read-only unless you're doing something horribly unusual or non-standard. And with global transaction ID enabled (increasingly common on modern MySQL setups), errant transactions that were manually run on non-read-only replicas are more apparent and much easier to detect with automation. Still, I agree this could be improved to be less of a footgun for new users.

> My understanding is that mysql will also truncate data in a column if the column is altered to where a value would no longer fit (IE; from varchar(20) to varchar(8))

Your understanding is out of date. With strict sql mode (again, default since 2015) such an ALTER will fail loudly rather than truncate data.

>MySQL isn't prone to "corruption"

Corruption in the sense of allowing data that is invalid ( clobbering valid data even on transaction error) is corruption by my definition.

>0000-00-00 will error upon insert by default since 2015

How many people have noticed they have invalid data only because of this change? how many people have avoided upgrading to 5.7 because of this change (and, admittedly other issues regarding the new revamped query planner which makes software like Zabbix slow to a crawl)... But, yes, it's a change in the right direction.

>MySQL replicas should be read-only

Ah, the old "it's not a footgun if you know better" argument.

> Corruption in the sense of allowing data that is invalid ( clobbering valid data even on transaction error) is corruption by my definition.

OK. Do you have any examples of a modern version of MySQL or InnoDB doing this?

> how many people have avoided upgrading to 5.7 because of this change

As I said, 5.7 just changed the default. This setting has been available since mysql 5.0.2 in early 2004 -- it just wasn't the default. Or you can use 5.7 and override the new default to get the old unsafe behavior. This change is absolutely a non-issue in terms of avoiding upgrades.

> other issues regarding the new revamped query planner which makes software like Zabbix slow to a crawl

I have heard of very few issues from the query planner improvements. Anecdotal, sure, but I am pretty "in the loop" fwiw.

> Ah, the old "it's not a footgun if you know better" argument.

How is that my argument? I directly said "I agree this could be improved to be less of a footgun for new users"!

The database won't force read_only for replicas because there are replication topologies (e.g. master-master) where nodes are replicating but also still need to be directly writable. These topologies are a bad practice (IMO) but nonetheless are used by enough places that MySQL cannot simply force read_only for replicas.

In any case, these days administrators who are completely unfamiliar with MySQL replication concepts should probably be using a DBaaS rather than setting up replication manually. The DBaaS platforms enforce read_only properly, making this all a non-issue.

A couple of years ago we did a load testing comparison between MySQL and Postgres for our Java-based project. MySQL throughput was 5x better. I'm not sure if it was JDBC driver issue or just database configuration wasn't perfect, but MySQL worked much better for us.

The default PostgreSQL configuration is extremely light-weight.

Allowing it to use (much) more memory is a good start. https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Serv...

We've got some non production facing stuff being hammered with the default postgres config. It's actually surprisingly good out of the box. We run on the basis that we will tune it if we need to with that and we haven't had to yet.

I don't think there is any issue in PgSQL JDBC drivers. In fact in the latest round of TechEmpower benchmarks[0], if you filter for Java, you will see all top positions are held by PgSQL based stacks for Single Query, Multiple Query and Updates tests.

0: https://www.techempower.com/benchmarks/#section=data-r16&hw=...

Although this is an old article (maybe even posted here earlier), curious to know how's does pg10 or upcoming pg11 serve / address those issues mentioned....


Previous discussion from 2016 ...

Maybe it's just me but I find this article has the following TL;DR: we are going against recommendations and common practices and postgres doesn't work well.

Since I started programming, I have always used MySQL because honestly I think it is the best DB I have used because it has great software support and usability.

I have tried Postgres and while I have no issues with it, I usually go with MySQL if I have a choice because I see no real reason to switch out something I know and like.

Is there any real reason for a person like me to use postgres or another alternative instead of MySQL?

Many reasons, full text indexing is a good one for example. Although mysql claims to have support for it, it is not as far as postgres is (with tsvector).

It has many more extensions. It depends on what frameworks you use, but django is mostly optimized for postgres.

The MySQL ecosystem is a bit weird due to having multiple "mysql"-ish versions. I did a small project where mariadb and mysql where not interchangeable. Some stuff was causing problems, cannot remember exactly anymore.

Edit: EXPLAIN ANALYZE in postgres I love it.

From 2016 and definitely a repost.

This is a 2016 article, can someone add the timestamp?

This is old news and a repost. pass...

While it is old to you, HN prefers to add a year to the title so that others might happen upon it if they missed it in the past. Not everyone is able to watch HN 24/7, and new discourse might have come up since the last time this was posted.

Postgres however has moved quite a way since 2016 when the post was written, so it is possibly even factually incorrect at 2018.

Postgres switched to much more rapid release scheme, and is making a lot of ergonomic and performance improvements since they got the "correctness" part down (such as auto vacuum, etc), which only came together in last couple years.

I feel like the 2016 tag is extremely useful in letting users figure this out, though. And the new discourse that flows from this (read some of the other comments in here, they're explaining that Postgres took a hard look at themselves and adapted).

Except you stopped to comment :)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact