Hacker News new | past | comments | ask | show | jobs | submit login
On Uber’s Choice of Databases (use-the-index-luke.com)
169 points by narfz on July 29, 2016 | hide | past | favorite | 13 comments

All of these response posts seem to miss one big piece that made no sense to me. Uber goes really in-depth around the higher cost of updates in postgres, but then Uber describes Schemaless as an immutable append only data store. They created a custom database with no updates, but one of their primary reasons for changing from postgres to mysql was because of the high cost of updates?

From what I can tell Schemaless has updates but represents them as inserts. When you want to change something (e.g. change the billing status of a trip), it writes a new billing status for the trip and consumers will always ask for the latest billing status for the trip, which is the most recent and up-to-date one.

Take this with a grain of salt, I just read the Schemaless articles yesterday.

That is how I understood it too. Updating your data with a call to insert won't run into the same problems as updating your data with a call to update. Their immutability implementation seems like it would behave nearly the same under both databases.

Isn't reddit doing schema-less on postgres?

Most of the conversation on the topic misses that Uber's actions are totally irrelevant to 99.9% of us. Uber has unlimited resources and unique problems. Sure, it's interesting but ultimately almost entirely meaningless.

Schemaless is primarily a sharding scheme for MySQL. It's possible that using Postgres in an append-only manner would have created too many records to handle without an analogous sharding scheme.

From the little I can see from their blog posts, Schemaless seems to be both a sharding layer and an immutability layer that can sit on top of postgres or mysql. However, discussing the costs of mutable data when you have created an immutable data storage layer doesn't make much sense.

This is a very good and fair response and anyone serious about tuning their DB should have use-the-index-luke.com bookmarked!

'SELECT * from depesz'[1] is another great resource for weird sql capabilities and performance-related stuff, focused heavily on Postgresql

[1] https://www.depesz.com/


This really all boils down to "Ford F-150s are bad trucks because your cargo can get wet when it rains, use a U-Haul truck".

Sure, for that VERY specific (and not very common) use case U-Hauls beat F-150s. However, I think we can all agree that F-150s are far more versitle for most use cases, up to including something trivial like heading to Starbucks for a quick coffee.

The money quote to me that was really interesting was this one:

> for a rapidly growing company, technology is easier to change than people.

This is a response to Uber's article, discussed at https://news.ycombinator.com/item?id=12166585.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact