
A PostgreSQL response to Uber [pdf] - kissgyorgy
http://thebuild.com/presentations/uber-perconalive-2017.pdf
======
lucasmullens
Title should be changed to "A PostgreSQL consultancy's response to Uber
[pdf]". Many people in this thread, including myself, assumed this was an
official response by PostgreSQL.

~~~
bmpafa
thanks. comments like yours are why I always check the HN thread before
deciding if something's worth reading.

------
mamurphy
The use of graphics in this PDF/slideshow was incredibly effective, more so
than most articles I read.

The PDF lays out Uber's statements, and either lay it over a real-world
analogy (a road on a sinkhole for database corruption) or lay it over a
picture that primes their response (like a picture of apples and oranges when
they plan to respond that Uber is comparing different features of mysql to
postgresql).

The use of the elephant picture to give "elefacts" (sort of a parody on
politifacts, where they evaluate the truth of uber's statements) is also
great.

The images add humor and reinforce the content - great use of graphics!

~~~
yawaramin
Not to mention the elephant painted to look like a taxi :-)

------
charlesetc
This is missing one huge point of the [uber engineer
post]([https://eng.uber.com/mysql-migration/](https://eng.uber.com/mysql-
migration/)).

They did not switch from a postgres instance to put all their data in a mysql
instance. They switched from a single postgres instance to shard their data
across many mysql instances. This an entire reworking of the architecture that
is completely ignored in this powerpoint.

~~~
SEJeff
Honestly I think Uber switching was more of a business decision as their
employees had trouble figuring out how to properly use Postgres. Instagram
definitely has different data needs, but is probably much bigger than Uber wrt
data. They use Postgres successfully via Django no less!

[https://engineering.instagram.com/sharding-ids-at-
instagram-...](https://engineering.instagram.com/sharding-ids-at-
instagram-1cf5a71e5a5c)

~~~
pkkim
As someone who participated in this transition, I don't think that was a part
of it. The new database layer took quite a bit of instruction to use properly
as well, and there were plenty of people misusing it.

------
foldr
This comes across as a bit defensive and snarky. I'm not sure what it's hoping
to achieve. The not so subtle message, reading between the lines, is that Uber
engineers had bad experiences with postgres because they are morons. Well,
maybe. But my main take away from that is that postgres is hard to use
correctly, which if true is actually a good reason not to use it.

If only programmers could be polite to each other occasionally.

~~~
grzm
Having just read through the slide deck, I'm having a hard time coming to the
same conclusion you do. Overall, I thought it was quite balanced, pointing out
strengths and weaknesses of both MySQL and PostgreSQL. The author even started
out with a slide with Inigo Montoya, "You insulted my elephant. Prepare to
die", so he was clearly cognizant of the fact that this could be taken as
purely a blind defense of PostgreSQL, and that is not the intent.

Would you mind taking the time to point out what phrases or slides gave you
the impression that this was written from the position that "Uber engineers
had bad experiences with postgres because they are morons"? I know that
different people can get different impressions of the same material, so it
would be helpful for me to understand better what gave you that impression.

There are a couple of comments like:

> _I assume the company the size of Uber can figure it out. C’mon_

> _But... c’mon. Uber?_

Both of these I think reflect the idea that it's likely Uber would have been
able to continue to use Postgres if they were interested in fixing the issues
they had with the system, rather than had additional, other motivations for
doing so, the belief being that the Postgres-specific issues they list are
likely soluble if they had wanted to put in the effort. They're a large enough
organization that they should have had the resources to do so.

That doesn't mean that their decision to move off of Postgres for those other
reasons wasn't the right thing to do: just that there's not enough information
there for us to really understand the decision process. From a Postgres
community standpoint, it's important to make sure that they have quality
answers to the issues publicly raised by very visible companies such as Uber.
Many people will read about Uber's experience with Postgres, and it makes
sense for the Postgres community to be clear what can be done about them.

Your point about "postgres is hard to use correctly" I think is one of those
things that it's hard to use a lot of the systems out there—not just
Postgres—at the scale that Uber or some of the other large installations do.
That's when you really become aware of where the stresses put on the systems
start to show and what you need to be aware of to tune and set them up
correctly for your use case.

Overall, I think 'gdulli's response
([https://news.ycombinator.com/item?id=14223170](https://news.ycombinator.com/item?id=14223170))
is largely on point.

Like I said above, if you'd point out which parts struck you as particularly
unfair, I know I'd benefit from it to hear more from your perspective.

~~~
foldr
I found the use of shrug emojis in response to substantive points from the
Uber paper pretty infuriating.

I didn't say that postgres is difficult to use. I said that the slides give
that impression.

~~~
grzm
Interesting. I'm not a fan of emoji (or memes for that matter) in general, but
I realize they're part of the tech culture now and try not to let them bother
me too much.

The first shrug emoji was after the 9.2 data corruption bug. I took that one
to mean "Yup. What are we going to do? There was a bug, and we fixed it as
quickly as we correctly could. Incredibly regrettable, but that kind of stuff
is going to happen." There are bugs in software. Knowing the Postgres
developer community, they take correctness very seriously.

The second one is after the Uber quote which describes their tolerance for
developer's holding open transactions and blocking I/O operations, which they
do so from a position of inexperience because they're not database experts. I
understood this one to mean "If you're going to use handle transactions in
this manner, that's not something Postgres itself is going to be able to help
you with."

The third was in response to the Uber quote regarding Uber application bugs
which resulted in open idle connections. I took this shrug to mean that if
this is an issue, it likely can (and should) be fixed in the Uber
applications. It's not really a Postgres issue.

The fourth (and last) was in response to the lack of quantitative information
regarding their Postgres issues (which makes honest, in-depth third-party
investigation difficult), Uber's decision to go schema-less, and that MySQL is
more tolerant of the bugs in Uber software. I took this one again to mean that
there's little that Postgres itself is responsible for, or can do anything
about here.

A question I always like to ask myself when someone has issues with something
a third party is responsible for is what is a realistic and reasonable
response from the third party. That often makes me realize that there's little
they can be expected to do or are really responsible for, at least for some of
the issues. For the most part, I think these shrugs reflect that.

If the shrug emojis were removed, would it be okay? Anything else?

~~~
foldr
The issue is that it's rude to quote someone and then respond with an emoji.
If what they're saying is relevant then it merits a proper response. If it
isn't relevant, or you don't have anything to say in response, then you
shouldn't quote it in the first place.

You're welcome to come up with your own theories of what the emojis mean, but
I don't see any point in doing so.

~~~
grzm
This is a slide deck from a presentation at Percona Live 2017.

[https://www.percona.com/live/17/sessions/postgresql-
response...](https://www.percona.com/live/17/sessions/postgresql-response-
uber)

We're missing everything that was said by the presenter. I would strongly
suspect that the presenter said something about the slide.

~~~
foldr
Yeah that's a fair point. I’d suggest that’s a reason to think carefully
before posting a slide deck, though. If essential points are left off the
slides entirely, then it may not be such a good idea to post the slides by
themselves.

~~~
grzm
I think many people haven't had the issues you personally do with the
presentation. That said, other comments in this thread make clear that you are
not the only one to have read them as snarky. It's very useful to be
charitable and give people the benefit of the doubt when reading their
communication. I think there's a lot of useful information in the slide deck
presented in a succinct way, and publishing them as-is is a useful and
efficient way to do so.

I think with very little effort it's easy to interpret the shrugs as I did
above, and there's little if any additional information that needs to be
address along with the final shrug. I wouldn't have presented it this way, but
I don't see any malevolence or negative intent on the part of the author.

------
euyyn
The response to "MySQL handles our devs’ bugs better" is ¯\\_(ツ)_/¯ over and
over, but, in my opinion, it's perfectly valid criticism. When writing a tool
for real-world businesses to use, the path of least resistance needs to lead
to bug-free code, and the tool must handle common buggy usage gracefully.

~~~
gdulli
The path of least resistance should lead to quick and obvious failure rather
than a false sense of security that the system is working. Careless
development creates a price that has to be paid one way or another. You can't
get something for nothing.

~~~
tormeh
If you want your product to succeed it has to be easy to get to work. That
means having insecure defaults etc. When a product is in production, people
will go through hell to make it work a little better; when they're trying out
a product they'll drop it at the slightest inconvenience.

~~~
pg314
Some people work that way. Other people do a more in-depth evaluation. The
first group tends to end up with MySQL, the second with PostgreSQL. I'm
painting with a broad brush here, there are situations where it might make
more sense to use MySQL.

It would be hard to argue that PostgreSQL is not successful. MySQL has a
larger market share, but the PostgreSQL project is alive and thriving.

------
capote
I think this is humorous, and, in a way, brutal. Keep in mind it's not an
official document from Postgres but rather just a guy who is big on Postgres
writing a funny, and yes, snarky response to Uber. He makes some decent
points, if not about the technologies themselves, about people switching to
something more or less equivalent for reasons that could easily be interpreted
as more gut reactions than solid business and tech necessities.

To me, Uber acted as though someone bought a Honda and it had some mechanical
issues (and no seat heat), so he went apeshit, drove it off a cliff, then
bought a Toyota thinking he will never have that problem again.

------
FunkSQL
My personal experience dealing with Postgres people vs. MySQL people has
always been oddly lopsided, with Postgres users seeming to be crazy defensive
about their product and getting very offended about any perceived slight when
compared with MySQL, and MySQL users generally shrugging their shoulders and
saying "post what?"

One camp seems to be made up of perfectionists who spend a lot of time
worrying about how things "should" be, and the other seems to consist of
pragmatists who just want it to work.

I will leave it to the reader to decide which is which and which has more
appeal to business decision makers.

------
StreamBright
In case somebody missed there is another article from Uber about justifying
MySQL -> Postgres move a while back.

[https://www.yumpu.com/en/document/view/53683323/migrating-
ub...](https://www.yumpu.com/en/document/view/53683323/migrating-uber-from-
mysql-to-postgresql)

------
grillorafael
Really nice to see another side of Uber's issues. Mainly because since Uber is
a huge company most people will read their article as the absolute truth.

------
Dowwie
Is there a video of this presentation? Was this from pgconf 2017?

~~~
josnyder
I saw it presented at Percona Live three days ago. As far as I am aware, there
was no video.

------
candiodari
I think this slideshow is doing the project a disservice. Mysql is definitely
a "lesser" database, that seems to be widely known and accepted. It seems to
be also very frustrating for the PostGres team that that is not the sole
factor, but it isn't. For a shared database it's not even that high up there.
This slideshow is bemoaning that Uber seems to find the design decisions of
PostGres a problem. Really, they're fine decisions "just get a department to
deal with them", is the message. And yes, that is very much the PostGres
attitude.

The great thing about MySQL is that it generally just keeps working with
incredibly small amounts of maintenance whereas PostGRESQL just constantly
needs attention. This has always been my personal complaint. From vacuuming
(yes I used PostGRESQL before autovacuum, and you can still fuck up
autovacuum) to upgrades, everything is just fiddly fiddly fiddly.

The end result is that mysql, you start it, you run it, you do your normal OS
upgrades and everything just kinda hums along. For years and years. PostGRESQL
is like all enterprise solutions : you start it and run it and a month or so
and it suddenly refuses to accept connections, or suddenly it starts using too
much disk (e.g. misconfiguring autovacuum), or ... It has a bazillion things
you need to configure and make cooperate and there's large procedures for
everything you need do to. Every week some warning light goes all flashy and
won't stop flashing until it made you press a few buttons where it was
perfectly predictable which buttons needed to be pressed. It forces you to
consider 2000 configuration options, rather than picking sensible defaults,
instead asking.

But yes, you get something back for that. A bigger, better, more correct and
far more featureful database. In many ways it starts having the issues of
other large databases (e.g. the 3-page-and-totally-inscrutable SQL stored
procedure functions).

This is very much a case of "pick your poison". But frankly, if you want your
app to just run, like we all do, MySQL will serve you better. If your OCD
can't deal with small imperfections, datatypes that fit only 99%, having
values that your text mode SELECT in the database can't print ... if those
bother you, stay away from MySQL. And of course the classic, if you have a
"real database workload" (very heavy load with constant reads AND constant
writes), yes you probably need PostGRESQL.

You could say Mysql is halfway between LevelDB and PostGres.

By the way, if you need a mobile database with zero maintenance, SQLite will
serve you even better. It can't be shared with other applications and is not
meant for database-behind-network approaches, but you'd be surprised how well
it can work.

~~~
anarazel
> I think this slideshow is doing the project a disservice.

I think you know, but I just want to emphasize: This is not the project's
response, it's an individual's response.

------
jchmbrln
This is fun, but here's the actual, less snarky, response from Postgres
developers, as previously discussed:
[https://news.ycombinator.com/item?id=12201353](https://news.ycombinator.com/item?id=12201353)

------
mianos
The index issue and the memory used per connection I can understand but when
they didn't even try to use one of the many logical replication systems that
have been used a scale bigger than Uber, hello Marco at Skype, the argument
against Postgres gets a bit confusing. Uber seemed to really understand some
things with amazing depth and not understand others that are documented by
others outside of Uber. I think some politics played a big part.

------
grezql
I know PostgreSQL meant to defend themselves, but it just made the matter
worse.

1) I didnt even know Uber switched database, now I know. I also know the
reasons.

2) Comes across as unprofessional, you don't see Microsoft defending MSSQL
this way. They let the users see it for themselves.

~~~
dragonwriter
> I know PostgreSQL meant to defend themselves

[...]

> you don't see Microsoft defending MSSQL this way.

PostgreSQL Experts Inc. is a consultancy that specializes in PostgreSQL; they
don't appear to be particularly linked to Postgres development
organizationally (and none of their staff profiles highlight involvement in
Postgres development). If this was the PostgreSQL Global Development Group,
you'd be a bit more on point.

~~~
protomyth
If its not PostgreSQL but instead a consultancy then the title should be
changed to reflect that. The misunderstanding is likely to affect the comments
(as above).

~~~
JdeBP
The title does already reflect that. The name of the consultancy is on the
title page of the document.

------
protomyth
I'm a little confused by the pdf. Is it implying that the program is opening
and closing a connection for each query? Is that normal these days?

~~~
syncsynchalt
Some applications will indeed do that to guard against the server-side memory
cost of thousands of postgres pool connections. It makes sense when queries
are relatively rare.

This was the solution that we used at MX Logic in the early years, before we
moved to pgbouncer and went back to long-lived stateless connection pools.

------
perfmode
Facebook (TAO), Dropbox (Edgestore) also moved to a schema-less database
design. I wonder how these large orgs manage versioning of their models...

~~~
mahyarm
You put a version key & number in the model blob metadata itself.

------
niroze
Love the shrug emoji

~~~
chrisper
I think it is an emoticon, but I am not sure! Or are emoticons always
sideways? Hmm...!

~~~
floatboth
I always thought "emoticon" is the most generic word that includes everything
from ":)" to the shrug to unicode emoji.

------
skc
The reason (and timing) for the somewhat snarky response is probably because
Uber right now have a bit of a black mark against them PR wise.

------
discodave
Meanwhile Amazon and AWS have essentially banned putting a relational database
behind any publicly facing website or service.

~~~
darksaints
While I don't doubt that they'll tell you that that was a scaling decision, it
actually has more to do with an architectural decision that proved to be
difficult to overcome once it's flaw became apparent.

They operated warehouses using a monolithic oracle database, one for each FC.
They had hundreds of different services using the same database. Whenever one
service wanted to do something new, they had to spend a massive amount of time
running their proposed database change past every team on the database. I've
seen a single column addition take 9 months and hundreds of engineer hours to
get approved.

So once the warehouses got really big, sharding was the obvious answer but
they couldn't make sharding work because they couldn't coordinate their way
out of their mess they created. They couldn't scale because they engineered
themselves into a corner that made it impossible to use normal best practices
for scaling SQL databases.

NoSQL has an interesting lack of a feature that solves their problem. Because
theyre not relational, they don't really work very well sharing data across
services and teams, so they don't get into major coordination tangles on
shared databases. Maybe that works for them, but it's more of an indictment of
their engineering culture than it is a slight on SQL databases. And it's
pretty punitive in a TSA kind of way: We fucked up once so none of you can
have nice things anymore.

------
hobolord
Can anybody explain about the buffer pools part?

~~~
jldugger
Caching can happen in a number of places, with a variety of degrees of
success. In psql, a lot of caching is deferred to the OS filesystem layer. In
MySQL, apparently, they have a additional cache in the MySQL address space.

There's two sides to this. Generally speaking, an general purpose OS will use
cache management algorithms that suck compared to what an application could
do, because the application has more structured knowledge. In the case of a
DB, it knows about indexes, and row sizes, and is less likely to evict half an
index or row.

On the other hand, the OS is sort of the last authority. Varnish, in
particular, argues that programmers should rely on the OS caching algorithms,
because you have them whether or not you want them. A poor interaction between
userspace and kernelspace caches can end up increasing I/O activity if kernel
pages something to disk before the userspace does (varnish had a doc somewhere
explaining this better, which I can no longer find). The penalty here though
is context switching. A userspace cache is available in memory, whereas a
filesystem / buffer cache will incur a context switch to retrieve the data
from kernelspace to userspace.

Finally, both have a number of caches, so this is more about how much and what
type of userspace caching.

