
Thoughts on Uber’s List of Postgres Limitations - areski
http://blog.2ndquadrant.com/thoughts-on-ubers-list-of-postgres-limitations/
======
Illniyar
I think its worth mentioning again that what Uber ended up using has no
resemblence to an RDBMS (single table, manual indexes).

So regardless of whether their complaints are justified or not, it should not
be taken as an endorsment of mySQL over postgres, but rather of an endorsement
of NoSQL over RDBMS .

Which is really just what every company at these scales do (except for google
and f5 if whitepapers are to be considered).

~~~
mranney
Uber runs a lot of databases in production. We stopped using Postgres a while
ago for new applications that talk SQL, and this article explains some of the
reasoning.

We still have many applications that talk directly to MySQL, and we still have
our original API monolith that talks directly to Postgres.

All new applications are being built using distributed databases like our in-
house Schemaless system which happens to be backed by MySQL, and we also have
Riak and Cassandra in production.

~~~
im_down_w_otp
I would love to see a similarly detailed write-up on your guys' transition
from Riak to Cassandra. Especially considering you would have been really,
really deep into the details and guts of operating Riak at the red-line from
your time at Voxer.

------
harshal
Potentially the most useful part of this post to me was this part

> 2ndQuadrant is working on highly efficient upgrades from earlier major
> releases, starting with 9.1 → 9.5/9.6.

I hadn't heard of that before. Anybody know more about this? I'm currently
babysitting a 9.1 deployment which we desperately want to get upgraded. The
amount of downtime this can tolerate a very limited and I was currently tasked
with coming up with a plan. Its going to get hairy. If such a tool is really
on its way, I could make a case for holding off on the upgrade for a few more
months and save quite a bit of work.

~~~
pilif
you can use pg_upgrade with -k - it will complete within seconds. Afterwards,
things will be slow until a complete analyze updates the statistics, but the
update itself can be done in seconds.

I have updated ~2TB of database from 9.0 all the way to 9.5 over the years.

~~~
bigato
The problem with this is that if anything fails, you can potentially corrupt
your data and have no backup plan. To make that option safe, you would have to
copy your data directory first, and you need to be offline for that. So you
have to add the time it takes to make that copy.

~~~
pilif
This is why I ensure that the slaves are up to date, then disconnect them,
pg_upgrade the master and resync the slaves (which is required anyways). If
something goes wrong, I would fail over to the slave.

Also: You don't need to be offline to copy the data directory. Check
`pg_start_backup` or `pg_basebackup` (which calls the former)

~~~
bigato
That requires the master and slaves to run different versions for a while. And
that is not possible with stock postgresql, is it?

Regarding your second point, I meant copying the data directory as in a 'cp'
command. Or rsync if you will. The functions you mentioned are only useful
when doing a dump, isn't it? And recovering from a upgrade problem using a
dump is way slower than just starting the previous version in the backup data
directory.

~~~
pilif
_> That requires the master and slaves to run different versions for a while.
And that is not possible with stock postgresql, is it?_

Yes. That's not possible. But if I announce the downtime, bring master and
slave down, migrate the slave and run our test-suite, migrate the master, run
the test suite again and bring the site back up, then I know whether the
migration worked.

If the migration on the slave fails, well, then I can figure out where the
problem lies and just bring master back.

If the migration on master fails, but works on slave, then I can bring slave
up as the new master.

No matter what, there's always one working copy and the downtime is limited to
two `pg_upgrade -k` runs (which is measured in minutes).

 _> Regarding your second point, I meant copying the data directory as in a
'cp' command. Or rsync if you will._

Yes. You execute `select pg_start_backup()` to tell the server that you're now
going to run cp or rsync and to thus keep the data files in a consistent
state. Once you have finished cp/rsync, you execute `select pg_stop_backup()`
to put the server back in the original mode.

This works while the server is running.

If you don't want the hassle of executing these commands, you can also invoke
the command-line tool `pg_basebackup` which does all of this for you.

~~~
jlgaddis
Any chance you have a blog or website where you could write up/post an example
of this entire process? It sounds like the details that you've posted above
would be of extreme assistance to many others.

------
3manuek
I do think that you need to learn how to use in the best way the technologies
you have chosen or that are present in your current setup. No matter if it is
a MySQL, Postgres or any other DB, it requires as a part of a job, learn how
to use at it's best. The points on the article are good, however it's true
that Postgres had problems with scalability not so long ago. That's changing,
however I think that other data stores have addressed the problem of
availability and scalability earlier and gained maturity during the last
years.

Also, there is something that caused some noise to me:

    
    
        This point is correct; PostgreSQL indexes currently use a
        direct pointer between the index entry and the heap tuple 
        version. InnoDB secondary indexes are “indirect indexes” 
        in that they do not refer to the heap tuple version 
        directly, they contain the value of the Primary Key (PK) 
        of the tuple.
    

That's true, but the article doesn't make explicit that the PK on InnoDB is a
clustered index and, that there are other optimizations like adaptive hashing
to make read queries faster.

~~~
evanelias
Agreed 100%, and the author also failed to mention several other advantages of
having a clustered PK and indirect secondary indexes. A few off the top of my
head: reads in PK order are faster due to lack of indirection; clustered index
takes up less space due to lack of storing pointers to tuples; secondary
indexes will be covering (no need for PK lookup at all) if the query only uses
columns in the PK and secondary index.

It is interesting/ironic to see the article complain "those limitations were
actually true in the distant past of 5-10 years ago, so that leaves us with
the impression of comparing MySQL as it is now with PostgreSQL as it was a
decade ago." In the MySQL world, we very very frequently see the opposite --
Postgres fans bashing MySQL for things that haven't been true in 10-15 years,
as well as things that simply have never been true. It certainly is
frustrating, just like what the author is experiencing!

Having a favorite/preferred database is fine, but I don't understand all the
extreme views -- why do so few of these articles take the view that Postgres
is a better fit for some workloads, and MySQL/InnoDB is a better fit some
other workloads?

Or even just an acknowledgement that the authors of these articles rarely, if
ever, have a comparable amount of expertise in both databases -- which would
be necessary to make a fair comparison. Yes, Uber's original article clearly
shares this same problem, but at least they seem to acknowledge it more
clearly than the author of this response article. Take the section on
replication comparison, for example: the author is describing logical
replication support in Postgres even though it's currently a third-party
addon. Cool, but MySQL has all sorts of third-party replication systems too.
Alibaba has implemented physical replication in MySQL. And meanwhile even in
MySQL core, there are two different types of logical replication -- there's no
restriction to only use statement-based logical replication as this article
implies.

------
andy_ppp
Might be worth someone writing "a panicky guide to installing varnish" for
such database issues. Pretty embarrassing though!

------
Dowwie
Does anyone know the back story to Uber - why didn't it try to improve
Postgres rather than move on to feed on another host?

~~~
tehbeard
Most likely they don't have any engineers with the skills necessary to
build/improve a RDBMS.

(note I said build, not use, different skillsets between driving a car and re-
configuring the engine to run on seed oil)

------
Sevrene
google cache:

[https://webcache.googleusercontent.com/search?q=cache:blog.2...](https://webcache.googleusercontent.com/search?q=cache:blog.2ndquadrant.com/thoughts-
on-ubers-list-of-postgres-limitations/&num=1&strip=1&vwsrc=0)

------
f055
I came here to write a snarky comment, but now I can write two ;)

first: if you think any particular db platform is clearly a winner in "db
wars", you are naive. there are so many factors involved in configuring the
db, the backend, the frontend etc. that you can always find a case where: the
supposedly winning db is failing, or the supposedly worse db is performing
perfectly fine. and from my experience, you should always use the
platform/framework/language that is best for the current project, not the one
you madly love. clearly postgres wasn't working for uber. that does not mean
it will not work for your project. i have a recent experience where a binary
file of a programming object works much much faster than mysql and solves
several other problems. would i say "use binary files instead of rdbms"? of
course not. but in this one case it does wonders. the "tech-vs-tech" wars need
to end, they are pointless.

second: if you cannot setup your blog to withstand an HN spike then maybe you
don't have as much real world experience with scalability (albeit simple) as
you might think (hint: static page cache behind cdn will make you almost
bulletproof - also, with for example Azure, that's dead cheap).

~~~
arnarbi
That second point is a really unnecessarily belittling straw man, and I think
such comments are counterproductive to the discussion.

~~~
debaserab2
I don't know, if you're going to critique the very talented engineers at Uber,
seeing your blog fall over due to capacity doesn't lend you a lot of
credibility.

~~~
thinkMOAR
On what information do you base your opinion that the engineers at uber are
very talented?

So far what i have seen, the only thing Uber is talented at is violating local
laws and then throwing sacks of money at it to pay fines or whatever. (and
inflating their own (bubble)value, but probably not many people agree with
that)

Seeing their blogs mysql-> postgres followed by a postgres -> mysql migration,
doesn't give me the idea they are very talented (they still might be, but so
far no data has proven me this). Talented would be to forsee these issues and
to have avoided encountering them at all. At least that would be my definition
of very talented.

~~~
debaserab2
I'm not sure what anything you said has to do with their choice of database.

~~~
thinkMOAR
Then with your own reasoning, i'm also not sure that is worth a comment of
yours.

I clearly indicate their recent changing of db software twice to avoid issues
(experts could solve) is an indication one in my eyes is not very talented.

~~~
debaserab2
Actually no, you didn't clearly say that at all. How do the experts solve the
fact that Postgres' rewrites entire indexes on row updates?

~~~
thinkMOAR
"Seeing their blogs mysql-> postgres followed by a postgres -> mysql
migration" this is not clear to you? Perhaps a visit to an optician is in your
best interest.

And why do you make the assumption, i'm a postgresql expert to answer that
question.

Though others did, so if you would have bothered to read the other thousands
of comments on the matter. You would not have needed to ask this question,

PostgreSQL Heap-Only-Tuples (HOT) from: [http://use-the-index-
luke.com/blog/2016-07-29/on-ubers-choic...](http://use-the-index-
luke.com/blog/2016-07-29/on-ubers-choice-of-databases)

------
known
Doesn't Postgres use mmap files internally?

~~~
anarazel
No. It uses mmap(MAP_ANONYMOUS) for most of its shared memory, that's pretty
much all the use of mmap.

------
raisingqs
would simply coughing up the money for an expensive oracle/sql server solution
have worked in this case?

------
asah
To me, the real news is that Uber ($50B company) didn't bother to engage the
postgres community before migrating - they'd have jumped to support Uber.

~~~
rch
I don't get the sense that Uber has had much stability in technical leadership
over its lifetime. Big moves like this can be as much cultural as technical.

~~~
snaky
What would be better to have in resume, "we were using PostgreSQL and after
some tuning it worked just fine" or "we designed and implemented scalable
modern BigData realtime OLTP solution"?

------
bechampion
"Error establishing a database connection" ^ Running on pg too?

------
grabcocque
'Error establishing database connection'

How very meta.

~~~
awalGarg
The site is powered by Wordpress, which, to the best of my knowledge, can work
only with MySQL. Now _that_ is very meta.

~~~
simon2Q
I believe the blog site does run MySQL. We eat our own dogfood wherever
possible, especially on important services... we hadn't regarded the blog site
in the same category until now.

------
ZeKK14
"Error establishing a database connection"

Genius !

------
ungzd
It seems that after lots of hyped NoSQL systems companies are still using
MySQL or Postgresql as simple storage backend with lots of custom crutches
over it, like it's 2007. So all these cassandaras and riaks failed
expectations?

~~~
threeseed
This is one of the more ridiculous posts I've seen on HN.

"Companies" are taking a whole range of approaches to storing data. With a
combination of NoSQL, SQL and Filesystem e.g. HDFS and everything else in
between. Cassandra in particular is killing it right now under the stewardship
of Datastax which is why they've grown from 1 person up to 400+ employees.

