
Ask HN: What are pros and cons of PostgreSQL and MySQL? - stanhou
I searched the web but most of the articles are written several years before. Many cons have been fixed.
======
mrmondo
From many years of hosting both - MySQL seems easier at first but there are
so, so many problems with its internal design that hide problems until its too
late. PostgreSQL in my eyes is engineered in a much smarter way and has
sensible defaults. PostgreSQL won't implement something unless it's _going to
work_ and work properly. MySQL is analogous to software like Drupal and
'hobbiest' projects that have grown organically without proper engineering -
and when it's too late you realise you're missing data, MySQL makes it easy to
do things the wrong way, PostgreSQL enforces you do things the right way.
We've also found that PostgreSQL performance far outshines MySQL as long as
you've setup basic things like shared buffer sizes etc correctly.

~~~
jypepin
do you have specific example? (sincerely interested)

~~~
rmetzler
\- MySQL example: for MySQL there is one thing called Collation, which decides
about the correct ordering of chars in table values. The default was
`latin1_swedish_ci` for years, not `utf8_unicode_ci` which might be more
sensible.

\- Postgres Example: to set up Postgres correctly, you need a Postgres user in
the OS, who runs the database. That should enforce to not run the database
process as root.

~~~
CoachRufus87
I spent this past week learning/solving this very issue. MySQL's utf8
character set doesn't actually include _all_ utf8 characters. Use utf8mb4
instead. Same goes for collation; use utf8mb4_unicode_ci.

~~~
lathiat
The good news is they are changing the defaults in 5.8:
[http://mysqlserverteam.com/sushi-beer-an-introduction-of-
utf...](http://mysqlserverteam.com/sushi-beer-an-introduction-of-utf8-support-
in-mysql-8-0/)

------
tjelen
Here are some issues that I've seen becoming important when scaling to a
higher traffic with both MySQL and PostgreSQL (we're running both for
different kinds of databases). Some of this is second-hand from my colleagues,
so it might not be exact:

* Thread-per-connection (as opposed to process-per-connection with PG) means much lower memory overhead with many concurrent connections; with PG you might need to use something like pgbouncer, which doesn't come without its own issues.

* Logical replication; the MySQL format is compatible between minor release versions (e.g. between all 5.x releases), which enables upgrading individual replicas or master without complete shutdown. With PG this often a becomes way more complex.

* Better caching/buffer management. Essentially, InnoDB has its own implementation of LRU cache while PG relies on kernel page cache. This generally allows finer tuning on the database configuration level and you don't have dive into kernel page cache tuning (which will be always OS-specific).

EDIT: most of this stuff is probably even better described in the Uber
article, linked in other comments here.

~~~
dijit
A note on logical replication. That exists in pg as an add-on right now. And
will be in base/core when pg10 ships.

~~~
falcolas
Any ETA on the PG equivalent of mixed replication - row based whenever
possible, logical when row-based doesn't work?

Logical replication has the problem that missed messages can have really nasty
effects on your data down the road.

~~~
evanelias
fwiw, all 3 of MySQL's binlog formats (statement-based, row-based, mixed) are
still actually logical replication:

* Statement-based is logging raw write SQL

* Row-based is logical effects of executing that SQL

* Mixed uses statement-based for everything except statements that are non-deterministic or otherwise unsafe for statement-based replication

Physical replication in MySQL would be storage-engine dependent. For example,
in InnoDB physical replication would mean shipping the InnoDB transaction logs
rather than the separate binary logs. Alibaba built something to do this, and
from what I understand Amazon may be doing something along these lines under
the hood in RDS MySQL and/or RDS Aurora.

------
rawnlq
Uber switched from mysql to postgres in 2013:

[https://www.yumpu.com/en/document/view/53683323/migrating-
ub...](https://www.yumpu.com/en/document/view/53683323/migrating-uber-from-
mysql-to-postgresql)

but switched back to mysql in 2016 which generated a lot of discussion:

[https://news.ycombinator.com/item?id=12166585](https://news.ycombinator.com/item?id=12166585)

There were a lot of followup posts afterwards arguing both sides if you search
for them:

[https://news.ycombinator.com/item?id=12216680](https://news.ycombinator.com/item?id=12216680)

~~~
heimatau
It's strange to me everyone touts Postgres but when you finally scale to a
medium to large architecture HN has a handful of articles on great start-ups
(now full fledge companies) switching FROM Postgres to MySQL. It's more
efficient, atm, especially when accounting for labor. I wish the HN community
would see that.

~~~
falcolas
There are more MySQL experts out there than there are PostgreSQL experts
(though the number is still probably under 10,000 worldwide). MySQL
replication and scaling is a well understood solution, and thoroughly battle
tested. There are dozens of good solutions out there for handling near-
instantaneous failovers, using little more than the tools provided by MySQL.

It has a lot of problems, but when you need to scale, you can do so much more
affordably with MySQL.

------
jgowdy
ITT: Many people bashing MySQL for the exact things that OP is referring to in
"Many cons have been fixed."

MyISAM isn't used anymore. MySQL has ACID transactions and foreign keys. Get
over it. The engine is still there for users who want to use it and that's not
a reason to knock MySQL.

Almost every one of the correctness and SQL standard flaws in MySQL is handled
by the sqlmode flag. It allows users to choose correct behavior, but doesn't
suddenly break the millions of MySQL apps when they upgrade. To suddenly break
millions of apps to be correct by default without sufficient time for apps to
migrate makes no sense.

The fact is, MySQL is significantly faster for simple CRUD applications which
is really common in today's REST and Microservice designs.

That being said, Postgres is an amazing RDBMS, and would probably become the
#1 database server if administration tasks like SHOW DATABASES weren't
accomplished through \l (or if \l where a shortcut).

Postgres users wonder why people use MySQL. They wonder why people use NoSQL
when storing JSON in Postgres can be faster. The fact is, the learning curve
of Postgres is needlessly higher than these other systems.

~~~
morgo
Product Manager for the MySQL Server here -

I would like to re-enforce a good point you mention: quirks can be a side-
effect of making sure you have a good upgrade story. Sure; we would prefer not
to have them, but you want a vendor which has a strong commitment to backwards
compatibility too. There is a balance: and I think we've gotten better at it
in the last 5 years. The default is now STRICT mode on (since 5.7/2015 and new
configurations since 5.6/2013).

Upgrade is a feature too :-) Because of the rate of CVEs at the moment, it is
important to keep within the vendor EOL policy. Because MySQL Replication is
logical, most users do major-version upgrades without downtime.

~~~
kazagistar
Quirks are a function of compatibility, yes, but also of how bad your initial
design was. Mysql having a lot of quirks doesn't meant it's better at
compatibility, it means it just had a worse design to begin with and had to
compensate.

------
Shelnutt2
Currently we are using MariaDB for most database needs. The main reason we've
continued to use MariaDB is for the multi-master replication. It is simple and
easy to setup, we've designed our applications assuming there is always a
write master, but aware of possible slag lag. So far having a multi-master
setup is not as simple or robust in PostgresSQL.

We also use tokudb storage engine for a large number of servers. This engine
is designed for datasets that do not fit into memory, and we've had a great
experience with it. We are able to load CSV's in 1/3 to 1/5th the time of
innodb. Most queries aggregate queries run faster, there is online index
creation and column changes. From a DBA prospective tokudb prints detailed
statuses in the show processlist! It is definitely a niche storage engine, and
not right for all cases but this has contributed to our continued use of
MariaDB of PostgreSQL.

With PostgreSQL 10 things are looking interesting. We tested out 9.6 with it's
parallel query and saw some significant improvements on aggregation queries.
It was not enough by itself to warrant use moving off of MariaDB but has me
very excited going forward.

MariaDB is not sitting still, there are many improvements coming to 10.2 which
should be released soon. For most applications I'd use whatever database you
are familiar with. If you have a specific case, terabytes of data, millions of
incoming writes, requirements for live aggregation, that is where the
PostgreSQL vs MariaDB really gets exciting.

~~~
gaius
_This engine is designed for datasets that do not fit into memory_

Can you explain a bit more about why this is "a thing" please? Because pretty
much every database does this, and has done since the 1970s.

~~~
phamilton
For the last little while, it's been recommended to have enough RAM for your
entire working set.[0]

While most engines will still work with frequent disk paging, performance can
degrade rapidly. An engine designed for the paging can be smarter about
performance, at the cost of performance on smaller datasets.

[0]
[http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_B...](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_BestPractices.html#CHAP_BestPractices.Performance.RAM)

------
ddebernardy
MySQL works fine if you chiefly interact with your DB through an ORM or using
basic queries.

Postgres shines if you need advanced functionality: recursive queries, lateral
joins, window functions, foreign data wrappers, partial indexes, advanced
types... The list of things MySQL doesn't do as well or at all, be it user
facing or under the hood, is fairly long.

There are plenty of examples of both DBs scaling successfully in the wild.

IMO it's a coin toss decision for most CRUD apps. For more advanced apps, you
might want to pick Postgres over MySQL for the same reason some companies
prefer niche languages over Java or PHP: the pool of competent engineers is
smaller but tends to be of higher quality.

~~~
pg314
> MySQL works fine if you chiefly interact with your DB through an ORM or
> using basic queries.

I ran into an issue [1] with MariaDB a year and a half ago: 'SELECT * FROM
table' failed to return a large fraction of the rows of that table. Selecting
them in a number of different queries with ranges of ids worked fine. Nothing
fancy, no transactions, on an idle database. I might have been unlucky, but I
will pick PostgreSQL over MySQL/MariaDB any day if I have the choice. Having
had a cursory look at the MariaDB source code and a more in-depth look at the
PostgreSQL source code confirms that preference.

[1]
[https://jira.mariadb.org/browse/MDEV-9347?page=com.atlassian...](https://jira.mariadb.org/browse/MDEV-9347?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-
tabpanel)

~~~
falcolas
MariaDB != MySQL

They're protocol compatible, but MariaDB has been drifting away from the core
storage engines and optimizer for some time. For good and ill (I think you hit
the latter).

------
dijit
Background on me; I abuse databases. I currently use pgsql as a k/v store with
locking metadata because it's the only database technology that supports being
forced onto disk (not just writing into VFS). It's also pushing huge row sizes
(1MB)

My team spent many months trying to mangle MySQL into doing a similar thing
(which it will do) but eventually we hit a performance stop with MySQL.

We did many things to improve the performance; switched to using fastsockets
in the kernel, changed memory allocators (tried two different ones) but at
40cores and 8 pci-e ssds it was spin locking some function in memory.

Not to mention it's history with throwing data away.

Most people use MySQL because they know it already- and I would say that's
fair comment. But getting started with PostgreSQL is easier and it scales
better, so it's hard for me to agree with using it in 2017.

To get started with a random test db you can do:

mkdir database

pg_ctl -D database initdb

pg_ctl -D database start

psql -h database

~~~
evook
> But getting started with PostgreSQL is easier and it scales better, so it's
> hard for me to agree with using it in 2017.

Facebook and Youtube use MySQL, YT via Vitess, in 2017. I am just curious if
you could exactly point out what part of PSQL scales better.

~~~
evgen
To say that Facebook uses MySQL is a bit of a misnomer. No one at Facebook
outside of a hundred or so people on the db team uses MySQL directly, it is
just an on-disk storage system with a basic SQL front-end that can be used as
an intermediate translation layer and even then it was more of a legacy issue
than anything inherent to MySQL as a technology. Please don't use it as an
example of why someone should consider MySQL, as there is almost no way anyone
else is going to be using it in a similar fashion.

Facebook has spent literally thousands of man-years working to make this
system scale and perform in the way they need it to, but there is nothing
about this effort that required MySQL and in fact a lot of effort was expended
working around glaring deficiencies in MySQL. If they had it to do over again
I think that no one at Facebook would select MySQL at this time.

~~~
evanelias
Just about everything you've said here is factually incorrect.

There are far, far fewer than 100 people working on MySQL at Facebook. I say
this as a former member of the team, and I am still in close contact with
several team members.

Additionally, every engineer at FB has the ability to issue ad hoc SQL
queries, and many teams do use special-case MySQL DBs that they interact with
directly. I was the lead on FB's DBaaS so trust me that I know what I'm
talking about here. Over 1000 engineers had used the DBaaS at the time I left
the company.

There is no legacy issue involved. FB is changing storage engines in MySQL
right now; they have the resources to completely move away from MySQL if there
was a compelling reason. Your statement of "If they had it to do over again I
think that no one at Facebook would select MySQL at this time" is completely
incorrect.

Even if you ignore FB, a much larger percentage of the largest sites on the
internet use MySQL than Postgres.

~~~
evgen
As a former manager on the ops side who worked closely with the managers of
the db teams, and the hadoop/hbase teams, and the presto folks, wormhole,
scuba, a few other teams that may not be public knowledge, etc. I can say with
absolute certainty that the only people in the company who thought it was not
a sad joke that we used MySQL down in the bottom layers of the stack were the
people on the db team. And why wouldn't that team like MySQL? Knowledge of its
intricacies and bugs and how to work around its problems is what got them this
high-paying gig.

I will simply state that having also spent a long time inside FB I disagree
with your assessment regarding the high regard in which you think MySQL is
held within the company.

~~~
evanelias
> I can say with absolute certainty that the only people in the company who
> thought it was not a sad joke

You can say with _absolute certainty_ that you know the opinion of _thousands
of engineers_ at Facebook? This is clearly hyperbole. It does not appear you
are interested in having a meaningful discussion on this topic.

> I disagree with your assessment regarding the high regard in which you think
> MySQL is held

Where did I say it's held in high regard? I simply said there is no compelling
reason to move away from it, and also said your numbers were way off base in
terms of how many people are working on MySQL at FB, as well as how many
people directly interact with MySQL at FB.

I also greatly question how closely you were interacting with the managers of
the db teams if you think there were 100 people on these teams!

[EDIT: It's also possible that we just have very different definitions of
people "on the db team". For people "working on MySQL" I would only include
MySQL Eng, DB Client, MySQL Infra PE, and the majority of Data Performance PE.
Arguably you could include some RocksDB people at this point, though their
work is used in many non-MySQL ways too, and even then that brings the total
to still way under 100 people. Overall I would not include teams like Wormhole
or TAO as "working on MySQL", though even if you add them in, still
significantly under 100.]

------
thom
If you are just looking for something to point an ORM at, you are probably
fine with MySQL, you may see performance gains, and administration is probably
a little more straightforward (and certainly more googlable).

If you spend actual time in the database using SQL, Postgres offers vastly
more functionality, e.g. window functions, broader and more mature GIS
support, more data modelling options. Postgres also certainly _can_ outperform
MySQL in a variety of situations - it supports different indexing options
(i.e. last I checked MySQL didn't have partial indexes), and is moving to more
parallelism in its query plans. MySQL's partitioning support is a little more
natural than Postgres's but that's changing in 10.0.

So, I would say, if you just want a back end for an app and have never
_really_ cared about your RDMBS, use MySQL or whatever you're most familiar
with. If you have complex query requirements, time series data, analytics
workloads etc, try Postgres.

~~~
scrollaway
Don't use a relational DB if you deal with time series data. There's time
series databases which are far, far better at that. InfluxDB is a good
example.

But you're also making it sound like MySQL is easier to deal with than
Postgres and I find that to be completely untrue. Postgres has far better
documentation, sane defaults and only one engine whereas the mysql
documentation is all over the place and "googlable information" conflicts
depending on your engine or hell, even whether you're using MySQL or MariaDB.

In fact if you don't want to have to care about administration, I'll go as far
as recommending Amazon RDS Postgres.

~~~
thom
I think there are perhaps 'soft' and 'hard' requirements for time series data.
I do a lot of analytics based on (mostly offline) data that is sequential, and
I benefit greatly from clustered indexes and window functions for many ad-hoc
queries. If you're talking inserting millions of readings per minute, then
yes, more targeted solutions exist. I think moving to esoteric databases
before exhausting what's possible with RDBMSs is generally suboptimal, but
other people are smarter than me.

~~~
stymaar
In my experience, the main benefit of influxdb are its built-in functions
(percentiles, integration/derivation, etc.) but its scaling ability is over-
emphazised : in my last company my coworkers have had issues exceeding 30k
insertions per minutes, way less than «millions per minute».

~~~
profquail
How long ago were you at your last company? I wonder which version of InfluxDB
your coworkers were using? There've been some significant performance
improvements in v1.0.0 and beyond, compared to the earlier beta releases.

~~~
stymaar
It was in 2015 and early 2016, before version 1.0. You're right that's an
important piece of context I should have added.

------
falcolas
Disclaimer - I worked professionally with MySQL for several years, and
actually don't mind using it. I've also used PosgreSQL professionally, and
don't mind using it either.

The best advice I can offer is, "Use what you're more comfortable with." Both
can be configured on a single machine to serve a ton of traffic. Both can be
backed up to recover when they falling over. Both can be configured to keep
your data consistent and safe. Don't worry about scaling, don't worry about
sharding or how much better indexing method A is than B. Worry about your
application.

Now that the obvious is out of the way, I personally prefer MySQL over
PostgreSQL. It will be easier to scale when the need arises (the corollary to
"avoid premature optimization" is "design to make optimizations easier"), it
performs like a madman when configured properly, and there are simply more
people out there who can help you configure it properly.

Tangent time: yes, it is comparatively feature poor. I can't store native JSON
documents in it, I can't run Python stored procedures, I can't create nearly
the same breadth of indexes. Then again, I have never needed to. JSON maps
nicely to objects, and an ORM maps those objects (somewhat less nicely) to
native tables and columns. I don't run stored procedures. I use purpose built
tools for handling (for example) full text indexing when the built-in searches
get too slow.

My biggest question, when everyone points out how many features PosgreSQL
offers, is "why are you doing that in the database?" Your database is your
biggest chokepoint. It's your biggest point of failure. If you're using it for
anything more than "transactionally store and retrieve data", you're asking
for even more trouble when it goes down.

Not if, when.

It also makes scaling harder; if you're doing most of your computations in the
DB, you have to scale the DB when one machine can no longer handle it. Scaling
DBs is inherently harder than scaling just about any other component in your
stack; you can't just add another server to the pool.

In summary, if you use your database as a database, and only as a database,
either option will stand you in good stead. The biggest advantage MySQL offers
at that point is that there are simply more experts out there who can help you
with it when you need that help (and will probably cost you less
professionally).

~~~
scrollaway
I want to rebuke a couple of points here:

> _there are simply more people out there who can help you configure it
> properly._

There are also more people out there who can help you mess it up. MySQL has
two engines and three different mainlines. Advice that applies to MariaDB
doesn't necessarily apply to MySQL. Advice that applies to MyISAM doesn't
necessarily apply to InnoDB.

A lot of people in this thread are underestimating how much of a cognitive
burden that is. Postgres has one mainline, one engine, one set of very solid
documentation and its community is relentless about strictness. This greatly
improves the _quality of help_ you will receive (be it through Google,
documentation, or contractors) compared to MySQL.

> _" why are you doing that in the database?"_

I'm a poor person to answer this but a lot of your post really stems on this
question. You don't see the need to do certain things in the db, thus it
doesn't bother you that you can't do those things in MySQL. That makes sense.
At any sort of load though doing certain things in the database is a
requirement. You may have to scale the DB earlier than you otherwise would,
but it also sometimes means that application-level code you would spin up 8
dedicated mapreduce instances for requires orders of magnitudes less work and
you _don 't_ have to deal with scaling as early in the first place!

------
ubernostrum
Postgres adhered to a philosophy of "make it correct, then make it fast".

MySQL adhered to a philosophy of "make it fast, then if someone complains
enough, consider making it correct".

The result of this is predictable: Postgres is now fast and correct in its
implementations, while MySQL is fast and burdened with a large number of
"gotchas" and legacy incorrectness.

For example:

MySQL's traditional "fast" table type doesn't support transactions. So you
have to select a different table type if you want transactions, but then when
you do, you discover that you need to usually go "up" one level from what
you'd expect, because the REPEATABLE READ level only affects SELECT, not
INSERT, UPDATE or DELETE queries. Which in turn means you can have an
inconsistency between the set of rows you'd see in a SELECT with a set of
conditions, and the set of rows that would actually be affected if you issued
an UPDATE with that same set of conditions. You have to go up one level to
READ COMMITTED in order to start solving this.

MySQL's Unicode collations have some bugs; when I was at Mozilla, one we ran
into was that we had to set up our own custom collation to get, for example,
"e" and "é" to be considered distinct characters (if you ever used MDN, and
once saw a strange bug where French versions of topic tags showed on an
English article? That's why -- MySQL thought "CSS Reference" and "CSS
Référence" were the same tag).

MySQL infamously accepted almost any kind of garbage values regardless of
declared column type, accepted values larger than the declared column width
(i.e., you could insert a 300-character string into a 256-character VARCHAR
column), and would silently or near-silently mangle, truncate or just drop the
data.

etc., etc.

I cannot at this point imagine using MySQL for any kind of serious project.

~~~
nailk
> MySQL's traditional "fast" table type doesn't support transactions

InnoDB is the traditional table type now. Myisam was forgotten long time ago.

> we had to set up our own custom collation to get, for example, "e" and "é"
> to be considered distinct characters

You could just use "varchar CHARACTER SET utf8 COLLATE utf8_bin"

> MySQL infamously accepted almost any kind of garbage values regardless of
> declared column type

There is a setting that allows strict checking.

> I cannot at this point imagine using MySQL for any kind of serious project.

It is used by many for serious projects.

~~~
philliphaydon
In regards to charset and strict checking, these are settings which should be
defaulted. I think the problem is that people run into these issues then have
to seek solutions, which shouldn't be issues to begin with. I don't know why
they aren't now defaults.

~~~
morgo
Strict has been the default since MySQL 5.7. The default character set
switches to utf8mb4 in MySQL 8.0.

~~~
philliphaydon
That's great to hear.

~~~
i336_
I'm reminded of
[https://developers.slashdot.org/comments.pl?sid=204433&cid=1...](https://developers.slashdot.org/comments.pl?sid=204433&cid=16703529)
as I read this exchange.

I found the above link in another HN thread some time ago. When I showed it to
some PHP devs recently (I unfortunately don't remember exactly who I talked
about it to) I was promptly told PHP does not do what is described in the
above link (from 2006) and is generally smarter nowadays.

It's sad that languages and runtimes take so long to fix their core insanities
(if ever), and it's also sad when systems are built without front-and-center
regard to design correctness from the start. But on the flip side, the only
people capable of envisioning good designs of this level of scale are often
worn out shells of their former selves at the "you'd have to pay me 7 figures"
stage... and of course you have to make sure you pick people who aren't
completely out of touch with the latest developments in $category science...

------
nottorp
This is not a technical reason, but MySQL is now owned by Oracle, which
carries a whole host of risks if you're a small fry. A technical comparison
should be done between MariaDB and Postgres if you ask me...

~~~
darkhorn
For example we have upgraded our MySQL 5.5 to 5.7 and we were hit by a
performance issue in few GROUP BY queries. Then we decided to use 5.6 instead
of 5.7. I think the Oracle corporation doesn't care much about MySQL.

Edit: In MySQL 5.5 and 5.6 it takes around 1 second but in 5.7 it takes around
70 seconds.

~~~
morgo
From MySQL 5.6 to MySQL 5.7, the cost model changes significantly. So while
most queries will improve - there are edge cases where regressions occur. In
anticipation of this occurring (even if you are better 99% of the time; there
will be cases where two wrongs made a right), MySQL 5.7 also added new query
hints.

I work on the MySQL team; we are 2x the size under Oracle, and have made many
improvements. Please check out my list here:
[http://www.thecompletelistoffeatures.com/](http://www.thecompletelistoffeatures.com/)

------
antoaravinth
Note I'm not an DB expert. I was trying to evaluate Postgres and Mysql for my
side project. I went ahead with Postgres after doing the following example:

#Mysql

create table test (name varchar(3));

insert into test (name) values ("hello world");

>Query affected 1 row // success!!

Even though my varchar size is 3, Mysql didn't even complain about my string
size in insert statement. The weird thing is that, when you try to retrieve
the value it returns only the first 3 chars which is "hel" (What??). I was
expecting an error message but Mysql was happy to eat my error!

#Postgres

create table test (name varchar(3));

insert into test (name) values ("hello world");

>> Error, value too long for the given varying (3)

The Postgres looks doing simple things correct for me.

------
no1youknowz
I've migrated from MySQL to Postgres to use CitusDB (for a company project,
many years ago).

For my own startup which the primary use-case is analytics, I have moved back
from Postgres to MemSQL.

The four reasons were :-

1) In-memory rowstore, thus eliminating my in-memory nosql dependency.

2) The ability to join in-memory tables with disk to provide up-to the second
data is awesome.

3) Multiple read/writer group nodes.

4) The biggest for me. Columnar storage.

Some examples on how columnar storage is so awesome.

Note that I am comparing MariaDB vs MemSQL on a old desktop. 8 cores, 8 GB
Ram, HD (non ssd) drive.

\- A query with a count of 1 table with 50m rows.

MariaDB - 49.60 sec

MemSQL - 1.12 sec

\- A query with a count, sum, where clause and group by, with 50m rows.

MariaDB - 17 min 44.84 sec

MemSQL - 8 seconds

\---------

Here are some things to be aware. Prior to migrating over to Citus. We were
shopping for MySQL consultants to see whether they could improve on the above
figures. They quoted $70-100k to work on the project, without guaranteeing
results.

With Citus I was able to see far better results and this is why the company
switched over. However fast forward to now, MemSQL beats Citus soundly. 1
desktop beats a cluster.

I wager, that with a MemSQL cluster with decent hardware on the cloud, I will
be seeing sub 1 second queries across the board.

Oh and the amount of data? It's tens of billions of rows a month. That's the
workload I'm talking about here.

\---------

This post isn't to bash postgres. I loved using it. If Postgres 11/12,
included columnar storage. I would literally scream! :)

~~~
lima
And apparently ClickHouse out-performs MemSQL:
[https://clickhouse.yandex/benchmark.html](https://clickhouse.yandex/benchmark.html)

Has anyone here experience with ClickHouse?

~~~
dignan
No experience, but did a thorough read through of the docs. One thing to keep
in mind about clickhouse is that their replication guarantees aren't very
strong. From the docs: "There are no quorum writes. You can't write data with
confirmation that it was received by more than one replica."

That's pretty troubling, but at least they're open about it. That said their
performance claims are pretty spectacular, and it seems solidly engineered.
Further if you're not planning on using replication it certainly seems
interesting. I'd be curious to hear about someone's production experience as
well, since the list of companies running it seems rather thin.

~~~
ztlpn
(ClickHouse dev here)

Yes, replication in ClickHouse is asynchronous by default. For intended use
cases (OLAP queries aggregating data from many rows) data that is a few
seconds stale is usually okay. In a serious production deployment you
absolutely should enable replication, otherwise you risk losing all your data,
not just last couple of seconds of inserts.

That said, sometimes synchronous replication is necessary despite the latency
penalty that comes with it. This feature is actually implemented but not yet
considered ready for prime time.

We have several years of production experience with ClickHouse (as a DBMS
powering Yandex.Metrica - second largest web analytics system in the world).
If you have questions - just ask.

------
lngnmn
PostgreSQL: being written in the sane subset of C instead of being a pile of
amateurish C++.

As a result of this design decision the resource management of each major
part, like the buffer pool, the lock manager, etc is explicit and
straightforward.

This, in turn, leads to indeed stable and predicable runtime behavior during
high loads.

Basically, PostgreSQL is one of very few canonical examples of what sanity and
attention to details could accomplish.

------
some_developer
We just recently switched from MySQL to PostgreSQL. God we're so glad:

\- add a null column to a big table => have fun and plan your maintenance
accordingly. Good luck estimating it. This is basically a no-op in Postgres

\- partial indices; absolutely important for big tables. Think about ticketing
systems where in general the number of "open tickets" is much lower than the
closed ones

\- being able to use advanced SQL features like window functions can easily be
underestimated; you can't use potential you don't have

\- renaming indexes, just like that. If you're into refactoring and lessen
developer surprises, you want to have proper names and fix accidently mistakes
or just legacy stuff

\- in my experience (but I'm no DBA), the logging of PostgresSQL is much more
useful then what I get with MySQL (e.g. deadlocks or slow logs)

\- the EXPLAIN (ANALYZE, BUFFERS) output is ... what you get is beyond
comparison to MySQL

The only technical downside we experienced with Postgres (but in comparison,
these things are benign):

\- limit of length of identifiers; even with aliases. This can trip ORMs which
try to do some clever naming under the hood

\- you cannot change the order of columns, you can only append at the end

To me, everything I do with Postgres feels much more predictable, whereas with
MySQL it always felt a game of gamble.

------
mamcx
All the talk here and everywhere have convinced me that:

\- MySql is the choice for _niche_ workloads and to be used by experts not
only in DB but in MySql Quirks.

\- PG is the sane choice for _everyone_. Experts benefit more, but
inexperienced developers will be better served from a more solid foundation,
and still PG is so good that will be along your side for years to come.

PG _maybe_ will not shine in some niche workloads but we are talking about
problems that most DO NOT HAVE.

\---

Years before, I have read (and believed!) that "PHP, MySql, JS" was "better"
for inexperienced (or new) developers, because are more "easy" to learn and
use.

Then eventually is show how facebook and others used them.

However, after +18 years on the field, I become more and more convinced that
is totally the reverse:

\- PHP, JS, MySql, etc are tools _for experts_.

\- More strict tools are better _for everyone_.

\- Only expert will use correctly any tool, even if have quirks.

\- This is even more notorious if the tool is more dynamic than static, or
provide more "freedom"/less "safety", like C VS. Pascal.

\- More well designed or strict tools provide equal benefits to experts, but
sometimes a tool with not safety-protection allow to do hacky things easier
(or possible).

~~~
ben_jones
It's funny to me because on one hand you have specialization where larger
companies will typically abstract database work to the DBAs and application
work to the developers creating some sane interface for interop and edge
cases. And then on the other hand you have things like "full stack developers"
who leverage new tech like kubernetes to essentially stay "that one guy who
does everything".

"Jack of all trades master of none" and all that. What's the natural
conclusion to this or does it stay this way forever?

------
chris_wot
The other day I saw a Google engineer with a shirt that read "MySQL is the
Windows ME of databases".

~~~
ino
I've seen online far more PostgreSQL fans bashing MySQL than the other way
around.

It's like there's an online task force to make MySQL look like shit.

And for most cases it won't matter and one should pick what they're more
comfortable with. And if the requirements aren't that simple, people should do
their own research.

~~~
evgen
> It's like there's an online task force to make MySQL look like shit.

Yes, we call them the MySQL development team. For a long time MySQL was simply
the default choice for many projects, and this status has persisted long after
it was the best choice and is extending into the period where it is no longer
even a reasonable choice. If you have a specific reason to use MySQL then go
ahead and use it (and 'it is what I know and am comfortable with' is a
reasonable justification) but there are few reasons to suggest it to someone
else except for the fact that misery loves company.

~~~
ino
That's not what I was talking about. I stand by my comment, and you just made
my point stronger.

~~~
evgen
I stand by mine as well. There is no online posse trying to destroy MySQL,
just people who have actually used it and been burned by it who have decided
to warn others.

It is a mediocre at best technology when there are better options available,
and anyone who suggest it to someone else for a greenfield project should be
considered an untrustworthy source of technical advice.

------
pgris
Maybe I'm wrong, but last time I checked, adding a column to an existing (big)
table in MySQl locked the table for a long time (like hours), and the same
operation in posgress was finished in almost no time

~~~
lathiat
This has changed in many cases (though not all):
[https://dev.mysql.com/doc/refman/5.7/en/innodb-create-
index-...](https://dev.mysql.com/doc/refman/5.7/en/innodb-create-index-
overview.html)

There are also a number of tools to work around it for other for other cases:
[https://githubengineering.com/gh-ost-github-s-online-
migrati...](https://githubengineering.com/gh-ost-github-s-online-migration-
tool-for-mysql/) [https://www.percona.com/doc/percona-toolkit/2.1/pt-online-
sc...](https://www.percona.com/doc/percona-toolkit/2.1/pt-online-schema-
change.html)

------
known
[https://en.wikipedia.org/wiki/Comparison_of_relational_datab...](https://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems#Database_capabilities)
is updated frequently

------
chvid
I have used PostgreSQL extensively before but now I just use MySQL. It is easy
to install, maintain and use - even on a Windows box.

Basically it gives me everything I need in a relational database and let's me
worry about other things.

~~~
SloppyStone
Would you say learning SQL (or relational databases in general) with mySQL is
easier than starting out with PostgreSQL?

~~~
chvid
Yes - it seems that MySQL is easier to set started with and there is more
information available online. Also a lot of cheap hosting uses MySQL which is
plus for hobby projects.

~~~
floatboth
More like, a lot of garbage outdated tutorials available online. Copy and
paste to get SQL injection vulnerabilities! Easy!

Seriously, what is "not easy" about Postgres? Just read the official
documentation, which is actually readable and good.

About hobby hosting… there is Heroku Postgres, but no Heroku MySQL. (There are
third party addons for everything, yeah). Cheap shared Apache hosting is
TERRIBLE. It should NOT be used for any new projects.

~~~
chvid
Yeah. Popularity and the ability to work at cheap horrible apache hosting
companies all the way to big bad facebook does speak volumes, doesn't it? :-)

------
mcv
I happened to be talking about this with a couple of friends just moments ago.
Their opinion basically amounted to this:

* MySQL is a really awful database. It's usable only for small hobby projects, like a simple Wordpress site or so. For serious datasets, it's a horror. Complex queries on large amounts of data are just broken. Many features don't work or work incorrectly. The design is fundamentally flawed.

* Use PostgreSQL instead.

* If you absolutely have to use MySQL, use MariaDB instead. It's a fork that fixed some of the issues.

* Still better use PostgreSQL. Or even Oracle or SQL Server. They're fine, MySQL is not.

~~~
6t6t6t6
It could be interesting to ask them what are they basing their opinion on and
if they can enumerate all the advantages of one database over the other.

> use MariaDB instead. It's a fork that fixed some of the issues.

Which ones exactly? ;)

~~~
mcv
Check their changelog, but one example I've heard is that a simple 15 minutes
upgrade to MariaDB improved performance for a particular query by a factor 5.

------
nailk
Is there something like TokuDB or MyRocks for PostgreSQL? These things allow
to reduce disk space usage by several times.

~~~
ptman
I guess you could get part of the benefit by using e.g. TokuDB via FDW:
[https://wiki.postgresql.org/wiki/Foreign_data_wrappers](https://wiki.postgresql.org/wiki/Foreign_data_wrappers)

------
mybrid
This kind of question is meaningless in my opinion without the workload being
defined. For example, is this for transaction processing like for banking or
web programming of social networking like with Facebook?

One also needs to consider the entire landscape. For example, the LAMP stack
is mature community with out-of-the-box, ready-to-ship web sites. There exists
lots of support and tools. Software stacks like Django for Python typically
have community involvement for a particular database. If one is using an ORM
like SQLAlchemy, what is SQLAlchemy best used with?

Here are just some of the things that should be qualified when asking database
choice:. 1\. Transaction support. 2\. Scale out support. 3\. NoSQL and other
data store integration. 4\. Cloud support if deploying in a cloud. Really
depends on your cloud vendor. 5\. Location, location, location. Is this an
internal application on your company LAN, or a social networking application
to be installed around the world? Replication in MySQL quite useful. 6\.
SQLite? Why did you exclude SQLite? I'm assuming because the person posting
this question already knows something about the workload that defeats SQLite?
What is that workload? Testing against SQLite is trivial, just create an in
memory database. 7\. ETL support. Are bulk updates due to ETL processing of
files and such common place? Bulk updates happen better when indexes are not
attached, etc. 8\. Reporting support. 9\. Denomalized versus normalized
support. It is common place in web applications with data distributed
applications around the world to shard a database table, completely
denormalized.

The point is that designing a database schema is heavily dependent on the
expected workload. The choice of SQLite, PostesSQL , MySQL, SQL server, Oracle
or what have you depends heavily on the workload. It is nonsensical to ask
what the right tool for the job is without knowing the job.

------
voganmother42
Mysql/Innodb: index organized table Pg: heap table

depending upon workload this can be significant.

[http://use-the-index-luke.com/sql/clustering/index-organized...](http://use-
the-index-luke.com/sql/clustering/index-organized-clustered-index)

~~~
SigmundA
This is a big thing I miss from SQL Server is clustered indexes. SQL server
gives you the choice so you can go either way depending on situation.

Was actually kinda of surprised after using SQL server for years then looking
at Pg that it didn't have IOT at all. Thought everyone had that option.

------
jhonyhacker
Postgres. The main reason I enjoy Postgres is because it is more standards
compliant than MySQL, which makes it a bit easier to find documentation
because it doesn't need to be specific to Postgres and easier to port to other
standards-compliant RDBMSes. I have had a much easier time scaling Postgres
with master-slave replication (it's only just come out in version 9 though).
Also partitioning in Postgres is super-easy. I don't know how MySQL fairs
these days, but Postres is completely transaction ACID compliant. I find user
management much easier in Postgres. You create the user and then specify where
they are allowed to connect from in a file. MySQL uses a separate user for
each host pattern they want to connect from. permalinkembedsavegive gold

------
i336_
I forgot to comment when this question was only a couple hours old :( but
hopefully someone sees this anyway.

MySQL has phpMyAdmin, which is (compared to some of the tools out there)
incredibly basic (and ever so slightly buggy), but really helpful for getting
started and providing a barebones "look at the database" interface. (In all
fairness, it does have some pretty decent features.)

pgPhpAdmin... isn't really the same. In my case I couldn't even get it working
- I don't remember what errors I got but as a complete beginner I wasn't sure
how to respond (I _think_ the errors I was seeing weren't giving any hits on
google either).

So, I'm curious what other web-based or Linux-compatible postgres
administration tools are out there?

~~~
kk_cz
pgAdmin ([https://www.pgadmin.org/](https://www.pgadmin.org/)), of course.

btw, if you want to work with mysql you might want to use mysql workbench
instead of phpmyadmin.

~~~
i336_
Wow, thanks so much!

pgAdmin looks great, and TIL about MySQL Workbench, which looks awesome (wish
there was a PostgreSQL Workbench! :P) - this will be great for learning.

Databases are one of those things I know I badly need to learn more about,
this is great start. Thanks again.

------
pella
For a performance: Check the TechEmpower FrameworkBenchmarks - Data updates -
check the 'DB' columns

\-
[https://www.techempower.com/benchmarks/#section=data-r13&hw=...](https://www.techempower.com/benchmarks/#section=data-r13&hw=ph&test=update)

-> Pg - Pg - Pg - Mo - Pg - Pg - Pg - Pg - My - My - My - My

\----

Round14 - Preliminary data!

[https://www.techempower.com/benchmarks/previews/round14/#sec...](https://www.techempower.com/benchmarks/previews/round14/#section=data-r14&hw=ph&test=update)

-> Pg - Pg - Pg - Pg - Pg - My - My - Pg - Pg - Pg - Ny ....

~~~
raziel2p
I don't think that's useful data. That could just as well be the code is more
optimized for postgres, or the postgres client library in the language being
used is more performant for some reason.

~~~
mwpmaybe
It's useful for comparing full stacks, but the queries used are very basic,
and most of the update benchmarks don't use transactions or locking.

------
jonbaer
The 2 pro things that impressed me on both platforms from recent builds are
probably under utilized: On PostgreSQL: Logical decoders
([https://www.postgresql.org/docs/9.4/static/logicaldecoding.h...](https://www.postgresql.org/docs/9.4/static/logicaldecoding.html))
and on MariaDB, the CONNECT engine
([https://mariadb.com/kb/en/mariadb/connect/](https://mariadb.com/kb/en/mariadb/connect/))

------
cygned
Using Postgres in Docker (for development of course!) is interestingly
challenging, if you have a Windows host - due to the architecture. All our
devs are working with macOS or Linux now.

------
bartl
Using Drupal, MySQL (or MariaDB) simply works much smoother than PostGres.
That is because apparently Drupal has been designed around MySQL. You may be
quite sure that if you are having a problem with Drupal on PostGres, that the
same problem doesn't exist when using MySQL.

In 2011, we started out using Drupal with PostGres. After a year or two, we
migrated to MySQL.

I think it's likely that the situation is quite similar with WordPress.

------
ftaurino
MySQL has BIG problems with views. If the view statement contains aggregate
functions, distinct, group by and some others constructs, the results from the
view are retrieved into a temporary table, which then is used to execute the
statement. If involved tables contain tens of thousands of records, this
behaviour will hinder performance. In PostgreSQL you don't have to worry.

------
camus2
> Many cons have been fixed.

it would be easier if you listed what you mean by "cons". Postgres is much
much more powerful than MySQL if you are really into SQL and exploiting an
RDBMS, MySQL administration is simpler.

This question is a bit too vague to warrant a specific answer.

~~~
mickeyben
> Postgres is much much more powerful than MySQL

I would say this used to be true in 5.5 and before but it's less and less of
an argument. 5.6 and 5.7 bring a lot of features like JSON or geospatial.
Here's a list of the new feature of 5.7:

[http://www.thecompletelistoffeatures.com/](http://www.thecompletelistoffeatures.com/)

~~~
philliphaydon
I've been looking at MySQL's JSON and it's still pretty incomplete compared to
PostgreSQL. It's still far superior to SQLServer 2016's abysmal offering of
JSON support tho.

(Want to be able to support Marten document store on more than 1 database)

------
e98cuenc
For me, a key advantage of MySQL is Sequel Pro. I've never found a GUI for
postgresql that can touch it.

If there was one, I will switch in a heartbeat as I will love to have access
to some more advanced SQL than what's available in MySQL (like window
functions)

~~~
dgelks
Agreed, Sequel Pro is far above anything I've seen in Postgres and is the
primary reason we are still using MySQL in production - business team just
love Sequel Pro too much

~~~
peter_pen
Try compare to Valentina Studio. It can do tons things which Sequel Pro do not
have.

~~~
dgelks
Just downloaded and will investigate - thanks!

------
pbhowmic
From my perspective MySQL does secondary index is much better than PostgreSQL.
Uber found that out the hard way.

~~~
lngnmn
PostgreSQL has implementation of almost every possible indexing technique,
unlike MySQL. It has been academic project (of great scholars) after all.

~~~
SigmundA
Except clustered indexes.

------
samfisher83
[https://db-engines.com/en/ranking](https://db-engines.com/en/ranking)

It seems like Everytime on HNs people say postgre is so much better. Why is
MySQL so much more popular?

~~~
lima
Legacy systems, tooling + operational experience, better clustering (for now).

PostgreSQL wins at everything else and the gap is closing fast.

~~~
evanelias
The largest MySQL-based companies have the resources to switch to another
database if there was a compelling reason, but they do not do so. The db fleet
sizes of these companies also dwarfs that of the largest companies using
Postgres. The "legacy systems" argument is faulty logic.

Facebook is in the process of changing MySQL storage engines (from InnoDB to
MyRocks), which is a massive migration involving a large number of engineers,
yet they're still staying with MySQL overall because it's the best fit for
their OLTP workload. Since pg doesn't even support pluggable storage engines,
it would be a total non-starter, among many other reasons.

~~~
lima
Thanks for the insights!

I'm actually using MySQL (MariaDB, actually) in production because it does
work fine, does all we need in an OLTP data store, ops tooling is very good
and we need reliable multi-master (Galera) for load balancing.

I'm not an expert on databases, but I've heard so many DBAs tell me how
PostgreSQL is superior with regards to features + correctness (features that I
don't actually need for our use case, but still) except for scaling out.

What's your take on this? If you were to start on a green field, what database
would you recommend for a typical SaaS/OLTP workload?

~~~
evanelias
I'd say it depends on your team's level of knowledge on various DB's, the
expertise available in your geographic area, and the expected size that your
data set would grow to.

I'm not sure there is a "typical" workload in the general case ;)

Don't get me wrong; Postgres is a really really good piece of software with
many appealing qualities. But its fanbase is a bit overly rabid, or at least
over-eager to trash MySQL for things that haven't been true in years or never
were true in the first place. There are some very valid MySQL complaints mixed
in there too, but it can take some effort to separate them out...

~~~
i336_
Hmm. Have you seen any wholesale feature comparisons between the two that
you'd consider objective and fair, and which debunk old myths in the process?

~~~
evanelias
Unfortunately I haven't. People who are equally well-versed on both databases
seem to be the rarest of unicorns.

------
vesak
PostgreSQL will always be technically superior, and it is a single product.

MySQL is MySQL or MariaDB.

------
jballanc
Pros and Cons depend largely on what you're hoping to use your DB for and how
you intend to deploy/administer it. Feature-wise, Postgres has a break-neck
speed of new feature adoption compared to MySQL. That could be a Pro or a Con
depending on your perspective. If you're dying to try out deep indexing of
JSON columns in your SQL DB, then you definitely want Postgres. If you need
something stable and reliable, then MySQL might be a better choice.

Not to be trite, but I think the biggest high-level Pro/Con comes down to
this: Postgres has the more active OSS community, MySQL has Percona.
Seriously, if you're in a situation where you have the option of spending
money to solve problems, then having the option of hiring Percona to fix your
problems alone makes MySQL a sound decision. If you don't have a large cash
reserve, but are willing to spend time combing through message boards and blog
posts, pick Postgres.

~~~
JamesMcMinn
You seem to imply that Postgres isn't stable when in the fact the very
opposite is true. Postgres is one of, if not the, most safe and stable
relation database systems to ever exist. The fact that they keep adding or
improving features does not take away from this, it only means that you have
the option to upgrade if you want the new features.

~~~
egeozcan
> Postgres is one of, if not the, most safe and stable relation database
> systems to ever exist.

I would even say that it's one of the most safe and stable _software_ to ever
exist.

