Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: It's 2023, how do you choose between MySQL and Postgres?
219 points by debo_ on May 11, 2023 | hide | past | favorite | 347 comments
Imagine you are starting a new project, and your backing datastore options are restricted to mysql or postgres (and/or their cloud-tailored equivalents.) What sort of functional requirements would cause you to choose one over the other?



Postgres. Fast, full-featured, rock-solid, and a great community.

I think many of us can’t be bothered to go over (again) the issues we’ve had with MySQL in the past. The last straw for me was about ten years ago, when I caught MySQL merrily making up nonsense results for a query I’d issued that accidentally didn’t make any sense.

Very likely this particular issue, and others like it, have been fixed in the meantime. But I just got the sense that MySQL was developed by people who didn’t quite know what they were doing, and that people who really did know what they were doing weren’t ever likely to be attracted to that to fix it.


FWIW, it hasn't changed in ten years.

Here is an 18-year-old bug, that DELETE triggers don't work for foreign key cascades: https://bugs.mysql.com/bug.php?id=11472

That makes the entire feature mostly worthless. Reported in 2005, last updated in 2008.

---

While I would choose PostgreSQL every time, MySQL has the following advantages:

1. Write-performance, due to fundamental design tradeoffs. [1]

2. Per-connection resources, due to single-process design.

3. Related to #1, no vacuum requirement.

[1] https://www.uber.com/blog/postgres-to-mysql-migration/


Good you shouldn't be using them. You shouldn't be using foreign keys either. It just makes working with data harder and doesn't help with constraining it if your data modifications are inside transactions and properly written statements.


Having used postgres for the past decade, I tried MySQL for a side project to see whats changed with it. The sad answer is that it feels like nothing has changed - Oracle seems to have let what used to be a core technology of the industry languish.

I'm sure there are use cases where MySQL will be the better choice over postgres, but the future for the stack looks bleak.


    Oracle seems to have let what used to be a core 
    technology of the industry languish
I think slowly squeezing the life from MySQL was a very explicit goal for them. After the big wins (Wal-Mart, etc) MySQL had 15-20 years ago I think it was very clear MySQL was going to chip away at more and more of Oracle's business.

I wonder how much Oracle spends on MySQL every year? They're spending a lot of money to keep MySQL at kind of a "not quite good enough" state. But they can't kill it outright - it'd be like boiling a frog fast instead of slow.

In the end, I wonder what extinguishing MySQL really accomplished for them. It might have bought them some breathing room but Postgres quickly filled MySQL's old segment.


Strange take when FB, twitter, Square and new startups such as Faire(#4 valued private YC co) are all using MySQL to some/large degree. Stripe uses MySQL too in combination with other DBs including Postgres.


You're not considering the timeline of events.

I'm unfamiliar with Faire but the rest were already using MySQL at the time of Oracle's acquisition in 2010. Switching backends would have been rough for those companies and this was... 2010, meaning Postgres was not nearly as performant or full featured as it is today. As mentioned in other comments, FB's investment in customizing MySQL has been extensive. They've poured a lot into their own fork of it.

More to the point, look at MySQL's progress since 2010. Do you think it has been largely stifled since then, or do you think it has kept pace with Postgres? It's been largely stagnant.

I'd love to hear your alternative theory, of course. You think Oracle bought MySQL to... what, exactly? Make it amazing?


(Full disclosure, I work for Oracle)

Anyone who says no investment has been into MySQL I suspect never took the time to read the features/release notes for MySQL 8

https://dev.mysql.com/doc/refman/8.0/en/mysql-nutshell.html


Didn't 8 ship 5 years ago?


Their release model changed with MySQL 8 -- they do rolling point releases every quarter with new features sprinkled in as they're ready. Quite a few new major features have been released that way, including INSTANT alters, parallel index builds, CLONE plugin, major changes to how UNDO logs are sized... it's more like Windows 10's release model.

Very recently they've mentioned they'll be changing this again to have separate LTS releases, which is a positive change stability-wise.


Really wish they bump up the version number or something. It makes discussions with MySQL a lot easier.


Well, you got me there. It stagnated for so long relative to others that I didn't realize the pace had picked up in a big way.


Imo mysql is decent enough for many years, and PG has relatively major outstanding architecture problem (see my Other comment in this thread)

I don’t think either DB is bad, but the whole MySQL is dead and needs serious work idea doesn’t make sense to me. What problem does MySQL still have that needs fixing in your opinion?


Facebook maintains a patch-set, not a full fork. They still track Oracle's releases and apply their patches on top.

Facebook definitely has the resources to migrate to Postgres, if there was any motivating reason to do so. Indeed, Facebook developed their own LSM-based key-value store (RocksDB) and MySQL storage engine based on it (MyRocks) and then migrated the majority of their db fleet to that. In total that's massively more work than migrating to Postgres would have been.

Part of the reason was that MyRocks provides better compression than pretty much any other possible choice, and that adds up to an astronomical amount of cost savings at Facebook's scale. In contrast, what would Facebook have gained from moving to Postgres instead?


You are looking for long term planning where there is none. The only relevant question for those in charge of the project is: “How do I make my quarterly report to investors look slightly better than last quarter’s?”


Creating a series of connections very quickly is cheaper in MySQL and MariaDB than in PostgreSQL. Typically, a connection poller is used before PostgreSQL to support connection scalability.

I'm not sure if there has been a recent breakthrough that has changed that. I think that still applies today. Correct me if I'm wrong.


You can create a series of connections in postgres just as fast. The connection pooler you are referring to is when you put pgBounce or pgPool in between your pgdb and your client software to expand beyond the physical limits of connections and optimize clustered architectures. MySQL at scale is replication only. A few commercial offerings for MySQL like planetscale have brought MySQL into the 21st century. Postgres has a couple ways of clustering, sharding, scaling, beyond your Wordpress database.


It took me until here to realise we were talking about MySQL, not SQLite, because honestly 'in 2023' isn't that the comparison, pg vs sqlite?


Actually I would argue that there isn’t a single reason to use MySQL over Postgres (barring the obvious- it’s what the team already knows, or the company already uses, etc.)


> I'm sure there are use cases where MySQL will be the better choice over postgres, but the future for the stack looks bleak.

see, I'm pretty sure there basically weren't. It lucked out at the right moment in the late 1990s. Also, Slashdot used it.

The only use case I can think of is when you want an application, and it requires or is highly optimised to MySQL. Otherwise, it should actively be avoided.


i think one is also referring to mariadb here and not just mysql. Maybe that's better enough? I wouldn't know, I just go with postres.


Yep. The real question here is: it's 2023, why would you choose MySQL over PostgreSQL?

Not that there aren't reasons. There are some. But for starting out with a new app without a very, very good reason to do something different? PostgreSQL every day of the week.


Ease of updates is a very good reason. Handling connections too.


Working with MySQL (MariaDB, but doesn't make much difference). Never get any issues that couldn't be explained by architectural or development mistakes.

Just as example - how do you create read-only user (SELECT only) in Postgres? In MySQL it's extremely simple and it works, while in Postgres it's a nightmare to create and maintain


> Just as example - how do you create read-only user (SELECT only) in Postgres? In MySQL it's extremely simple and it works, while in Postgres it's a nightmare to create

Isn’t that

  GRANT SELECT ON ALL TABLES IN SCHEMA foo TO bar;
?

> and maintain

If you mean you want to grant a user select rights to whatever table gets created in the future (a somewhat questionable idea from a security viewpoint):

  ALTER DEFAULT PRIVILEGES IN SCHEMA foo GRANT SELECT ON TABLES TO bar;
I think both are possible in PostgreSQL 9 and later (https://www.postgresql.org/docs/9.0/sql-grant.html , https://www.postgresql.org/docs/9.0/sql-alterdefaultprivileg...)

That version is from 2010.

I guess the lesson is that both these systems evolve fairly rapidly. You can’t use quirks you remember from over 5 years ago to judge their current versions.


It is correct but apparently not sufficient - you need to give CONNECT access to user profile.


It’s really unfair because a lot may have changed in 10 years so it might be worth reconsidering.

But I’m like you, MySQL did some nonsense once that took me hours to work out. So now I really can’t be bothered with any potential quirks it may still have. This is not an SNL sketch.


Unpopular opinion on HN apparently, but MySQL

- It's less featureful, and I'd consider that a strong virtue in the YAGNI camp - less to go wrong, less mental overhead.

- Maintenance is simpler and far less necessary in my general experience.

- Replication is simpler and more reliable.

- You can tell the query optimizer what to do. When this is needed, you'll be thankful. It's a godsend.

That said, I wouldn't run Oracle MySQL. I opt for MariaDB on smaller projects and AWS Aurora MySQL for larger projects. Aurora scales insanely well, and replication lag is almost non-existent.

In my general experience MySQL was always significantly faster but it's been a number of years since I've worked with Postgres and the comments here seem to indicate that that may no longer be the case. YMMV


> It's less featureful, and I'd consider that a strong virtue in the YAGNI camp - less to go wrong, less mental overhead.

This doesn't really hold water in my opinion.

It's not like PostgreSQL is some minefield of misfeatures and quirky behavior. Some of these features exist, but have zero impact on you unless you actually opt to use them. But if you end up needing to: they're there, and you can just start using them.

Compare this to MySQL where they simply don't exist no matter how much you may need them. Need to apply an index to the result of a function to quickly fix a performance issue in prod? Sorry, you can't. Need window functions to accurately compute some analytics in a sane period of time? Sorry, you can't. The list of things you can do in PostgreSQL that you simply can't with MySQL is massive and grows every day.

The odds that you'll want, need, or greatly benefit at least one of these features is not small. Having the flexibility of knowing these features exist should you ever have a use-case for them is massive.


Your examples regarding MySQL's features are not correct.

Need to apply an index to the result of a function? No problem, use a functional index, supported since October 2018: https://dev.mysql.com/doc/refman/8.0/en/create-index.html#cr...

Need to use a window function? No problem, supported since April 2018: https://dev.mysql.com/doc/refman/8.0/en/window-functions.htm...


Fair. I picked two instances that had long been personally painful with MySQL and didn’t exist when I finally fled to greener pastures.

There are far more examples than just those two though. :)


Sure, and there are also plenty of examples of things MySQL can do that Postgres can't.

There's no silver bullets in software or databases, no "one size fits all". Just various trade-offs between different design decisions.


I mean, for most mature ecosystems / technology stacks I think the “no one size fits all, just strategic design trade-offs” theory largely holds true as an objectively accurate account of the “choosing the best tool“ situation ….

but IMO, I think it’s worth noting that occasionally, some rare unique and functionally superlative technology comes along that in practice transcends every alternative from the onset and indefinitely going forwards, sometimes even at a more prominent scale than the MySQL / Postgres projects topic of discussion (which are not small by any means).

something maybe like Git, most immediately comes to mind, as an example of the de-facto standard for distributed VCS basically since … 2005* when Linus decided to create it?

edit: not 1995


> something maybe like Git, most immediately comes to mind, as an example of the de-facto standard for distributed VCS basically since … 1995 when Linus decided to create it?

hot take. might be the most popular, maybe even by a large margin, but I think you'll find a good chunk of people who have actually tried different VCSs don't think it's the best.


Yah, I think that’s probably true. But that’s also hard to reconcile with the reality of the adoption trending consistently away from any alternative and only towards Git. And “large margins” are indeed pretty objectively the case (from the largest developer surveys the breakdown 10 years ago was like 70% Git to everything, growing to ~95% in 2022). Usually the phenomenon you’re describing, leads to other alternatives becoming more popular not less (even if the most popular standard continues to eclipse the field. Here is would seem these highly likable alternatives for those who took the plunge are nevertheless dwindling into irrelevance…

I suspect the die-hard proponents of Mercurial, or SVN, or whatever else, these few pagan heretics that might exist out there wherever they’re hiding, have found themselves in a camp different to the Git standard likely on the basis of electing to be intentionally contrarian / anti-normative as the general catalyst, and rather not, as a function of struggling with Git to the point of being so disillusioned they call it quits and head out looking for greener pastures. I think in practice the most common result of encountering problems with Git is, fix the problems. And functionally I think that’s resulted only in furthering it’s supremacy over alternatives, despite there existing a handful of cultish weirdos who are _really_ into Mercurial and prefer not to fux with Git as a personal lifestyle choice haha)


Giant companies like Google and Meta use alternatives to Git for their internal / closed-source code, because Git couldn't scale to their needs. If I recall correctly, Google transitioned from Perforce to an in-house system, and Meta transitioned from using Mercurial to a custom derivative of it.

Again, no one size fits all...


Are you serious? Anyone who disagrees with you about what VCS is best must be intentionally contrarian?


Haha. That’s a pretty aggressive and surface level read of my comment . For starters it was mostly a joke, and explicitly a conjecture. I was saying, probably most users of a different system do so on for some reason other than, Git being objectively inferior for their purposes (and as others in the thread pointed out, the latter actual is the case at the large scale end of the spectrum for big companies concerned with scaling, not choosing Git).

Secondly, no one has disagreed with me on the matter, as I haven’t put forth a personal opinion, I’m simply impartially making referencing to the fact that Git is the de-facto standard. And from an intellectual perspective, was hoping someone might elucidate more into why that is the case, given my conception is an inferential deduction. at best).

While I’m personally relatively familiar with Git internals, by no means an expert, its the only distributed VCS I’ve ever used and I don’t know anything substantive enough about the alternatives to credibly make a relative value comparison here. They could be the bees knees for all I know, but it seems unlikely given the position Git holds as far as consensus standard choice.


What are the “plenty of examples of things MySQL can do that Postgres can’t”? I honestly can’t think of one, much less “plenty”.


These are just a few random ones off the top of my head!

* Handling several thousand connections per second without needing a proxy or pool (MySQL's connection model is thread/conn vs Postgres using process/conn)

* Handle extremely high volume of primary key range scan queries (MySQL's InnoDB uses a clustered index, see https://news.ycombinator.com/item?id=35909053)

* Handle workloads that lean heavily on UPDATE or DELETE, without major MVCC pain (see https://ottertune.com/blog/the-part-of-postgresql-we-hate-th... for example)

* Semi-synchronous logical replication, for environments which cannot tolerate any data loss during failover; this ensures the statements have reached at least one replica but without the huge latency of synchronous replication

* Use index hints, to ensure random index stats changes don't cause unexpectedly negative query plan adjustments (see discussion in subthread https://news.ycombinator.com/item?id=35909340)

* Handle very high-volume OLTP workloads using direct I/O, since InnoDB's buffer pool design is completely independent of filesystem/OS caching

* Achieve best-in-industry compression by using the MyRocks storage engine (MySQL's pluggable storage engine design has a lot of tradeoffs but it is inherently what makes this even possible)

* Use UNSIGNED int types, to store twice as high max value in the same number of bytes, if you know negative numbers are not going to be present

* Use case-insensitive or accent-insensitive collations out-of-the-box without having to monkey with confusing user-defined CREATE COLLATION

* Ease-of-use commands like SHOW CREATE TABLE

* Silly cosmetic things like the ability to reorder columns in an ALTER TABLE (see https://wiki.postgresql.org/wiki/Alter_column_position)

* A tooling ecosystem which includes multiple battle-tested external online schema change tools, for safely making alterations of any type to tables with billions of rows


ratio


Yeah I can’t think of any. I mean if I had to come up with something, I would talk about tooling. MySQL Workbench (when it works) is by far the easiest to use and most feature-rich of all RDBMS.. MS. It runs on any OS too, which is a ding on SQL Server and not Postgres since Postgres’ is a web app- which has the limitations of a web app (no right click for example- which I know is possible in a web app, but last I checked not implemented for pgadmin).


I would disagree on maintenance being simpler. I have never had Postgres randomly munge a table and require me to run a command to fix it.


I have not had that happen in MySQL either, at least, not with innodb. what command would that be?

I do remember getting bad tables with myisam tables a decade ago, sometimes after a bad shutdown.


I believe this was the command in question: https://dev.mysql.com/doc/refman/8.0/en/repair-table.html


My needs are meager (simple CRUD, low volume), but I haven't had that happen in MySQL either, in over 15 years of running it in production.

Not saying it can't happen, but I don't think it's a common occurrence.


WordPress may be uniquely bad.


I had the misfortune of inheriting a MySQL 5.6 database and I had to manage it for a year and a half. It wasn't very big (less than 100GB), but:

- for some reason it hang periodically and had to be restarted during the night for no apparent reason

- compared to Postgres, the tooling is garbage (both for backups and even more for general database administration)

- essential features are missing, the most important one of which, for me, was proper CSV import/export. CSV-related functionality is so broken and terribly inconvenient to use. In a specific case I had to write a program to export millions of records manually since MySQL could not generate correct CSV export due to some columns containing text with special characters, quotes, newlines. Any combination of the export parameters ("ENCLOSED BY", "ESCAPED BY" and all the other garbage options) failed in one way or another. I even tried to use non-standard characters like \x15 and \r for column and row separation but even that failed. With Postgres, "with csv header" is simple and works every time.

I also managed bigger Postgres databases (up to tens of terabytes) and never had the issues I encountered with MySQL.


> It's less featureful, and I'd consider that a strong virtue in the YAGNI camp - less to go wrong, less mental overhead.

Imagine when you actually need any of the features that PostgreSQL provides like pub/sub, logical replication, JSONB etc. With MySQL you might have to hack a solution that is much more complex or you have to set up an entirely separate tool. What I find nice with PostgreSQL is that for simple cases you can get away without a dedicated key/value store or a queue or a full text search engine. You can do a lot of these kinds of tasks with just a single database.


> features that PostgreSQL provides like [...] logical replication

MySQL has offered logical replication for considerably longer than Postgres, and it's a substantially more configurable and mature implementation. MySQL's built-in replication has always been logical replication. It's a feature MySQL has had for 23 years -- built-in logical replication is literally one of the top reasons why all the biggest MySQL users chose MySQL originally!


Correct, but as far as I know MySQL has those mentioned features but the overall management effort seems to be considerably lower.


less features, simpler admin, more compatibility, more familiarity. I agree


samesies


What's the ratio of solving DB perf issues by optimizing it and letting the planner do its work, to telling it what to do? For me it's like 1000:1.

And that one case I remember was perfectly solvable the regular way, with a little more time.


The problem is that when you you need to tell the planner what to do, you can do it in MySQL, but not postgres. Imagine you've got a production database with lots of traffic which suddenly can't handle anything because it inserted an extra row which tipped the balance and now takes seconds to process a common query.

Do you know how to fix the table statistics quickly? Do you know how to change that query to force the execution plan you want? Do you know how long the solution will last until the stats change again?

MySQL is a bit more predictable for this case and if things go really bad for some unexpected reason, one comment can fix it.

I'm looking at it from ops perspective. The ratio during development doesn't matter that much - all issues are solvable at that stage. For me it's rather "which situation would I rather be in at 3am".


I googled a bit and had no idea how many questions about Postgres query planner going nuts are out there. I just imagined this is a problem that creeps over time (giving you time to notice and act in advance, assuming you have monitoring/alerts set up) rather than suddenly tipping the scale - though it probably can happen suddenly after large data import.

Personally never ran into this with Postgres nor had anyone I know worry about it - the query planner was reliable for me in 99.99% of cases but yeah, I admit that it's a black box for me that I expect to take care of internals - hopefully it continues to do so, but I got to give it to MySQL for allowing to override it then.


There is the pg_hint_plan extension that gives most of what you would want in a hinting system.

I think a better fix lies in the direction of making queries more of a first class object with options to nail down plans, add custom logic to pick plans dependent on parameter values, etc.


I am on GCP/AWS, its not possible to use extensions. They allow only a few whitelisted extensions. If it is built into the DB, then I can use without any hassles.


Postgres query planner suddenly deciding to do something silly in the middle of the night has taken Notion down a few times. It's quite annoying, and it's very frustrating to have no recourse.


Any posts on this? Are there bulk data loads that make table stats more stale and affect plans? I’m wondering what would suddenly make a plan selection change a lot that might be a contributing factor.


> bulk data loads that make table stats more stale

This is the usual culprit (cure: "ANALYZE ((tablename))").

Collecting more samples (ALTER TABLE SET STATISTICS...) may be useful.

"Extended Statistics" covers most(?) other cases: https://www.postgresql.org/docs/current/planner-stats.html#P... https://www.postgresql.org/docs/current/multivariate-statist...


Interesting - how do you approach it when it happens and you're under time pressure to bring it back online - assuming you can't just fix query plan? I'd probably start by tweaking stats options and resetting them for problematic tables but don't have further ideas from the top of my head.


There is almost no good reason to choose MySQL over PostgreSQL for any operational reason, I did a deep dive many moons ago (before major improvements in performance to postgres) and people were saying that MySQL was faster. I found that not to be true and the differences have only gained even more favour towards postgres.

also, I assume you mean MariaDB as MySQL is owned by Oracle and I would greatly implore anyone and everyone to avoid Oracle as if it has herpes.

There are a lot of historic problems with MySQL accepting invalid data, committing data even when there are constraint issues, and having very poor transactional isolation, I am not sure if these have improved.

Truthfully, the only benefits you gain from using MariaDB or MySQL are:

* Memory tables

* Having inconsistent replicas (which can be useful when you want your downstream to have less data than your upstream and you know it won’t get updated.)


> avoid Oracle as if it has herpes

herpes isn't that bad. most people will get it in their lifetime. 1 in 6 people have hsv-2, the less common variant. trying to avoid herpes is like trying to avoid chickenpox (although herpes isn't nearly as harmful as chickenpox).

you should avoid Oracle like it's a blood pathogen.


As a person who has herpes firmly in his nerves, I would say don't underestimate herpes.


I read this guy's book and he has some good ideas (and references to back them up). You don't need to get his book as his website also has all of that material+. http://doctoryourself.com/herpes.html

Another related site: http://orthomolecular.org/resources/omns/index.shtml (scroll down for articles).

None of the above will help though in deciding between Postgres or MySQL ;)


Suddenly the analogy got very accurate.


Chickenpox is actually caused by a herpesvirus. herpes varicella zoster.


hello, fellow person with herpes! (I assume)

The worst part about having it is having to talk about having it. It's really not bad as a condition separate from societal concern.


I find similar societal concern when I tell friends I use Oracle.


Having Oracle experience on the resume is a positive, I suppose, but I'm not sure it's been worth the exposure.


It’s not so bad for most people but if you’re one of the unfortunate few who suffer chronic symptoms it can be truly awful. Not worth playing that lottery if you can avoid it.


> most people will get it in their lifetime

Citation needed.



Probably in certain countries, given that it is mainly an STD. There are many conservative nations where this won't be an issue hopefully.


As it says in the linked article, it’s a global epidemic.

If you are an adult of typical sexual activity, it is likely you have already had sex with someone infected with herpes.

That doesn’t mean you have contracted it — carriers aren’t always shedding the virus.

I’m not sure I see any correlation between a country being conservative and an absence of sexually transmitted infection; the 10 countries where HIV is most prevalent are all (as far as I’m aware) relatively conservative.

Furthermore, here are some statistics from Wikipedia on HSV which may be referring to some of these conservative countries you’re referring to:

> Turkey— High levels of HSV-1 (97%) and HSV-2 (42%) were found amongst pregnant women in the city of Erzurum in Eastern Anatolia Region, Turkey. In Istanbul however, lower HSV-2 seroprevalence was observed; HSV-2 antibodies were found in 4.8% of sexually active adults, while HSV-1 antibodies were found in 85.3%. Only 5% of pregnant women were infected with HSV-2, and 98% were infected with HSV-1. Prevalence of these viruses was higher in sex workers of Istanbul, reaching levels of 99% and 60% for HSV-1 and HSV-2 prevalence respectively.

> Jordan— The prevalence of HSV-2 in Jordan is 52.8% for men and 41.5% for women.

> Israel— HSV-1 seroprevalence is 59.8% in the population of Israel and increases with age in both genders but the adolescent seroprevalence has been declining as in most industrialized nations. An estimated 9.2% of Israeli adults are infected with HSV-2. Infection of either HSV-1 or HSV-2 is higher in females; HSV-2 seroprevalence reaches 20.5% in females in their 40s. These values are similar to levels in HSV infection in Europe.

> Antibodies for HSV-1 or HSV-2 are also more likely to be found individuals born outside of Israel, and individuals residing in Jerusalem and Southern Israel; people of Jewish origin living in Israel are less likely to possess antibodies against herpes. Among pregnant women in Israel a small scale cross sectional study found the prevalence of HSV-2 infection was 13.3% and that of HSV-1 was 94.9%. The HSV-2 infection rate was 3-fold higher among immigrants from the former Soviet Union (27.5%) than among Israeli-born Jewish and Arab women (9%). Approximately 78% of HSV-2 infections in Israel are asymptomatic. HSV-1 causes 66.3% of genital herpes in the Tel Aviv area.

> Syria— Genital herpes infection from HSV-2 is predicted to be low in Syria although HSV-1 levels are high. HSV-1 infections is common (95%) among healthy Syrians over the age of 30, while HSV-2 prevalence is low in healthy individuals (0.15%), and persons infected with other sexually transmitted diseases (9.5%). High risk groups for acquiring HSV-2 in Syria, include prostitutes and bar girls; they have 34% and 20% seroprevalence respectively.


> If you are an adult of typical sexual activity, it is likely you have already had sex with someone infected with herpes.

Again, not in conservative populations where marriage is the typical way to have sexual relations. Syria is one example that you quoted. A smart person would get his/her partner tested if they suspect anything before getting married.


I'm trying to work out which logical fallacy you're employing here. No True Scotsman? Personal Incredulity?

In any case, assuming that infidelity or anything else sinful/haram doesn't occur in "conservative" populations strikes me as frightfully naïve.

The data is in. Arabs get herpes too.


Infidelity does happen, but it's much much less common and is severely looked down upon.

Edit: it seems that HSV-1 is not an STD, but it causes genital herpes if engaging in oral sex? Which can explain the big gap between the two in conservative cultures.


I think that the only reasons to choose MySQL (or Maria) over Postgres for a new project are operational. Postgres is probably the better database in almost all respects, but major version upgrades are much much more of a pain on Postgres than on almost any other system I have ever used. That being said, I would choose Postgres pretty much every time for a new project. The only reason I would use Maria or MySQL would be if I thought I later would want to have something like Vitess, for which I think there isn't really an equivalent for Postgres.


> but major version upgrades are much much more of a pain on Postgres than on almost any other system I have ever used.

This is a thread comparing MySQL and Postgres and your claim is that postgres is harder to do major version upgrades than anything you have used??

Context is important here, have you honestly actually upgraded a MySQL node? It’s a lesson in pain and “major” version changes happen on minor versions, like the entire query planner completely trashing performance in 5.6->5.7

Postgres has two forms of updates:

1) in place binary upgrade.

Fast, clean, simple, requires that you have the binaries for the old and the new database.

2) dump/restore.

Serialise the database into text files, load a new database and deserialise those files into it.

Slow, but works flawlessly & consistently with relatively low danger.

MySQL can only do option 2.

You can sort of fake an “update” by abusing the fact that MYSQLs replication offers no guarantees, so you can make a new server a replica; then roll over. But it is impossible to know what data was lost in that transition and MySQL will happily continue without ever telling you.

I have experienced this behaviour in large e-commerce retailers. MySQL was very popular for a very long time and I am intimately aware of operational best practices and how they are merely patching over an insane system.


MySQL doesn't use SemVer. MySQL 5.6 vs 5.7 are different "release series", and switching between them is considered a "major" version change.

MySQL absolutely fully supports in-place binary upgrades, saying otherwise is pure FUD. And the upgrade process in MySQL doesn't even iterate over your table data in any way, so claiming it will cause "data loss" is also pure FUD.

At Facebook we automated rolling in-place updates of our entire fleet, with new point builds of fb-mysql going out several times a month, to the largest MySQL deployment in the world. Worked flawlessly and this was a full decade ago.

MySQL is widely considered easier to upgrade (relative to Postgres) because MySQL's built-in replication has always been logical replication. Replicating from an older-version primary to a newer-version replica is fully supported. When upgrading a replica set, the usual dance is "upgrade the replicas in-place one at a time, promote one of the replicas to be the new primary while temporarily booting out the old primary; upgrade the old primary and then rejoin the replica set".


Facebook has, at minimum, 3 teams maintaining MySQL. including a team who genuinely modifies it into submission. so much that they needed 1,700 patches to port their modified version to 8.0.

It is not relevant to the discussion to discuss how Facebook has managed to munge it to work reasonably well by pouring thousands of hours of engineer time into the effort; and MySQLs in-place upgrades absolutely do not work the way you describe consistently.

I know this because I have been in the code, and only after having experienced it. Maybe some of your lovely colleagues has helped out your particular version to be marginally more sane.

It genuinely must be nice having a dozen people who can work around these issues though, I certainly wouldn’t consider it an operational win, most companies have no DB automation engineers, or DB performance engineers or MySQL infrastructure engineers.

> Replicating from an older-version primary to a newer-version replica is fully supported.

Here also be dragons, as eluded to. I know it works quite often, I have used it.

FWIW: I ran global AAA online-only game profile systems on a handful of Postgres machines at about 120k transactions/s in 2016, I would have needed 5x as many instances to do it in MySQL, and this was only tiny part of our hosted infra.. which included a global edge deployment of game servers, auth servers, matchmaking, voice bridges and so on.

and we only had two people responsible for the entire operation


Please educate me on how my statement about MySQL upgrades is incorrect, I'd love to hear this. I've been using MySQL for 20 years, and while 2 of those years were at Facebook, 18 were not. I've performed MySQL upgrades in quite a wide range of environments, and what you're saying here about lack of in-place upgrades or eating data is simply not aligned with reality.

I haven't made any comments regarding performance comparisons, and have also run extremely large DB footprints with tiny teams, but I don't see how any of that is relevant to the specific topic of new-version upgrade procedure!


Because it depends so much on your storage engine and schema, I have never seen it recommended because there are circumstances where you have data which is unrepresentative unless you are very careful or you don’t actually use the expressiveness of the DB.

I mean, I’ve also seem my share of “ERROR 1071 (42000) at line xxx: Specified key was too long; max key length is xxx bytes” randomly that basically means the machine needs manual recovery.

God help you if you don’t have innodb_file_per_table enabled to begin with too.

I know you want me to cite exactly. That will take me time to find because I stopped caring about MySQL 7 years ago, but I will dig for you.


FWIW while I use Postgres for my own development I've had to administer a number of MySQL servers for other devs. Upgrades have always been updating the MySQL package, restarting MySQL, then running `mysql_upgrade`, and restart the server again. I'm pretty sure the mysql_upgrade has even been missed a number of times and it's worked fine.

I won't say it's impossible you ran into issues doing this, but it is the documented and supported upgrade path.

I love Postgres, but as someone whose maintained both for years, upgrades (at small scale) are the one area where I'd say MySQL has Postgres beat.


as long as you upgrade with a minor version, you will have the same experience with postgres.

11.0->11.2 will work totally fine, with no command needed.


Right, but now go from 11->12, which is the equivalent of the upgrade path I was describing for MySQL. I either need to install both versions and use pg_upgrade to convert the binary files, then remove 11 (and extensions may break this flow) or do pg_dump/restore.

Minor versions on both Postgres and MySQL are painless, just install and restart the server. Major upgrades on MySQL are significantly less painful.


> I’ve also seem my share of “ERROR 1071 (42000) at line xxx: Specified key was too long; max key length is xxx bytes” randomly that basically means the machine needs manual recovery.

What? This error has nothing to do with upgrades, nothing to do with manual recovery, and hasn't been a common problem for many many years.

In old versions of MySQL, it just meant you needed to configure a few things to increase the InnoDB index limit to 3072 bytes, instead of the older limit of 767 bytes:

innodb_file_per_table=ON innodb_large_prefix=ON innodb_file_format=barracuda

and then ensure the table's row_format is DYNAMIC or COMPRESSED.

But again, all of this happens by default in all modern versions of MySQL and MariaDB.

Should it have been the defaults much earlier? Absolutely yes, MySQL used to have bad defaults. It doesn't anymore.


The error I gave is a similar one to the one I used to get with “major” upgrades that happened when Ubuntu decided it was time to upgrade.

It happens and I seriously never claimed that it was an ultra common problem, merely that upgrades in Postgres are more intentional and not painful except for a little extra work between major versions. The standard upgrade path within major versions; 9.x or 10.x or 11.x or 12.x is working just the same as MySQL, except I have much more experience of MySQL completely fumbling their “automatic unattended” upgrade or even the mysql_upgrade command.

Mostly because in the real world outside of engineering cultures databasen are massively abused, ISAM tables that are constantly updated, InnoDB ibdata1 in the terabytes, poor configs, replicas that have skipped a few queries, column changes inside a transaction that failed but actually modified data, it happens. Usually I am called in to clean the mess.

Major difference here is that Postgres doesn’t leave a mess, so I never have the kind of issues that I am describing in this thread with it, and you don’t because I am guessing that you’re there when they’re installed, someone with knowledge was actively maintaining. or you have a lot of people to help with shortcomings.

I get it though. you’ve got your sunk cost knowledge of MySQL and you’ve been on large support teams for it. Maybe you’re afraid I’m suggesting that this knowledge goes out the window. and it has gotten better, but I wouldn’t give my kids watered down led infused soft drinks just because I had suffered through led poisoning. I remember coming to blows with you in other threads over the years because you think MySQL can be saved or is totally fine, but honestly, just, no.


I'm primarily a software engineer, not a member of "large support teams". I've also worked for many years as an independent consultant, brought in when things go wrong, certainly not when they were first "installed". I'm not "afraid" of anything concerning my knowledge going "out the window". If MySQL suddenly disappeared worldwide, I could happily pivot to some other area of software engineering, or I could simply retire. Please stop make assumptions about other people who you know nothing about.

I'm responding to you because you're repeatedly posting factually incorrect items, for years. For example you and I have directly discussed the "MySQL doesn't use SemVer" thing before on HN, and yet here you are again in this thread, claiming 5.6 to 5.7 should be a "minor" upgrade.

Anyway, to the topic at hand, as others have also mentioned in this thread: historically the difficulty with Postgres upgrades has been the lack of cross-version replication, due to Postgres WAL replication being a low-level physical replication system. This made it difficult to perform an upgrade while keeping your site fully online. Perhaps the newer logical replication support makes this easier these days. I hope to learn more about it someday. If you can share your process for upgrading a Postgres cluster while keeping it online, that would be helpful and informative.


1. The log-replication method of upgrading can be performed using the built-in logical replication facilities as well as using external logical replication systems such as pglogical, Slony, Londiste, and Bucardo. Most of which have existed essentially forever.

2. Failovers of any database are not instant, but they are indeed quick! So let’s not claim that you can do an upgrade with zero downtime.

3. In-place upgrades are extremely fast and you can test the speed using a physical replica before hand, usually it’s a couple of seconds though the docs say minutes.

4. MySQLs major version being in the minor position is exactly the kind of “you should be sure you know what you’re doing but we won’t make it obvious” territory that I really despise.


While you two have agreed on approximately nothing, this has been an informative discussion and I do thank you both.


I echo the sentiment and think yours is likely the most pertinent takeaway having made it this far absent reaching any consensus whatsoever haha.

It was nevertheless a pretty epic journey of dialectic discourse plunging _deep_ into the esoteric and nuanced realm of expert-level technical minutiae. A mostly intellectual journey, albeit distinctly punctuated by an undertone of emotional angst that steadily progressed in its growing intensity in a manner proportional to the magnitude of your collective disagreement… epic indeed.


Is there a good way to do case-insensitive accent-insensitive collations yet in postgresql? It’s been a holdup for using that for some use cases like searching for data, like a person’s name, in pgsql when the casing or accents don’t match perfectly.

Mssql has had this for ever, and I’m pretty sure MySQL has it as well.


Maybe this helps: https://stackoverflow.com/posts/11007216/revisions ?

My gut tells me that I would do it in the query itself though, and not rely on the collation. Maybe I am misunderstanding.


Look under "Update for Postgres 12 or later", there they create a collation, an index to make use of it and then a query to make use of it:

   SELECT \* FROM users WHERE name = 'João' COLLATE ignore_accent;


Not really, no, it's doable but not easily


> There is almost no good reason to choose MySQL over PostgreSQL for any operational reason

Galera is the main one I can think of:

* https://galeracluster.com/library/documentation/tech-desc-in...

* https://mariadb.com/kb/en/what-is-mariadb-galera-cluster/

* https://packages.debian.org/search?keywords=galera

I'm not aware of any multi-master, active-active(-active) replication system that is open source for PostgreSQL.


We ran a Galera cluster for 4 days and it died on the first ALTER TABLE we did. Only option was to reduce it down to a Primary / Replica setup. I really can't recommend anyone using this.


Last time I looked at Galera it had a limit at 5TiB of data. Which is fine I guess.

In the Postgres space there is citusdb which provides multi master.

There is also BDR from 2ndquadrant if you want a paid/supported solution. https://www.enterprisedb.com/products/edb-postgres-distribut...


yeah, that's the use case "I run an application that requires or is highly optimised for MySQL".


how do multi master MySQL handles conflicts?


The main problem with herpes is the stigma against it. Don't besmirch it by associating with Oracle.


Postgres is >50x slower for range queries(example below) and is akin to using array-of-pointers (ie Java) whereas MySQL supports array-of-struct (C). Illustration from Dropbox scaling talk below.

Sneak peek photo [1] (from [2]). Just imagine its literally 500-1000x more convoluted per B-tree leaf node. That's every Postgres table unless you CLUSTER periodically.

[1]: https://josipmisko.com/img/clustered-vs-nonclustered-index.w...

[2]: https://josipmisko.com/posts/clustered-vs-non-clustered-inde...

Mind boggling how many people aren't aware of primary indexes in MySQL that is not supported at all in Postgres. For certain data layouts, Postgres pays either 2x storage (covering index containing every single column), >50x worse performance by effectively N+1 bombing the disk for range queries, or blocking your table periodically (CLUSTER).

In Postgres the messiness loading primary data after reaching the B-tree leaf nodes pollutes caches and takes longer. This is because you need to load one 8kb page for every row you want, instead of one 8kb with 20-30 rows packed together.

Example: Dropbox file history table. They initially used autoinc id for primary key in MySQL. This causes everybodys file changes to be mixed together in chronological order on disk in a B-Tree. The first optimization they made was to change the primary key to (ns_id, latest, id) so that each users (ns_id) latest versions would be grouped together on disk.

Dropbox scaling talk: https://youtu.be/PE4gwstWhmc?t=2770

If a dropbox user has 1000 files and you can fit 20 file-version rows on each 8kb disk page (400bytes/row), the difference in performance for querying across those 1000 files is 20 + logN disk reads (MySQL) vs 1000 + logN disk reads (Postgres). AKA 400KiB data loaded (MySQL) vs 8.42MiB loaded (Postgres). AKA >50x improvement in query time and disk page cache utilization.

In Postgres you get two bad options for doing this: 1) Put every row of the table in the index making it a covering index, and paying to store all data twice (index and PG heap). No way to disable the heap primary storage. 2) Take your DB offline every day and CLUSTER the table.

Realistically, PG users pay that 50x cost without thinking about it. Any time you query a list of items in PG even using an index, you're N+1 querying against your disk and polluting your cache.

This is why MySQL is faster than Postgres most of the time. Hopefully more people become aware of disk data layout and how it affects query performance.

There is a hack for Postgres where you store data in an array within the row. This puts the data contiguously on disk. It works pretty well, sometimes, but it’s hacky. This strategy is part of the Timescale origin story.

Open to db perf consulting. email is in my profile.


I made same comment elsewhere before finding this comment. I can vouch for this speed up in the ratio of records per page. It is very real. Only applies for small records where you can pack many rows into a page, and where you can cleanly partition by user/tenant, but that’s common enough.

I will say: we kept every table we didn’t have to migrate for perf reasons in Postgres, and never regretted it.

Edit: and the index “fix” for Postgres doesn’t work often. Postgres will validate the row on disk, even if it’s in the index, if the page’s visibility map isn’t set. If you data isn’t write once, there’s a decent chance your page is dirty and it will still make the “heap” fetch.


You are confusing two concepts here. In InnoDB, the tables are always ordered by the primary key when written to actual disk storage.

This is not the same as "having a primary key", Postgres also has primary keys. It just stores the PK index separately from the bulk of the data.

Oracle also has primary keys, even if the order of the rows is different to the key order. In Oracle, when the rows are stored in the same order as the keys in the primary index, it is a special case and these tables are called IOT, index ordered tables.

The disadvantages of IOT are that inserts are slower, because in a normal table, the data is appended to the table, which is the fastest way to add data, and only the index needs to be reordered. In an IOT, the entire table storage is reordered to take the new data into account.

Select queries, OTOH, are much faster when using IOT, for obvious reasons, and this is what you describe in your comment.

If you use TEXT, BLOB, or JSON fields, even in MySQL, the actual data is stored separately.


I said primary index, not primary key (primary key and primary index is synonymous in mysql Dropbox example). Primary index is database theory lingo for storing all the primary row data inside a B-tree. It’s synonymous with what you say IOT although that’s a new term for me.

You’re incorrect about IOT reordering the entire table at least wrt mysql. MySQL uses a B-tree to store rows, so at most it’s insertion sort on a B-tree node and rare b-tree rebalance. Most b-tree leaf nodes have empty space to allow for adding new data without shifting more than a few hundred other rows. Also, non-IOT tables also need to do a similar process to write to each of its indexes. Last, it’s sort of a tossup since if you’re appending to an IOT table frequently, the right edge of the B-tree is likely cached. (similarly for any small number of paths through the primary index B-tree). At worst Postgres heap will need to surface one new heap disk page for writing, although I’m sure they have some strategy for caching the pages they write new data to.

Sorry to spam this info! Glad to see we both love databases and I’m always please to see engagement about this topic!


When using clustered indexes, one tradeoff is that if a non-clustered index isn't covering for a query, it will need to perform B-tree traversals to find rows in the clustered index. This can significantly increase the amount of (at least logical) IO compared to heap tables, where the non-clustered indexes can refer directly to the row id.

Because you can only have a single clustered index, you're effectively paying for efficient range queries on a single clustering key by making all other queries slower.

This tradeoff may or may not be worth it depending on your query patterns. In my experience, you can often get away with adding some subset of columns to a non-clustered index to make it covering, and get efficient range queries without making a copy of the entire dataset.

And even with clustered indexes, as soon as you want a range query that's not supported by your clustered index, you're faced with the exact same choices, except that you have to pay the cost of the extra B-tree traversals.


Appreciate the thoughtfulness. I believe the branching factor of even wide keyed tables don’t add significant cost to point lookups. At most one or two extra disk pages needing to be read.

Example: 80 bytes keys gives you branching factor of roughly 100. 10M rows and you can pack say 20 rows per page. That’s a 4GB table, give or take. That btree still only has 3 intermediate layers and primary data on a 4th layer. (Calculation is log(10M/20/0.75)/log(100)+1.) The first two layers take up less than a megabyte of ram and are therefore easily cached. So you wind up only needing 2 disk reads for the final two layers. Unless Postgres is caching the entire index for point lookups, it should come out about even.

Can’t find any resource saying that btree height exceeds 5, so I’m thinking it’s at worst 2x the (very small) disk read cost vs Postgres.


(Admitting bias: I've only ever worked with postgres in production with update-heavy tables so I've dealt with more of its problems than MySQL's)

Postgres also has other gotchas with indexes - MVCC row visibility isn't stored in the index for obvious performance reasons (writes to non-indexed columns would mean always updating all indexes instead of HOT updates [1]) so you have to hope the version information is cached in the visibility map or else don't really get the benefit of index only scans.

But OTOH, I've read that secondary indexes cause other performance penalties with having to refer back to the data in clustered indexes? Never looked into the details because no need to for postgres which we've been very happy with at our scale :)

[1] https://www.postgresql.org/docs/current/storage-hot.html


Interesting. PG docs don’t clarify whether visibility map gets updated for HOT update. Maybe even HOT update spoils index only scans. Although I can’t see why-no new index entries, heap visibility status hasn’t changed for any indexes.. wish to find some answers here but I could not.

Wrt secondary indexes, yes and no. There is a cost to traverse a B-tree for point lookups. Also, foreign keys may now be composite keys if primary key is composite as in the Dropbox example.

If the secondary index is very different from the primary, it will be more expensive. However it’s pretty common to at least use a “user_id” as the first part of the primary key. This will make partial full scans a lot faster for queries regarding a single user; only need to scan that users data, and it comes at a 1-2 order of magnitude cheaper disk read cost. So you’d need a secondary index only if the data you need is spread across 1000s of pages (megabytes of data for a single user in one table) and you’re looking for only a handful of rows randomly located in that sequence.

Twitter is a characteristic case where you need many different clustered sets for the same data (tweets) to power different peoples feeds. I believe twitter just stores many copies of tweets in different clusters in Redis- basically the same as having a (author_id, ts) primary key tweets table and a (follower_id, ts) primary key feed table, both having tweet data inlined. If one clustered table isn’t enough, use two.


That was a great read in contrast to all the "there's no reason to use mysql" nonsense in this thread


That's what I'm going for! I love this type of info, so against the grain and useful it's basically a scandal. :-) Check out your sibling comment about pg_repack though, it's a really awesome PG perf improvement tool.


I've definitely run in to these kind of issues and learned about them the hard way, but I found that in PostgreSQL it's quite a bit easier to understand what is actually going on due to better documentation and tooling, and I've found this very valuable. Maybe it's just because I've spent more time with PostgreSQL than MariaDB, but MariaDB has often left me quite a bit more confused (on performance, but also other topics).


If your pg database improves with CLUSTER, you can use pg_repack instead to achieve the same effect without downtime. Besides reordering the heap, it will also clear out bloat from the heap and indexes. I highly recommend partitioning if you have heavy write traffic on large tables since that will keep overhead low and make it complete faster.


Thanks. I actually thought of similar pg_repack concept on my own a couple days ago (influenced by gh-ost probably..) I was googling stuff like how to switch a PG replica to master. I was imagining having 2 dbs and running CLUSTER on the inactive one. Probably wouldn't work but anyways I found about pg_repack after researching.

I'm now super interested in the perf aspects of running pg_repack. It would definitely require scratch storage space to be able to copy over the largest table in the DB (I'd guess 2.1x the largest table vs 2x the total DB size). I imagine repacking isn't as efficient as putting stuff into a B-tree. But I wouldn't expecting it to be anything like 50x worse like I portrayed above.


It does CREATE TABLE ... AS SELECT * FROM table ORDER BY, which isn't too bad. Then it runs CREATE INDEX for all your indexes, which is usually faster than CREATE INDEX CONCURRENTLY. However, while these are running, a trigger is inserting records for all new writes into a special log table. Once all the indexes are created, it replays that log table until it's empty, and then it briefly locks the source table, flushes the last bit of the log table, and swaps the tables.

If you have a lot of write traffic, that log table will get really big, and pg_repack will have to do a lot of work to replay it. It's possible that the log table grows faster than pg_repack can replay it, which will cause it to never finish and eventually exhaust your storage space. You can also run into an issue where the log table never gets completely empty, and pg_repack never initiates that final lock and swap.

Partitioning helps a lot because it divides both the size of the tables and the write traffic to each table.


Thanks for that informative link. It's rare in these sorts of discussions.


The only reason I’d consider MariaDB, which I’m surprised I don’t see mentioned currently, is its bitemporal features. There are whole worlds of problems solved that are almost universally badly done in schema/business logic instead.

Granted I haven’t had to make a decision like this for several years, I’ve hardly even touched a database except to debug some stuff on localhost that’s outside of my explicit purview. So maybe Postgres solutions have narrowed the gap on this without my knowing it.


Does Postgres have binlog replication yet?


Yes, for over half a decade at least, but “binlog” is a MySQL term, for postgresql it has the much more apt name: write-ahead log.

it is the only official, in-binary replication mechanism.


Postgres WAL replication is a physical replication stream. MySQL binlog replication is a logical replication stream, a higher-level abstraction which is independent of the storage engines.

Postgres does separately support logical replication now, but it has some limitations, such as not permitting replication of DDL: https://www.postgresql.org/docs/current/logical-replication-...


MySQL is certainly faster for writes.

https://www.uber.com/blog/postgres-to-mysql-migration/


I would be wary putting too much stock in a seven year old post on the topic; lots of stuff has changed. Specifically, at least one person claimed that a patch greatly improved PostgreSQL for this use case: https://news.ycombinator.com/item?id=26285452


The MVCC requirement is still there.

It's like saying C is faster than Ruby. It always will be.


MySQL is free, regardless of Oracle's ownership.


> * Memory tables

PostgreSQL has memory tables too.


Does Postgres have an archive mode?


If you say what you’re trying to actually achieve I can help with a solution, but asking if it supports an arbitrary feature is not going to get the answer you want because depending on what you’re actually using an archive table for, Postgres might have something already built in but it will almost assuredly not be exactly like an archive table storage type.


Sorry, ARCHIVE is a MySQL storage engine. It supports only non destructive transactions: INSERT, REPLACE, and SELECT, but not DELETE and UPDATE.

It’s an excellent alternative to use a WORM drive when you’re trying to preserve everything (say, a list of financial transaction).

I’ve looked for something like this in Postgres (which I love!), but sadly it doesn’t seem supported.

https://dev.mysql.com/doc/refman/8.0/en/archive-storage-engi...


Saying that REPLACE is non-destructive is a bit weird. Overwriting data is destructive, by definition -- and even if ARCHIVE is append-only underneath, it seems this use of REPLACE is very much a "logical overwrite", as there doesn't seem to be any kind of time travel view of the old state.

Also, you might be interested in Parquet, perhaps as seen through Delta Lake https://delta.io/ or Postgres Foreign Data Wrappers like https://github.com/adjust/parquet_fdw -- Delta Lake's simple "LSM of Parquet files in an object store" design is pretty sweet.


Why is revoking permissions to run deletes/updates not an option?


PostgreSQL every time, unless you have a specific reason, or as already pointed out, you're sure you don't just need SQLite.

PSQL in my experience has vastly better tooling, community, and is ahead of the curve tech wise. Same with extensions availability. Or perhaps you need to move away from it to say CockroachDB, or similar which is much easier.


nit: psql is the command line client. postgres or pg are the more common shortenings of PostgreSQL.

https://www.postgresql.org/docs/15/app-psql.html


Choose whichever one you/your team is more familiar with. Both are battle-tested and proven and will likely scale to whatever needs you have.


This is the correct answer.

Whichever one you start out with, you will be annoyed if you switch to the other one 5 years later. I started out with mysql, and when I started working on a postgres project, I was shocked at some of the ways it was lacking (how am I supposed to store email addresses in a database without collations?).

But when postgres folks grouse about stuff in mysql, I'm usually nodding along and saying "yeah, that would be nice".

They're both great options. If anybody on your team is already an expert at one of them, use that one.


> when I started working on a postgres project, I was shocked at some of the ways it was lacking (how am I supposed to store email addresses in a database without collations?)

How long ago was this? :)


3 years ago. From this comment thread https://news.ycombinator.com/item?id=35908169 I infer that Postgres still doesn't support collations.


Oh - is this ci collations, not collations in general?


A lot of the collations are useful in different circumstances – but the CI ones probably come up the most often, and are necessary to store email addresses in a way where you store the case information that the user typed in, but can do case-insensitive lookups against in the future.


According to the documentation of Postgres 12 [1] it is possible to use so called non-deterministic collations, which may express case-insensitivity. If that is what you need.

The documentation of Postgres 11 [2] states that this was not possible:

    Note that while this system allows creating collations that “ignore case” or  “ignore accents” or similar (using the ks key), PostgreSQL does not at the moment allow such collations to act in a truly case- or accent-insensitive manner. Any strings that compare equal according to the collation but are not byte-wise equal will be sorted according to their byte values.
[1]: https://www.postgresql.org/docs/12/collation.html [2]: https://www.postgresql.org/docs/11/collation.html


> how am I supposed to store email addresses in a database without collations?

Not familiar with MySQL so trying to look that up, but with a constraint? Or just don't do that? - SO answer I found says 'it's much more useful to have johndoe@ and JohnDoe@ treated as the same than it is to support case sensitive email addresses'.. ok, it's also incompliant, but whatever's 'more useful’ I guess!


Having used both in production, I agree with the above. It's not going to make or break your business or project.

I will also add that their are giant companies relying on both databases with great success. Facebook still runs on MySQL, and contribute back to it. Youtube I'm not sure about, but it did run on MySQL for a long time, well after it got massive. I'm sure examples exist for Postgres (Amazonm since they moved off Oracle?)


And if you have experience with one and try to use the other, you may end up foot gunned by something you didn't know about.


Use MySQL if you're expecting to have to do large operational database tasks re: migrations, maintenances, with no ability to go offline. gh-ost, percona-osc, the new INSTANT DDL stuff, is all quite far ahead in MySQL-land. Additionally, Vitess and Planetscale are making huge strides in MySQL performance. There are more people and guides in the world to help recover even the most mutilated of MySQL databases. MySQL gets more love in extremely large enterprise-level organizations, and it shows.

Use Postgres if you need some of the quite-good extensions, most notably PostGIS, or if you just want things to work; most documentation will be postgres flavored. Postgres gets more love from web-developers, and it shows.


This is wrong. MySQL does not support transactional DDL, so you cannot run migration and abort them in the middle.

Always use postgresql. It's more logical, more extensible, saner, supports many extensions and is more predictable.

MySQL is inconsistent crap, that trades away consistency, correctness and stability for a little bit of performance in standard use cases.

Do yourself a favor and always use postgreSQL. I switched 15 years ago and never looked back. Have done 15-50 projects since in psql.


With a sane migration tool like pt-osc or gh-ost you can absolutely abort in the middle. What's more, you can pause in the middle or even slow down in the middle based on arbitrary logic (ie, pause migration if replication delay rises above a certain value). Postgres is nice and transactional DDL has its place but postgres stopped halfway through IMO. Vanilla Postgres > vanilla MySQL, but the migration story of MySQL + tooling is so far beyond Postgres + tooling that it's not even funny.

That said, if you don't expect to have tables with 100m+ rows, even vanilla postgres will be good enough.


Spoken like someone who switched 15 years ago and never looked back.


Production MySQL databases of any significant size use pt-osc or gh-ost for schema changes, and these can be throttled, paused, aborted and so on.


Having answered this a ton over the years, don't want to really take shots at MySQL. But Postgres stands in pretty unique ground.

1. It's solid as a really reach data platform (more than just a relational database). It's extension framework is quite unique compared to others. It's JSONB support was the first among other relational databases and is feature rich and performant. Multiple index types. Transactional DDL. The list goes on.

2. No central owner. A lot of open source is source code is open, but it's maintained by a central company.

3. I mentioned extensions, but really that is understated. It can do really advanced geospatial, full text search, time series, the list goes on.

Having explained this a ton of times first 10 years ago - https://www.craigkerstiens.com/2012/04/30/why-postgres/ and then again 5 years later with and updated version, most recently tried to capture more of this in an updated form on the Crunchy Data blog - https://www.crunchydata.com/why-postgres



https://web.archive.org/web/20211206040804/https://blog.sess...

From a former MySQL developer:

> let me point out something that I've been saying both internally and externally for the last five years (although never on a stage—which explains why I've been staying away from stages talking about MySQL): MySQL is a pretty poor database, and you should strongly consider using Postgres instead.


Pick postgres unless you have a good reason not too?


And even then you pick Postgres.


Can't really loose with postgres, I concur


The Postgres query optimizer is more powerful than the MySQL query optimizers [1]. It generally scales better for OLTP. Also tons of extensions that can accelerate your workload.

[1] - https://ieeexplore.ieee.org/document/9537400


It's also more opinionated than the MySQL query optimizer, in that you can't give it hints to prevent it from making a horrible mistake.


There is a way to provide some type of planner hints https://pghintplan.osdn.jp/pg_hint_plan.html


The main reason I prefer mysql over PostgreSQL is that mysql is just more consistent - in its commands, quirks, etc.

Postgres - is it pg, pgsql, psql, postgres, postgresQL? The answer is "yes."

Plus the case behavior for tables and column names drives me crazy. It's like some leftover VMS shit. I mean seriously fix it. Can you or can you not use a capital letter for a table/column name? I can never remember. Or you can, but you have to quote it? Fuck.

Until recently (which to be fair might be 8-10 years ago) postgres' performance monitoring tools sucked compared to mysql. I know at one point in the last 10 years they still used sunos4 as their base configuration because you know, the OS had been EOL for like a decade at that point.

MySQL is fire and forget. psql (or postgres or pg or postgresql?) is not fire and forget. It's twitchy and requires constant vigilance. I don't want a piece of infrastructure that requires constant vigilance.

That's not to say I won't use it. It's geo stuff is really great. It's JSON support is better than MongoDB's, from what I've heard. Row level security is awesome. But are those features good enough to overcome psql's quirks? Sometimes.


> at one point in the last 10 years they still used sunos4 as their base configuration because you know,

What exactly do you mean with this? I tried to find some more information, and all I could find were some fixes from 2001[1] (SunOS 4 was supported until 2003), a minor refactor in 2008 with "SunOS 4 is probably broken anyway"[2], and that's pretty much it. SunOS 4 was moved to "Unsupported Platforms" with the release of 7.3, in 2002.[3]

[1]: https://postgrespro.com/list/thread-id/1598869

[2]: https://www.postgresql.org/message-id/20081211091708.0726075...

[3]: https://www.postgresql.org/docs/7.3/supported-platforms.html


If you’re talking about the command line client that’s built in, it’s psql. If you can’t remember the command name to launch it or regularly type those other commands when you meant to type psql, you could add aliases to your shell that point to psql. :)

Learning any new CLI client is a bit daunting at first. With repetition and intention, I think the commands become very memorable. Eg “describe table” is “dt”.


This is like asking how you'd choose between Emacs and Vim, Mac and PC, Monoliths and Microservices, Functional and Object Oriented .. you're likely going to elicit a lot of passion and not a ton of objective information.

For most applications, either choice is going to be just fine. Use what your team has the most experience with. If you have no experience, try them both out and go with whatever you're most comfortable with.


Nah, trivially the correct answers are:

Vim, *NIX (so Mac), monoliths, and lambda calculus all the way—FP!

jk, but FWIW I think sometimes, in rare instances, there does exist a pretty unequivocal consensus “right answer” to this sort of question … maybe such as like: Git vs any other distributed VCS ?


For anything involving location, choose Postgres because PostGIS is just so good.


PostGIS is somewhat of a standard interface for geolocation data and one of the main interfaces of QGis, the other being based on SQLite. It's kind of cool how powerful it is, being a "real" database instead of some more contrived format.

My adventures with QGis+PostGIS have also led me to discover the fun fact that even setting the TLS mode to "enforce" may enforce that you're using TLS ... but not that the certificates are correct. Silly. It also won't use the system's store when set to "full". Oh well.

By the way: I find Railway to be a great home for low-use Postgres/PostGIS databases. You basically pay for operations and the free tier is totally fine for semi-casual use (about 1-2k entities) with a few people from across the pond.


Good to know! I’m starting a project that will use location, and I chose Postgres, I’m happy to hear this :-)


One big factor that keep us on MySQL is the MyRocks engine. We have huge databases with billions of rows. The MyRocks enable the use of it with heavy compression, that PostgreSQL can´t handle it, as it is much slower and uses 30x more disk usage, even with heavy TOAST tuning and/or ZFS compression.


To be fair, at the scale of your use case there I really hope you have a proper DBA who understands multiple database systems and their details, and is able to make the best choice for your setup. (At some point the commercial and/or oddball SQL servers become an option too…)

For everyone else who's in most cases not even stuffing a million rows into their database… just stick with Postgres :)


At our company, we provide Postgres 24x7 support. We have partners that provide support for other databases, and for some projects we work together with companies that have multiple databases.

We have over the years compared the rate of production incidents Postgres vs MySQL. It's roughly 1:10 (MySQL has around 10 times more production incidents than Postgres).

You may consider this anecdotal evidence, but numbers managed here are quite significant.

The gist is that Postgres is not perfect nor free from required maintenance and occasional production incidents. But for the most part, it does the job. MySQL too, but with (at least from an operational perspective) many more nuances.


I find PostgreSQL permission management quite convoluted. In MySQL it is simple to query for what grants a user has, but in PostgresSQL you need to write 100 lines of SQL to do the same... and you can't run \du and other commands without psql. Why couldn't they just come up with `SHOW` shortcuts that work in any SQL client?


You can likely get the SQL for a meta command, and you could run the SQL from your preferred client if you don’t use psql. Here is one example: https://dba.stackexchange.com/a/131031

I also highly recommend investing in psql skills though if you are a Postgres user.


Yes, that's the "100 lines of SQL to do the same" the previous poster mentioned (obviously they were exaggerating a bit and it's less than literally 100 lines, but it's pretty complex).


The difference is not significant enough to matter for most projects, esp just starting out. Hence, I mostly choose Postgres, since I don't like Oracle as a company very much.


Whenever I see MySQL, my brain automatically sees MariaDB. What is this Oracle thing you speak of ;-)


MySQL and MariaDB have diverged quite a bit. I recently wrote a roundup of just the differences in table functionality and DDL, and despite keeping the post focused to that relatively-narrow topic, the list of differences is getting VERY long: https://www.skeema.io/blog/2023/05/10/mysql-vs-mariadb-schem...


>MySQL and MariaDB

Again, what is this thing you are trying to compare? I just see MariaDB and MariaDB =)

However, because of my blindness to actual MySQL, I have totally not paid attention to any differences between the two. I guess "drop in replacement" isn't actually true any more. Thanks for the info


I know it seems dumb, but postgres really needs to add the simple developer experience stuff like:

SHOW CREATE TABLE;

SHOW TABLES;

SHOW DATABASES;

SHOW PROCESSLIST;

CockroachDB added these aliases ages ago.


This.

For anything at home, I would use MySQL just for those things. The psql client feels very primitive by comparison to me - even though it isn't.


I highly suggest investing time learning psql, autocomplete works great and it has a ton of useful slash commands. \d for instance shows you the list of tables. Awesome tool


I don't need it often enough to invest the time. I generally set up a database as backing store to some project, fiddle with it until I'm happy it's working at the scale/performance I want, and then move on to something else.

During those few weeks I'm actively using the database on the project, I can either get frustrated beyond belief with the CLI for Postgres, or just use what's at hand with MySQL. In fact, these days SQLite is getting more of my attention anyway, and I wrote a small CLI for it a decade or so back (before the sqlite3 client gave us most of the below) to provide:

- Timings for the queries (in fact I called it 'tsql')

- Aligned-column displays, with | separators between columns and index-indicators

- Ability to parse some blobs (Plists on the Mac, for example) and display

- Things like "SHOW DATABASES", "SHOW TABLES", "SHOW TABLES LIKE" etc.

Mainly I wrote it to do some benchmarking, but I eventually preferred it over sqlite3 as the CLI.

Note that all this is personal stuff - When I do commercial stuff, the answer is always "what is best understood by the people maintaining it afterwards"...


Each of the bullets you listed have very straightforward and memorable meta commands that I use on a regular basis with psql. It may be worth learning them just for when you use Postgres. There is also a built in help. These can also be saved into your dot files so you don’t need to memorize them. Happy to show you if you’re interested!


I forget if there's an equivalent for the first one, but from psql there is a translation of mysql's "DESC table" as "\d table", and the rest are:

\dt

\l

SELECT * FROM pg_stat_activity;


> I forget if there's an equivalent for the first one

Not really; PostgreSQL doesn't store the original query, so you'll need to re-create it from pg_class, pg_attribute, and all of that (which is really what \d and such in psql do). The easiest way is probably pg_dump, but it's best to just get used to \-commands because it's really just the same thing.


Find the SQL from meta commands. Example: https://dba.stackexchange.com/a/131031


That is not equivalent to "show create table" at all.


There is an extension that does that: pgddl. [1]

[1]: https://github.com/lacanoid/pgddl


Generally speaking it should always be PostgreSQL.

It's the default choice for a number of reasons but chief among them is just that it's higher quality. That is it's developed to higher standards due to community bar being really high (thanks Tom Lane, et al for your stewardship) and testing and analysis of changes to the database being ingrained into the culture.

By pursuing correctness first before performance for many years PostgreSQL has built a stronger foundation that is now paying dividends in terms of both performance but also ability to ship powerful features quickly. This has generally resulted in the gap between MySQL and PostgreSQL only continuing to widen over the last 10 years.

So when would you consider picking MySQL?

To me that comes down to exactly one set of use-cases and that is workloads that are fundamentally incompatible with VACUUM. The PostgreSQL MVCC system requires that table heap be maintained by the VACUUM process to both ensure safety (txid wraparound) and reclaim/reuse heap storage. This process is very expensive for workloads that do a lot of updates, especially on indexed columns (as indices need VACUUMing also), less of an issue for non-indexed columns if you can use HOT (heap only tuple) updates and tune the target fill ratio of heap pages appropriately.

In most cases it's highly unlikely your business is going to reach the level of write load where these deficiencies in write behaviour actually matter but it is possible. Uber famously migrated from PostgreSQL primarily because their experiences with write amplification and VACUUMing.

If for instance though your data consists of a smaller live hot set and a warm set that is less frequently updated and easily separable by a deterministic factor like time you can very easily use PostgreSQL table partitioning to isolate the two and continue to scale for a very very long time on pure PostgreSQL.

In practice this may be fixed in PostgreSQL one day, there was a project called zheap to implement an UNDO log style storage system for PostgreSQL (which would avoid all the VACUUM maintenance etc) but it stalled out, largely I believe because it wasn't able to provide obvious wins quick enough to stimulate further interest. However OrioleDB has picked up the torch now and does in fact seem to be showing very impressive results.

If such a storage engine is merged in my mind there will no longer any reason to consider MySQL for production workloads.


Thanks for calling out table partitioning. Besides implementing it at one level, multiple levels can be used simultaneously (eg list and range). Tables can be grouped and split out to their own database (aka functional sharding/vertical sharding) and again partitioned. This all takes more effort and investment but keeps you on PostgreSQL. As you said fillfactor can be tuned, more HOT updates. Even analyzing whether the Updates could be turned into inserts that are written at a high rate, not incurring bloat, and then fewer updates are made at a rate that does not outrun Vacuum.


I've been using MariaDB (MySQL) as a hobbyist for years. I just set up a couple myqsql servers with phpmyadmin on Raspberry PIs and use them for local development. Basic crud apps, etc.

I've always assumed that PostgreSQL is a step up, but never really bothered to look into what I get for the effort. Do I really get anything if I'm not trying to make apps at scale?


> I've always assumed that PostgreSQL, but never really bothered to look into what I get for the effort.

You're making a (mistaken) assumption that Postgres giving you a "step up" means that you also have to put in more effort. You don't, at least not in my experience. Both are database servers with a bunch of install & setup. There's phppgadmin if you want an 1:1 replacement for phpmyadmin (no opinion on these, haven't used either).

Postgres just gets you farther if you need to at a later point.

I would recommend you swap out mysql for postgres on your raspis. You're gaining experience on one of the two. But experience on Postgres seems to be more useful and valuable (cf. rest of the HN comments), for the same cost of your time.


Same position.

There's so many things I want to learn I'm not sure postgres is such a step up from MySQL that it's worth being at the top of the list.


If you’re not at the level of detail to care about transaction isolation levels, clustered indexes, and the implementation details of replication, you can invest your time elsewhere. For certain use cases and workloads, the differences can matter a lot, but many developers are just looking for a backing store for their CRUD app and in that case, either is likely fine.


Most answers seem written by fanboys rather than legit answers.

I would say go with what you know and are most comfortable with. You are more likely to get the better outcome.


This.

I've heard countless times that Postgres is better and I've watched talks where they show how loosey-goosey MySQL is with some things but I know how to backup, restore, tune, secure and replicate MySQL. I grok it's permissions in depth more than I ever have with Postgres and I've even written a mysql plugin in C so I have that in my toolbox if I need it. So I'd by default, usually go with MySQL (or in some cases SQLite.) but if I didn't have to administer or maintain it, and someone else was handling that I think I'd be fine with Postgres too.


I’m going to go for the esoteric opinion: MariaDB. Specifically to get system versioned tables. Imagine having the full change history of every single row for any odd task that you need without the performance penalty of keeping historical data in the same table. It can be a huge amount of leverage.

If that’s not your interest, I will admit that Postgres array support is far ahead of any of the MySQLs. Most ORMs don’t use it but you can get optimal query prefetching via array subqueries.


I’m 100% with you regarding system versioned tables. However, I think they’re coming to Postgres soon-ish. I was following the tracker for a while and it looked like it was done.


Postgres for anything with a single database size < 4TB.

But if you need something that can handle 100TB+, go Vitess(mysql compatible).


What’s the reason though for Vitess? Postgres supports tables up to 32TB but hopefully you’re splitting them up using declarative partitioning in one or more ways before that. If you have tables that are smaller than a TB and a large memory DB (>1 TB RAM) Postgres should run ok right? I’d also imagine you’re splitting up your database into multiple databases and multiple instances (the writers) well before that as well right?


Why not Citus?


A lot of comments for Postgres, but it's the only major DB in 2023 that does not let you choose your character collation when creating a database. That is pretty much a deal breaker day 1. Guess you'll be doing a tolower() on every db search and not use indices which will kill performance or using column collation casts on every search query. I just don't get it.

I once tried to migrate a SQL Server DB to Postgres and eventually gave up, with MySQL being a pretty easy switch with some minor stored procedure rewrites.

Also it tends to do things way differently than every other DB. VACUUM is just a completely different concept that can footgun you pretty fast.

Postgres is pretty powerful but it has certainly made some interesting design choices.


> doing a tolower() on every db search and not use indices

If you create the index with lower() it will uses that; e.g. "create index on tbl (lower(email))" and then "select * from tbl where lower(email)=lower($1)". That's more or less the standard way to do this but there are some other ways as well. It's more explicit than MySQL, so in that way it's better. It's more effort and easy to forget, and in that way it's worse – either way: it's definitely possible to do case-insensitive searches with indexes.

When I first used PostgreSQL I ran in to "how the hell do I do this?! MySQL just lets me [...]" kind of issues, but after many years of PostgreSQL usage I now have the opposite when I use MariaDB, which also has its share of awkwardness and issues (just different ones).


> It's more explicit than MySQL, so in that way it's better.

It sounds like you're under the impression that MySQL just makes everything case-insensitive and is silent about this? That's decidedly not the case.

MySQL 8 ships with 41 different character sets, supporting a total of 286 different collations. Collation names explicitly include "ci" (case-insensitive) vs "cs" (case-sensitive), as well as "ai" (accent-insensitive) vs "as" (accent-sensitive), and also the language/region conventions used for sorting purposes.

You can choose collation at the column-level granularity, as well as setting defaults at the table, schema, and server levels. It's completely explicit and very configurable.


There is no way to see from the query itself if it's case-sensitive or insensitive; that is what I meant.


Eh, just from a SQL query alone, there's no way to see that (lower(email)) is indexed, or see column data types etc. That metadata lives in the table definition, which is a normal place for it, so it seems reasonable for the collation / case-insensitivity to not be explicit in the query text.

Besides, MySQL also supports functional indexes, so you could do the (lower(email)) approach in MySQL too if you really want!


No, you can't see everything, but you can see the exact comparison it's making. Is that useful? You can decide that for yourself but I like being able to see as much of the logic (and thus verify the correctness) in the query itself. Also helps with copy/paste and some other things.

I never said you can't use functional indexes in MySQL. Someone said "you can't do this in PostgreSQL" and I just commented "here's (one way) to do this, maybe that's helpful some day". That's it.


My apologies, I misunderstood "It's more explicit than MySQL" to imply that you were saying that approach couldn't be used in MySQL.


I was disappointed with MariaDB and tried it before using MySQL. It is so far behind MySQL that it can't be considered equivalent anymore. And I really wanted to use MariaDB instead.

The "MySQL just lets me" stuff eventually adds up and hinders development. For example, your lower() on the param example is now incompatible with most ORMs unless you do that in code or create a special SQL statement for that. This isn't all fringe cases that you run into when you're months in either. I really wonder on some of these comments saying they've vetter both and Postgres wins hands down.

Postgres is solid but it definitely has its warts and downsides too.


Am I missing something here? Postgres does allow you to choose character collation when creating a database, as well as when creating new columns.


I'm sorry, I was wrong in that Postgres does let you specify a few character collations at the DB level, but they are pretty much ASCII vs UNICODE with no case-insensitive configurations. You can create collations in v12 and assign them to particular columns.


Friends dont let friends use #Horracle software.

That includes VirtualBox, MySQL, Horracle Cloud. Just step back. Walk away. Do not pass go, do not collect $20000 lawyers fees for unintended actions.


That's quite silly. VirtualBox is great, and so is MySQL. They're also both OSS, so no lawyers in the sense you're implying.


Then you don't know what you're talking about, and are ignorant of the risks of using VirtualBox.

That VirtualBox extension pack? That aint free... well, it is for personal use only because they're not shaking individuals down. However, Oracle watches what domains download that extension pack, and sues companies when too many employees download it.

You can see that in this reddit thread: https://www.reddit.com/r/sysadmin/comments/d1ttzp/oracle_is_...

And this article also addresses the risks of dealing with Horracle.

And a company I worked for had dealings with them as well. Again, Horracle played dirty and did bullshit, except over Horracle DB itself.


You're basically saying something that isn't free, isn't free.


Why would I choose MySQL in any year? There is no context you can provide to this question where I wouldn't always choose Postgres.


Its simple AF - I just always pick the well proven PostgreSQL database.

... if that is too big I use SQLite.


Postgres for almost anything. Robustness, ecosystem, accuracy, all top notch.

One exception: I did migrate two very large tables (15B+ rows) from Postgres to MySQL for performance reasons. InnodB (MySQL storage engine) can arrange the records on page by primary key, and if you have a multi-value PK (user_id, uuid) it means all records from a user are in the same set of pages. Huuuuge improvement over having your data for a user spread out over N different pages. Memory cache way more efficient. Orders of magnitude speed up on cold queries, better cache hit rate, and cost reduction from smaller servers.


Did you implement table partitioning with Postgres or consider that before moving?


Been a while. IIRC it didn’t work with millions of users, and partitioning with many users per partition would have same issues. Also looked at CLUSTER but there was no workable online version.


Also interested in the responses, not because it seems like a close decision but because I would pick postgres by default for anything (anything that isn't simple enough to be done in sqlite).


Same.


MySQL is still ahead operationally (no vacuum, group replication, gh-ost, optimizer hints, etc.). I would choose it unless one of the Postgres extensions is a killer fit for your project.


If Postgres was that much better than MySQL then you would expect to see exact reasons on why to pick it. Every comment so far has not listed any reason.


Ok, here's one: When you give MySQL invalid data, its standard behavior is to just silently store garbage in your database in many cases where PostgreSQL would've told you that your data is invalid.

MySQL's handling of unicode has also been terrible historically, with way too many foot guns, but I don't know if that may be better with recent versions.

People aren't providing strong reasons because the question wasn't "what are some objective reasons for picking one over another", but "how do you pick between them". People are simply answering the question OP asked, and a lot of people's process is simply to pick PostgreSQL.


A lot of the MySQL issues historically have been fixed. UTF-8 is better now, invalid data handling is better (by default even! though your distros default config probably puts you back in permissive mode!) but regardless I'm still using Postgres every single time.

The fact is that MySQL historically was terrible for complex schemas with complex types while postgres was a pleasure to work with. MySQL had a huge performance edge for many years but that came at a cost of resiliency and reliability. MySQL has greatly improved on these key areas and Postgres has also made significant performance improvements. Systems these days are also so powerful that the database probably isn't your benchmark.

Regardless, I always use Postgres every single time because I am just scarred from years of dealing with MySQL. What even is MySQL is also an open question at this point, there's MySQL and MariaDB and Percona flavors and the founder of Percona was just ousted and I can't be bothered to put in mental energy to untangle all this to even figure out what MySQL I should be developing against.

Compare this to Postgres where the community seems to have an extremely steady hand and constant progress. There's no forks, there's no infighting, there's no drama, there's a great regular release schedule with incremental improvements.


In MySQLandia, we do not speak of utf8, we only speak of utf8mb4.


> When you give MySQL invalid data, its standard behavior is to just silently store garbage

This is a common misconception, but this hasn't been the case for over 7 years. MySQL 5.7, released in Oct 2015, changed its defaults to enable strict sql_mode. All prior versions have hit end-of-life for support years ago, so there is no modern version of MySQL with this silent truncation behavior.

The only reason this problem persists is because Amazon RDS (all versions and forms, including Aurora) uses nonstandard default settings which disable strict mode!

That all said, I do believe Postgres is an excellent database, and a great choice for quite a large range of use-cases. But please, let's compare 2023 Postgres with 2023 MySQL, not 2023 Postgres with much older MySQL. It's only fair.


> But please, let's compare 2023 Postgres with 2023 MySQL, not 2023 Postgres with much older MySQL. It's only fair.

Heh Heh Heh

On a humorous note, the official MySQL page (in the early 2000's) comparing MySQL vs other databases had the same problem.

They'd list the latest and greatest MySQL version, but compare it against archaic versions of the others. Clearly on purpose, because "Marketing" probably.

Asked them (via official @postgresql.org email address) to please update that page to more a recent PostgreSQL, for fairness. And was completely ignored of course.

So it's kind of amusing to see a request for fairness in the opposite direction (which I agree with anyway) ~20 years later. ;)


Why do they have these settings? Because shit software used shit MySQL +data corruption.

It might be better now. But for me it’s in the same shithole as mongodb, php (old style, so no recovery there even though it’s possible to create proper code) and most JavaScript.

Other things is that people don’t even want to use Oracle’s MySQL but MariaDB. Why the hell would I want to run a fork of something, and still keep calling it something else.

The only reason for MySQL is wordpress


> MySQL's handling of unicode has also been terrible historically, with way too many foot guns, but I don't know if that may be better with recent versions.

Unicode generally "just works" if the charset in use is utf8mb4. As of MySQL 8.0, this is the default.


Ah, that's good to hear. I haven't looked seriously at databases other than SQLite for a long time, it would be interesting to see a more up to date evaluation.


Hasnt been the case for a few major versions, unless you want to anthropomorphise your db and and hold it accountable for past behaviour.


> unless you want to anthropomorphise your db and and hold it accountable for past behaviour

No one is holding the literal bits that make up the database executable accountable here, they are indicating they don't trust the devs of MySQL/MariaDB to do a good job. Whether or not that is an accurate view on their part is arguable, but it's pretty clear from context that they don't think that several if/else statements had it out for them.


A few reasons:

- Transactional DDL statements (schema modifications)

- Better support for UPSERT operations

- Better JSON support (including ability to index into JSON columns)

- the RETURNING statement to return data that was inserted/updated

In general Postgres is a lot more featureful than MySQL.


If you're doing any sort of spatial logic (e.g. mapping), you'll want to use PostGIS.


One thing that bit me on MySQL is that triggers don't execute on cascade deletes/updates: https://bugs.mysql.com/bug.php?id=11472

That issue was filed in 2005 and it still isn't fixed.

Another gripe I have is that MySQL's JSON Path capabilities are much more limited than the Postgres JSON Path capabilities. MySQL doesn't have filter expressions nor any operators/methods but Postgres does. Don't get me wrong, neither is jq, but I hate having to jump through extra hoops to get my JSON in the right format.

Compare

https://www.postgresql.org/docs/current/functions-json.html#...

and

https://dev.mysql.com/doc/refman/8.0/en/json.html#json-path-...


Postgres has a worse implementation of MVCC. It results in more bloat being produced, and slightly slower updates in a highly concurrent environment. Most businesses don't operate at the scale where this matters. But on the flip side, the tooling and developer experience is much better.


Array data type.

This has saved my ass a bunch of times. Not even as a column type, just in complex queries that otherwise became unwieldy monsters. The usual "started out simple but now it's a frankenbase" and you're stuck with a shitty schema. (The one thing worse than refactoring code: refactoring databases!)

In one case I was able to replace a 15-minute process with thousands of queries with one single query that aggregated all the data into a bunch of arrays. It completed in a few seconds. (Doing it without arrays would have been possible, but duplicated a lot of data in the result set.)


The only instance where I'd choose mysql over postgres is if your database needs are very simple, but you need to be able to scale hard, and your devops aren't skilled enough to manage an advanced postgres setup.


We choose Postgres for extensibility and stability :)


I was going to make a similar joke: you look at them and choose Postgres


PostgreSQL is a community thing and MySQL is Oracle. Maybe make some basic benchmarks for comparison?


1) The choice is Postgres if you care about your data at all.

2) Yes, if you are already HUGE and have requirements on Vitesse then by all means use it. If so, you are not asking this question—see #1.

3) It's a blog or something where it doesn't matter, use a static site generator.


By choosing SQLite.

No server process and a single file per DB which I can put wherever I like.


I like SQLite. But I really wish its default behavior wasn't to simply allow garbage data into the database. If I have an int column, don't let me accidentally store a string.


https://www.sqlite.org/stricttables.html

"In a CREATE TABLE statement, if the "STRICT" table-option keyword is added to the end, after the closing ")", then strict typing rules apply to that table. ... The STRICT keyword at the end of a CREATE TABLE statement is only recognized by SQLite version 3.37.0 (2021-11-27) and later.

  sqlite> create table x (a int);
  sqlite> insert into x values ('hello');
  sqlite> select * from x;
  hello
  sqlite> drop table x;
  sqlite> create table x (a int) strict;
  sqlite> insert into x values ('hello');
  Runtime error: cannot store TEXT value in INT column x.a (19)


Yeah, I know about that, and I'm doing that on all my tables these days. It's just sad that the default behaviour is to allow garbage data, and that if you ever forget to put 'strict' on your tables you'll have a perfectly functional application with no sign that anything is wrong until you suddenly find corrupt data in your database.


create table strict


SQLite is great for its scope - but not in the same class as a full-fledged RDBMS.


Can you give a specific example of what you are missing when using SQLite?


At a certain scale, you'll want replication or replication, which SQLite doesn't really do AFAIK. At a scale below that, you'll probably want to be able to have multiple web servers talking to one database server, which SQLite doesn't really do either. I also think SQLite's performance during heavy write workloads is worse than PostgreSQL's?

Basically, AFAIK, SQLite becomes problematic once you need more than one computer to handle requests.


Just to point out, there are now SQLite replication and various "distributed database" projects which seem to work fairly well.

They're probably not as battle tested as the PostgreSQL ones, but they are around, have users, and are actively developed.

The ones I remember off the top of my head:

* https://litestream.io

* https://github.com/rqlite/rqlite <-- more of a "distributed database using RAFT" type of thing

* https://github.com/canonical/dqlite


Yeah, there are SQLite forks or products built on SQLite which have these sorts of features, but SQLite doesn't. They remains reasons why someone may want to use another database (such as PostgreSQL, or an SQLite fork, or a database which uses SQLite as a storage engine) instead of SQLite.

Honestly though, if I need these sorts of distribution features, I would probably prefer the database to have them built in. I don't really see the point in using SQLite at that scale.


rqlite[1] creator here, happy to answer any questions.

[1] https://www.rqlite.io


Here's a good place to start:

https://www.sqlite.org/whentouse.html

https://www.sqlite.org/quirks.html

Full-scale RDBMSs, especially Postgres, have lots of goodies that either SQLlite doesn't have (or which it does have, but which aren't so richly featured). Once you've gotten hooked on a few of these, the distinction will feel a lot more clear.

Meanwhile the tipping points in favor of SQLite seem to be embedded systems, and its whole "service-less" architecture and plain ease of use. Which is why it still gets lots of love, for those contexts.


Multi writer, access through the internet without 3rd party software, etc.


Performance, especially after a while and certain size.


> By choosing SQLite.

It's a good reminder to give some thought to whether one actually needs MySQL|Postgres. If not, SQLite is the way to go. Most of my code that uses a DB is using SQLite.

But obviously, if you actually need MySQL|Postgres then SQLite is not an option.


If you're a gigacorp you might have reasons to go with a customized version of MySQL otherwise pick Postgres.


I dunno if this is still true but a couple of years ago MySQL was cheaper on AWS (RDS/Aurora) than Postgres.


For a typical db.t3.xlarge instance, you're talking about 29c/hour vs 27.2c per hour. That's $157.68 as the total difference for one year's runtime, when the whole instance cost for postgres would be $2540.4 for the year, or about 6%. The larger the machine, the closer to parity. Given the absolutely small difference, I hope this isn't the dividing line in any commercial project.


Again, I don't still know if this is the case, but you could use smaller instances with MySQL aurora than with Postgres, given the way our application worked it would have made a big difference for us if we had used mysql.

RDS/Aurora was our most expensive resource so we were looking at ways to cut that cost down and mysql was one option (though the way the app worked and the extensions that it relied on made it not a possibility.)


If I were starting from zero knowledge, I'd choose Postgres, but you're not. Unless you require on particular database extensions (full-text search, PostGIS, etc), or you're operating at the kind of scale where all bets are off, it won't matter very much; use the one your team has more expertise with. MySQL sucked twenty years ago, but it's had twenty years of people beating on it, so it largely sucks in precise, well-understood ways now. This is nearly as good as being good.


Always Postgres. It will be more painful to migrate in the future versus starting with it.


MySQL - Better performance and tune-ability. Higher connection limits/lower connection costs. Better multi-master and clustered replication story. 2+ decades in extremely high data and availability environments (telecom).

Postgres - Better SQL semantics.

Having supported both in 'bet the business' scenarios, I would choose MySQL hands down. Operating state of the art HA postgres clusters today feels like running MySQL clusters in 2005.


It depends on my requirements and context, really. THESE DAYS with docker, postgres is almost a no-brainer for basically anything that _I_ do, anyway. If I even need a heavy write load or external-access-via-network then PG, otherwise for my mostly simple CRUD needs SQLITE, quite honestly.

I have a production app on MySql, but that was before docker and MySQL was a fair bit easier to setup then. That was my need at the time.


Easy. Choose Postgres. I worked with Oracle for 15 years before moving to Postgres. In my experience, Postgres has been better documented, more performant, better standards compliant, etc. Now, you are asking about MySQL and not Oracle, but I’ve never heard anyone say MySQL is better than Oracle, while I have heard that about Postgres many times.


You choose postgres.


For myself it’s about what I know and that’s MySQL / Mariadb

If there is a problem I have probably seen it. MySQL won’t start for some reason I know what to do. Adding and removing users , permissions, running with selinux, optimizing queries, indexes - all with in my scope. Backups and restores, working with snapshots, MySQL relocation or master slave. I have been able for years to upgrade mysql versions with out much hassle. I know the table structure well enough I could downgrade in almost all cases - with adjustments I once downgraded mysql 8 to 5.7 because years ago it was too slow.

Now if I had the desire (I’m not past 40 so meh) or i didn’t have such a difference in knowledge I’d probably seriously look at postgresql. So my suggestion is go with in your circle of competence. Regardless if you go with mysql it is a fine database that many use at a high scale.



For my little hobby sites I use whichever best fits the application. If an application has a specific need for a function that is in Postgres and not MySQL then I use Postgres.

In my former work life we used Percona MySQL for the commercial support and very fast response to fix bugs and add features, but we also used Postgres and Oracle. In those cases it was more important to have awesome DBA's that could do anything with any database. I learned a lot from them and they earned a lot of respect from me. One of them could find significant design flaws in the application just be reviewing the schema. They cut our memory usage in half and bought us time to wait for the server vendors to support more memory in their next motherboard release.


There was a thread asking this a month ago, here's what I wrote then:

https://news.ycombinator.com/item?id=35618726


I don't use either but I don't think I've seen MySQL mentioned in any meaningful way in almost a decade at this point. If forced, knowing only that, I'd choose postgres.


It depends on the project but a couple things that would help me pick MySQL (MariaDB) more often (wish list) are:

- Row-level security (RLS) - depending on the tenancy and DB level safe-guards you need Postgres supports it and MariaDB doesn't - transactions for schema changes - Postgres (and even sqlite) support this but MariaDB doesn't. Due to not having it, if there's a faulty migration manual clean-ups might be required to get it back to the state you need


Absolutely Posgtres!!! Yesterday I had to migrate from postgres to mysql and I found some disadvantage things about mysql in my use case

There is no returning for insert/update/delete what forces me do a second query to get new data. Why dont just return what I have updated? Because it is some thing like: UPDATE wallets SET balance = balance + 1

I have to give up json function because its hard to work with it on mysql. Have to do the aggregation in my code instead

No uuid v4 support by default


Easy. Always postgres.

I also use the same logic applied to document databases. Mongo or Postgres? Postgres

Also pub sub. Postgres or redis? Postgres

Use postgres until it’s not technically feasible to use it anymore.


I worked for many years with MySQL and then started working in a company that used PostgreSQL. Things I liked from PostgreSQL:

  - Better SQL standard support (modernish?).
  - JSONB type, also with indexes! Very useful to have JSON fields to allow dynamic fields or avoid sub-querying some tables (i.e. I mean caching some computed results in a JSON field).
I have to say that it was in the 2018-2019, maybe MySQL has improved since then.


I flipped a coin, it landed on its side so I went with SQLite.


Completely tangential, but I need to do some transformations on some CSVs and decided I'd just import them into a local database, run my SQL, and export. The CSVs had the literal strings TRUE/FALSE in a column. MySQL would require me to import into a varchar column and then transform; Postgresql automatically converted it as a boolean.

Plenty of big reasons, but it's the small quality of life stuff that makes a lot of difference.


"PostgreSQL supports most of the major features of SQL:2016. Out of 177 mandatory features required for full Core conformance, PostgreSQL conforms to at least 170. In addition, there is a long list of supported optional features."[1]

[1] https://www.postgresql.org/docs/current/features.html


> What sort of functional requirements would cause you to choose one over the other?

Simple: I don't like having headaches. Therefore, I chose postgres.


Not even funny or clever, and definitely not helpful to the actual question being posed


Easy. I keep using MySQL because that's all I know.

Kinda j/k. I'm intrigued my Postgres but I don't start that many new projects. It has at least 1 feature that MySQL doesn't -- deferred referential integrity which there's no good workaround for that I'm aware of. Arises extremely rarely when you need to have 2 tables have a FK to eachother.


What's to choose? Postgres all the way.


> What sort of functional requirements would cause you to choose one over the other?

MySQL failed me big time in the past, so my functional requirement would be "don't f** with my data". And so far, PostgreSQL and SQLite never did. Don't have time to give second chance, don't need to.


Funny to be so blind, no one mentionned how the Heatwave analytics engine of MySQL is 10 years ahead of what PostgreSQL can do, oh but that's because of Oracle so it doesn't count right?

Mixing OLTP and blazing fast analytic queries on the same database, at the very same time, removing ETL needs...


That's EASY

PostgreSQL. The answer has always been PostgreSQL, even at the height of MySQL's popularity and LAMP craze.

It is a VASTLY better piece of software and ecosystem - yet it's boring at the same time - all things I demand from the DB system whether it's a toy project or an enterprise app.


Friends don't let their friends choose Mysql :)

A super long time ago (decades) when I was using Oracle regularly I had to make a decision on which way to go. Although Mysql then had the mindshare I thought that Postgres was more similar to Oracle, more standards compliant, and more of a real enterprise type of DB. The rumor was also that Postgres was heavier than MySQL. Too many horror stories of lost data (MyIsam), bad transactions (MyIsam lacks transaction integrity), and the number of Mysql gotchas being a really long list influenced me.

In time I actually found out that I had underestimated one of the most important attributes of Postgres that was a huge strength over Mysql: the power of community. Because Postgres has a really superb community that can be found on Libera Chat and elsewhere, and they are very willing to help out, I think Postgres has a huge advantage over Mysql. RhodiumToad [Andrew Gierth] https://github.com/RhodiumToad & davidfetter [David Fetter] https://www.linkedin.com/in/davidfetter are incredibly helpful folks.

I don't know that Postgres' licensing made a huge difference or not but my perception is that there are a ton of 3rd party products based on Postgres but customized to specific DB needs because of the more liberalness of the PG license which is MIT/BSD derived https://www.postgresql.org/about/licence/

Some of the PG based 3rd party DBs:

Enterprise DB https://www.enterprisedb.com/ - general purpose PG with some variants

Greenplum https://greenplum.org/ - Data warehousing

Crunchydata https://www.crunchydata.com/products/hardened-postgres - high security Postgres for regulated environments

Citus https://www.citusdata.com - Distributed DB & Columnar

Timescale https://www.timescale.com/

Why Choose PG today?

If you want better ACID: Postgres

If you want more compliant SQL: Postgres

If you want more customizability to a variety of use-cases: Postgres using a variant

If you want the flexibility of using NOSQL at times: Postgres

If you want more product knowledge reusability for other backend products: Postgres


The license issues are relevant. I've had to convert an application from using MySQL to using Postgres because we wanted to distribute copies of it. The Postgres license allowed us to distribute it without running afoul of GPL, which MySQL uses.

Even distributing the MySQL JDBC driver with your software means that you need to provide source code of the Java app that uses it.

I used MySQL by default for years. If you use it for internal-only apps then the GPL doesn't matter. But once you cross the line into sharing what you've built with others, MySQL is a non-starter unless you want to GPL everything you are sharing.


couldn't you have used the maria db connector? it's LGPL https://mariadb.com/kb/en/about-mariadb-connector-j/


You're quite right. The LGPL license would allow you to distribute the connector along with a non-GPL application.

This is good to know. Thanks!


None of them if you need to do something like pivot tables with dynamic number of columns. Postgres has some half baked pivoting functionality library, but you MUST define the column names in advance, which kind of defeats the purpose of a pivot table functionality.


For MySQL users today, there are modern cloud-native replacement solutions, such as tidb(tidbcloud.com), where tidb (github.com/pingcap/tidb) provides a MySQL-compatible syntax layer but it's not MySQL.

Interests: I work at TiDB


Am I the only one who thinks postgresql’s timestamp and timestamptz types are incredibly stupid?

I just want to either save a local date and time, or an utc timestamp. Postgresql’s timestamp(tz) types do neither and both at the same time.


I've avoided `timestamp` and haven't needed localized times stored in the DB, but in what way is `timestamptz` not exactly the utc timestamp you expect?


It converts the timezone automatically, which I do not want


Practically speaking, they're very similar. Mysql and Postgres differ in their approaches to replication & clustering, which can have a big impact to high availability, high volume database configurations.


If I need disk encryption for very basic security in my production database, I know which one I choose, and it doesn't start with a P.


There is nothing to argue about.

Postgres is better in almost every possible aspect you can come up with.

So you choose mysql in 2023 only if you have very very specific requirements and constraints to justify the sacrifice.


Go with postgres. Unless you need php/wordpress, then pick mysql.


Am I working on a Wordpress site ? Yes -> MySql No -> Postgres


What is go-to solution for Postgres to have data encrypted on disk?

I think that MySQL is more popular in enterprise as there's transparent encryption in enterprise version that's single click.


Encrypt at the block level, e.g. LUKS or at the filesystem level: ZFS.


I choose mysql because they don't want to install extensions (geo stuff) at work and we are using many PHP things that use mysql. And WordPress. Choose boring tech or something.


Always PostgreSQL. Unless if MySQL protocol is absolutely needed.


I would quit that job and get a gig or project where I can make my own decisions. Then stick everything in JSON files or whatever seems convenient for my particular problem.


I kind of learned Git rebasing thanks to the postgres github repo, so I tend to use it nowadays. Does mysql even have a public github repo?


Just choose Postgres. If you want something more nimble, go with Redis (which does support persistant storage, contrary to popular belief).


pick the database that you are most familiar with


It's pretty easy, it's always Postgres. MySQL has been awful at every place I've had the misfortune of using it.


> It's 2023, how do you choose between MySQL and Postgres?

Easy, you pick Postgres. There's no reason to ever use MySQL.


I think the answer is really easy: you should choose the one you or/and your team has more experience with.


In 2023 you don't even debate between the two. The decision to go with Postgres is instant and by default.


Actually I would like to know just ONE advantage MySQL has over PostgreSQL... are there actually any?


Easy: SQLite


piggybacking on this: does anyone know a Postgres alternative to PlanetScale?


We use YugabyteDB and have had a lot of success with it. We deploy it across 5 continents and it's remarkably performant.



CockroachDB is PostgreSQL compatible and is distributed SQL supporting single-region, multi-region and multi-cloud deployments.


It's PostgreSQL protocol compatible, it's not PostgreSQL compatible. There are many differences. For more compatibility, check out YugabyteDB.


Can it(the database) do simple pub/sub without third-party software? :)


Easy. You don't ever pick MySQL because it's essentially garbage.


the answer is:

* always use Postgres.

* if you need something else, it better have an actual reason.

MySQL/MariaDB is only for when you have a framework that requires it or is highly optimised for it (e.g., WordPress or MediaWiki).


Its not even a choice anymore. One is far better than all of the others.


Can I actually work with it locally and on a production server?


Sigh. Hoping dang will swoop in and close this flame war...


If row level security is a need, I’ll go with Postgres


It's OracleSQL instead of MySQL now ?


Neither. Use either TiDB or YugabyteDB.


If you are using .NET then Postgres might be better choice. (much better support for drivers and default ORM).

If you need replication go with MySQL.


https://mysqlconnector.net/ is very good.

Dapper and ^ that works very well IME.

Agreed about replication.


By going with Apache Solr instead


Simple: just pick Postgres :)


postgres just because I'm more familiar with it, and the extension ecosystem (TimescaleDB, PostGIS, etc etc).

That said, I'm not going to be sad with MySQL, though I'd probably go with MariaDB just because of full open source (note, I don't know any details there, being a postgres guy)


Same way you did in 2003. You pick Postgres. There has been zero times mySQL has been a better choice.


You choose Postgres.


postgresql always, that’s how :)


Most of the answers here are choosing Postgres over MySQL, But there are cases where MySQL is far better than Postgres.

1. MySQL's Replication is simple, more reliable and takes up less disk space than Postgres.

2. Ability to control the optimization of queries with Optimizer hints. At scale, to lower tail latencies you will definitely need to help the query planner since you have some domain knowledge about your app and queries that the planner doesn't have. Yes, In PG you can use pg_hint_plan, but still it is not an official solution.

3. MySQL has better connection scaling without a separate component like pgBouncer. Also, cost of connection creation is lower due to MySQL's thread per connection model vs Postgres' process per connection model

4. Collations suck in Postgres since it decided to depend on the OS for collation support. Version updates of other packages (glibc) or the OS can corrupt your database due to this collation mess. The fix for this is to use ICU collations but even they have multiple limitations (You can't specify a non-deterministic collation at the database level etc)

5. Postgres's MVCC implementation is the most inefficient among modern databases [1] Not only is it inefficient, it causes maintenance headaches with managing and tuning auto_vaccuum. It also causes increased disk usage, write amplification (entire row is rewritten on each update and index is updated even if the column being updated is not indexed) and increased bloat due to multiple dead copies of the rows. If you use wide tables (100s of columns) with updates (not even high frequency but moderate updates), MySQL will be far better. Heap-only-tuples (HOT) will only be helpful if your rows are narrow and there is sufficient free space in the page of the row being updated. For wide-tables, most often this is not the case, so even HOT won't be much helpful. Almost everyone would have read Uber's story of moving to MySQL [2], but even if you are not at Uber scale, Postgres's MVCC implementation, and its associated pains are better avoided. Unfortunately attempts to fix this in Postgres (zheap) have been long abandoned.

6. In MySQL (InnoDB), the rows are physically ordered by Primary Key. This improves cache-hit ratio and performance if most of you queries involve selecting or ordering by the PK, or a prefix of the PK in case of composite PKs.

So, if you need performance at scale, reliable replication for HA, less maintenance, good connection scaling, go for MySQL. If your app depends on a extension or FDW that only Postgres has, then choose Postgres.

Often, people may complain about some obscure SQL syntax that does not work in MySQL or that Postgres correctly implements but mostly there will be an alternative you can use in MySQL.

[1] - https://ottertune.com/blog/the-part-of-postgresql-we-hate-th... [2] - https://www.uber.com/en-US/blog/postgres-to-mysql-migration/


By choosing Postgres. Next!


I choose Postgres every time, because that is what I am familiar with.

But nerdy stuff:

Postgres stores data linearly (in heap - which has nothing to do with the heap data structure used for sorting, it just means pile of data). If you need to have fast access to data, you need to add secondary indexes - and the secondary indexes point to location in the heap as "this is where you find the data".

MySQL stores data in a tree - a table is a tree sorted by primary key. You can create secondary indexes and instead of a pointer they contain the primary key value.

That means for example that data with similar primary key will be located physically nearby each other, in MySQL but not in Postgres. At the same time, inserting new data with random (like UUID) primary key in MySQL will write all over the table, but will mostly "append at the end" in Postgres.

Postgres also implements MVCC with Serializable Snapshot Isolation - so data that someone changes exists in multiple copies and needs to be cleaned up later - but there's no locking. MySQL relies on locks instead so there's no duplication but you might see transactions waiting for each other. I don't remember if MySQL implements a proper serializable isolation - but that is not really the default on any database anyway.

Interestingly, Oracle has very similar design to Postgres (though it uses rollback segment for old data, so there's no bloat and vacuum but you might get "snapshot too old" error) while MS SQL Server is also tree and lock-based database like MySQL.

Does this impact you? It might, like in cases where MySQL performs terribly due to UUID keys or Postgres can't vacuum fast enough due to high volume of updates or something. Or you're implementing money settlement logic and need proper serilizable transactions, who know. But it is cool to know the implementation details.


just pick MySQL. If you want 100 half-baked non-production-ready features pick Postgres.


MySQL for the quick and dirty and Postgres for anything else


Why is mysql better for quick and dirty? I feel like pg extensions offer a lot more "dirtiness" running inside pg than mysql has.


yeah that's a weird take. if you want quick and dirty you use sqlite and if you need something more you go with postgres. some replication things are nicer in mysql apparently but postgres is the better option for most workflows


Are there easy offsite backup solutions for SQLite?

The easy deploy tools (digital ocean apps, doku, beanstalk, heroku, etc) destroy the instance (losing state). You need to configure an offsite backup tool for your db or you have to manually setup a server to live forever.

Having a separate db machine is so much easier to setup and plays better with the diy tools.


if you're working with ephemeral services like those then no you'd never use sqlite because they are temporary. in my mind quick and dirty being a backing store for something like a local script where you wouldn't want to spin up a full server for... or a small webserver that you are running locally. if you are going to do a proper website you'd ideally want to have something with more flexibility and data guarantees. sqlite is grand for a specific sort of read heavy or single access writes workflow.


Postgres doesn't have type hints and it might create a false impression of robustness until it messes up table statistics and does FULL SCAN against a table with millions of rows ignoring all indicies. It happens super rarely though, so if you run anything critical, you'll probably be down only for a few hours once a year or two. Be ready for this to happen though.

Apart from that (and noticeably higher memory consumption), Postgres is most likely preferable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: