I've used mysql a lot, and pgsql a little. I can't tell the difference, other than pgsql being slower and having less support. Some people swear by it, so I'm curious what I am missing.
In performance terms, granted, pg was a bit of a dog in those days, but it outperformed MySQL in every benchmark I could throw at it, with minimal tuning. So, if you're experiencing it as less performant, you're most likely running with a default configuration — which is deliberately tuned for something like a 486-class box with 128MB RAM or so — and/or haven't ANALYZEd your data. (Those are complete WAGs, knowing nothing more than what you've said about your scenario, but they tend to be among the more common reasons for lackluster performance.)
As for support, as a complete noob, I had a weird performance problem I couldn't make sense of, so I went to the mailing lists. Within a few hours, I was exchanging stack traces and other sundry debugging/profiling dumps with Tom Lane. (He's in the Wikipedia. Even if you've never touched PostgreSQL, you use his code every day of your life.) I don't think you can get much better support than that. Since, I've never encountered a problem that I haven't been able to have addressed, or at least get pointed in the right direction, by asking on — or searching the archives of — the relevant mailing list.
From there, it was largely a matter of, "This is the one I already know how to use...", along with the better feature-set (not mentioned by any of the sibling posts thus far: transactional DDL); the lack of a known-evil corporate overlord who could pull the plug at any time; the consistent tens-of-percent performance improvements in every major release; a development community that will punt a feature to the next release if it's not 100% ready and provably correct; and, let's be honest, the fact that, as someone who's been doing pg work for as long as I have, I can command a very comfortable hourly rate on the basis of that depth of experience — particularly when it's been focused in high availability and replication.
It also didn't support transactions where postgres did. Essentially mysql was a dumb datastore with sql interface, whereas postgres was a database.
That's changed now, but postgres is still a head in reliability(shit just doesn't break) and feature-set. and I hear in speed these days. But I no longer care about speed these days, as an SSD backed postgres handles anything I can possibly throw at it.
At present, it looks like the best choice is the forthcoming Intel 710 series drives, but if you need an SSD now, their 320 series, the Sandforce controller based drives with supercaps (like the OCZ Vertex Pro models, though I wouldn't touch those with a competitor's database), or a FusionIO card are the only remotely safe options.
The benefit that has in terms of increased concurrency is worth far, far more than having to implement a workaround for quick-and-dirty row counts for simple things like pagination, IMO.
(Aside: InnoDB is also MVCC-based, which is why its COUNT(*) is an estimate. The MySQL folks apparently decided that it was better to provide an estimate than an exact count, while the PostgreSQL folks decided the other way. There's a part of me that wants to call that symbolic of the way the two projects operate on a much broader level...)
The Percona team said that Oracle has a lot of customers who are using Oracle RDBMS for big stuff and MySQL for smaller stuff, and they like being able to sell support for both, but they don't see MySQL as competing with their flagship DB. He also said that Oracle has so far kept their promise to keep developing MySQL and that Percona has pulled their upstream changes.
Besides that, if Oracle decides to stop supporting MySQL, Percona is a drop-in replacement, as is MariaDB, developed by some of the original MySQL developers. So you won't get stuck.
All of that said, however, I'm interested in PostgreSQL just because I've heard that it's a well-made database. There's nothing wrong with switching for other reasons; I just don't think that uncertainty about MySQL's future is a good reason right now.
• Subqueries are optimised as well as JOINs. I can go all-Inception in my queries and they perform well (I find subquery style often easier to understand than equivalent JOIN).
• You can do UPDATE … SELECT on the same table.
• Postgres has query rewrite (RULE) that can be used to implement writeable VIEWs (which is awesome for migrating legacy applications to new schema)
The only thing I really miss from MySQL is ON DUPLICATE KEY UPDATE. Postgres has only weak 1-row hacks emulating this, and "standard" MERGE syntax for this is horribly ugly.
Disclaimer: I don't think of myself as a database guy, and I couldn't give you a definitive list of the differences. I remember some times in the past when things like transactions, triggers, and stored procedures were on the list of differences the pgsql fans quoted to me.
Historically, PostgreSQL has been considerably more SQL-feature-rich than MySQL. It still is, yet to a lower extent (see previous paragraph) due to MySQL caching up. For example, MySQL only has views, triggers and stored procedures since version 5, and Postgre had them since at least 7 or 8 which were already mature at the time MySQL 5 was still development. I guess a lot of people became adepts at that time.
Subqueries, but we don't require this in production.
Wariness of Oracle's conflict of interest.
Worse still, subqueries in the FROM clause are (documented to be) implemented as an unindexed temp table.
Google apparently compiles subquery support out of their mysql instances so people don't mistakenly think they're usable.
That said, I'm generally more frustrated when using MySQL than when using PG. Here's a sample of the problems I've encountered from using both MySQL and PG. I haven't updated my list in a while now - please feel free to correct me on things - but hopefully it's a little more illustrative than that Wikipedia feature matrix, and a little more specific to MySQL vs. PG. (This list is a cleaned-up selection from my notes wiki at http://yz.mit.edu/notes/Hackery.)
No referential integrity.
No constraints (CHECK).
No sort merge join, let alone hash-join. http://www.dbms2.com/2008/07/10/how-is-mysqls-join-performan..., http://www.mysqlperformanceblog.com/2006/06/09/why-mysql-cou...
Generally poor at analytical workloads, since it's designed for transactional workloads.
Can't reopen TEMP table - WTF? (Still not fixed!) http://bugs.mysql.com/bug.php?id=10327
Multiple storage engines has always restricted progress: http://www.mysqlperformanceblog.com/2010/05/08/the-doom-of-m... (PG also supported multiple storage engines in 80s, then concentrated on one)
No WITH clause: http://stackoverflow.com/questions/324935/mysql-with-clause
Crappy errors: “Incorrect key file for table ‘stock’; try to repair it” on “alter table stock add constraint pk_stock primary key (s_w_id, s_i_id);” where stock is in InnoDB (which has no “repair table”) means I have no /tmp space (no Google answers)
Crappy EXPLAIN output - somewhat better when using the visual-explain tool from Percona.
InnoDB auto-extends ibdata1 file; only way to trim (garbage collect) is dumping and loading.
Scoping is broken:
mysql> create table t(a int, b int); Query OK, 0 rows affected (3.30 sec)
mysql> select a, (select count(*) from (select b from t where a = u.a group by b) v) from t u;
ERROR 1054 (42S22): Unknown column ‘u.a’ in ‘where clause’
“InnoDB is still broken…Just last week we had to drop/re-create an InnoDB-table in one project because it would not allow to add an index anymore, no matter what we tried…Mysql::Error: Incorrect key file for table 'foo'; try to repair it: CREATE INDEX [...]” http://news.ycombinator.com/item?id=2176062
MySQL only recently got such things as per-statement triggers and procedural language support.
MySQL has only its own internal auth system, whereas PG supports a wide array of auth providers.
PG has more supple ALTER TABLE implementation.
MySQL doesn’t support ASC/DESC clauses for indexes http://explainextended.com/2010/11/02/mixed-ascdesc-sorting-...
Optimizer only recently started working properly with certain subqueries
OK documentation, but still considerably unpolished compared to PG's. Random omission: auto_increment jumps up to next power of 2 but inconsistently across versions (platforms?).
(Older issue, not sure if it's still relevant) Crappy concurrency, >3 cores sucks vs PG: http://spyced.blogspot.com/2006/12/benchmark-postgresql-beat...
Some features that make Postgres awesome:
* Transactional DDL. You can do "create table" in a transaction. _Everything_ is transactional, it's not a tacked-on feature, it's the basis of everything.
* No legacy cruft. Compared to MySQL, which is filled to the brim with historical warts. The Postgres people have been careful to weed out obsolete functionality. There are essentially no sneaky border cases that a developer needs to be aware of, no weird special cases like "0000-00-00 00:00" having special meaning.
* No need for a "strict" mode, since Postgres is always strict. Postgres doesn't allow invalid dates, doesn't allow byte sequences that violate character encodings, etc. It diligently enforces contraints and generally doesn't allow you to screw up. To Postgres, data integrity is paramount.
* PostGIS. Simply awesome. (MySQL's geospatial stuff also tries to implement the OGC API, but last I looked, it was a half-hearted attempt that negelcted to provide the fast R-tree-based (actually GiST-based) indexing that makes PostGIS so super fast.)
* Replication. It's late to the party, but I rather prefer how Postgres has implemented its replication, even though it has some downsides where it will abort a long-running query if some data has changed under its feet (but if you're using transactions it's easy to simply restart the query). 9.1 will be getting synchronous replication, which is pretty cool.
* Extensions. Postgres can integrate languages like R and Ruby as first-class languages that can be called from SQL. It also has a module system that can extend the type system (a bit of trivia: This was originally the main reason why Michael Stonebraker invented Postgres) with new types, eg. multidimensional matrix columns, or new features, like remote tables.
* The "text" type. Seriously, why should do people keep writing things like varchar(255)? Postgres' text type is an unlimited string. Unlike MySQL's text type, it can be efficiently indexed without limitations. (Varchar is internally implemented as a bounded text type.)
* Cost-based planner backed by row-level statistics. This is the stuff that allows Postgres to do complex nested queries and still perform incredibly well.
* Partial indexes. You can do something like "create index ... on themes (name) where color = 'blue'". Whenever you do a query that falls within the expressions's range, Postgres will use that index, potentially vastly reducing the search space.
* Functional indexes. You can do something like "create index ... on (lower(name))". If you then do a query such as "select ... where lower(name) = 'xyz'", then Postgres will recognize that it's the same expression, and it will be able to use the index.
* Windowing functions and recursive queries, both from ANSI SQL99 iirc. Look this up, they're great.
There are some bad points, none of them significant and all of them a matter of taste:
- I have never really liked Postgres' text indexing, which feels a bit creaky and antique. At least with 8.x, GIN index updating was slow as hell.
- Partitioned tables are a great feature, but I will never use it because of the requirement that one does the plumbing yourself (creating partitioning rules and so on); I keep waiting for something like Oracles's automatic partitioning.
- Stored procedures -- ie., running logic inside the database -- feels wrong to me, and always has. For some people this is a requirement, so I'm not really complaining. In some cases, writing a stored procedure can be essential to speed up queries/operations by saving on database roundtrips.
- Still no "upsert" SQL command (aka "insert or replace", "insert or update") for asserting the existence of a row atomically.