

Comparing MySQL and Postgres 9.0 Replication - rgeorge28
http://www.theserverside.com/feature/Comparing-MySQL-and-Postgres-90-Replication

======
dstorrs
My takeaways from this are:

1] MySQL supports more replication options, whereas PostgreSQL focuses on the
ones that 95% of users will care about and does not implement the rest.

2] MySQL's replication has several failure modes that can lead to data loss or
non-synchronization. For example, it is statement based by default, so any
statement which includes a call to NOW() will return different results on
master and slave. PostgreSQL also has failure modes, but the defaults are
designed to minimize them -- e.g. it always uses WAL and data file rewrites to
ensure that data is properly synced and consistent.

3] Due to points 1 & 2, if you aren't an expert and don't need the extra
features offered by MySQL, you are probably better off with PostgreSQL.

4] 3rd party tools exist for both servers which close many of the feature gaps
between them.

Could someone who is more knowledgeable than me tell me if this is a fair
reading?

~~~
spudlyo
1\. I don't pretend to know what 95% of users will care about.

2\. MySQL writes the timestamp for NOW() into the binlog so provided your
master and slave are in the same timezone, it's replication safe. Functions
like UUID() and RAND() will behave as you describe.

3\. I disagree. MySQL's replication is more mature, widely understood, and has
more documentation than PostgreSQL's.

~~~
moe
I disagree strongly with "more mature", "widely understood" and "more
documentation".

1\. Mature?

I have seen MySQL replication blow up in so many awful ways, it's not funny.
And yes, I've seen it happen on 5.* deployments. The worst part is that MySQL
hardly ever detects a problem, and you hardly ever know _why_ it went wrong -
it just silently corrupts. The only way to detect the corruption is through
regular checksum runs which can be costly on large databases (there's a reason
why the checksum-utility from maatkit has fairly sophisticated options for
incremental checksumming).

2\. Widely understood?

I challenge you to prove your understanding and explain only a small subset of
the functionality: Please enumerate all failure modes that can lead to
corruption in statement based replication, and how to avoid them.

3\. More documentation?

That has to be a joke. The MySQL documentation is a _mess_. It is poorly
structured, poorly written and full of conflicting and misleading bits. Please
show me the MySQL document about replication that comes remotely close to
<http://www.postgresql.org/docs/9.0/static/hot-standby.html> in terms of
clarity and exhaustiveness. Notice the long section about "query conflicts" -
where is the MySQL equivalent?

~~~
spudlyo
1\. This is the first release of baked in replication for PostgreSQL. MySQL
has had years to evolve their approach and to fix bugs. That's not to say
either system is perfect, but their relative maturity seems obvious.

2\. No. It's widely understood because MySQL has a much larger mindshare than
PostgreSQL. Statement based replication has its faults, but complexity isn't
one of them.

3\. It's not a joke. There is _more_ documentation. There is the official
documentation, but if you don't like that there are plenty of blog posts,
books, and recorded talks on the subject.

~~~
moe
_MySQL has had years to evolve their approach and to fix bugs._

That sounds nice in theory, but completely ignores the reality of the
respective code-bases. Features in MySQL have a tendency to rot rather than
mature. It could hardly become more obvious than in the comparison between
this version 1 of postgres replication against the "evolved" MySQL equivalent.

Postgres may have the smaller feature set, it may be the first version, yet
I'm still inclined to trust it a lot more than what MySQL has to offer. Simply
because postgres has a track-record of being rock solid and MySQL, well, not
so much.

Sure, in theory you can build fancy rings, cascades and even multi-master
topologies in MySQL. But what is that worth when even the simplest
master/slave mode still randomly and silently desyncs in various, undocumented
situations?

 _Statement based replication has its faults, but complexity isn't one of
them._

There's a nice quote from Alan Perlis:

    
    
        Fools ignore complexity,
        experts avoid it;
        geniuses remove it.
    

Some complexity is inherent to replication. The question remains which parts,
if any, they removed and which parts they ignored...

 _It's not a joke. There is more documentation._

Sorry, measuring documentation in terms of quantity sounds like yet another
joke to me. As you can tell by my snarky comments I'm responsible for
babysitting the odd MySQL cluster in my day-job (and postgres, too, so I have
a frame of reference). It is indeed true that for most common problems I'll
find the solution not in the official docs but rather on some more or less
obscure forum, in broken english, linked from page >3 of the google search
results (after 2 pages of outdated and plain wrong information).

So, in all fairness, yes, most _common_ problems can eventually be solved that
way. But good luck debugging a not so common ring, multi-master or otherwise
non-standard scenario that way. Or encoding issues. Or silent truncation. Or
hotbackups that are strangely much smaller than they should be. Or the dreaded
'mysql'-database being screwed after restoring a backup to a newer minor-
version...

Anyways, the way postgres handles this seems more sensible to me. In postgres
the overwhelming majority of the "common" MySQL-issues simply doesn't exist.

~~~
spudlyo
You win. Your sweeping generalizations, personal anecdotes, and FUD have won
me over. _Bravo_.

~~~
wanderr
Do you have any experience with MySQL replication? I do, and mine aligns with
what moe is describing. Nice in theory isn't worth much if it's crappy in
reality.

~~~
spudlyo
Yes. Over the last five years I've supported replication on hundreds of
servers, powering sites you've probably used. I've had to clean up after
crashes with unsynched binlogs, writes that completed partially on master,
corrupted relay logs, failed hubs, and endless inconsistencies for a myriad of
reasons.

Guess what, things have gotten incrementally better and continue to improve.
Bugs get found and fixed. MySQL replication is mature enough to power huge
sites like Twitter, Facebook, and countless others. There has been exactly
_one_ PostgreSQL release that supports replication suitable for read scaling.

~~~
wanderr
I measure stability by how well something works, not by how many releases it
has had. MySQL replication is getting better, but it does still have major
problems. Sure, it's good enough to power huge sites (we use it here as well),
but it's frustrating to deal with. Fortunately it's being used by web 2.0
companies and not banks, so generally if a slave is corrupted and returns
weird results, customers aren't going to be too upset and in a lot of cases
might not even notice, but it's still an annoying problem to deal with. I
wonder how many more releases it's going to take for them to get it solid.

------
sph
This is a wonderful in-depth article and the PostgreSQL team deserves a lot of
kudos for the work they put into this. I'm looking forward to using it. The
thing I find interesting that MySQL offers is multi-master and circular
replication. Cal Henderson noted that Flickr uses multiple masters replicating
to slave servers in their setup during DjangoCon 2008 (a great presentation if
you haven't seen it: <http://www.youtube.com/watch?v=i6Fr65PFqfk>).

Granted, multi-master setups aren't needed for most sites and PostgreSQL's new
WAL replication and the fact that it makes sure that the WAL is written on the
slave before the transaction is committed on the master means that data
integrity should be top notch. It's exciting to see PostgreSQL's development
progressing so well.

~~~
wiredfool

      it makes sure that the WAL is written on the slave
      before the transaction is committed on the master 
      means that data integrity should be top notch.
    

That's not my reading of the documentation. The slaves may delay applying a
transaction for some configurable period of time to allow conflicting read-
only queries to finish. Slaves can also disconnect and reconnect at will
without stopping the master.

~~~
sph
The slaves may delay applying the transaction, but the data is already on the
slave machine. As the article points out, the data gets written to the WAL on
the slave before the transaction is committed on the master. The slave might
not update the database from the WAL immediately, but the data _is_ on the
slave in case the master goes down.

Basically, it means that there is the potential for replication lag and so you
have to make sure you query from the master for just-updated data. However, it
also means that if the master goes down, any data committed on the master is
already on the slave in the WAL, just possibly not applied yet. So, it doesn't
isolate you from replication lag, but it is a nice thing when it comes to
making sure that the data remains intact even if a complete failure of the
master occurs.

~~~
wiredfool

       As the article points out, the data gets written to the WAL on
       the slave before the transaction is committed on the master
    

Then the article is wrong, it doesn't. I can reboot or kill the slave and the
master will still happily accept transactions. It's also pretty clear from
tcpdump that the data only goes on commit, and it's one way from the
master->slave.

