
Bucardo 5, async replication for PostgreSQL with true multi-master - rosser
http://blog.endpoint.com/2014/06/bucardo-5-multimaster-postgres-released.html
======
mbell
I'm a bit unclear what they mean by 'true multi-master'. The overview and FAQ
lead me to believe this is a perl script with it's own _singular_ db sitting
between multiple databases. Specifically what are the CAP guarantees for this
system? At a surface level it almost seems like it's split brained by design.

~~~
rosser
Multi-master means exactly what you'd expect: N (where N >= 2) masters, each
of which can accept writes, which will be propagated to all the other nodes
Bucardo knows of.

The "_singular_" db is Bucardo's configuration store, and is otherwise
uninvolved in replication.

~~~
mbell
> Multi-master means exactly what you'd expect: N (where N >= 2) masters

But the FAQ says it only supports 2 masters?

> each of which can accept writes, which will be propagated to all the other
> nodes Bucardo knows of.

How are conflicts dealt with? Is the resolution serializable?

> The "_singular_" db is Bucardo's configuration store, and is otherwise
> uninvolved in replication.

If it fails does sync stop? If sync stops how is healing performed?

I appreciate the response but it doesn't really tell me anything about how I
can expect the system to perform with nodes/network failures. I swept through
the docs but didn't see much of anything on the topic.

~~~
rosser
The sibling comment is correct: the FAQ is out of date. Bucardo 4 supported
only 2 masters; B5 is N >= 2.

I'll try briefly to answer your questions, having used Bucardo for several
years, having a (woefully under-utilized) commit bit, and being a somewhat
active participant on the mailing list and in #bucardo.

Conflicts are handled in one of a number of user-selectable ways ("latest"
wins, "source" wins, _& c_). They're selectable on a per-table basis.

If the configuration DB were to stop, I don't actually know what would happen.
I've never faced that scenario. AIUI, however, Bucardo only reads its
configuration on startup, so it shouldn't break anything unless you were to
try to change your replication configs (or restart, which doesn't typically
need to happen), at which point you'd be able to address it.

If syncing stops, "deltas" (which are merely a timestamp and the pkey of the
changed record) accumulate in their source databases until syncing is resumed.
Healing consists of those deltas being replayed against their targets, as
managed by the selected conflict resolution strategy.

------
philjackson
"sudo chmod 777" \- ALERT!

------
glogla
We need to get Aphyr do his funny picture magic with this, before we can trust
it.

~~~
teraflop
Just looking at the documentation is enough to make me wary: all of the
default conflict resolution strategies[1] will, by design, throw away data in
the event of a conflict (except for "skip" and "abort", which probably don't
do what you want either).

Supposedly you can write a custom conflict handler (in Perl, naturally). But I
find it worrying that a feature that's so critical for correctness is also
entirely undocumented. I found an open issue to write some docs[2], which
refers the curious to a single unit test that was deleted in 2011 with the
commit message "Spelling cleanups, other maintenance" [3]. Doesn't inspire a
lot of confidence.

[1]: [http://bucardo.org/wiki/Swap](http://bucardo.org/wiki/Swap) [2]:
[https://github.com/bucardo/bucardo/issues/61](https://github.com/bucardo/bucardo/issues/61)
[3]:
[https://github.com/bucardo/bucardo/commit/5eeb51d5bb7dd379bb...](https://github.com/bucardo/bucardo/commit/5eeb51d5bb7dd379bbe5ca205bbd80187debbd4b)

------
opendais
I wonder what performance of Bucardo + Postgres is like vs. Galera + MariaDB.
I'm mainly curious if switching from sync to async would be worth the
performance benefits. Then again, I suppose since Bucardo can handle
MySQL/MariaDB it'd be fairer to compare Galera vs. Bucardo on MySQL.

~~~
dscrd
Does Galera still suffer from the "gap locks" and related deadlocks? I was
quite horrified to find this out.

I would hope that that's not a problem with Bucardo or in fact any serious
database system.

~~~
opendais
You mean InnoDB's gap locks?

[http://www.mysqlperformanceblog.com/2012/03/27/innodbs-
gap-l...](http://www.mysqlperformanceblog.com/2012/03/27/innodbs-gap-locks/)
[http://pooteeweet.org/blog/745](http://pooteeweet.org/blog/745)

If you actually have that problem in production, it is because you are
spreading the writes out to all of your Galera servers and simultaneously
doing conflicting updates/transactions.

I'm uncertain what exactly you expect any database to do in this situation?
Silently discard one?

~~~
dscrd
Spreading writes to all servers: yes. Conflicting updates: no; this problem
happens when doing simple inserts.

See [http://www.toofishes.net/blog/mysql-deadlocking-simple-
inser...](http://www.toofishes.net/blog/mysql-deadlocking-simple-inserts/) \--
that's an old blog post but this still happens with a reasonably late mysql
and mariadb. I have no reason to believe that it's not a fundamental design
failure. It's of course not a Galera issue per se, but rather a more serious
InnoDB issue, but it causes problems very easily when using Galera.

Perhaps there's a workaround that I don't know of yet? I'd be happy to hear of
it. Perhaps it's a fundamental RDBMS problem that I don't understand yet? I'd
be happy to be educated if so.

As to your last question, I'd expect the database system to accept all such
inserts to all instances since they don't conflict with each other in any way.

------
__john
It doesn't look like they've updated their changelog page yet
[http://bucardo.org/wiki/Bucardo/Changes](http://bucardo.org/wiki/Bucardo/Changes)
. Does anyone know where a list of changes can be gotten?

~~~
rosser
[https://github.com/bucardo/bucardo/blob/master/Changes](https://github.com/bucardo/bucardo/blob/master/Changes)

(Note that Bucardo is mirrored to github; the official repo is on
bucardo.org.)

------
rpedela
Are DDL statements replicated yet?

~~~
rosser
No. Unfortunately, that's a limitation in postgres, itself. It's being worked
on — or at least discussed; see "Event Triggers" for more info on that.

~~~
spacemanmatt
They are available starting with PostgreSQL 9.3

