Hacker News new | past | comments | ask | show | jobs | submit login
Is Bi-Directional Replication in Postgres Transactional? (sdf.org)
39 points by craigkerstiens on Jan 5, 2016 | hide | past | web | favorite | 17 comments



> It’s better to think of BDR like other eventually consistent data stores […]

with the exception that this will never be consistent.

Silently corrupting data goes totally against the usual Postgres spirit (which might be one of the reasons why the BDR patches haven't been accepted yet). Normally, if $THING can't be guaranteed to be done right, then Postgres just doesn't offer the feature (or throws a specific error).

If handling transactions isn't feasible for BDR, then it should refuse BEGIN's instead of accepting them and silently corrupting data.

If handling transactions across tables isn't feasible for BDR, it should blow up once you touch more than one table, forcing you to roll the whole transaction (which has so far only touched one table) back.

It's totally fine if your tool can't do $THING. Especially in matters of multi master replication which is a really hard problem. However, I would really want your tool to be honest about its deficiencies and tell me so I can try to find a workaround.

(Of course this could also just be a "simple" bug at which point, you can disregard this rant and just fix the issue :p)


> Silently corrupting data goes totally against the usual Postgres spirit (which might be one of the reasons why the BDR patches haven't been accepted yet). Normally, if $THING can't be guaranteed to be done right, then Postgres just doesn't offer the feature (or throws a specific error).

There's a good set of scenarios, particularly globally distributed datastores, where latency makes it infeasible to make strong guarantees. That's what BDR was designed for. That's why you can configure conflict handling and add conflict handlers to resolve conflicts.

And it's not "silently corrupting" things - it's logging the conflicts to postgresql's log and, if configured, to a historical table.

> If handling transactions isn't feasible for BDR, then it should refuse BEGIN's instead of accepting them and silently corrupting data.

That just depends on what you define as "handling transactions". You can have multi statement transactions which are atomic - something rather useful for a lot of usecases. Other sessions, on any node, will never see a partially applied set of DML from within a transaction. Also useful.

I agree that it'd be helpful to expand BDR docs on the specifics of what consistency to expect when doing what.


>And it's not "silently corrupting" things - it's logging the conflicts to postgresql's log and, if configured, to a historical table.

I stand corrected. Still, it would be useful if there was a mode when it blew up because that would make sure the data is always consistent even if the application forgets to check the historical table.


Oh and:

> which might be one of the reasons why the BDR patches haven't been accepted yet

That's primarily because they've not been submitted as a whole. Many individual parts of BDR have been submitted & integrated into postgres. Simplified versions of others (the logical decoding output plugin, the apply side, replication set management) have been submitted recently and are being reviewed. I don't think any major part has been wholly rejected - changed during review, surely and heavily.


Agreed. This is the sort of thing you would expect from a Postgres fork, but not Postgres itself.


What does the author think should happen here?

Two servers independently committed conflicting transactions, which is always a possibility for multi-master in the general case. You can't have independence and dependence on the same variable at the same time.

Now, maybe BDR could do better for the case the author is concerned about, but the author hasn't explained what it should do and why it's better.

The author seems to want the entire conflicting transaction to be rejected, but it's already been committed on the remote node so I don't see how that helps. They will never be back in sync again until you sort it out.

You should use BDR in cases where you can come up with some reasonable conflict resolution. For instance, allow inventory to go negative, and inform one customer of a delay with their order.


What the author of that blog post expects isn't realistically possible in a general async multi master system. You can't really have conflict free changes, that are consistent across values to boot, across distributed nodes unless you: a) are synchronous in some way (e.g. distributed locks) - which you don't really want in a geographically distributed system because of the heavy latency impact or b) partition your data in a way that doesn't requir any changes to span nodes.

In many scenarios these restrictions are unacceptable, and dealing with potential inconsistencies is the smaller (but y no means small) evil. If you indeed need it, you need to design your spoliation accordingly.

There are workloads where it's easy to deal with such problems. If there are only inserts, without constraints there often not much to do. In other cases you can simply accept rare inconsistencies.

Quite often the majority of writes don't need to care, but a small portion needs to be much more careful. E.g by using distributed 2pc, or BT always going through a specific node.


But, how could they really be? Wouldn't you always come up against the CAP theorem (treting delays like short partitions)?


The thing that I can't find a simple answer to and it bugs me: master-master replication is pretty common in MySQL, is the MySQL implementation inherently flawed somehow but people are still using it (which would be against the spirit of PG, and rightly so), or is it something about Postgres that makes it specifically difficult to implement?


Master-Master in plain MySQL is incredibly hackish, but it's not that hard to design your database schema around these limitations and avoid conflicting writes in most real-life scenarios.

Then there is Oracle's MySQL Cluster (never seen that in production, no idea how well it works) and Percona's MariaDB Galera Cluster, which… well: https://aphyr.com/posts/328-jepsen-percona-xtradb-cluster


>is the MySQL implementation inherently flawed somehow but people are still using it

Yes.

What MySQL calls "master master" is actually "master slave".


Progammers worry too much about achieving perfect instant consistency which is impossible anyway, because so far Eintein's theory of special relativity holds true. The priority of the database needs to be to accurately capture and store facts, as they occour in the real world. Yes certain contraints may be breached, but as long as the activity is tracked correctly, these can be resolved later. Bank systems are not consistent, with HSBC for example even if I do an inter account transfer, the lags are often several minutes, international payments are slow antiquated and involve many human interactions. Checks are even worse, as a student I could go way over my overdraft limit because I deliberatly got extra checkbooks at the start of term, then with my check guarantee card I could cash £50 for each check, even at my own bank who knew i was over limit? But I still of course was liable to repay these and I did in full, because they were all recorded accuratly.


Bolting on replication into an existing product is very hard. It usually is something that has to sit at the core of the product to be done right. It seems this sort feature cuts vertically from core all the way to the user level API.


None of this is surprising. I'm not sure what the author was expecting? Or if they're simply clarifying the scenario for others as a warning.

> "You would be paying that 70ms cost four different times."

Right, and either you're willing to pay that cost OR temporary inconsistency is acceptable to you. BDR doesn't warp the speed of light.

Also: If bank accounts are involved, inconsistency is not acceptable. Someone will bankrupt you. That's not BDR's fault.


> Also: If bank accounts are involved, inconsistency is not acceptable. Someone will bankrupt you. That's not BDR's fault.

Fun fact: a large portion of bank accounting systems are not fully consistent. It's a but scary at first, but of you think about it, it's not that surprising: e.g. not all ATMs and shops are always online, particularly in the past. Many systems also couldn't and some still can't, keep up with the full load on peak times in a central manner.


Note that logical decoding (the rather unfriendly name for the API that lets you subscribe to streams of changes coming out of Postgres) does demarcate transactions. The lack of transactionality in BDR is not the API's fault.


Indeed. BDR also doesn't even have that fault in general. Conflict handling happening in a row basis and not having any form of transaction are two very different pair of shoes.

FWIW: I'm not particularly happy about the name either. But we fought over names so long that agreeing on something subpar was better than not having the feature. It was originally named changeset extraction. Not sure if that'd have been better.




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: