

Databases: The Einstein Hypothesis - Distributed Transactions Cannot Exist - cassandravoiton
http://nerds-central.blogspot.com/2008/06/databases-einstein-hypothesis.html
WTF! I always though distributed transcations were BS. If only I understood this post, I guess could now prove it <i>giggle</i> :)
======
cx01
The author is wrong. Distributed transactions are possible, for example using
the Paxos algorithm. In Paxos, if the network fails, the system will block
(i.e. become unavailable) until connectivity is restored. It's not possible
for one node to commit and the other node to abort.

There is even a paper about using Paxos specifically for distributed commit:
[http://research.microsoft.com/apps/pubs/default.aspx?id=6463...](http://research.microsoft.com/apps/pubs/default.aspx?id=64636)

~~~
cassandravoiton
You, sorry to say, utterly incorrect. The Paxos algorithm offers a clever way
to making the choise of a single transactional arbitrator dynamic. For each
transaction, it might be different, but for a single transcation, Paxos is
only stable if the transactional co-ordinator is stable. if the network fails
then an unknow state can be achieved and the entire system will have to
restabalize to the values held in the single co-ordinator for the broken
transaction.

~~~
cx01
Did you even read the Paxos Commit paper? There is no single arbitrator. And
what is 'unknown state' intended to mean? The system will obviously always be
in a known state. It will just block as long as there is no network
connectivity.

~~~
cassandravoiton
Yes, I did read it... "

In practice, it is not di±cult to construct an algorithm that, except dur- ing
rare periods of network instability, selects a suitable unique leader among a
majority of nonfaulty acceptors. Transient failure of the leader-selection
algorithm is harmless, violating neither safety nor eventual progress. One
algorithm for leader selection is presented by Aguilera et al. [1] "

" The algorithm satisfies Stability because once an RM receives a decision
from a leader, it never changes its view of what value has been chosen "

Fundamentally, the idea is that the leader is the transactional arbitrator
using the RM as the recorder of that transaction.

------
nerds-central
Thanks - but it really is not that complex. You cannot define the state of two
distant points simultaneously when the speed of information transfer in
space/time is finite. So all transactions collapse to being controlled from
one point, not distributed. The reality of distributed transactions is that
they make fail over to the single transactional arbitrator less common; but
the need for that single arbitration point never goes away.

------
beza1e1
Disagree. The underlying problem is consensus [0], which can usually be
solved. For example, read about bitcoin, which got some attention here
recently. It provides global money transactions within a distributed system.

[0] <http://en.wikipedia.org/wiki/Consensus_(computer_science)>

~~~
JanezStupar
The OP is writing about transactions.

Of course distributed systems can be built, of course there are mechanisms to
reach consensus between nodes.

However such a system needs to be designed from ground up to cope with
partitioning. And one MUST accept that there will be conflicts in the system -
and that some of them won't be resolvable automatically or in timely fashion.
Its a trade off that cannot be avoided.

And I have sat in more than one meeting where customers and my bosses
_demanded_ that I violate laws of physics and provide them with a distributed
solution that will have no errors whatsoever.

Edit: And work faster than a single node solution to top it off.

~~~
cassandravoiton
Which is why I wrote the post. I go fed up with repeating myself as people
demanded the same of me!

~~~
nerds-central
You mean - when you linked it! I seem to remember writing it myself.

------
david927
For KayaDB, we handle it as follows: elements involved in the transaction are
all duplicated onto a single server, the original values change from scalar to
a reference, redirecting to the new location, the transaction is executed on
the single server, the elements are then re-propagated.

~~~
cassandravoiton
So the single server is the transactional arbitrator. Nice :)

------
zwischenzug
Pretentious but entertaining article that tells us what's fairly obvious with
a little reflection.

~~~
cassandravoiton
If it is so obvious, why do so may people no get it?

~~~
zwischenzug
Who doesn't get it?

------
praptak
You don't need such a heavy theorethical machinery (I mean relativity) to get
to the point that distributed transactions are not possible (for some values
of possible.)

Here's the link to the proof that no deterministic protocol achieves
distributed certainty in presence of network failures, even for two parties
only:
[http://en.wikipedia.org/wiki/Two_Generals%27_Problem#For_det...](http://en.wikipedia.org/wiki/Two_Generals%27_Problem#For_deterministic_protocols_with_a_fixed_number_of_messages)

And yeah, the misconceptions around that are huge. I once had to expose a
totally-not-database-related API as a JDBC-compliant driver because someone
insisted it would provide transactional consistency.

~~~
cx01
The two generals problem doesn't mean distributed transactions are impossible.
It just means that you cannot guarantee liveness, i.e. a distributed commit
protocol might block if the network fails. But it will never yield an
inconsistent state (one node commiting and another node aborting).

~~~
praptak
On the contrary. The proof actually demonstrates that finite sequence of
communication in presence of possible failures can not assure consistency.

------
bhousel
Unfortunately for the author's hypothesis, there are two very significant
words that don't appear in this article anywhere: "resource" and "lock"

~~~
cassandravoiton
You are way off the mark here. How do you propose to use a resource lock to
solve the distribution problem?

~~~
bhousel
Every major RDBMS that supports ACID compliant distributed transactions does
exactly this.

~~~
cassandravoiton
Actaully - they don't. I spent some time working with Oracle at a major
installation where they could not come up with a good enough transactional
fail secarios because a single mainframe was being replaced by a multiple DB
node distributed system.

Oddly - the project failed...

Just because a large company says they do something - does not mean they
actually do. Distributed Transcation as said by RDBMS companies actually is
'Distributed as long as nothing goes wrong'.

~~~
bhousel
_Distributed Transcation as said by RDBMS companies actually is 'Distributed
as long as nothing goes wrong'._

Of course, that's the point of ACID. When something does go wrong, your
database will rollback the distributed transaction and throw a scary ORA error
rather than commit inconsistent data across the cluster. I've never known a
properly configured Oracle system to break ACID.

I build these systems for a living, and I am quite certain that they can be
made to work. Oracle is hard to set up, no doubt, but I would never blame it
for the failure of a project...

