

Hat, not CAP: Introducing Highly Available Transactions - pbailis
http://www.bailis.org/blog/hat-not-cap-introducing-highly-available-transactions/

======
jmileham
In order to reconcile ACID with CAP, this defines a weakened form of ACID to
mean whatever-some-databases-currently marketed-as-ACID-compliant-support in
order to say that you can still offer effective ACID compliance and still
choose CA over partition tolerance (in the <http://codahale.com/you-cant-
sacrifice-partition-tolerance/> sense). For a lot of applications, the
weakened isolation guarantees aren't, or shouldn't be, negotiable (if you try
to sneak by without them, they'll cause data integrity issues at scale).

Not saying that the solution doesn't provide a valuable framework for building
robust applications that can overcome those issues (necessarily pushing some
of that complexity up the stack to the application developer), but the
marketing seems a little bit suspicious?

Edited to add: In fairness, the article doesn't actually claim to have evaded
CAP - it recognizes that HAT is a compromise. But I believe it's easy to
understate the practical problems with non-serializable transactions. It
becomes impossible to prevent duplicate transactions from being created on the
split-brain nodes. In banking, for instance, this would be a Bad Thing, and
lead to potentially hairy application-specific mop up when the nodes resync.

~~~
pbailis
Good point, and well-taken. As I mention in [http://www.bailis.org/blog/hat-
not-cap-introducing-highly-av...](http://www.bailis.org/blog/hat-not-cap-
introducing-highly-available-transactions/index.html#tradeoffs) (and devote an
full section to in the paper, including documented isolation anomalies like
lost updates, write skew, and anti-dependency cycles), there are many
guarantees that aren't achievable in a highly available environment. Our goal
is to push the limits of what is achievable, and, by matching the weak
isolation provided by many databases, hopefully provide a familiar programming
interface.

As I tried to stress in the post, we aren't claiming to "beat CAP" or provide
"100% ACID compliance"; we're attempting to strengthen the semantic limits of
highly available systems. I intended "HAT, not CAP" as a play on acronyms, not
as a claim to achieve the impossible.

edit: We're also certainly not claiming to have a "CA" solution, whatever that
means. There's a lot of confusion between "CAP atomicity"==linearizability and
"ACID atomicity"=="transactional atomicity"/"all or nothing"; see
[http://www.bailis.org/blog/hat-not-cap-introducing-highly-
av...](http://www.bailis.org/blog/hat-not-cap-introducing-highly-available-
transactions/#cap-note)

~~~
prodigal_erik
> matching the weak isolation provided by many databases, hopefully provide a
> familiar programming interface

I'm not sure it's really that familiar. Just knowing how to make requests
doesn't ensure you really understand all the ways the answers could be wrong,
much less have done the analysis and proven you can withstand all those
failure modes. I think a lot of systems out there are quietly corrupting
themselves in ways the maintainers didn't have high enough scale or good
enough analytics to notice, at least not early enough to recover to a valid
state.

------
ryanpers
Interesting paper, I hope to see a follow on that actually describes the
algorithm in full. As written, it doesn't cover the failure recovery, data-
drift and timeout cases.

Also maybe you could speak to a few constraints: \- missing updates \- unique
index

and outline your thoughts as to how an application developer might avoid
pitfalls. Most applications I have seen tend to require/run in to these
issues.

