

Calvin: Fast Distributed Transactions for Partitioned Database Systems [pdf] - cdl
http://cs.yale.edu/homes/thomson/publications/calvin-sigmod12.pdf

======
untitledwiz
Prof. Abadi's post on his blog about Calvin:
[http://dbmsmusings.blogspot.co.uk/2012/05/if-all-these-
new-d...](http://dbmsmusings.blogspot.co.uk/2012/05/if-all-these-new-dbms-
technologies-are.html)

Reddit comments on the blog post:
[http://www.reddit.com/r/programming/comments/trb7e/if_all_th...](http://www.reddit.com/r/programming/comments/trb7e/if_all_these_new_dbms_technologies_are_so/)

~~~
rdw
Thanks for the supplementary links, they really clarify things for me. I think
the key high-level decision they made is to ignore per-transaction latency in
pursuit of higher throughput. A consequence of that decision is that the
application logic of the transaction must be executed by the coordinator.

A while back coworkers and I developed a system that made similar tradeoffs,
and was capable of linear throughput scaling as well, but never made it into
production for various reasons. The per-transaction latencies were in practice
"good enough", though they theoretically could grow quite large. The necessity
of combining the application logic with the transaction coordinator was a lot
more difficult in practice than I'd expected, especially since we required
completely lock-free logic. It turned out to be a real brainteaser for some
applications. It's going to be much easier to write logic for Calvin, because
locks are somewhat more intuitive and map more closely to existing systems.

------
szopa
I went quickly through the paper, and there are some interesting ideas, like
separating scheduling, sequencing, and storage.

Also, there are some details that leave me a bit confused... For example, they
mention they use ZooKeeper for Paxos, while ZooKeeper uses a different
protocol, ZooKeeper Atomic Broadcast.

------
chatmasta
I took Prof. Abadi's seminar "Database Systems" this past semester and really
enjoyed it. Great class, great teacher and super interesting material. He's
also very good at commercializing academic research, which is something
everyone can learn from.

~~~
untitledwiz
You should come work for Hadapt, we love Yalies :) PM me if interested.

~~~
chatmasta
I applied, didn't get the job. Awwwkward.

------
mattparlane
This is not particularly new, it's from May 2012. There's an easier-reading
writeup here:

[http://gigaom.com/2012/05/16/calvin-a-fast-cheap-database-
th...](http://gigaom.com/2012/05/16/calvin-a-fast-cheap-database-that-isnt-a-
database-at-all/)

------
kmasters
OK, sounds good but, Im getting the feeling that serializing database work
across distributed parallel transactions queues is the solution. I think there
is a little sunny optimism here about rollback frequency. Deterministic
parallelism is not going to help you much when you have the same transactions
failing across distributed nodes.

If you dont have a lot of rollbacks I can see this being ok.

