Hacker News new | comments | ask | show | jobs | submit login

For us (Dropbox), the threshold ended up being multiple product teams having to implement ad-hoc two-phase commit over their datatypes and burning engineer hours not to implement it but to prove that they had gotten it right and to handle the clean up after any unsuccessful writes.

You're right that most systems probably don't need 2PC, which is why Edgestore didn't include it until now. As mentioned in the post, we finally felt that we had reached the right balance of tradeoffs to justify the API-level primitive.

When you get to big enough scale, you need cross shard transactions. Google reached it with Spanner.

If I look at Dropbox, i am sure features related to sharing folders between people/organizations cross-shards. You can't avoid them if you want to offer a fully featured product.

After a quick skim of 2PC in Edgestore (sorry, no time), it is unclear if there is a single transaction coordinator (TC) or not. I assume it is a single TC - that you can scale it to 10m trans/sec is impressive. The really hard part is have multiple TCs and to design protocols to coordinate their recovery after failure. Here is a good example in the open-source NDB (MySQL Cluster) system - https://drive.google.com/file/d/1gAYQPrWCTEhgxP8dQ8XLwMrwZPc...

We have a routing tier of a few hundred machines. Any of those can serve as transaction coordinator. This is one of the places where having a transaction record is useful -- in general, our locking and non-transactional read scheme will wait nicely for a 2PC transaction to complete, but in the face of failures, any actor in the system can abort the 2PC transaction by marking the transaction record. There's also a really nice optimization for non-transactional reads that allows them to read even in the event of a staged but not yet committed 2PC transaction (you can prove to yourself that a linearized read can occur on either side, if the 2PC transaction is still pending when the read begins).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact