
SDPaxos: Building efficient semi-decentralized geo-replicated state machines - mattdemon
https://muratbuffalo.blogspot.com/2018/11/sdpaxos-building-efficient-semi.html
======
marknadal
After going through the article, I'm not seeing anything that supports or
justifies the phrase "decentralized" or even "semi-decentralized". The
terminology should be "distributed".

Claiming this is semi-decentralized is confusing, and seemingly wanting to
borrow from the recent success of decentralized systems (like IPFS, ours, GUN,
and others) without being honest: There system is distributed, not
decentralized. In the same way "Serverless" totally requires using servers.

Otherwise, very good article.

~~~
matthelb
I'm not sure the distinction that you're making between "distributed" and
"decentralized" is commonly accepted in the research community (or broader
technology community). In this context, the authors appear to be using
"decentralized" to contrast with the "centralized" nature of leader-based
state machine replication protocols.

------
skyde
anyone familiar with the algorithm can help answer a few question?

1- Why C-instance can come from any node without a paxos phase 1a Prepare
message, is it because each node (R0 to R4) have its own distinct replication
log for C-instance?

2-When sequencer receive a C-Accept why is it safe to assume this value was
successfully accepted by other replica without receiving a paxos phase3
Commit?

3-If replicating large value, are the value only sent in C-instance message
and not in O-instance messages?

~~~
mad44
1) Yes.

1& 2) In fact the C-instance messages do not conflict with each other and gets
accepted immediately. These messages do not even need a ballotnum, but the
ballotnum used is that of the O-instance to denote sort of an epoch of which
sequencer the sender thinks is still in-charge.

3) If replication messages are large, you can just order the "commands"
referring/pointing to them via Paxos, and not necessarily the data itself.

~~~
skyde
for 3) I mean O-instance is the commmand itself replicated or just the node id
?

~~~
mad44
Yes, for O-instance it could be as small as the command id.

------
fokker
Curious, why Paxos over Raft?

~~~
TheDong
Note that there's 3-10 variants of Paxos depending on who you ask and how many
research papers you've read.

Paxos has different performance characteristics, different implementations,
and more maturity.

There is no simple answer to your question unless you make it more specific.

~~~
fokker
Considering we’re talking about distributed state-machines, I am curious why
you would choose Paxos, a more complex algorithm over Raft. Raft, to my
knowledge provides the same guarantees as Paxos, and is simpler to understand
and implement.

I am no expert on Paxos - just hoping for an explanation.

~~~
skyde
Paxos itself and not something built on top of Paxos like multi Paxos is in
fact super simple compared to raft.

Where it get complicated is trying to build a practical replicated log system
using Paxos. Raft just happen to clearly and completely define this use case.

What SDPaxos or EPaxos try to achieve is good performance over WAN. Something
that Raft and any Paxos variant that rely on a stable leader are very bad at.

This can’t be easily added to raft because the main reason the raft algorithm
is simpler is because it assume a stable leader in every operation except
leader election.

