
A simple distributed algorithm for small idempotent information - janerik
http://antirez.com/news/71
======
adolph
Idempotence by example: _Looking up some customer 's name and address in a
database are typically idempotent (in fact nullipotent), since this will not
cause the database to change. Similarly, changing a customer's address is
typically idempotent, because the final address will be the same no matter how
many times it is submitted. However, placing an order for a car for the
customer is typically not idempotent, since running the method/call several
times will lead to several orders being placed. Canceling an order is
idempotent, because the order remains canceled no matter how many requests are
made._

[https://en.wikipedia.org/wiki/Idempotence](https://en.wikipedia.org/wiki/Idempotence)

~~~
vishnugupta
While the wiki article seems to do a decent job of explaining mathematics
behind idempotency it sheds no light on why it's so important in distributed
systems. antirez talks about idempotency in sort of mathematical sense, though
I could be wrong as I've just skimmed through the article.

Since failures are a norm in distributed systems one would want clients to
retry a request without worrying about double updations (placing the order
twice in the example above, debiting the money twice etc.,).

Just elaborate this further. Let's say the "place order" request didn't even
reach the server. In this case client can just retry the request. However, if
the request did reach the server but the response didn't reach the client then
client can't safely retry without creating duplicate order. The problem here
for the client is that there's no way for it to distinguish between a lost
request vs lost response.

In such cases idempotency is achieved by requiring clients to send a unique
identifier with each request. Server can now ensure that for a given request
identifier the operation is performed just once.

------
bch
Knowing the Raft[0][1] protocol, I felt this post was disingenuous going on
for its entire length, and only mentioning Raft basically as a footnote, w/ no
other references.

I think Raft deserves a bit more prominence and credit here, Antirez.

    
    
      [0] (pdf) https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf
      [1] I kept reading thinking to myself repeatedly "This is just Raft".

~~~
antirez
In the very first part of the post: "The algorithm is mostly borrowed from
Raft, however because of the premises it uses only a subset of Raft that is
trivial to implement."

Maybe you only scanned the article.

~~~
bch
:) my apologies -- I started reading your introduction and then moved on to
the meat of the article...

~~~
antirez
np ;-) after all as an heavy HN reader I understand very well the
impossibility to read whole posts...

------
bkirwi
It looks like the protocol is broken; can anyone confirm?

Suppose a handful of the processes trigger an election simultaneously, each
with a `currentEpoch` of `n`, so that each process has voted but no proposal
has a majority. Now all processes have a `lastVoteEpoch` of `n`.

According to the description, no process will vote more than once in a single
epoch. However, a process will can only propose a new value using its
`currentEpoch`. Since each proposal must have a `currentEpoch` of `n`, but no
process is still allowed to vote in that epoch, no more state transitions are
possible and the frequency cannot change.

~~~
mjb
Right. For the same reason, the Paxos protocol allows multiple prepare votes
from each acceptor. It makes this safe by requiring that each acceptor tell
the proposer of the value for the highest ballot number it has accepted, and
requires the proposer to switch proposal values when receiving such a message.
This is, at first glance, one of the strange edges of the Paxos (synod)
protocol, but it makes a lot of sense once you see an example where it is
needed.

Raft has somewhat similar constructs, but explained slightly differently.

~~~
antirez
Hey, actually Raft uses randomized retries, since I copied this part (and most
of the rest) from Raft. Voting can result into split brain as this condition
is recovered, and a retry time that is big enough compared to RTT guarantees
liveness.

~~~
gbhn
It looked similar. Can you highlight the differences from Raft? Is it just
that state is small so it can accompany the various voting/heartbeat messages?

~~~
antirez
Because the state is small and does not depend on the previous computation,
the problem becomes trivial, and is accomplished just by broadcasting the full
information with an attached unique version. No need for a log, rules to apply
or not entries from the log to state machine, and so forth.

