
Reply to Aphyr attack on Redis Sentinel - mattyb
http://antirez.com/news/55
======
brown9-2
It's very refreshing to see here that "attack" is not used in the way that one
might expect from just the headline, meaning "a possibly unwarranted criticism
that I didn't like or found unfair, or that I am taking personally".

~~~
Legion
I am endlessly impressed with how antirez responds to any critique of Redis
that I've ever seen. He's always taken it as a positive, and looked for the
truth in the critique, rather than searching for something to be wrong and try
to discredit the critique.

My opinion of him and the Redis project increases further every time.

------
pygy_
Tangentially related:

In the PostgreSQL evaluation[0], Aphyr noticed that, if a packet confirming a
transaction is dropped, the client ends up in a deadlock.

Does PostgreSQL keep a record of the past transactions, and their success or
failure. If so, is it possible to query it?

[0] <http://aphyr.com/posts/282-call-me-maybe-postgres>

~~~
aphyr
Yes, you can recover from lost acknowledgements by asking for the transaction
ID from postgres before committing--or by making up your own flake ID and
writing it to a table. Given a queue with at-least-once delivery (which
includes, say, durable storage on the client), you can check for the presence
of that ID at a later time and re-apply the transaction to recover from
network errors safely.

The transaction ID _does_ wrap around, so there's a time limit depending on
your transaction throughput. You can also ask for certain transactional
properties on rows, though this won't allow you to recover in all (most?)
cases.

~~~
fdr
Database constraints usually catch these problems in event of re-submission,
especially if the client can assign primary keys (e.g., a UUIDv4) a-priori,
but this also tends to be true in simpler cases, too.

All in all, I am not sure if anyone should find this surprising: if anyone has
ever had a network stall when clicking the 'confirm' button at a web-based
store, they are familiar with the uncertainty as to whether the order has been
submitted or not (resolved typically by browsing the history or waiting for an
email, or no).

I would guess modern e-commerce vendors would send you a UUID or moral
equivalent to de-dup cart resubmissions these days...but if not, it'd be
interesting to know why not.

~~~
aphyr
Correct; if your writes are idempotent, retrying is safe. I cover this in the
post as well. My above comment shows that it's possible to recover consistency
even for writes which are _not_ idempotent--though depending on the semantics
of your retries, there may be some locking required.

------
Glyptodon
Redis is one of those things I both love and love to hate.

I've had good results using Redis as a lock server, but I live in (perhaps
misplaced) fear of a client hanging or crashing leaving a lock stranded. Not
that this is really Redis's problem.

~~~
antirez
Hello, you can easily mount a lock that auto-releases itself after some
timeout using the new (2.6.13) extended SET command (see
<http://redis.io/commands/set>) or simply a Lua script.

~~~
Glyptodon
Since the jobs we're locking can have somewhat inconsistent times we're
actually using an implementation where the tasks can get a lock with a time
limit and can extend their lock so long as they still have it, so they do
potentially auto-release.

Even given this, bad lock timing (not that likely) or a crash (more likely)
could let inconsistency in.

 _Shrugs_

Like I said, my problem is not really Redis's. If I can't trust everything
that uses a lock not to crash 99.99% of the time I should really be looking at
our jobs and not at Redis.

Even then, though, it's probably more a matter of me not trusting things than
it is said things not actually being trustworthy.

~~~
rmaccloy
We're about to open source a similar deal (redis-based "soft guarantee"
mutexes) -- ours is written in Python and mostly used as a way to coordinate
(very frequent) parallel task execution a la CountDownLatch, so 100% reliable
exclusion in the face of failure isn't critical.

I'd be interested to hear about your implementation if you can share (email is
HN username at gmail.com)

~~~
Glyptodon
I sent you a rather quick email.

------
nutmeg
A response to this article on Redis: <http://aphyr.com/posts/283-call-me-
maybe-redis>

~~~
keeran
His continued use of "CP" confused me for a while, so TIL about CAP Theorem

<http://en.wikipedia.org/wiki/CAP_theorem>

~~~
krenoten
If you have the time, this video by Basho's CTO will give you a much better
understanding of the tradeoffs that are involved in distributed system design:
[http://www.infoq.com/presentations/Concurrency-Scale-
Distrib...](http://www.infoq.com/presentations/Concurrency-Scale-Distributed)

A great alternative to thinking about things in terms of CAP that Justin
brings up is harvest-yield, where yield is the probability of completing a
request and harvest is the fraction of your data that the response actually
represents. Here's the paper:
<http://lab.mscs.mu.edu/Dist2012/lectures/HarvestYield.pdf>

------
undoware
I'm frustrated that when the HN editors deduped the original story, they
apparently deleted ALL the instances, leaving only this one. I wanted to read
the discussion on the subject of Aphyr's research, not Antirez' response.

It looks bad, HN. We all know that VMWare is litigious as ____(try looking up
benchmarks sometime.) But to (presumably) cave so quickly and effortlessly
suggests... well, I'm not sure.

The other possibility is that Aphyr yanked them himself, probably under duress
(or else there'd just be an ' __update __' at the bottom of the research's
page.) Aphyr, is this what happened? I figure you probably can't talk freely
if so, but say something.

~~~
aphyr
[https://www.hnsearch.com/search#request/all&q=aphyr.com](https://www.hnsearch.com/search#request/all&q=aphyr.com)

HN stories on my original posts are still there, as far as I can tell. They
just never hit frontpage.

~~~
antirez
Aphyr, this is very lame, it's not common to see a work like what you did, and
none of your stories hit the HN front page? I don't know what to think, but I
hope that at least my post will help to show more people your awesome work.

~~~
hendzen
I think Aphyr's series was a little too meaty for the general HN audience (of
today).

Talking about things like the FLP impossibility result, CAP theorem and
specifying protocols with TLA+ may be a bit over the heads of many HN readers
- clearly, people would rather read stories about the latest funding round,
acquisition or frontend UI framework than a substantive article on distributed
systems.

~~~
pyre
It's not fair to imply that these thing are over the heads of HN readers.
There are plenty of smart people that just might not care about distributed
systems enough to read through. Does my lack of reading medical journals speak
to my ability to read/comprehend them?

------
JulianMorrison
RethinkDB people, how does your database compare?

------
contingencies
DRBD? <http://drbd.org/>

~~~
aphyr
Same limitations as any asynchronously replicated system; if both nodes
diverge during a partition, you'll probably have to drop one's writes.

[http://aphyr.com/posts/287-asynchronous-replication-with-
fai...](http://aphyr.com/posts/287-asynchronous-replication-with-failover)

~~~
contingencies
Right. By operating at the block level it's a little more portable than most
of the solutions discussed, though. Worth people's consideration, IMHO.

~~~
aphyr
I'm inclined to think just the opposite. It's often possible to recover
divergent data structures logically. Good luck doing that on an arbitrary
block store.

~~~
contingencies
My impression is that most DRBD setups are such that the backing volume is
marked to recall which node last had 'master' (ie. write capacity), thus
avoids this issue. However, to achieve this reliably it needs out of band
STONITH (shoot-the-other-node-in-the-head), eg. IPMI.

