

Redis Sentinel design draft 1.1 - elliotlai
http://redis.io/topics/sentinel-spec

======
KirinDave
I don't understand the motivation for this design? Almost immediately I ask,
"Why have sentinels? Why not have an election process exist within the ring of
master service candidates?" The second question I ask is, "Wouldn't Zookeeper
also be an excellent way to coordinate this sort of operation?"

I know that part of the design goal of Redis is to create a system that is
"Simple" and "Readable", and the redis server as it stands succeeds at this
goal admirably. But the approach in this draft is neither particularly simple
nor is it going to make the codebase more readable. It seems fairly awkward
and introduces more of a burden on the operations and management of your
product.

Can someone explain the value of this design?

~~~
antirez
Hello!

The idea is to provide a system that is built into Redis and does not require
external dependencies, and that exploits Redis-specific things to do the work
better or in a simpler way.

Sentinel (already partially implemented, and you'll see a beta in mid July)
uses Pub/Sub to discover other sentinels, INFO to discover attached Slaves,
and other Redis-specific capabilities as well.

It is created as a special execution mode of Redis itself, and has a memory
usage of 1MB, with a CPU usage that is near to zero. This means that running a
Sentinel is basically for free, you can put one everywhere.

It is written in C as a "sentinel.c" module of Redis. In fact you can think at
Redis itself as a framework to write event-driven services. So there was a lot
of code reuse.

At the same time being every Sentinel a Redis instance, it understands the
Redis protocol (but a different set of commands) and you can query every
Sentinel for the full state. Or subscribe to a channel to get events that are
happening using Redis Pub/Sub primitives (events like: this master is down, or
it was failed over, use that instance instead).

This are a few of the reasons I designed and I'm building a new system.

Currently I'm at 50% of the work after few weeks of work, but actually a lot
was done in the latest 10 days (before there was still too much 2.6 work). I
think with another few weeks we'll have a working product.

~~~
KirinDave
Okay well if you don't want external dependencies, that's A Design Decision
and it's fine.

What this doesn't answer is my first question: why do there need to external
special-mode sentinel processes? Why not have the slaves negotiate themselves?
You're already doing a bit of this work, wouldn't it be simpler?

You say the sentinels are "for free" and they may be extremely inexpensive to
run; but they're not "for free" from an operational context. They're another
process to provision, start, monitor, and consider.

~~~
antirez
Something is "available" or not depending on the observer and point of view.
Users should be able to position Sentinels accordingly to the kind of failure
detection they want, and usually where the slave are is not an objective point
of view (that is instead where _clients_ are usually).

Position, number of sentinels, level of agreement. Change this three elements
and you can create many kind of setups.

About operations: Sentinels are designed to monitor other sentinels
automatically and are as simple as possible to operate.

~~~
mrkurt
I don't disagree with allowing that flexibility, but letting Redis instances
do dual duty would greatly simplify things for some really common types of
deployments. It's awfully nice with Mongo to just run three members and get a
pretty decent failover feature for "free", versus having to worry about
maintaining an entirely different set of processes.

~~~
antirez
I think that flexibility is absolutely needed for reliable failure detection.

~~~
mrkurt
Oh I agree with you, I was really just asking for the ability to run dual mode
daemons. Companies who host these things for people might appreciate the
option. :)

------
famousactress
I find this bit interesting:

 _Modify clients configurations when a slave is elected._

(I assume elected == promoted). This is an idea I haven't really seen in other
servers/services before, and I'm curious how this will be implemented. I
assume some sort of pub/sub subscription to each of the slaves so that your
server is notified when one of them takes over? It sounds tricky, but really
interesting. Document seems a bit scant on details for this part at the
moment.

Regardless, really thrilled about this project.

[Edit: Ah, I missed this part:

 _client reconfiguration are performed running user-provided executables (for
instance a shell script or a Python program) in a user setup specific way._ ]

------
simonw
As far as I can tell, the core mechanism here is that the sentinals keep an
eye on your redis instances and agree which slave should become a master if
your master dies. They then "inform the clients" of the configuration change.

One thing that worries me... normally if I have a configuration setting that
might change at runtime I store it in redis! Does anyone have a good way of
storing the configuration of where the redis server is in a way that can be
updated at runtime (assuming a standard shared-nothing architecture) - without
putting it in redis?

~~~
stock_toaster
Maybe something like doozer[1] or zookeeper[2].

[1]: <https://github.com/ha/doozerd>

[2]: <http://zookeeper.apache.org/>

------
alpinegizmo
Is there any expectation that this design will be workable over a WAN, or this
a LAN-only solution?

~~~
antirez
Designed to work over wan (if protected from PoV of security).

------
shin_lao
I have an unrelated question: is redis still monothreaded?

~~~
eternalban
Redis server has multiple threads e.g. background tasks, but the data
manipulation functions are all single threaded.

~~~
dinedal
Does it? I know it forks for BGSAVE and BGAOFREWRITE but those aren't threads
AFAIK

~~~
pedromelo
It has real threads for some I/O operations. See
<http://redis.io/topics/latency> (search for Single threaded nature of Redis)

------
burnhype
Looks great! I like the idea of not using external tools.

