
Learn Redis the hard way: in production - adamnemecek
http://tech.trivago.com/2017/01/25/learn-redis-the-hard-way-in-production/?
======
antirez
One problem with Redis is that it looks superficially simple, but actually,
like any tool used in critical contexts, there are two possibilities: 1) Know
how it works very well and do great things with it. 2) Don't understand it
properly and find yourself in big troubles soon or later. Because Redis, in
order to provide the good things it provides, has, on one side, a set of
limitations, and on the other side is strictly coupled with a number of things
related to the operating system, tradeoffs in the configuration and so forth.

Certain things I understood over the years that made it better are, of course,
improvements in the implementation, but especially things like LATENCY DOCTOR
and MEMORY DOCTOR (this one only for versions >= 4.0), that is, most of the
things a Redis operations expert can tell you, Redis can tell you directly...
without any other help, just observing its internal behavior.

There is another aspect. For a long time I believed a Redis book was not
really needed. Now I changed my mind, and I see no good enough Redis book out
there, and I'm almost planning writing one... I was writing for a long time,
incrementally, a C book, that is currently just an incomplete draft. I'm
thinking about abandoning for now this project, and start with a Redis book.

~~~
jarpineh
Please, do consider writing the C book as well. Most publishers have beta
programs these days (O'Reilly, Manning, Pragmatic Programmers). Don't leave it
in a desk drawer (so to speak).

Perhaps simplified Redis example could work as a good base to learn C. Single
threaded, memory only persistence, network communication service so no complex
stuff like concurrency and file management. Though there's probably a lot more
going on in there...

Perhaps "Redis for devops" is another book waiting to happen, or how to write
Redis modules. By the way, Manning's book (from 2013) is freely available
these days:
[https://redislabs.com/resources/ebook/](https://redislabs.com/resources/ebook/)

~~~
antirez
Yep I don't want to abandon the C book... Since is already halfway and is
something more general and long lasting than Redis probably. The book indeed
has certain Redis internals examples, but isolated so that you don't need to
understand the Redis code base.

~~~
avtar
I really hope you end up releasing this C book. Sounds like it will be a
keeper.

------
randomdrake
This comment may be a bit snarky, but it's important to understand that the
TL;DR for this is not: you need to learn how to debug Redis and understand a
lot of internal things about it and the libraries using it. It's that you can
really save yourself a ton of time and headache with a better understanding of
application architecture.

Step 1) Throw everything at Redis because you don't know how to architect. Use
it as a cache. Use it as a temporary data store. And use it as a persistent
data store _at the same time._ What could go wrong, right?

Step 2) Watch it break because you're using it wrong.

Step 3) Write a blog post about everything you discovered about how you can
sort of get it working while continuing to do it wrong.

Step 4) Only concede at the very end of the blog post that you were just
_doing it wrong™_ from the beginning.

"The root causes of these errors were more a conceptional[sic] issue. The way
we used Redis was not ideal for this kind of traffic and growth."

Unless I'm missing something, there's really no reason that all three use
cases (caching, temporary data store, persistent data store) should all be
sitting on a single Redis server so far away from the metal of the application
server.

I do not want my search cache waiting on temporary data to be read and flushed
from the cache. I do not want data I'd like to persist competing with search
cache results. I don't want any of these things! That's not how this works.
That's not how any of this works!

~~~
omn1
In any bigger project, architecture grows organically. It's easy to say "I
would have done better". In hindsight, every flawed system looks like a bad
idea. It takes a visionary to foresee all possible error cases.

Things go wrong. I think it's an honest post with lots of practical tips.

~~~
randomdrake
> In any bigger project, architecture grows organically. It's easy to say "I
> would have done better". In hindsight, every flawed system looks like a bad
> idea. It takes a visionary to foresee all possible error cases.

Kudos to the team on their findings and for their honest post with some tips
for their odd architecture. I hope my comment did not imply that "I would have
done better," or that hindsight is not, 20/20\. I have enough experience to
understand both of those things.

However, as architecture grows organically, it doesn't take _too_ much
understanding or experience to know that using a single Redis instance for
three different use cases would _probably_ run into issues. That's more than
likely not a symptom of "organic growth," but of poor decision making.

 _Trivago received a $4,000,000,000 valuation._ Why does their engineering
deserve the benefit of the doubt here?

At some point we (technologists and developers) have got to concede that maybe
it's not just a few folks trying to keep up with organic growth, but that some
actual bad decisions were made and should be accepted and highlighted more
than just covering up for mistakes.

I think the post could have been more useful if it would have conceded the
point earlier that they made bad decisions from the get-go, that they had to
play catch up with those decisions, and that the advice was for attempting to
catch up to these decisions. It would be very interesting to understand why
they made the choices they did.

Why are they running memcached and Redis at the same time? Why wasn't a
regular database suitable for data persistence or temporary data? Why wasn't
the cache for search installed _on the application server_ to remove network
latency altogether? Why weren't they running php-fpm from the get-go?

I made the comment because doing some research or having some understanding
from the beginning, might have avoided the post altogether. I think it's
important for us to be able to own our mistakes and honestly approach how we
made those mistakes so they aren't repeated by others. I think that is far
more useful than a writeup on how to adjust some settings to deal with some
fairly basic mistakes. Especially for a technical blog post from a company
with a $4B valuation.

~~~
andygrunwald
Original author here. Thanks for your comment. I don't think you mean
everything bad or snarky here. I like to read critical feedback. Thats how you
learn.

But i guess you mix things up. So yes. This architecture has grown over the
time. And the whole story didn't happen yesterday. It starts September 3,
2010. So >6 years back. The first _Real_ problems appeared in 2013. So ~4
years back.

With our current knowledge, i agree that those three use cases should not fit
in one redis instance.

> I think the post could have been more useful if it would have conceded the
> point earlier that they made bad decisions from the get-go, that they had to
> play catch up with those decisions, and that the advice was for attempting
> to catch up to these decisions.

This post is written as a kind of story. Of course it would be possible to
conduct this post to learnings.

> It would be very interesting to understand why they made the choices they
> did.

Feel free to ask every question you want to know. I am here and happy to
answer.

> Why are they running memcached and Redis at the same time?

Because this are two different systems with two different concepts for
different usecases. If you use caching, i agree. Both can cache data. But on
use cases where you need master slave replication (maybe across data centers),
memcached may not fit. IMO it is hard to say both systems are the same. E.g.
if you deal with caching data that vary a lot in size per entry (same data
concept), one memcache instance / consistent hashing ring might be the wrong
solution, because of the slap concept of memcache.

> Why wasn't a regular database suitable for data persistence or temporary
> data?

This question is not answered in general. But i use one use case. We had a
MySQL table running on InnoDB that was really read heavy with ~300.000.000
lines. The indexes were set for the read patterns. Everything fine. An normal
insert in this table took some time. And we wanted to avoid to spent this time
during a user request. This would slow down the request. One option the dev
team considered is to write it in redis and have a cronjob that reads this
data out of redis and stores them in the table. With this the insert time was
moved from the user back to the system.

> Why wasn't the cache for search installed on the application server to
> remove network latency altogether?

There is a cache on the application server. But we have several cache layer in
our architecture. This was just one of them.

> Why weren't they running php-fpm from the get-go?

We are working on a switch to php-fpm from the typical prefork model. Such a
task sounds easy. But in a bigger env this can get quite a challenge.

> I made the comment because doing some research or having some understanding
> from the beginning, might have avoided the post altogether.

Agree. The challenge here was, that the people who introduced Redis at that
time has left the company. So in short: We had the problem, and had to fix it.
With our current knowledge, we would fix this in a different way and maybe
choose different approaches. But yeah, i assume this is a normal learning
process.

Anyhow. Thanks for your feedback.

~~~
firebones
Thanks for the awesome writeup. Since you opened up questions:

From the start of your cutover (when you started seeing the 500s) to
resolution of the various issues along the way--how much time elapsed? (e.g.,
deciding to swap client libraries, A/B testing, redis upgrade, shifting load
to dedicated instances, implementing proxies/memcached)?

And what was the end user impact (e.g., 50% of users would see timeouts during
peak usage for the day, or users using certain functionality would be affected
1% of the time, etc.)

Just trying to get a sense of the level of urgency involved in terms of
chasing down all these leads. It seemed pretty methodical, so it's hard to
tell if it was a slow-burning persistent nagging issue that you chipped away
at over a couple of months, or an all hands on deck sequential process of
trying a lot of different things in a relatively short period of time to keep
the site afloat.

------
_Codemonkeyism
I didn't learn a lot from using Redis in production, as in two companies Redis
run untouched for several years without any problems at all.

------
TeeWEE
Setting up a new connection for every request considered normal for stateless
applications? Really? This is a big scaling issue in your stack. You need
connection pooling also for stateless apps.

~~~
chatmasta
If you are containerizing your stateless app, where do you store the
connection pool? I suppose you could setup another container as a TCP proxy
that can maintain persistent connections and then connect to redis through the
proxy?

~~~
detaro
What do you mean by "containerizing"? Unless you restart app instances for
every single request, I don't see how connection pooling is an issue. Just
putting a worker in a container doesn't stop it from keeping a connection
open.

~~~
CaveTech
PHP is stateless. You don't get to keep connections across requests.

~~~
brianwawok
This is one of the reasons where PHP makes stuff so complicated.

Django out of the box its 1 flag to keep connections alive. I assume most
other languages as well.

You CAN do it the right with with PHP if you use a bunch of other products,
but you are making life so much harder than it needs to be ;(

~~~
CaveTech
Persistent connections are far more complicated to reason about. I believe
what you mean is that this makes it more difficult to scale PHP, which is
arguably true, but at the same time it forces you to understand underlying
concepts.

~~~
brianwawok
Connection pooling is fairly understood at this point in time...

~~~
junker101
It's typically 'SESSION' specific things with connection pooling that I've
seen underlying (sometimes long-uncaught) bugs in production systems.

e.g. A 'ALTER SESSION'/'NLS_DATE_FORMAT' commands in Oracle or even an
unfortunate 'USE <db>' w/MySQL.

There are of course safe solutions & techniques for this, but when you have an
otherwise stateless-by-design codebase (such as with PHP), picking up
"possibility state-laden" connections is a bit of an unexpected concept that
I've seen catch developers more than once.

~~~
brianwawok
That seems a terrible idea. Don't do that!

------
arrty88
Here's the fix: fork the redis API for your language (nodejs,java,c#, whatever
it may be) and remove the .KEYS function completely. Stop your entire dev team
from shooting themselves in the foot.

~~~
notamy
Isn't this why rename-command exists? Just "remove" it on the server side by
changing KEYS to a blank string.

------
pfarnsworth
The "debugging" they went through appears to be random acts of system config
modifications and upgrades. Talk about making a bad situation worse, they
completely lost track of the issue. Upgrading redis in the middle of figuring
out the problem? My God....

------
andygrunwald
Original author here. Feel free to ask anything.

~~~
ben_jones
I checked out twoemproxy because it seemed interesting and noticed a few
things:

* the last code commit was 8 months ago (and was very minor)

* there are 120 open issues

* it's written in C

Now this is far from abandaomware (imo), but it still seemed like a risky
dependency to lean on. Have you guys moved past it already? Any thoughts on
the cost-benefit analysis there or alternative solutions?

Note: I mention C like a detriment because I myself cannot write production C
and therefore would think twice to using a library/service I myself could not
fork (and probably many other devs).

~~~
andygrunwald
twemproxy is not the best piece of software when it comes to maintenance of
code quality. But it works reliable. We are still using twemproxy and we are
maintaining our own fork of it. This fork often has changes from us, like
bugfixes or introducing new redis commands that are not supported yet.

Every bug we fix we apply this as a PR. A few PRs from the repository we have
cherry-picked internally and deployed. But we have also a few issues with it
like this one here:
[https://github.com/twitter/twemproxy/issues/442](https://github.com/twitter/twemproxy/issues/442)
or live reconfiguration would be a nice feature. Think about it if you really
need this, because it is not only fun. A few pain points:

* Deployments of new configs and restarting twemproxy on your server farm

* New redis version with new commands? Extend twemproxy for those command support

I don't now any _real_ alternative to it, right now. I know from
[https://github.com/Netflix/dynomite](https://github.com/Netflix/dynomite)
which is a twemproxy fork. When we had only memcached and no redis (we are
using both with twemproxy), then we would try
[https://github.com/facebook/mcrouter](https://github.com/facebook/mcrouter)
from facebook. But it is specialized on memcached.

If you come up with an alternative, please let me know.

~~~
ben_jones
Thanks for the response. Sorry no alternative, but a generic database proxy in
golang would be a great weekend hack I might try sometime.

~~~
andygrunwald
[https://github.com/eleme/corvus](https://github.com/eleme/corvus) This might
be an option. Someone tweeted me this. Didnt tested this yet

------
pan69
This is a very well written and insightful article. However, while I was
reading this I was wondering; "what is the time-frame over which the debugging
and solution finding occurred?". Was it hours, days, weeks or even longer? My
guess is that the time-frame was days (not necessarily continuous).

I think an article like this could serve well as an example to non-developers
to demonstrate the path taken to solve a problem but also how long it takes to
solve a problem.

~~~
andygrunwald
Original author here. It really depends on which debugging session. Several
debugging sessions was across days, because we didn't understand the problem
at first. Some others were hours up to minutes (e.g. were we checked which
commands we are using and optimized them).

Does this answer your question?

------
beering
Not sure if this is a dumb question, but if your cache is behaving so badly
(40% 500 error?), shouldn't you just get rid of the cache? Maybe allocate some
more resources to the DB, but the DB already caches frequently accessed data
and you save yourself the roundtrip of check-cache-then-db.

~~~
tschellenbach
The conventional wisdom is to move as much load as possible from DB to a
cache. The reason is quite simple. Scaling a database is hard and expensive.
Scaling a cache is very easy in comparison.

~~~
scott_karana
In this case, scaling the cache was also hard, or at least non-obvious. ;-)

------
markbnj
Interesting write-up. Did I miss it, or did the author not talk specifically
about the implementation architecture for the redis instances, i.e. single
instances vs. high-availability with replication? If you run replication then
you at least need persistence for the master, and since any of the replicas
can fail over and become master that basically means you need persistence for
all of them.

~~~
andygrunwald
Original author here. Thank you.

The article didn't go into depth of the server architecture. But in general
all instances i wrote about were setted up as a classical Master <-> Slave
connection. Sometimes with multiple slaves.

Related to persistence: It depends. It depends on your tooling around. E.g. If
you have a Master <-> Slave setup with persistence enabled, you have multiple
options. BGSAVE on master only. BGSAVE on slave only. A seperate slave without
traffic only for BGSAVE (to avoid the forking issue, many companies doing
this) or BGSAVE on every node.

BGSAVE on every node can make sense of you have a running sentinel that
propagades the slave as the new master once the master failed.

IMO it depends. We are running master <-> slave envs with persistence enabled
on every node. Were we don't need persistence there we are running only single
instances that are combined by the client to a consistent hashing ring.

Does this answer your question? Any questions left i can answer?

------
jnordwick
> KEYS * ... O(N) with N being the number of keys in the database

How can this NOT be the case? The problem isn't the algorithmic complexity --
O(n) is optimal -- but that you are probably either bouncing around memory or
running though way more of than you need to depending on the order returned
and the data structure used.

This whole article sounds like a bunch of "Full Stack" developers running
around making a mess of things with simple mistakes that any real back-end
person wouldn't have made in the first place.

~~~
matt4077
As mentioned in the article, the documentation wasn't quite as obvious on the
problems with KEYS. The problem was created way back when being "a real back-
end person" still meant something completely different.

------
frik
Super interesting lesson learned article. Will definitely try out Redis, if it
could replace Memcached (though it works rock solid) or be used as queue
instead of Kafka (I dislike ZooKeeper)

~~~
kod
Kafka is not a queue, it's a circular buffer.

Comparing Redis and Kafka is like comparing chalk and cheese.

~~~
frik
Wikipedia page mentions "massively scalable message queue broker". And no word
about details if the buffer is circular or not - but does it matter in my case
- probably not. In memory message queue that can be optionsl persistent in
case of a restart, that can be dobe with both Kafka and Redis.

[https://en.wikipedia.org/wiki/Apache_Kafka](https://en.wikipedia.org/wiki/Apache_Kafka)

~~~
kod
It's not a queue. Some people may use it for cases that resemble a messaging
queue, but it definitely is not, and many valid use cases for Kafka would be
completely impossible with a traditional message queue or Redis.

How about actually reading the Kafka documentation instead of an inaccurate
wikipedia page, it's pretty good and won't take you that long

[http://kafka.apache.org/documentation/](http://kafka.apache.org/documentation/)

------
draw_down
Jeez, what a nightmare.

------
randomsofr
Hotel?

~~~
cmstoken
Trivago.

------
Kiro
I am using MongoDB and it works like I want it but I am roasted all the time
on HN because of it and told to be using Redis instead. Still considering the
switch but I don't want to run into problems like this.

~~~
phn
Could you give us an idea of your workload?

From what I've been told MongoDB breaks in really bad and silent ways when
under pressure. And I mean "ignore writes without giving out errors" kind of
bad.

~~~
threeseed
And by told I take it you mean read on HN ?

Because none of what you said is really all that true. MongoDB has never
ignored writes without giving errors. You have always been able to check if
the write failed. The issue I think you're referring to is the F_SYNC
immediately to disk default. Which was changed many years ago and was never
the default for any of the language clients.

MongoDB is a great database in specific situations.

~~~
phn
Actually it is feedback I have from the data guys at work, not HN.

And you may be right about that configuration, but I cannot remember the exact
details of the discussion.

Either way, don't get me wrong. I use MongoDB as well, and am quite fond of
its aggregation framework. It's just not something that I catalog as "reliable
and high performance" the way I do redis (which of course, also has
limitations of its own).

