One problem with Redis is that it looks superficially simple, but actually, like any tool used in critical contexts, there are two possibilities: 1) Know how it works very well and do great things with it. 2) Don't understand it properly and find yourself in big troubles soon or later. Because Redis, in order to provide the good things it provides, has, on one side, a set of limitations, and on the other side is strictly coupled with a number of things related to the operating system, tradeoffs in the configuration and so forth.
Certain things I understood over the years that made it better are, of course, improvements in the implementation, but especially things like LATENCY DOCTOR and MEMORY DOCTOR (this one only for versions >= 4.0), that is, most of the things a Redis operations expert can tell you, Redis can tell you directly... without any other help, just observing its internal behavior.
There is another aspect. For a long time I believed a Redis book was not really needed. Now I changed my mind, and I see no good enough Redis book out there, and I'm almost planning writing one... I was writing for a long time, incrementally, a C book, that is currently just an incomplete draft. I'm thinking about abandoning for now this project, and start with a Redis book.
Please, do consider writing the C book as well. Most publishers have beta programs these days (O'Reilly, Manning, Pragmatic Programmers). Don't leave it in a desk drawer (so to speak).
Perhaps simplified Redis example could work as a good base to learn C. Single threaded, memory only persistence, network communication service so no complex stuff like concurrency and file management. Though there's probably a lot more going on in there...
Perhaps "Redis for devops" is another book waiting to happen, or how to write Redis modules. By the way, Manning's book (from 2013) is freely available these days:
https://redislabs.com/resources/ebook/
Yep I don't want to abandon the C book... Since is already halfway and is something more general and long lasting than Redis probably. The book indeed has certain Redis internals examples, but isolated so that you don't need to understand the Redis code base.
I would also love to read a C book by @antirez! It'd go up there next to my K&R.*
I think a redis book is good, but your docs and blog articles are already excellent. If you combined them into a single document (slightly updated or curated blog articles as chapters, with the second half of the book as reference section/appendix for redis.io commands) then your book is done. You could probably even just hire someone to handle this formatting.
Original author here.
First of all, thank you antirez for this beautiful piece of software. And i agree you 100%.
At the first glance it looks simple. Later if you use it, it can get a beast very fast. But after you get some more knowledge about this, you know how to deal with it.
What i really like is your approach you mentioned: "Over the last years i learned...". And the users of redis can feel this.
More and more commands and tools are implemented in redis that makes handling of bigger envs way easier or tools to speed up your debugging sessions like WATCHDOG, SLOWLOG, latency based testing (as mentioned in the article).
Don't get me wrong. This article is not about blaming redis. I and we as a company, love redis. Thank you.
As with many things, it's not the solution thats complex, the problem is. A seemingly simple solution to a hard problem often just shifts the problem to a different domain for you to deal with it.
Understanding your problem and the tools as you mentioned deeply allow the great things, or just, keep fingers crossed and hope for the best.
A lot of problems can often be reduced in complexity by really understanding what you need vs. what you think you want or your user wants. The real challenge is translating what is wanted into what is needed (which is often hard to explain to clients).
Hi, Salvatore. I was actually just recently deciding how to contact you as I saw an earlier comment of yours about wanting to switch you site over to Markdown.
Obviously I've benefited from your work on Redis, so if you'd like a hand converting the site over to Markdown or with the books (proofreading, technical editing) I'm happy to help. Let me know!
The biggest lesson that I think that most need is that Redis is a very sharp knife. It is perfect for some problems, and a horrible fit for others. Understanding what it is for is critical to having it work out well for you.
Exactly, but the use cases for it cannot be easily put into obviously delimited things. There are use cases where Redis is a good fit as a primary database, and use cases where it should not be used even if data is ephemeral, to make an example. The right applications depend on the exact problem, and even problems that are not a fit can be re-stated so that they become good enough for Redis, and similarly, problems that in theory are a great fit can be mis-implemented easily. As I said IMHO it's a matter of understanding how Redis works exactly... in order to make great choices both in applying it or not, and how to apply it.
Redis pushes the limit of what can be accomplished using a single-threaded in memory data store. If your processing can be done by such a thing, Redis is probably the most efficient option available. (With, of course, persistence, replication, and so on for managing your data.)
But if you throw blocking operations or too much data at it, Redis is going to demonstrate the inherent limitations that come with being a single-threaded in memory data store.
> The right applications depend on the exact problem, and even problems that are not a fit can be re-stated so that they become good enough for Redis, and similarly, problems that in theory are a great fit can be mis-implemented easily
How about adding something about this in the documentation, if you haven't already?
Redis is one of the most reliable pieces of software we have used. We ran it in the debugger in our staging environment for 2years hoping to supply a good bug report years ago. Zero crashes to report on in two years under cache and analytics load. Please write the book, I have always appreciated your writing.
> One problem with Redis is that it looks superficially simple, but actually, like any tool used in critical contexts ...
Definitely. And on top of that, Redis is very easy to get started with, and as the article illustrates, you can plug along very smoothly until you suddenly find a mountain to climb. Shouldn't scare someone off, but one of the lessons is to understand what you're implementing and where the bottlenecks will eventually be - much easier said than done, in my experience, but something to aim for.
This comment may be a bit snarky, but it's important to understand that the TL;DR for this is not: you need to learn how to debug Redis and understand a lot of internal things about it and the libraries using it. It's that you can really save yourself a ton of time and headache with a better understanding of application architecture.
Step 1) Throw everything at Redis because you don't know how to architect. Use it as a cache. Use it as a temporary data store. And use it as a persistent data store at the same time. What could go wrong, right?
Step 2) Watch it break because you're using it wrong.
Step 3) Write a blog post about everything you discovered about how you can sort of get it working while continuing to do it wrong.
Step 4) Only concede at the very end of the blog post that you were just doing it wrong™ from the beginning.
"The root causes of these errors were more a conceptional[sic] issue. The way we used Redis was not ideal for this kind of traffic and growth."
Unless I'm missing something, there's really no reason that all three use cases (caching, temporary data store, persistent data store) should all be sitting on a single Redis server so far away from the metal of the application server.
I do not want my search cache waiting on temporary data to be read and flushed from the cache. I do not want data I'd like to persist competing with search cache results. I don't want any of these things! That's not how this works. That's not how any of this works!
In any bigger project, architecture grows organically. It's easy to say "I would have done better". In hindsight, every flawed system looks like a bad idea. It takes a visionary to foresee all possible error cases.
Things go wrong. I think it's an honest post with lots of practical tips.
> In any bigger project, architecture grows organically. It's easy to say "I would have done better". In hindsight, every flawed system looks like a bad idea. It takes a visionary to foresee all possible error cases.
Kudos to the team on their findings and for their honest post with some tips for their odd architecture. I hope my comment did not imply that "I would have done better," or that hindsight is not, 20/20. I have enough experience to understand both of those things.
However, as architecture grows organically, it doesn't take too much understanding or experience to know that using a single Redis instance for three different use cases would probably run into issues. That's more than likely not a symptom of "organic growth," but of poor decision making.
Trivago received a $4,000,000,000 valuation. Why does their engineering deserve the benefit of the doubt here?
At some point we (technologists and developers) have got to concede that maybe it's not just a few folks trying to keep up with organic growth, but that some actual bad decisions were made and should be accepted and highlighted more than just covering up for mistakes.
I think the post could have been more useful if it would have conceded the point earlier that they made bad decisions from the get-go, that they had to play catch up with those decisions, and that the advice was for attempting to catch up to these decisions. It would be very interesting to understand why they made the choices they did.
Why are they running memcached and Redis at the same time? Why wasn't a regular database suitable for data persistence or temporary data? Why wasn't the cache for search installed on the application server to remove network latency altogether? Why weren't they running php-fpm from the get-go?
I made the comment because doing some research or having some understanding from the beginning, might have avoided the post altogether. I think it's important for us to be able to own our mistakes and honestly approach how we made those mistakes so they aren't repeated by others. I think that is far more useful than a writeup on how to adjust some settings to deal with some fairly basic mistakes. Especially for a technical blog post from a company with a $4B valuation.
> However, as architecture grows organically, it doesn't take too much understanding or experience to know that using a single Redis instance for three different use cases would probably run into issues. That's more than likely not a symptom of "organic growth," but of poor decision making.
If one always had time to reflect upon problems during development you were right. But during crunch time one makes a lot of decisions without proper evaluation just based on a (more or less) educated guess.
> Trivago received a $4,000,000,000 valuation. Why does their engineering deserve the benefit of the doubt here?
Because a lot of readers here have worked in (potentially over-evaluated) environments that constantly underestimate the time required to build something. That someone reflected upon their mistakes and published a post mortem is a good sign someone found the time to finally re-visit all those previous decisions with a rested mind and some distance to the frantic time of "just getting things done".
The simple answer is that if you're working on a growing product, your engineering team is probably stretched thin. The why behind all these questions is most likely a short term optimization of time.
A $4B "valuation" isn't particularly relevant. It's just a matter of the actual (not potential) resources they had at the time, and how they were allocated. And I'd bet that was mostly allocated toward product growth and development, not stability. Because, well, of course it was.
Original author here.
Thanks for your comment. I don't think you mean everything bad or snarky here. I like to read critical feedback. Thats how you learn.
But i guess you mix things up.
So yes. This architecture has grown over the time. And the whole story didn't happen yesterday. It starts September 3, 2010. So >6 years back. The first _Real_ problems appeared in 2013. So ~4 years back.
With our current knowledge, i agree that those three use cases should not fit in one redis instance.
> I think the post could have been more useful if it would have conceded the point earlier that they made bad decisions from the get-go, that they had to play catch up with those decisions, and that the advice was for attempting to catch up to these decisions.
This post is written as a kind of story. Of course it would be possible to conduct this post to learnings.
> It would be very interesting to understand why they made the choices they did.
Feel free to ask every question you want to know. I am here and happy to answer.
> Why are they running memcached and Redis at the same time?
Because this are two different systems with two different concepts for different usecases.
If you use caching, i agree. Both can cache data.
But on use cases where you need master slave replication (maybe across data centers), memcached may not fit.
IMO it is hard to say both systems are the same.
E.g. if you deal with caching data that vary a lot in size per entry (same data concept), one memcache instance / consistent hashing ring might be the wrong solution, because of the slap concept of memcache.
> Why wasn't a regular database suitable for data persistence or temporary data?
This question is not answered in general.
But i use one use case.
We had a MySQL table running on InnoDB that was really read heavy with ~300.000.000 lines. The indexes were set for the read patterns. Everything fine.
An normal insert in this table took some time. And we wanted to avoid to spent this time during a user request. This would slow down the request.
One option the dev team considered is to write it in redis and have a cronjob that reads this data out of redis and stores them in the table.
With this the insert time was moved from the user back to the system.
> Why wasn't the cache for search installed on the application server to remove network latency altogether?
There is a cache on the application server. But we have several cache layer in our architecture. This was just one of them.
> Why weren't they running php-fpm from the get-go?
We are working on a switch to php-fpm from the typical prefork model.
Such a task sounds easy. But in a bigger env this can get quite a challenge.
> I made the comment because doing some research or having some understanding from the beginning, might have avoided the post altogether.
Agree. The challenge here was, that the people who introduced Redis at that time has left the company. So in short: We had the problem, and had to fix it. With our current knowledge, we would fix this in a different way and maybe choose different approaches. But yeah, i assume this is a normal learning process.
Thanks for the awesome writeup. Since you opened up questions:
From the start of your cutover (when you started seeing the 500s) to resolution of the various issues along the way--how much time elapsed? (e.g., deciding to swap client libraries, A/B testing, redis upgrade, shifting load to dedicated instances, implementing proxies/memcached)?
And what was the end user impact (e.g., 50% of users would see timeouts during peak usage for the day, or users using certain functionality would be affected 1% of the time, etc.)
Just trying to get a sense of the level of urgency involved in terms of chasing down all these leads. It seemed pretty methodical, so it's hard to tell if it was a slow-burning persistent nagging issue that you chipped away at over a couple of months, or an all hands on deck sequential process of trying a lot of different things in a relatively short period of time to keep the site afloat.
Here's the fix: fork the redis API for your language (nodejs,java,c#, whatever it may be) and remove the .KEYS function completely. Stop your entire dev team from shooting themselves in the foot.
Christ, any sympathy I had for them vanished when I saw they were using keys... The documentation is very clear about query complexity so this should have been an obvious one.
Setting up a new connection for every request considered normal for stateless applications? Really? This is a big scaling issue in your stack. You need connection pooling also for stateless apps.
I thought it was normal to do connection pooling as well, but I was pretty shocked when I saw how much the API at my job was murdering our PostgreSQL server. Every request meant a completely new connection. So while reads and writes were taking up maybe 10% of the server at peak times, creating and closing connections was taking up easily 5x more than that. If I was particularly unlucky, the PostgreSQL server would stall for a few minutes.
Still, I had to implement pgpool-II and do connection pooling that way. It was also a PHP application. :(
If you are containerizing your stateless app, where do you store the connection pool? I suppose you could setup another container as a TCP proxy that can maintain persistent connections and then connect to redis through the proxy?
What do you mean by "containerizing"? Unless you restart app instances for every single request, I don't see how connection pooling is an issue. Just putting a worker in a container doesn't stop it from keeping a connection open.
Persistent connections are far more complicated to reason about. I believe what you mean is that this makes it more difficult to scale PHP, which is arguably true, but at the same time it forces you to understand underlying concepts.
It's typically 'SESSION' specific things with connection pooling that I've seen underlying (sometimes long-uncaught) bugs in production systems.
e.g. A 'ALTER SESSION'/'NLS_DATE_FORMAT' commands in Oracle or even an unfortunate 'USE <db>' w/MySQL.
There are of course safe solutions & techniques for this, but when you have an otherwise stateless-by-design codebase (such as with PHP), picking up "possibility state-laden" connections is a bit of an unexpected concept that I've seen catch developers more than once.
Nothing, but the 2 parent comments both make strange statements which leads me to believe there is a misunderstanding.
> If you are containerizing your stateless app, where do you store the connection pool? I suppose you could setup another container as a TCP proxy that can maintain persistent connections and then connect to redis through the proxy?
Containerizing doesn't prevent you from having a connection pool, but PHP doesn't allow you to have a proper connection pool regardless. I believe there is a misunderstanding on containers here.
> What do you mean by "containerizing"? Unless you restart app instances for every single request, I don't see how connection pooling is an issue. Just putting a worker in a container doesn't stop it from keeping a connection open.
This comment corrects the misconception about "containerizing", but then overlooks that PHP is effectively restarted per request.
Additionally it is definitely not stateless, the session cache persists between requests as well. You can just load balance between 20 php-fpm instances without moving the session cache to something shared.
You can do persistent connections but it only applies to the same child process and you don't have fine grain control over this process if you choose to use it.
Session cache is stored on disk.
The "debugging" they went through appears to be random acts of system config modifications and upgrades. Talk about making a bad situation worse, they completely lost track of the issue. Upgrading redis in the middle of figuring out the problem? My God....
I checked out twoemproxy because it seemed interesting and noticed a few things:
* the last code commit was 8 months ago (and was very minor)
* there are 120 open issues
* it's written in C
Now this is far from abandaomware (imo), but it still seemed like a risky dependency to lean on. Have you guys moved past it already? Any thoughts on the cost-benefit analysis there or alternative solutions?
Note: I mention C like a detriment because I myself cannot write production C and therefore would think twice to using a library/service I myself could not fork (and probably many other devs).
twemproxy is not the best piece of software when it comes to maintenance of code quality. But it works reliable. We are still using twemproxy and we are maintaining our own fork of it. This fork often has changes from us, like bugfixes or introducing new redis commands that are not supported yet.
Every bug we fix we apply this as a PR.
A few PRs from the repository we have cherry-picked internally and deployed.
But we have also a few issues with it like this one here: https://github.com/twitter/twemproxy/issues/442 or live reconfiguration would be a nice feature.
Think about it if you really need this, because it is not only fun.
A few pain points:
* Deployments of new configs and restarting twemproxy on your server farm
* New redis version with new commands? Extend twemproxy for those command support
I don't now any _real_ alternative to it, right now.
I know from https://github.com/Netflix/dynomite which is a twemproxy fork.
When we had only memcached and no redis (we are using both with twemproxy), then we would try https://github.com/facebook/mcrouter from facebook. But it is specialized on memcached.
If you come up with an alternative, please let me know.
"Proper monitoring and alerting" do you mean the Redis software watchdog? As a bit above you mention running it with flag =0 is generally not a good idea, and propose =500.
Do you use it or something else for Redis production server monitoring? something with little impact on performance.
This is a very well written and insightful article. However, while I was reading this I was wondering; "what is the time-frame over which the debugging and solution finding occurred?". Was it hours, days, weeks or even longer? My guess is that the time-frame was days (not necessarily continuous).
I think an article like this could serve well as an example to non-developers to demonstrate the path taken to solve a problem but also how long it takes to solve a problem.
Original author here.
It really depends on which debugging session. Several debugging sessions was across days, because we didn't understand the problem at first.
Some others were hours up to minutes (e.g. were we checked which commands we are using and optimized them).
Not sure if this is a dumb question, but if your cache is behaving so badly (40% 500 error?), shouldn't you just get rid of the cache? Maybe allocate some more resources to the DB, but the DB already caches frequently accessed data and you save yourself the roundtrip of check-cache-then-db.
Original author here. I think i wrote the sentence in the post confusing.
The error rates of (40% 500 error) were not normal.
A normal hit rate was at ~97%. But because the redis server itself was overloaded by KEYS request, BGSAVE / forking, etc. this instance was not able to answer the cache request.
Due to this we had a high error rate.
But in general i agree. A normal behaviour of a cache with such a bad rate should be avoided.
I hope this make more sense now?
The conventional wisdom is to move as much load as possible from DB to a cache. The reason is quite simple. Scaling a database is hard and expensive. Scaling a cache is very easy in comparison.
Interesting write-up. Did I miss it, or did the author not talk specifically about the implementation architecture for the redis instances, i.e. single instances vs. high-availability with replication? If you run replication then you at least need persistence for the master, and since any of the replicas can fail over and become master that basically means you need persistence for all of them.
The article didn't go into depth of the server architecture. But in general all instances i wrote about were setted up as a classical Master <-> Slave connection. Sometimes with multiple slaves.
Related to persistence: It depends. It depends on your tooling around.
E.g. If you have a Master <-> Slave setup with persistence enabled, you have multiple options. BGSAVE on master only. BGSAVE on slave only. A seperate slave without traffic only for BGSAVE (to avoid the forking issue, many companies doing this) or BGSAVE on every node.
BGSAVE on every node can make sense of you have a running sentinel that propagades the slave as the new master once the master failed.
IMO it depends. We are running master <-> slave envs with persistence enabled on every node.
Were we don't need persistence there we are running only single instances that are combined by the client to a consistent hashing ring.
Does this answer your question? Any questions left i can answer?
Why would you want all of the replicas to be able to fail over and become master?
A common pattern is to separate writes from reads. Writes go to a small pool, reads get replicated into everywhere. Your read-only replicas will only ever be slaves, and do not need persistence.
> KEYS * ... O(N) with N being the number of keys in the database
How can this NOT be the case? The problem isn't the algorithmic complexity -- O(n) is optimal -- but that you are probably either bouncing around memory or running though way more of than you need to depending on the order returned and the data structure used.
This whole article sounds like a bunch of "Full Stack" developers running around making a mess of things with simple mistakes that any real back-end person wouldn't have made in the first place.
As mentioned in the article, the documentation wasn't quite as obvious on the problems with KEYS. The problem was created way back when being "a real back-end person" still meant something completely different.
Super interesting lesson learned article. Will definitely try out Redis, if it could replace Memcached (though it works rock solid) or be used as queue instead of Kafka (I dislike ZooKeeper)
Wikipedia page mentions "massively scalable message queue broker". And no word about details if the buffer is circular or not - but does it matter in my case - probably not. In memory message queue that can be optionsl persistent in case of a restart, that can be dobe with both Kafka and Redis.
It's not a queue. Some people may use it for cases that resemble a messaging queue, but it definitely is not, and many valid use cases for Kafka would be completely impossible with a traditional message queue or Redis.
How about actually reading the Kafka documentation instead of an inaccurate wikipedia page, it's pretty good and won't take you that long
I am using MongoDB and it works like I want it but I am roasted all the time on HN because of it and told to be using Redis instead. Still considering the switch but I don't want to run into problems like this.
Please don't switch DB's on a whim just because someone told you to. Unless it's your project lead or company CTO, you should really understand why you're making a move.
Also MongoDB and Redis have very few overlapping use cases, they're not really comparable.
From what I've been told MongoDB breaks in really bad and silent ways when under pressure. And I mean "ignore writes without giving out errors" kind of bad.
Because none of what you said is really all that true. MongoDB has never ignored writes without giving errors. You have always been able to check if the write failed. The issue I think you're referring to is the F_SYNC immediately to disk default. Which was changed many years ago and was never the default for any of the language clients.
MongoDB is a great database in specific situations.
Actually it is feedback I have from the data guys at work, not HN.
And you may be right about that configuration, but I cannot remember the exact details of the discussion.
Either way, don't get me wrong. I use MongoDB as well, and am quite fond of its aggregation framework. It's just not something that I catalog as "reliable and high performance" the way I do redis (which of course, also has limitations of its own).
Lots of small writes (200-1000 a second at peak maybe) but only reads it on server start once a day. Honestly I don't know if it's failing already but it doesn't matter that much. I would prefer to use something more stable though.
Certain things I understood over the years that made it better are, of course, improvements in the implementation, but especially things like LATENCY DOCTOR and MEMORY DOCTOR (this one only for versions >= 4.0), that is, most of the things a Redis operations expert can tell you, Redis can tell you directly... without any other help, just observing its internal behavior.
There is another aspect. For a long time I believed a Redis book was not really needed. Now I changed my mind, and I see no good enough Redis book out there, and I'm almost planning writing one... I was writing for a long time, incrementally, a C book, that is currently just an incomplete draft. I'm thinking about abandoning for now this project, and start with a Redis book.