Hacker News new | past | comments | ask | show | jobs | submit login
The Redis criticism thread (antirez.com)
152 points by HeyChinaski on Dec 10, 2013 | hide | past | favorite | 44 comments

To summarize quickly for those that didn't read the article, this focuses mostly on the new "clustering" aspect not on the "classic" single Redis server (if you wish).

I think it is important to be honest with the users and make it clear how and what happens behind the scenes, how data could be lost. And Salvatore has done most of this, maybe just make it a bit more explicit, as there still seems to be some confusion around.

All this is in light of 2 things -- 1) With the popularity and amount of talks and churn around distributed systems these days, people sort of expect a point on the map in the CAP triangle. So just saying we kind of do this and we provide some C, a little A and a dash HA was probably ok 5 years ago, now it needs a bit more definition, 2) In light of other database systems misleading users about what it could provides (you know which one I am talking about) and having resulted in lost data, there is a bit of apprehension and a higher bar that needs to be met in order for a db product to be accepted.

One good thing that came out recently is NoSQL database writers/vendors pushing for more rigorous tests. Tests that run for weeks and months. Consistency tests, network partition tests as run by Aphyr. It is a very good idea those things are talked about and defined better.

I think Antirez is simply doing a master/slave system with async replication. This is what mysql, solr, elasticsearch, kafka, etc do. This is not an unusual model at all.

In CAP theorem terms, Redis has picked zero (remember CAP theorem is pick at most two).

There's a bunch of people who've made this choice, but why? C incurs a synchronization cost. A means that you have to reconcile different writestreams. If you want consistent semantics in a very fast database, you can't pick either. So you end up somewhere in the middle of the triangle.

The consequence of picking zero is that you'll lose a time window of data roughly proportional to the replication lag when the master fails/partitions. There are many applications for which bounded data loss is a perfectly reasonable paradigm.

I assert that this design actually allows for arbitrarily long windows of data loss, but I haven't verified that the implementation matches my understanding of antirez's WAIT/failover algorithm yet. Pretty sure though.

If you read the Redis Cluster documentation, the data loss is acknowledged, even explaining what are the most obvious failure modes where this happens. WAIT does not provide strong consistency either, and is actually not even documented in the Cluster doc. However WAIT as it is makes the possibility for data loss less likely.

Once everything in the Redis Cluster design and documentation is about trading consistency for performance, I would expect the system to be analyzed for what it is: is it good at providing weak but practically usable consistency and performances? In short, it does respect the design intent?

Saying again and again that it features not consistency is a sterile exercise.

The strongest guarantee needs to be how quickly the system catches failure to recover or halt.

In CAP theorem terms, Redis has picked zero (remember CAP theorem is pick at most two).

That might be true, but describing something in the CAP framework is not the only salient thing you can say about a distributed storage system. It is characterizing the failure modes.

You also have to think about what happens in the normal case. In the normal case you have a consistency vs. latency tradeoff. People have written about this, but unfortunately I don't think this broader way of thinking hasn't the attention it deserves:



"If you examine PNUTS through the lens of CAP, it would seem that the designers have no idea what they are doing (I assure you this is not the case). Rather than giving up just one of consistency or availability, the system gives up both!"

"The reason is that CAP is missing a very important letter: L. PNUTS gives up consistency not for the goal of improving availability. Instead, it is to lower latency. "

I was just making a point since even people who have a better average understanding of distributed systems, can't quite put their finger on what Redis Cluster's failure modes or guarantees are, perhaps a bit more of clarification is needed.

Even you are using "I think antirez is simply doing..."

Not arguing the design is not useful or it is a bad design, Savlatore's code is great, I think it is just a matter of more testing, docs or a blog post.

> In CAP theorem terms, Redis has picked zero (remember CAP theorem is pick at most two).

No, he picked P, Partition Tolerance. You can't not pick P. Unless you assume a never-failing network with never-failing nodes.

This is the first time I've ever heard any criticism for redis. As far as I know, everyone loves it, it's a great tool for many jobs and it's amazingly written and solid to boot. I think many of the critics are trying to apply it in ways it wasn't meant to be used.

As far as I know, its main purpose is non-critical data that needs to be accessed as quickly as possible in various different ways, and that's where redis shines.

I don't pay super close attention to this stuff, but it's very obvious to me that not everyone likes Redis; in particular, the kinds of people who have problems with Mongodb tend also to have the same problems with Redis; long-term operational reliability is a big question mark with it.

For my part: on only a few occasions (mostly fuzzer farm stuff) have I ever used Redis and been happy with the decision. I usually regret using it. I always regret using it as an alternative to SQL.

I would hope this is not the case, because it means the nay-sayers are careless developers, or simply internet-attention-seeking by criticizing instead of contributing.

Mongo has taken a lot of knocks in part because its write ack behavior was (initially) documented in a way that not everyone felt was sufficient, but also because it is marketed as a universal solution for RDBMS woes.

Redis has never been touted as a replacement for a primary data store, but more of a "toolbox" in some cases, or "middleware" in others. To apply the same criticisms to Redis and Mongo implies those users did not research their platform decisions at all.

I don't know, I don't think it should be compared (or considered as an alternative to) MongoDB, and certainly never SQL. I use it to store cached keys, session data and similar other data that I can always regenerate. I've also had great success using it as a message queue on multiple projects.

What was your use case, and what was your experience with it?

Agreed. If Redis is an acceptable alternative to MongoDB you're using one of the two very incorrectly.

MongoDB is not significantly better at any particular thing so yes, Redis can be used as an alternative for the things I may use MongoDB for. Keep a close eye on http://www.amisalabs.com

Use case: I want to store some json values against some string keys. Which would be very incorrect to use for this?

How often are you updating the json values? 10/s? 100/s? 1M/s?

How many string keys? Millions? Billions? What happens if you lose an update?

Redis great for lots of rapid reads and moderate write speed for data that fits on a single server or can be manually sharded well (if it's straight k=>v, as you describe, that basically means your json fits in RAM; if you're using larger objects like sets/zsets/etc, it becomes a slightly different discussion), as long as your application can lose a few seconds without killing you (BGSAVE isn't instantaneous, of course).

If the data is critical and you must keep them, both. If not, either.

Redis people think it could be used instead of SQL for some designs:


I want to use it in a small project so I can learn the design patterns associated with key-value stores. I'm sure there are good and bad things about using Redis vs Postgres.

But it has to hold my data, so... does it hold my data or not? I'm talking about single box stuff here, of course.

(Of course, I'll just use it and find out myself).

That's supposed to showcase redis' features, not to tell you that you should dump Postgres for redis.

Which is a mistake a lot of developers make - trying to replace one data store for another wholesale, and then being surprised when problems arise.

But why not?

Is all data in a database that important? Maybe alternative databses would generate new webapp patterns, like dynamic languages do.

Anecdotes don't make data but I dislike Mongodb but love Redis.

Mongodb seems like a solution that was optimized for a certain performance sweet spot that, post-SSD and huge RAM, hardly anybody ever really encounters and 99% of the times it's used you would be better off with Postgres.

Redis on the other hand hits a (probably permanent) sweet spot of ultra-high performance one level of abstraction lower than your usual database; which is a use case everyone trying to saturate metal and squeeze out the last bit of performance on a minimal budget has.

As an alternative to SQL it's pretty awful (it's explicitly not ACID, for one).

As an alternative to an in-memory cache or (occasionally) a DB de-normalization, on the other hand, it works amazingly well for me.

I see the persistence is basically just a backup that helps your system get back up to speed quickly after a crash or systems' restart. It's not something that should ever be treated as reliable.

I wonder how many people who use it under this or similar use cases have an issue with it.

What went wrong? What's the Redis K/V model for it being used as a fuzzer farm data thing?

It would be great to also enumerate other examples where it shines.

I've used it for caching, session and transient objects connected to Rails without issue for the last two years.

Additionally, I've been looking at it recently for a simple database to perform fast matches on sorted data (E.g. get me the lowest number in this set).

For three years we've been using it as message bus, statistics tracker, and storing medium-term data (expires after 1 month). We've up/downgraded multiple times without issue.

>>We've up/downgraded multiple times without issue.

Why would you need to downgrade if you didn't have issues? ;) ...kinda joking, kinda honest curiosity.

Sorry, I meant up/downgraded the server. We stopped upgrading software at Redis 2.6 :)

I've used it as you have for two things: - Storing user session data for high-use web applications. - High score tables.

And one more: - Storing diffs like those used for Google Docs (literally every change) for a real-time communication system.

The last one works by the client not getting a response and continuing to queue up its diff and send it once the server is back up. Manual conflict resolution is likely necessary after a drop in availability, but consistency is easy to figure out with monotonically increasing IDs.

Just to be clear, the criticism is of Redis Cluster, which has been redesigned because the first design was so lax it was effectively unusable. The real issue, is that as of right now I have no idea what the exact semantics and failure conditions of Redis cluster are. I'm not clear that anyone really knows. Salvatore thinks he knows, but we can't be sure that he does.

With all this talk of practicality, what really makes distributed systems practical is when someone can do a formal analysis of them and conclude what possible states can occur in the system. This is not work that database users should do, it is work that database implementors should do. The failure to use a known consensus system is a failure to deliver a database I can understand.

I find this all a bit disappointing since I've been a huge fan of Redis since the early days. It's an amazing tool that I still have in production, but I get the feeling that it's utility will never expand to suit some of my larger needs. Bummer.

He is using known algorithms. He describes them here: https://news.ycombinator.com/item?id=6780342.

Thanks, I hadn't seen this

> Redis is probably the only top-used database system developed mostly by a single individual currently

isn't SQLite pretty much just Richard Hipp?

Dan Kennedy does a lot of work, certainly appearing to be a similar order of magnitude to DRH. Joe Mistachkin appears to do Windows almost exclusively and I can't tell if he is part time or full time.


I had not realised that. Thanks.

Salvatore is admirably proactive at garnering criticism for his project. He's also incredibly gracious when accepting it.

There has been a lot of constructive discussion in the Google Groups thread, and a lot of negativity on the same topic on Twitter. Ignoring for the moment the difference in discussion platforms, I am uncertain why the sudden uptick in Redis negativity.

Is it from users trying (and failing) to replace RDBMS or distributed platforms feature-for-feature with a single threaded, memory limited store like Redis? Or could it be FUD from people interested in seeing their new platform-du-jour get more attention?

The "clustering" feature is just now becoming stable. Distributed systems are complex (in no small part because they're generally poorly explained) so some mistakes were guaranteed. Hence the uptick in critics makes perfect sense even sans self-promotion conspiracies.

I think many times people criticize new-ish web technologies because they don't want to spend time learning it (which can be okay). They hope to sway others away so that a critical mass of users never compels them to adopt the, now necessary, tech (not that redis is 'necessary').

Quick things

1) I expected that thread to look like a Cultural Revolution struggle session". Thankfully it wasn't.

2) As I am sure many others have already said it, durability has very little to do with CAP. CAP is about the A and I in in ACID, D is an orthogonal concern.

3) Durability doesn't necessarily mean losing high performance. Most databases let the user choose how much data they're willing to lose and for what latency decreases -- the standard approach (used even in main-memory databases like RAMCloud[1] and recent versions of VoltDB[2]) is to keep a separate write-ahead log (WAL) and let the end-user choose how frequently to fsync() it to disk as well as how frequently to flush a snapshot of main memory data structures to disk.

There are many papers (e.g., http://citeseerx.ist.psu.edu/viewdoc/summary?doi= that talk about various challenges of building WALs, but fundamentally users who want strongest possible single-machine durability can choose to fsync() on every write (and usually use battery backed raid controllers or separate log devices like SSDs with supercapacitors or even NVRAM if the writes are going to be larger than what fits into the RAID controller's write-back cache). Others can choose to live with possibility of losing some writes, but use replication[3] to protect against power supply failure and crashes -- idea being that machines in a datacenter are connected to a UPS, replicas don't all live on the same rack (to protect against -- usually rack local -- UPS failure), and there's cross-data center replication (usually asynchronous with possibility of some conflicts -- notable exception being Google's Spanner/F1) to protect (amongst many other things...) against someone leaning against the big red-button labeled "Emergency Power Off" (which is exactly what you think it is).

Flushes of main data do also hurt bandwidth with spinning disks and old or cheap SSDs, but there's a solution: use a a good, but commodity MLC SSD with synchronous toggle NAND flash with a good controller/firmware (Sandforce SF2000 or later series, Intel/Samsung/Indilinx's recent controllers) -- these work on the same principle as DDR memory (latch onto both edges of a signal) to provide sufficient bandwidth to handle both random reads (traffic you're serving) and sequential writes (the flush).

4) I known several tech companies and/or engineering departments therein who absolutely love and swear by redis. There are very good reasons for it: the code is extremely clean and simple[4] and it handles a use case that neither conventional databases nor pure kv-stores or caches handle well.

That use case is roughly described data structures on an outsourced heap for maintaining a materialized view (such as a user's newsfeed, adjacency lists of graphs stored efficiently using compressed bitmaps, counts, etc...) on top of a database. So my advise to antirez is to focus the effort around making this use case simpler rather than build redis out into a database: build primitives to let developers piggy back durability and replication to a database or a message queue. In fact, I've known of multiple startups that have (in an ad-hoc way) implement pretty much exactly that.

This is still a tough problem, but one which (I think) would yield a lot more value to redis users. Just thinking out loud, one approach could be a way to associate each write to redis with an external transaction id (HBase LSN, MySQL gtid, or perhaps an offset in a message queue like Kafka). When redis flushes its data structures to disk, it stores the last flushed transaction id to persistent storage.

I would also implement fencing within redis: when in a "fenced" mode redis won't accept any requests on a normal port, but can accept writes through an bulk batch update interface that users can program against. This could be more fine grained by having both a read-fence and a write fence, etc...

This makes it easier for users to tackle replication and durability themselves:

For recovery/durability, users can configure redis such that after a crash, it is automatically fence and "calls-back" with that last flushed id into users' own code -- by either invoking a plugin, doing an REST or RPC call to a specified endpoint, or simply using fork() and executing a user configured script which would use the bulk API.

For replication, users could use a high-performance durable message queue (something I'd imagine some users already do) -- a (write-fenced) standby redis node can then become a "leader" (unfence itself) once its caught up to the latest "transaction id" (last consumed offset in the message queue, as maintained by the message queue itself -- in case of Kafka this is stored in ZooKeeper). More advanced users can tie this with database replication by either tailing the database's WAL (with a way to transform WAL edits into requests to redis) or using a plugin storage engine for the database.

Fundamentally, where I see redis used successfully are uses cases where (prior to redis) users would use custom C/C++ code. This cycles back to the "outsourced on heap data structures" idea -- redis lets you use a high level languages to do fast data manipulation without worrying about performance of the code (especially if using a language like Ruby or Python) or garbage collection on large heaps (a problem with even the most advanced concurrent GCs like Java's).

There have been previous attempts to build these outsourced heaps as end-to-end distributed system that handle persistence, replication, and scale-out and transactions. These are generally called "in-memory data grids" -- some simply provide custom implementations of common data structures, others act almost completely transparent and require no modifications to the code (e.g., some by using jvmti). Terracotta is a well known one with a fairly good reputation (friends who contract for financial institutions and live in hell^H^H^H^H world of app servers and WAR files swear by it), but JINI and JavaSpaces were some of the first (too came too early, way before the market was ready) and are rightly still covered by most distributed systems textbooks. However their successful use usually requires Infiniband or 10GbE (or Myrinet back in dotcom days) -- reliable low-latency message delivery is needed as (with no API to speak off) there's no easy way for users to recover from network failures or handle non-atomic operations.

To sum it up, I'd suggest to examine and focus on use-case where redis is already loved by its users, don't try to build a magical end to system as it won't conserve the former, and make it easy (and to an extent redis already does this) to let users build custom distributed systems with redis as a well-behaved component (again, they're already doing this).

[1] https://ramcloud.stanford.edu/wiki/display/ramcloud/Recovery

[2] http://voltdb.com/intro-to-voltdb-command-logging/

[3] Whether it's synchronous or not is about the atomicity guarantees and not durability -- the failure mode of acknowledging a write and then 'forgetting' can happen in these systems even if they fsync every write.

[4] It reminds me of NetBSD source code: I can open up a method and it's very obvious what it does and how.

Hello strlen, thanks for the interesting reply.

While CAP and durability are orthogonal they are very related in actual systems, I don't think it is ok for a system like Redis to assume that users have multi DC replication and/or other infrastructure preventing mass reboots of small clusters composed of a few nodes. Also note that the more nodes in a distributed system are "decoupled" from the point of view of failures (different physical networks / equipment, different datacenter), the more you are likely adding latency.

But the point in the discussion was never that, since synchronous replication by default is already, exactly as you express in your message, not the Redis business, so fsync or not, Redis Cluster is not going to feature "C". WAIT and its semantics ended taking all the attention, because you know, if your work is to show "C" is violated, you tend to focus there, regardless of the system analyzed not claiming to be consistent.

As for Redis Cluster, many places where Redis is used in the right way, in the environments you say people are happy with Redis, would benefit from the automatic sharding and the operational simplicity that Redis Cluster can provide to Redis, this is why Redis Cluster is IMHO a good milestone in the roadmap.

I had no idea that there was ONLY antirez working on Redis. That is some crazy inspirational stuff.

Mm Nn

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact