Hacker News new | past | comments | ask | show | jobs | submit login
Redis Turns 10 – How it started with a single post on Hacker News (redislabs.com)
855 points by mrburton on Feb 27, 2019 | hide | past | favorite | 215 comments

One of the funny things is that I've still never really used Redis directly, only as a sort of opaque part of other things.

I posted that link because I've known antirez for a long time, worked with him at Linuxcare, and when he said he was working on something cool - I believed him!

Thanks for the trust :-D

I thought it was funny that the author of OP said that "...one suggested Salvatore re-write Redis in Erlang" and it turned out to be you!

It was more of an excuse to get him to play with Erlang some... I always thought he might find it fun to work with.

Did he ever try it?

Not too seriously afaik.

Was he one of the Prosa folks back then?

Nope I joined with many ex Prosa to Linux Care IT. So many people were the same but different company.

Congratulations! Fast and stable DB.

In 2011 I wrote PHP client for Redis with features like tags, safe locks, map/reduce. It was heavily tested and users had no issues - and since there was no issues, it was removed from the list of clients after few years as "abandoned". Nobody even tried to contact me :) Now I don't use PHP and barely use Redis so it's all in past. But that experience taught me a lot of new things.

Salvatore Sanfilippo, you are great programmer. Redis API is elegant, safe and effective. I personally had 0 data losses all years I use Redis. And this DB is always extremely fast.

Funny anecdote though. No patches every few hours, no 1000 issues open? Apparently you were too good of a programmer ;)

I'm still a programmer, just use Rust mostly, instead of PHP. It's not an anecdote, there was 6-7 issues on github and code was really well tested.

Why not post a link to it? That would be a simple and welcome revival.

Redis evolved, clients evolved, there are better PHP clients right now I believe.

And I'm here not promote my code, I'm here to congratulate Antirez and thank him for his great job and for being a good example of how to write API, stable and fast code, how to communicate with the community. His blog is a source of lessons, wise thoughts and inspiration.

Google says anecdote means "a short amusing or interesting story about a real incident or person."

Edit: This should have been a reply to the comment from EugeneOZ saying it's not an anecdote; I replied to the wrong comment by accident.

The literal translation from Greek is “unpublished” which refers to a short incident irrelevant enough to not be part of the main story. In modern Greek anecdoto means joke.

Hm, I'm no native speaker. In my language, we use the same word for "stories from the past".

I am a native speaker and this was correct usage of anecdote by any reasonable definition

That is the irony of software development. In theory, good software could last forever without any maintenance. In practice though, nothing has to be replaced as often as software ;-)

What does that teach us about our software development practices?

I think developers need to learn to build to last. We have medical software that hasn’t been updated for two decades that runs just fine.

That’s often unobtainable with modern software development because we rely so much on things that change too often, but it doesn’t have to be that way.

It’s a paradigm shift of course, but I think our business really needs to take maintainability more seriously than it does. This goes for proprietary software as much as Open Source, but with Open Source there is the added layer of governance.

I work in the public sector and we operate quite a few Open Source projects with co-governance shared between different muniplacities, and the biggest cost is keeping up with updates to “includes”. It’s so expensive that our strategy has become to actively avoid tech stacks that change too often.

While I don't disagree that we shouldn't strive for maintainbility, things like medical software, airplane software or similar highly tested mission critical pieces are specifically built to last for that long. Nobody is going to pay us to build a WebShop to last for 20 years, thats just not a necessity when getting it out quick is so much more important from a business perspective than making it last forever.

> That’s often unobtainable with modern software development because we rely so much on things that change too often, but it doesn’t have to be that way

The reason we rely on things that change often is because we want to leverage them to get products out faster. Many different layers of that (as every tech stack is essentially a product by someone) and we have lots of updates to deal with. The flipside of slow moving projects is bugs might not be fixed or new helpful features might not be coming in, meaning you have to build it yourself.

As a community we know and have known how to build mission critical software for decades, but we actively often decide not to do it because it isn't that important compared to other factors.

So interestingly - the web shops you’re talking about do want to maintain their client data, and do expect it to be available “forever,” somewhere. The payment processors absolutely do at a minimum. Some of those layers are highly hardened.

While the particular Etsy clone or t-shirt of the day, or customized shower curtain site will certainly come and go, it’d be an entirely different problem if visa, PayPal, stripe, swipe, or whatever payment processor packed it up and went home at random.

We need the foundations/infrastructure to be built to last. People need to identify which kind of software they're making and treat the infrastructure as unchanging. Changes in the basement need to be carefully considered with a default stance of rejecting them unless justified by reasoning that has a time horizon of many years.

Tell that to the number of shops setting up factory IoT based on Node.js...

I have to be honest I do believe Node.js will be around for a long time. The improvements over the past few years have been vast thanks to the ever improving standards and all the major cloud companies are heavily invested. It has the world's largest public package registry as well (not that the sheer quantity means you can always find a high quality library).

I've only recently switched after years of scepticism but for the sort of stuff I do it's more than good enough. It has its warts but so does every language that's stuck around.

I don't necessarily think that the the language or the core thing won't stick around. But unless you force people to really decide which packages they require, you end up with an unaudited mess of packages (basically with every package - or is anyone really creating portable, stable apps of a relevant size based on the core node environment ...)

Yeah for sure the language definitely makes it easy to build something that won't last. I misunderstood your comment so sorry about that.

I agree with you in principle. But it should be noted that any software that interacts with the world outside of itself can't be considered to be in good working order if it hasn't been audited and updated to resist security vulnerabilities.

I'd argue that medical software shouldn't be connected to networks because security is hard, and most people get it so wrong. If that's part of the design, then the goal you're talking about is attainable. But in many cases, software isn't useful for its purpose if it can't access a network, and so the idea of just leaving it alone for decades at a time is an actively bad goal.

You’re absolutely right, but we also operate Django applications that hasn’t needed anything but the occasional security update in a lifespan that is longer than the existence of React.js.

I like react by the way, it’s just an example. But we’ve certainly had to spend a lot of dev time on JS frameworks in general.

Most “runs fine after 20 years” software is really “security nightmares that people are affraid to touch. Great designs and forward thinking are helpful, but “code and walk away” just isn’t the world we live in.

The new paradigm has to be “plan to evolve with the ecosystem.” There are just too many moving parts to treat software as static.

None of our old software that was build to last has security issues.

I know it’s harder to build with security in mind in the modern connected world, but we have a Django app that hasn’t needed anything but security updates that runs perfectly fine as an example of a web-app that doesn’t need much development time post implementation. So it’s not like it’s impossible either.

Don’t get me wrong, we’ve been as guilty of “wow this new tech is cool” as anyone else, which is where the lessons come from.

This debate never ends. Modern SaaS offerings are nothing like software 20+ years ago that was designed to be delivered in one shot, over the course of months or years, with many upfront hours crafting a precise spec that would not change whenever an investor would drop a new Series X round or the C.O.O. suddenly decided to "pivot" and promised a working minimal viable concept in 2 months without consulting anyone.

Even if the software was bulletproof, the context, environment, requirements, and expectations that the software is used under change, requiring software changes if the software is to remain as effective as it was.

And we are lucky that with software we have the flexibility to rebuild without many of the costs other disciplines face. If I want to rebuild a skyscraper in its same location for the same purpose, I can't build it offsite and then quickly publish it to the building site. I also don't get to reuse any of the cement or girders I used to build it the first time. Additionally, I can't easily redesign a skyscraper to support a totally new use case while not impacting existing tenants and the way the use the building.

People say this, and it certainly seems true on it's face --but software change is still without a doubt one of the most expensive parts of software development, and in fact we engineers spend a lot of time trying to learn how to design software to support change and how to make changes reliably.

To bring it back to the OP, redis is notable for being developed _very carefully and slowly and intentionally_, compared to much software. You won't get a feature in as quickly as you might want, but redis is comparatively rock-solid and backwards-compatibility-reliable software. These things are related. It takes time and effort and skill to make software that can change in backwards compat and reliable ways, takes lots of concern for writing it carefully in the first place.

Change of software is _not_ in fact easy. It might be easier than a bridge. But of course people just _don't change_ bridges, generally. We understand much better how to make requirements for a bridge that won't need to be changed for decades. Software might be easier to change than a bridge, but dealing with change is nonetheless without a doubt the most expensive and hardest part of producing software that will be used over a long term, and quality software is not cheap. And we haven't learned (and some think it may not be feasible ever) to make software that can last as a long as a bridge without changes.

My most popular open source library is a Redis backed cache for Ruby. It continues to accumulate stars, but I haven’t had any issues in well over a year.

Sometimes software is just finished, not abandoned!

Well the author himself said that Redis evolved and there's now better PHP client. It thus means that it wasn't actually finished and was abandoned.

In a closed environment, sure it's just finished, but the world is much bigger, not closed at all. Even the good old Instalshield required maintenance from Microsoft over their 16 bits compatibility layer.

One thing I think OSS developers / projects would benefit from is (self) certification.

Let's say you write a great piece of software that has no need of updating. But Redis keeps evolving - making new releases.

You could just each release of Redis run a series of checks and add a "verification" that we support version 1.0,1.1,1.2 of redis.

But even better is if someone else does this - why? Because it solves a problem I have seen a lot on government circles - the "we cannot use OSS because it is not supported"

But if another company says "we have reviewed, tested and used" version X of php-Redis then you suddenly have a self supporting eco-system.

Webshops that care about winning tenders can say "this software is supported by dozens of providers around the globe" so if it's original user goes off line there are still people who provably have skills and experience with it.

Everyone wins

(note - I am in no way saying this is something you should have done or thought about or arranged - writing software is hard enough without these sort of long term, unprofitable, activities - it's just matches observations i have on getting oss into government (plug: http://www.oss4gov.org/manifesto)

To this end: 18F[0] make extensive use of open-source technology and open-source their contributions, while making heavy use of open-source technology[1][2].

[0]: https://18f.gsa.gov/ [1]: https://18f.gsa.gov/open-source-policy/ [2]: https://github.com/18F

> But if another company says "we have reviewed, tested and used" version X of php-Redis then you suddenly have a self supporting eco-system.

Companies using it is nowhere near the same as 'supporting' it in the sense of providing assistance if there's a problem.

I think if you are prepared to assign some "certificate" that says

- we have tested version X of this and it's full test suite passes, and it runs against this version of redis server or it installs cleanly on RHEL 7.1

then it's a positive move

If you also sign off a different certificate saying

- we are a commercial entity that offers "support" (however we define that) for this software

then we are into a much more interesting eco-system

(yes I am looking for way more than I downloaded it and it works on my laptop :-)

> If you also sign off a different certificate saying we are a commercial entity that offers "support" (however we define that) for this software

There are companies that do that already. What's the issue? People that say "there's no support" either don't care to look, or have custom stuff that is not supportable via 'average' support companies.

Do you have examples of this? (I am not thinking of RHEL world but people picking up support for specific mid sized projects?

rogue wave springs to mind. searching for "open source support service" or "open source support company"

That must have been annoying. It’s interesting, I’ve been evaluating react css frameworks these last couple of days and one of things I’m looking at it is commit log to see what activity is like on the project. It did cross my mind that ones that had gone quiet might be very complete and bug free...but I decided that would be the exception.

Anyway, kudos for writing the client and kudos to Redis. It’s a brilliant bit of software.

there should somewhere be some stats about number of downloads, clones. while npm has its fair share of problems i like that it shows how many weekly downloads it has.

Download count is a reasonable proxy for use... But if someone is happily using the package in production, that won't tick any download numbers even though it's arguably the thing anyone considering a package would really want to know.

Maybe there could be a new curated repo for packages, where users are asked to regularly vouch whether they are still using the package? The problem of course is motivating users to give those answers. Without a critical mass of users regularly vouching, the data isn't much help.

Lol that’s open source marketing 101, GitHub issues are the main source of outreach.

The original HN post only has 23 votes and 11 comments! Good reminder that HN doesn't have to be crazy about your project for it to become successful.


Reminds me of the time when Bitcoin was first posted to Hacker News and only got 5 upvotes and 3 comments: https://news.ycombinator.com/item?id=599852

It's amazing how those three comments still reflect the main opinions about crypto

"This is an absurd waste of energy" is missing.

It's not a waste of energy according to the miners who shell out for that energy.

Cool, but I think the first time it received a whopping nothing:


Says something about society's myopia. I need to research and synthetize history of projects like these. Invisible at first, diagonal .. and then booming. Mostly about people's inspiration and thinking process prior said project.

Booming and busting, don't forget.

(My job is advocating a cryptocurrency that enables privacy through technology, but I think that cryptocurrencies should be a quite small percentage — if any — of most people's investment portfolios.)

"there is absolutely no way that anyone is going to have any faith in this currency"

And Ezra jumps right in and writes the Ruby driver just from that post with almost no traction

Ezra was one of the good guys

RIP Ezra, he was very talented and loved to code and do computer stuff. You could read his excitement in his eyes and words.

Pretty much the same with Dropbox. Quite a few of the top voted comments were rather dismissive.


The top comment on Dropbox is classic. "I can simply replicate this with some sticky tape, bailing twine, wood off-cuts and parts salvaged from a few motorcycles, cars, traction engines and that old WW II bomber turret I have in my back yard. Why would anyone ever use their pre-packaged product instead?"

Indeed, a good reminder for the tech crowd that there's value in usability / accessibility for the general public.

There is value in that for everybody, including the "tech crowd".

The top comment is still true, though. If you have technical knowledge, you can get something better much cheaper.

That's a big "if" though, only a very small percentage of people can do that. And of those that can, do you want to? If someone solves this problem nicely, why invent your own solution? There are many other problems that are still unsolved, or not solved adequately...

Not sure that I agree... with a few referrals, I've got more room on the free tier and mostly just keep some encrypted zip files with 2fa backups and software serial numbers... there's a few other odds and ends as well (shell scripts for mac/lin/win).

Are you sure it will be better?

Depends on what exactly you want.

Dropbox has a fixed set of features, most people care about some other set, and there will be some intersection and some things that are needed that are outside of Dropbox feature set, that will need to be implemented by the user anyway (like encryption, automation, etc.). There will also be some things that will go against the user's use case, and compromise it.

So compared to Dropbox, having two external hard drives and using rsync regularly will get you way faster point in time backups, faster access in case of recovery, privacy, transparent encryption (that you'll not have to care about during access to files), no worries about losing access/account takeover, one-time fixed payment for the drive that will last you probably more than 5 years, instead of subscription (where you'll pay after 1 year more than you'd pay for the drive alone), etc.

Having always connected extra internal backup drives will also give you some other options. Like if you're a heavy user of PostgreSQL, you can setup cluster replication locally with synchronous replicas on different drives, and you'll have your databases backed up. Better than dropbox in this use case, too.

OTOH, if your use case is collaboration, Dropbox may be better. But if you include encrpytion of individual files you want to collaborate on, it may be again more cumbersome. I don't know.

> So compared to Dropbox, having two external hard drives and using rsync regularly will get you way faster point in time backups, faster access in case of recovery, privacy, transparent encryption (that you'll not have to care about during access to files), no worries about losing access/account takeover, one-time fixed payment for the drive that will last you probably more than 5 years, instead of subscription (where you'll pay after 1 year more than you'd pay for the drive alone), etc.

The fact that you think this list of inaccurate claims supports rather than refutes your original post suggests you should spend more time learning about what Dropbox does and calculating the operational overhead of supporting a homegrown solution. In particular, thinking about what ease of access means with an external drive could lead you to insights about correlated failure modes such as what happens when the same thief/power surge/accident takes your laptop and the drive sitting next to it, and you realize that if you’d used Dropbox you wouldn’t have lost more than a few seconds of work. Similarly, your scheme has no versioning, bitrot protection, etc. which people always discount until the first time they lose data.

I mean yeah, you are describing two vastly different products. People want ikea not a table saw.

Some may have misread that comment.

True, but there were a lot less people on hacker news 10 years ago (presumably)

I agree. It's difficult to compare, but for comparison:

2009> https://news.ycombinator.com/front?day=2009-02-25

One with 381 points, a few with 100 points, and it descends quite fast and the last few posts in the page have about 20 points.

2019> https://news.ycombinator.com/front?day=2019-02-25

Three with more than 400 points, many with 300/200/100 points, and only the last has less than 50 points.

No wireless. Less space than a Nomad. Lame.

For me, Redis is the most software valuable tool I use. I was hooked from the day I started using it about 8 years ago. It somehow has a place in almost every software project I work on. And it just keeps getting better. Now with RESP3 on the horizon, it will enable me to even further benefit from the many clients available in almost every programming language.


Here is the link from Feb 25, 2009

Note this number = 494649

Today = 19 247493

Nineteen (19) Million Posts etc later (mas o menos)

Is it worthwhile to replace a MySQL with Redis? Am I losing something?

Redis is not a drop-in replacement for an SQL server. It's more like a RAM based key value store with better data type options than "just string" and it's also like a swiss army knife. In a generic Rails/Laravel/MVC setup you could use redis to handle one or all of: sessions, cache, queue, pubsub, (probably more), effectively lifting the burden from your SQL server.

I believe if you think of Redis as a data structure server, then there are lots of applications you can build out that have simple commands like set and get for both lists and hashmaps that are slightly different ways of thinking about saving and retrieving your data. The concept of a set is very powerful and easier to implement I believe than using SQL.

Relational databases are based on relational algebra which is all about operating on sets.


Your data, eventually? If that's OK, some of the time, then Redis might be fine and will be much faster (since not losing data involves much more work).

Hello, with Redis I can lose data? It does not create snapshots and has some High Availability configuration easy to setup?

As far as I'm aware any setup of Redis, including Redis cluster uses asynchronous replication, so you will lose some window of data.

I can truly say that Redis has changed my life and career. I started using it in 2010, got hooked immediately, and I've been a member of the community and contributor ever since. I got to build so many cool things with Redis over the years, it's amazing how versatile and powerful it is.

Having worked for a couple of years at Redis Labs, I got to work closely with antirez, and that's been a transformative experience as well, which made me a better engineer and open source contributor.

Thank you, Salvatore. Here's to the next 10 years.

And to that Erland rewrite!

Thanks Dvir! Working with you was an amazing time experience for me as well. See you soon in TLV!

What's the clustering story these days? Last I checked, Redis Cluster still had lots of issues losing writes, and the design doesn't seem to have been revised since Aphyr ran Jepsen tests on it back in 2013, making it (at least from my perspective) practically useless for anything that requires distributed consistency, which means most things that I look to use a distributed data store for.

I've also been warned by fellow devs that much of the clustering logic is actually done in the client, and there's historically been a lack of mature clients for all languages. Even if you find a mature client, the complexity of the implementation implies that not all clients may behave identically, or may have different bugs. Then there's the issue of lock-in, where you become dependent on a specific client library and its development lifecycle. I don't know if all of this is true, but I also don't hear a lot of people talking about Redis Cluster these days.

I know you can use Redis in master/slave mode via Sentinel + Twemproxy, though even this solution seems to have some issues with data consistency. Running all three also appears a lot more complex than an integrated system.

I see a lot of comments implying that Redis is mainly used in single-node setups, so that might be where it shines?

Hello! Redis Cluster has yet the same tradeoffs, because they were basically designed, and were not shortcomings. Btw Aphyr never tested Redis Cluster, and does not made any sense to test it because he tests things for strong guarantees so it should test Redis-CP instead, for instance, a consistent store for Redis implemented into a module using Raft.

I believe that for the Redis use case the cluster tradeoffs make sense because:

* Best effort consistency in the practice work quite well, even if it does not have any guarantee, but does certain things to avoid losing writes in trivial ways.

* If you want to cluster Redis, you want Redis, not a cluster that automatically becomes a lot more slower, memory hungry, and so forth. So replication should be asynchronous.

However what I may change in the future, and there are plans for that, is to add a failover strategy that does not just pick the replica that is more ahead in terms of received writes: but even stop the failover if there isn't a majority of slaves reachable. This improves certain properties, and if well orchestrated can also show strong properties if writes are acknowledged only after transferred to the majority of slaves.

Redis Cluster is used in many organizations right now. The next step is to improve it (in different ways than having strong properties mostly), and provide an official proxy for it.

Where can I find information about "Redis-CP"? Google isn't giving me anything.

I don't really understand the nature of the "tradeoffs" you mention. As Aphyr pointed out, Redis (with Sentinel) is not safe to use as a database, a queue or even as a lock service. That really narrows the possible use cases. I can absolutely see Redis being appropriate for many "lossy" applications: Caching, web sessions, rate-limiting counters, precomputed analytics data, intermediate outputs from distributed data processing pipelines, that kind of thing.

But the use cases where I'd reach for Redis seem a lot fewer than with data stores that have high consistency guarantees, such as FoundationDB, TiDB/TiKV, CockroachDB, or Cassandra/ScyllaDB. With the exception of TiDB, these are a bit easier to reason about since there's no Redis/Sentinel/Twemproxy split.

On the other hand, I certainly appreciate the specialized data structures and Lua support that Redis comes with.

Redis is safe to use as a locking mechanism, many people do that. You should dig deeper than Asphyr said it doesn't work.

How do you do that in a distributed cluster with HA failover using Sentinel, considering that Sentinel is susceptible to partitions and drops?

Do you work for NASA, or are you overly paranoid about your infrastructure?

Interesting approach! Reminds of an option my group considered a couple years ago, where we purposely didn't set up read replicas, but considered failure to connect/read of the master as a 'C' failure. Hashing keys to multiple masters and we had "highly available" redis with stronger consistency.

We ended up not using it, but I always wanted to revisit the pattern.

BTW, thanks for such a wonderful piece of software!

Yes. Thanks for this software !

BTW, Without a critical mass of users regularly vouching, the data isn't much help. if someone is happily using the package in production, that won't tick any download numbers even though it's arguably the thing anyone considering a package would really want to know.


I would strongly recommend that you read the author of Redis himself discussing this issue: http://antirez.com/news/122

Thanks for this link. Some developers see more than others and can articulate differences very well. I wish I'd been scanning his blog for years!

> "There is more: I believe that political correctness has a puritan root. As such it focuses on formalities, but actually it has a real root of prejudice against others. For instance Mark bullied me because I was not complying with his ideas, showing problems at accepting differences in the way people think. I believe that this environment is making it impossible to have important conversations."

Antirez was pressured into adding aliases to soothe the crowd.

This is extremely obvious concern trolling. Please don't do that here.

Nope. It's not trolling if you state a fact.

It absolutely is. Concern trolling means disingenuously expressing concern about an issue in order to undermine or derail genuine discussion, which you've done here, and your other remarks about "SJWs" lay bare your motives.

I'm asking you to knock it off and discuss in good faith next time.

Deemed by who? I'm in the valley and this is the first time I'm encountering it. I need to leave.

Huh, what? Never heard of this DB replication "PC" nonsense before.


In Switzerland we had quasi slaves till the 1980s (Verdingkinder). This is horrible and the skin color of the victims was most of the time white. And many of them are now old white dudes.

There are probably more old white dudes in Europe that suffered directly from slavery than there are Americans with Grandparents who were slaves. So please don‘t judge people by their skin color.

SighThe responses to your question (systems falling over on Jepsen) are already the most depressing kind:

- that's not our use case (then why are you advertising distribution???) - they didn't use this feature (WELL THEN, SHOW THAT FEATURE PASSING THE TESTS) - we're not that kind of distributed (oh, you mean the one you can use?)

Jepsen has been the first step for trusting anything that involves data and distribution. Everything is going to fail under his hard stare, it just comes down to how much BS smokescreening is done by the purveyors of the software to judge how to trust it.

Redis is the best data store I've had the fortune to work with. No nonsense data structures coupled with extremely reliable performance, beautiful API and extremely easy to manage.

It also helped me earn a lot of praise for using redis-cli --pipe to bulk insert data which bringed down a data import job to a few minutes from 4 hours. I eventually built a wrapper around redis-cli --pipe for using it with a cluster.

You clearly haven't used it at scale with sorted sets.

Can you elaborate on that please?

Try inserting more that 30K elements in a key which is a sorted set and watch the insertion time, memory & cpu usage. Now try doing this to millions of keys simultaneously.

We run several million sorted sets, but they are all short sets (100s), but do thousands of writes/sorts per second without issue.

From memory - there was a setting to turn on/off gzip compression for a list once it went beyond a certain size - do you have this enabled?

As someone who is fairly tech literate but not familiar with this tech stack - in practical terms, what is Redis, and what is it used for?

A lot of people are saying "k/v store", but I think there's an alternate definition which I've stolen from aphyr: A shared heap.

If you have a cluster of machines operating on a dataset, you can store that dataset in redis to get high performance reads and writes. In the simple case of a cache, it's a key value store. But other complex cases exist: A priority queue, an atomic transaction log, a lock server, and more.

It supports lua so if the data structure and operations you need doesn't exist you can generally build it yourself.

See that at least makes sense to me :)

I'm not an applications programmer (and don't want to be one), so take what I say with a grain of salt, but I was first introduced to Redis several years ago during an analytical project working with a consultancy, and I asked what it was and why they wanted to bring it in.

"Its a memory-resident key-value store!"...

"So its...a hash table?"

"Its a memory resident key-value store!"...

"And don't modern languages already come with those and why aren't we just using those internal and mature solutions rather than bringing in an arbitrary new external dependency?"

blank look

"Its a memory-resident key-value store!"

The point is not the key value part, it is the "externality" and the "store" part.

> "So its...a hash table?"

The answer is, yes, exactly!. Only, it's a hash table that can be shared across all of the different processes on the server. So if you have e.g. two different web requests that want to update some value, then that's how you do that. The other main alternative is a regular database, but that's much heavier and isn't really built in terms of "data structures".

(Redis isn't just a hash table - it's a list, a set, a queue, etc, in other words, all the standard library of a programming language, only in a way that can be shared across all the processes.)

Its a well optimized mature memory resident key-value store that can be used by arbitrary processes and threads which has an api accessible to arbitrary programming languages.

Back then, my definition was „Java collections with an API“, and to me it still fits the bill nicely. You basically have some solid computer science primitives to build whatever you want on top, only limited by memory size (which is enough for 90% of use cases)

What is it generally used for though? To me, a "k.v" store sounds like a very generic but nice thing to have, but I still don't have a good sense of what it is and what people think of it.

The one place I've run into it is in web development where it's used for caching? In some tutorials I've read.

So lets consider one use case. We have a website where people login and we create a session to track that user across different pages on the website. Now the session will have data like user name (to display on the website), the user's location or IP address, the user's preferences, etc. So whilst the browser and the website communicate with a cookie (to identify this session), we need to store this session data somewhere. You can use a file or a database to read and write to this session data which works fine. But redis shines in this example where the speed is very good and you also have the session id as they 'key' and the session data as the 'value' that you can store.

Redis gives you amazing speed and because it provides a kv interface, the work is very easy.

Thanks, I never heard a practical example like this. I assume the "value" side of Key:Value can be a giant heap of JSON? So theoretically you can retrieve an entire collection of structured data as long as you know the key, correct?

And where the usefulness of K:V store is beaten by SQL is the point you would want to run a "JOIN" or "GROUP" on the data, for example if you wanted to count the number of keys containing a certain data point, correct?

Yes the big bag of data which is generally the value in the kv store is a json (either as a serialized string or a map if it is natively supported by redis).

Yes, the usefulness of SQL always is the join or group but with something like session, the idea is to just use it to dump values into the store which you don't want to reach into the database every time. So different teams will want to put in their own keys and data into the session object which means your DB session store becomes extremely difficult to maintain over time.

On the other hand, the joins and groups based on this can be handled later in time than in the req-response cycle itself.

In a simplified manner: "I have this resource or a big piece of information I don't want to pull from my slow database or slow external service on each request/call/etc.

I'm going to see if that resource is there by doing `GET external.foo.bar`. if there is nothing here, I perform the slow pull. After the slow pull is done, we store the result in redis under the same name `external.foo.bar` with a timeout for x seconds.

Next time that resource is requested from our code, it will be there, so `GET external.foo.bar` will get us that resource without having to perform a slow call to external service. "

Thank you that was a great example. So from what I understand, Redis shouldn't be used unless there's a scaling issue?

Also, where does Redis sit (my guess is between request & database)?

Well, the previous comment was describing the use of Redis as a cache, and in that use case, yeah, it's kind of between the application and the database.

That said, caching is just one of the possible uses for Redis. I think of it as an easy way to share arrays, dictionaries, queues, etc between different applications. Then it's easy to see how it can be used for almost anything.

When used as a cache, Redis is often used as a 'look aside cache' (vs 'look thru cache'). Generally speaking, Redis does not talk directly to your database. Instead, your code looks to Redis, and then if it doesn't find what it's looking for, your code looks to your SQL database

The possibilities are kind of limitless. Redis implements a lot of data structures beyond a simple k/v map.

A simple non-caching example, but we use it for distributed locking, more specifically a distributed countdown latch. This is much cleaner and more performant (for us) than doing a similar operation in a traditional RDBMS.

Another very common example that is used in a lot of tutorials is maintaining a leaderboard using a sorted set.

What do you do when Redis dies on you?

All the normal strategies for dealing with distributed services still apply.

In most of these scenarios, you are still persisting data to stable storage. So, you would take the performance hit and load the data from the database.

Devil's advocate: I've seen Redis used lots but by people who mostly didn't seem to know what they're doing: NoSQL bandwagon or other magical thinking: "It's fast". Yeah, but only because it does nothing to avoid losing your data or be HA (the disk persistency and clustering options don't have pleasant semantics ATM, as far as I can tell).

You can use it as a simple cache, and that's fine. But then why not just stick with memcached -- less is more?

There are probably some scenarios where single point of failure and data loss is fine and the additional data structures redis provides over memcached are handy (e.g. analytics), but I've never seen it used for that.

Its commonly used to enhance the performance of web apps, ex: to put less load on your database by caching, to provide faster queries, session handling so your database doesn't has to do it, and of course, many more use cases.

Don't jump to use it unless you really have performance/scaling issues.

In Rails work, two of the most common things it might be used for are: 1) A data cache, 2) a queue of work for background jobs.

This only scratches the surface of what is possible, but it's some things redis is used for.

What does it mean for Redis to be a message broker?

One application inserts a message into Redis, which other application(s) will read. For example, using https://redis.io/topics/pubsub

Redis is a general purpose k/v data store that you run in memory. It has support for transactions and atomic operations, and it can be persisted and resumed. You can use it for caching, counting, event sourcing, lat/lng data, pub/sub, and more.

My coworker once called redis "NoSQLite", which I think is a very apt description.

Too much to not mention: davidw, antirez, and Richard Hipp (SQLite) are all current or former heavy hitters in the Tcl world. David contributed to the second ed. of Tcl and the Tk Toolkit[0], co-authored Rivet[1] (an Apache httpd Tcl module), among other things. antirez wrote Tcl the Misunderstood[2] and Jim[3], a lighter weight Tcl implementation, and Richard (drh) is a former Tcl Core Team member[4] who describes SQLite as “a Tcl extension that escaped into the wild.”[5]

[0] https://www.pearson.com/us/higher-education/program/Ousterho...

[1] https://tcl.apache.org/rivet/

[2] http://antirez.com/articoli/tclmisunderstood.html

[3] http://jim.tcl.tk/index.html/doc/www/www/index.html

[4] https://en.m.wikipedia.org/wiki/D._Richard_Hipp

[5] https://www.tcl.tk/community/tcl2017/assets/talk93/Paper.htm...

An out of process, memory-first, strictly consistent, distributed, data structure store.


Also see: https://www.dbms2.com/2008/02/18/mike-stonebraker-calls-for-...

It makes the collections you work with in your programming language (list, set, dictionary/map/hash) available as a dedicated server instead, so you can share data across multiple programs.

It's single-threaded, stores everything in RAM with optional persistence, and has Lua and modules so you can do more than the standard commands.

In memory fast key value store.

Great for caching, or anything distributed systems need quick access to.

So, like memcache?

It also supports other data structures, like lists, sets, and hashes. Super fast, and probably the best documentation I have ever seen.

Yes, exactly like memcache.

It can do things memcache can do and way more. Like storing other data types. Persisting storage to disk. Geospatial queries. Pub/sub. Lua scripting.

So, no, they are not interchangeable! Only if you just store keys and strings and do not care about persistence.

The way I like to define is: Redis is a "build your own (in memory) DB kit"

k/v store is just 10% of what it can do. Your value doesn't need to be a value, it can be:

- an array or a set

- an associative map

- an ordered list by score

- A bit array

Also offers functionality like streams and pub/sub

data-structures as a service.

need an atomic lock shared between multiple processes or servers (x)

need a set of unique values sorted by insertion time (x)

want to know the O(x) complexity of using any of the data-structures to design a system for scale (x)

need to notify multiple consumers of a modification ala pub-sub (x)

need to keep track of a stream of events for consumers (x)

need to do geo-spatial lookups (x)

oh and you want this thing to be durable to failure and easy to maintain (sentinel|cluster) (x)

redis is single process and easy to configure and understand that's my opinion of why it's so amazeballs.

[edit] unicode not supported so x instead

So you can put state that might otherwise exist on the server into Redis. This keeps your app servers stateless which has some advantages.

It's a key value store.

Where the value is not just a string but can also be a set, list, ordered set, hash (dictionary), etc.

You can store basically any common data structure in Redis and operate on individual elements as if they were local variables in your program.

A worse version of the modern hype etcd

Redis is incredible. I built a matchmaker with it as the sole data store and it has been a breeze to use. It's so simple to visualize data structures with it and I just can't see myself using any other in-memory data store. Also Redis Sentinel is fantastic.

What data structure can’t be used in memory?

Any data structure can be used in memory. I just like the abstractions that Redis provides me. It's really easy to visualize how the data is going to be organized.

And it's atomically available to multiple connections in your app.

What does Redis Sentinel do in a nutshell?

Provides high availability for Redis

It's a small monitoring program that checks a master/replica pair and alerts you or initiates a failover.

Easily one of the most versatile and reliable pieces of software I’ve ever had the pleasure of using. To ten more years!

especially for bitcoin miners and unsecured redis instances!

I’m curious. Elaborate?

Around 75% of open redis servers are hacked.


Redis safe mode introduced in 3.2 reduced the problem but still folks actively misconfigure Redis before putting it in a public IP... Now there are ACLs in Redis 6 that will mitigate this even more, but it's a lost game, because images are created with installations of Redis that are made completely accessible on purpose.

Oh not blaming Redis in the slightest. Its one of my all time fav tools. Things are only secure as ppl configure them.

I don't even want to know how many elasticache Redis servers data are just unsecured on a public IP because it's so easy to configure that way.

^^^ this exactly.

This happened to me, but it's because our sysadmin left a firewall port open to the whole world without setting a password on redis, which allowed a random drive by port scan to inject a LUA script. They couldn't escalate privileges, only run the miner and make the server mostly unresponsive.

Ah yes, the nostalgia

I've been using Redis many years. It's served me well. It's simplicity and vision is a testament to antirez's hard work. Thank you sir.

Has anyone ever had problems with Redis?

I expect it to always work and it always has. I really like it but am worried I trust it too much now. Please tell me I'm fine to trust single instances!

It's been one of the least problematic things in our infrastructure. We keep around for a bit of internal caching of things, transient state, and some non critical queueing. We have a couple of redis nodes that we've had for years. Saying it is a key value store is selling it short. What really makes it useful is things like queues, sets, ttls on keys, etc. The API has dozens of different operations and variants of operations. Mostly, redis is rock solid and stable but because it is not a transactional datastore you should not rely on it preserving your data. Bearing that in mind, you have to plan for the worst. IMHO treating it as a transient thing that can go at any point and that you don't back up is a sensible thing to do. Blindly restarting and wiping a redis node should not cause any harm. Mostly this never happens but when it does, we simply restart.

Redis cluster is more about availability than it is about consistency. If you are aware of that, it's a fine solution.

A couple of things I do with it:

- buffer log messages before we send them to elasticsearch via logstash. This is a simple queue. Technically it's a single point of failure but worst case we lose a few minutes of logging. This happens very rarely and typically not because of redis. This node is configured to drop older keys when memory runs out. We did this after a few log spikes killed our node by running out of memory. Since then, we've had zero issues.

- we have a few simple batch jobs that we trigger with an API or via a timed job in our cluster. To prevent these things running twice on different nodes, I implemented a simple lock mechanism via redis. Nodes that are about to run check first if they need to and abort if another node is already doing the same or recently completed doing this. This does not scale but it works great and I don't need extra moving parts for some routine things that we run a couple of times a day.

- some of our business logic ends up looking up or calculating the same things over and over again. We use a mix of on server in memory caching and shared state in redis for this. Keys for this have a ttl; if the key is missing the logic is to simply recalculate the value.

Once you have redis around, finding more uses for it is a bit of an antipattern. It does queuing but you probably should use a proper queue if you need one. It can store data but you probably want a proper database if you are going to store a lot of data, etc. It's great for prototyping though. Use it in moderation.

Only problem I ever encountered was mainly my fault -- my Redis instance was used to hack my server (the attacker manipulated Redis data and dumped it to overwrite /etc/passwd, etc). I was an idiot and hadn't locked down my installation. Luckily my provider had disk snapshots.

Yup, same thing happen to my VPS. I had redis running on a tcp port instead of a unix socket and I didn't have a firewall setup.

Sounds interesting. Can you share how and what happened in detail?

There's actually a writeup of this technique on the Redis blog: http://antirez.com/news/96

In my case they overwrote ~/.ssh/authorized_keys, /etc/group and /etc/passwd as well.

Pretty much the only problems I've seen it cause are due to people not understanding its role in the infrastructure, not defects in Redis itself.

Most people run Redis in-memory only (in my experience, at least). Those that don't usually sync to disk only periodically, whether they intend to or not.

The only problems that crop up from that pattern are that many users (especially new ones, or people who haven't worked with Redis before) forget that it's fundamentally an ephemeral cache. Eventually maintenance or failure drops the in-memory dataset, and then a wide variety of disasters occur because it was being treated as a source of truth, or as a datastore with durability.

In situations where the ephemerality of in-memory data was consistently known (or when disk persistence was configured with some thought), I have had the same experience as most others here: Redis was one of the most reliable, least surprising pieces of infrastructure present.

...except for TTL handling with read-only replicas, I guess. That behavior (TTLs can get ignored on replicas) was really rough and surprising, but is fortunately now fixed. Shame on me for running an old enough version to keep getting bitten by it.

I did, but I am not sure the use case was right. Inherited an old web forms project and tried switching from Memcached to Redis. It didn't work out due to a large number of serialization/deserialization of object stores. The .net Redis library was causing a massive amount of garbage collection storms. I was probably asking it to do too much though.

I was first exposed to Redis when I first started building NodeJS sites using ExpressJS many years ago. There was a session caching layer that used Redis that “just worked” and was fast. Later I had other opportunities to use Redis and the more I used it the more my appreciation grew. However I echo some of the well put discussions in this thread about Redis clustering. Thank you Antirez!

I like antirez's "Linenoise" library. I converted it to Unicode, and gave it a multi-line editing mode with undo, visual copy and paste, edit in external editor, parenthesis matching, incomplete syntax continuation, and more. It's all in the TXR Lisp REPL.

Cool! You should reply here with a link to your fork.


I've tried to keep it decoupled from TXR internals, and to abstract it from the OS a bit, so there is a considerable "struct lino_os" interface now.

There is a dependency on a "config.h" which provides some HAVE_* constants.

There is a user guide to the REPL in the TXR Lisp manual; that provides context for some of new interfaces, like what is the lino_set_atom_cb function for.


Wow a lot of interesting work. Thanks for sharing, I think I should compile a list of the forks at the end of the README.

Tcl was the first language used. I laughed, but only because I've developed in Tcl and it has its time and place. Though I don't remember how to do an if statement in Tcl off the top of my head. Update: ok, its squiggly brackets; that's right.

I've been using TCL more and more lately. It's ability to use its own interpreter is super handy. I made a static site generator in it most recently.

In classic Hacker News fashion, the top comment is just suggesting you use something else instead.

Sometimes I worry I'm going to be THAT person

Mandatory Show HN post on Dropbox: https://news.ycombinator.com/item?id=8863

I wonder what the odds are of any new thing posted on HN staying the course and turning out to be useful in the long run. Probably not very high.

About 1 in 57 according to the random statistic generator app someone posted years ago that I'm still using multiple times a day.

Super stoked the developer finally released the "output as odds" feature (though my favorite output setting is definitely still "matter-of-fact tone with a hint of strained patience").

Statistically speaking, I'm fairly sure it's safe to assume that had up-voted negative comments too?

Link please :)

In typical hackernews fashion, top comment is a comment complainling about another comment

Just coming here to say congratulations to Antirez.

I've never had to use the software myself, but I always read Antirez's posts and his interactions on here. I think it's a great example of technical leadership.

> “It takes guts to be a first follower. You stand out, you brave ridiculing yourself. Being a first follower is an underappreciated form of leadership”

There is a Mark Twain quote that reminds me of this:

“In the beginning of a change the patriot is a scarce man, and brave, and hated and scorned. When his cause succeeds, the timid join him, for then it costs nothing to be a patriot.”

(edited with the full quote)

Why would I use Redis over Couchbase?

I don't know about Couchbase, but Redis tends to be lighter/faster than full-fledged databases.

I never realised it was Ezra that did the initial redis-rb implementation.

He talked about it at Railsconf 2009, as well. I remember that was the first time I installed it.

It's funny to see how HN has changed. People who are actually excited about a project, creating a client for ruby within hours. Not the same anymore

People are still quick to build and add to new projects. Now it’s more likely to be something related to Kubernetes or serverless, and those people are seen as hype chasers not pioneers. Like I’m guessing people were in the past too.

Current Open Source Redis fork: https://goodformcode.com

Redis was/is open source. The licensing thing was with Redis labs modules (plugins) for Redis.

Slightly off-topic, but what makes Redis cluster preferable over something like etcd as a K/V store?

They are at two extreme places of the spectrum basically. One is a very slow CP KV without any feature over basic stuff. Another is a data structure server with a KV shell, and weaker consistency guarantees and good performances.

Doesn't load with ublock origin.

That dancing guy is rad tho :)

Who can help me identify redis timeouts in azure mvc core project?

Why is there JavaScript that prevents me from selecting any text? That is how I read sometimes (subconsciously mostly). When I disable JS, I can select text again. What are you trying to accomplish with this?

The page loads scripts from an insane number of hosts:

redislabs.com 3lift.com eb2.3lift.com adnxs.com ib.adnxs.com adroll.com d.adroll.com s.adroll.com advertising.com pixel.advertising.com s3.amazonaws.com execute-api.us-east-1.amazonaws.com x7ussrk21g.execute-api.us-east-1.amazonaws.com auryc.com mt.auryc.com bidswitch.net x.bidswitch.net bizible.com cdn.bizible.com bizographics.com sjs.bizographics.com casalemedia.com dsum-sec.casalemedia.com cloudflare.com cdnjs.cloudflare.com createjs.com code.createjs.com digitalreachagency.com cdn.digitalreachagency.com doubleclick.net g.doubleclick.net cm.g.doubleclick.net googleads.g.doubleclick.net stats.g.doubleclick.net static.doubleclick.net drift.com api.drift.com chat.api.drift.com 77314-14.chat.api.drift.com conversation.api.drift.com customer.api.drift.com enrichment.api.drift.com event.api.drift.com live.api.drift.com 77314-14.live.api.drift.com metrics.api.drift.com driftt.com js.driftt.com facebook.com www.facebook.com facebook.net connect.facebook.net fontawesome.com use.fontawesome.com ggpht.com yt3.ggpht.com google-analytics.com www.google-analytics.com google.com www.google.com fonts.googleapis.com googletagmanager.com www.googletagmanager.com gstatic.com fonts.gstatic.com hotjar.com in.hotjar.com script.hotjar.com static.hotjar.com vars.hotjar.com imgix.net driftt.imgix.net leadlander.com tracking.leadlander.com linkedin.com ads.linkedin.com px.ads.linkedin.com www.linkedin.com marketo.com lonrtp1.marketo.com lonrtp1-cdn.marketo.com rtp-static.marketo.com marketo.net munchkin.marketo.net mktoresp.com 915-nfd-128.mktoresp.com mrpdata.net j.mrpdata.net openx.net us-u.openx.net outbrain.com sync.outbrain.com pubmatic.com simage2.pubmatic.com reachforce.com cdn.reachforce.com smartformsapi.reachforce.com rlcdn.com idsync.rlcdn.com rubiconproject.com pixel.rubiconproject.com sf14g.com t.sf14g.com stackadapt.com srv.stackadapt.com tags.srv.stackadapt.com taboola.com match.taboola.com trc.taboola.com userty.com cdn.userty.com yahoo.com ads.yahoo.com youtube.com www.youtube.com ytimg.com i.ytimg.com

I guess some of all those hosts is doing something that goes haywire and results in this problem.

Would be cool if we could downrank pages like this somehow. Maybe tell the submitter "The page loads scripts (and therefore sends data to) 117 different servers. Please be aware that we will not show it on the front page until it has at least 117 upvotes".

I block all third-party requests with ublock and enable them one at a time (targeted, by looking at what they are manually) to get websites working again (when I really want to). It is generally insane how many external requests are 'needed' to display very basic websites like blogs or general information websites of companies.

There is lots of information on optimizing images and whenever I speak to designers they always say "don't worry, we optimize the images", which is great of course, but they completely overlook all the JS. Many websites don't only load more JS in terms of raw file size, but the parsing of all this JS then, again, takes more time than loading all of CSS and images, combined. And for what? Usually some annoying pop-ups, image sliders or some chat-box. In fact, those chat-boxes themselves load complete web pages in the iframe they are created in. I also see websites loading multiple versions of jQuery.

Anyway, 117 external requests for anything, let alone a blog, is insane.

I had the same issue. It's caused by an overlay of the "Updated Privacy Policy" button. You can either disable your adblocker and click accept to remove the overlay, or you create a rule in your adblocker to block it: ##div.reveal-overlay:nth-of-type(6)

Thanks for the investigation, but disabling JavaScript seems like a better move than accepting some privacy policy.

Or just hit the back button.

I can't see anything... the entire page is black, and it looks like it has no content at all.

I always find replies like this amusing. It seems to be the habit of Hacker News users to focus on minutiae while either missing or ignoring the focus of a submission. It's baffling.

People who build websites analysing how websites are built. Pretty natural I'd say.

I agree. I can image architects going into buildings for various reasons other than working on that building yet still analysing its architecture.

I guess 10 years is not enough to develop concurrent in-memory data structures storage.

Yeah, let's code like it's 1970's and multi-core CPUs are Sci-Fi ;)

Having no threads is exactly what makes Redis so convenient in deploying it to clusters: you have a node, and the node is a single process. It's like a building block.

Also, Redis source code is so much cleaner than memcached!

And yet it works incredibly well and is indispensable for many organizations.

It's almost like less sometimes really is more, and that prioritizing simplicity of implementation does more than just make code maintainable--holding that value as a developer can "leak" all the way up to the reliability of your software, in the best possible way.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact