I use redis. I don't need ACLs, but I certainly don't disagree with your assessment, nor begrudge the time, care and attention to detail which is obvious in your design-work.
Not to mention there are now businesses built on the data provided by this free software, being run by volunteers (like I).
I bet you'd receive a hell of a lot of postcards of thanks from around the world. Things like that feel more "real" than emails and words on forums.
@antirez do you have dogs? I’ve got some branded leashes with your name on em!
You are a pillar to a lot of us. (And I’m not a Redis user)
This is why I fell in love with Redis right from the beginning. Having a data store that gets out of the way and just gives you the structures (a lot like programming data structures) to do what you need feels awesome. Picking up the commands and putting them to use feels as natural as reaching for any vector/list/whatever structures I use anyway in my programs.
Thanks so much for everything you're doing antirez.
Thank you for the thoughtful design, for saying no to so many things and keeping Redis pure and useful and for your awesome stewardship. You are an inspiration in all those things!
Not a security expert, but in time_independent_strcmp(), first comment about strlen()s: couldn't the attacker use his own accounts with known passwords to determine the length of some other user's password? Also, given the name of this function I would expect the comparison to be time independent, even if attacker can change both strings' lengths... Or am I missing something? Haven't touched C in a loooong time... :)
> /* Again the time of the following two copies is proportional to
> * len(a) + len(b) so no info is leaked. */
If the attacker controls one of the inputs, the execution time reveals something about the length of the other input, right?
Or maybe you just meant that the length is leaked by the contents are not leaked? (I agree that it's generally considered ok for "timing-safe equals" functions to leak the length of the secret. But if you ARE allowed to leak the length, you can simplify the code by just checking the length in the beginning and exiting if they're not equal.)
And if you don't want to leak the length, it's easy: pre-SHA-512 the secret and then only compare hashes instead of comparing the full strings.
My point was not so much that there is a bug, but more that the assumptions that this function makes are not obvious (to me at least) from its signature or the header comment. I could easily imagine someone using this function assuming that it does, you know, time insensitive string comparison, but this is only true in a very limited context. Even renaming vars to `a_unchangeable` and `b_user_controlled` (or a comment) would help. But again, this is just a very minor point from a passing stranger, so feel free to ignore it. :)
And thanks for providing Redis!
Redis Streams is probably one of the most path-breaking things that Redis is doing - and can completely change what it stands for. That's not a bad thing..but it doesn't even get a title by itself but is rather called "data structures".
Here's a question - do you anticipate competing with Kafka? There are a lot of us that are cheering Redis on for this...but it seems there is internal reluctance.
It's like the long standing "Redis is not memcached" statement - even WordPress has a "choose Redis or memcached for object caching" option.
As was said in the post, Redis isn't memcached, but "a bunch of Redis nodes on the same machine together with a Redis Cluster proxy" is kind of equivalent to memcached.
Redis and memcached are on different architectural levels—one Redis instance represents one shard of a sharded data-store with an IO-concurrency of r=1 w=1, while one memcached instance represents a complete data-store with an IO-concurrency of r=N w=1. To build a caching "virtual appliance" for deploying on some big hardware, a memcached appliance would just be a memcached instance; while a Redis appliance would contain N Redis nodes + 1 Redis Cluster proxy.
It's a bit like comparing, say, UDP with TCP. There are features you can layer on top of UDP to make it act equivalently to TCP (as e.g. QUIC does), but you can also just use UDP by itself. UDP is not designed to be TCP or "compete" with TCP; but UDP happens to be able to be used—when you add extra stuff on top—for the same things TCP can be used for.
Redis, like UDP, is a flexible, low-level "infrastructure component" that can be customized into tools to solve any number of use-cases—but, like UDP, Redis sets out to address only a very limited set of use-cases when used as a standalone "tool" without such customization. I personally think of Redis a lot like an OS distribution—you could make any number of virtual appliances by customizing e.g. Debian with your own extra packages and building the resulting appliance-nodes into your desired architecture; and you can make any number of data-store servers (solving any number of problems) by customizing Redis with your own commands, and then connecting the resulting nodes into your desired architecture.
Redis itself isn't going to compete directly with Kafka. Kafka, like memcached, is on a higher architectural layer than Redis. But there's nothing stopping some downstream developer from building something on that layer, e.g. a "RedisMQ" server customized into a Kafka-killer.
So this is a philosophical rather than a technical thing. For a hobbyist open source project, I would agree with you there - there's only so much resources that someone can carve out of his day job.
For a company that raised a 60mil venture funding and specifically created a license to prevent IP leakage, this is disappointing.
I want to pay Redis and I don't mind their licensing. But I'm then questioning the philosophy if everything meaningful has to be done by unpaid open-source developers/startups, who might be on tenuous footing to do anything meaningful with it because of the licensing.
Redis is built by antirez as a "hobbyist open source project." It is a small core, and will stay that way. It is an infrastructure component. It is fully open-source.
Redis Labs is the company that raised 60mil venture funding, and Redis Labs' business model is exactly to create such "distributions" of Redis for their customers. These distributions are not open-source; they are the thing that has the weird license.
Redis Labs does not own or control Redis. antirez owns and controls Redis.
You could create a similar company of your own which is also selling Redis distributions (as a product, as a service, etc.) and make money off of doing so. Since Redis itself—the thing antirez develops—is open-source, nothing stops you from doing this, just like nothing stops Redis Labs from doing this. No "unpaid open-source developers" are needed.
antirez just happens to work for Redis Labs—but not in the sense that they can really tell him what to do. They just want to pay him to ensure he keeps making it, because their business depends on the continued health of the open-source project at its core. (It's about the same situation as Guido van Rossum working for Dropbox, or Yukihiro Matsumoto working for Heroku.)
If you want to talk about the well-being of some project, downstream of Redis, to build a full-featured multipurpose DBMS "tool" with Redis at its core, that is fully open-source... well, that project doesn't exist. Redis Labs is not(!) that project; Redis Labs is a services company. But there's nothing stopping such a project from existing, and there's nothing stopping a foundation springing up around it to take care of it and its developers (like e.g. the Linux Software Foundation, or the PostgreSQL EU+US foundations.)
But that software project, if it existed, would no more be "the Redis project" than https://openresty.org/en/ is "the Nginx project." It'd just be a customized distribution, by separate people.
We have been super excited about Redis Streams for quite some time now. But we are not seeing any implementation usecases, tools...or even your awesome blog posts around streaming usecases.
It would be great to see some more literature on the streaming stuff and things that you can (and are ) doing with it...and giving it a top level importance (beyond just a new data structure).
This is super awesome stuff !
Because I prefer not to have both in production.
Whereas with Redis streams I had to write code in my application to periodically poll and claim unacked messages pending for more then some threshold time.
As soon as this happens I will have no use for Redis. I already have better scalable datastores. I often use Redis as a cache for those. If Redis isn't a low-latency, consistent key-value/key-structure store anymore then in my mind it isn't Redis anymore.
How is this a logical step in the context of micro services ? I feel there is a trade off. If I have to scale a service to N instances where each of them cache the same type of data. Potentially, I'm duplicating that data N times. Which means I need to reserve that much additional RAM per instance. I can avoid this by storing the data centrally in a very fast cache aka Redis. On the other hand even the fastest standalone cache will be much slower then accessing data in process memory.
Given that the stability and availability of your central data store in your model is likely more important than the N instances talking to it, caching data on those instances is definitely beneficial, rather than asking it for the same data over and over.
You need not reserve that much additional RAM, modern infrastructures end up with unused RAM anyways, by virtue of them being optmised for CPU usage of stateless microservices. Why not put it to use? A smart cache eviction strategy can help optimise what gets held in RAM and what doesn't. A Redis client working in tandem with the Redis server can help with that and stop you needing to write a bunch of code to do this.
Actually, they don't. If you're scaling a multi-user system to millions of users across hundreds/thousands of instances of your micro-service, they're likely to cache different data because different user sessions will be handled by each instance of your app. So, local caching in each micro-service is valuable. You just need to set an upper limit on how much memory to consume.
> I/O threading is not going to happen in Redis AFAIK, because after much consideration I think it’s a lot of complexity without a good reason
> the way I want to scale Redis is by improving the support for multiple Redis instances to be executed in the same host, especially via Redis Cluster
Having to run multiple Redis instances, with a separate "Redis Cluster [proxy]", sounds like "I didn't originally use threads, and while threads are technically the correct solution (because it's the standard way to do things and not all that complicated), I don't feel like rewriting Redis to use threads the way it should have been done from the start; thus I've come up with an alternative hack that is not ideal but makes my life easier". While I have always argued that memcached is not automatically better solely because it is multi-threaded, the idea that an additional point of failure – a technically unnecessary proxy/broker – is somehow a better solution than threads seems like an attempt to save face for mistakes made during early development.
As for the ACL:
> You get a library from the internet, and it looks to work well. Now why on the earth such library, that you don’t know line by line, should be able to call “FLUSHALL” and flush away your database instantly
What can I say other than "give me a break". Every large codebase has hundreds/thousands of dependencies, and any one of them could slip in a malicious block of code. My analysis: anyone who would be concerned about some library calling FLUSHALL would never even install Redis - a C program that, for all we know, is spamming emails or running a Bitcoin miner while it runs. The fact is, all 3rd party software/dependencies come with risk; if I'm trusting your software not to screw me while I sleep, I think I can trust a basic library not to hijack my Redis server with a FLUSHALL command. Really, this is not even remotely a valid argument. Whatsoever.
Then there's this:
> maybe you just hired a junior developer that is keeping calling “KEYS” on the Redis instance, while your company Redis policy is “No KEYS command”.
Ok, now you're just making shit up. Here's an idea: you don't need ACL to restrict access to KEYS, as that command should have been deprecated and entirely removed in favour of SCAN long ago. You mentioned "I think it’s a lot of complexity without a good reason" regarding multi-threaded Redis; yet here you are advocating an entire, complex ACL layer... for what purpose? For some imaginary malicious scenario where my Redis library is going to call FLUSHALL, or a "junior developer" is allowed to call "KEYS *" – a command that should not even be available in 2019?
I started this comment intending to provide unfiltered – but polite/constructive – feedback. The more I reread your original post, the more frustrated I got at the fact you seem to have lost all common sense. You're trying to justify adding a massive amount of complexity, including a clustering/proxy strategy instead of simple threading, and ACL to save us from imaginary threats. I've previously been in the stage of development that I believe you might be in now – you've run out of truly "good ideas" that would actually benefit your user base, and are now attempting to justify new features as necessary, when the truth is you're just looking to write new code that you personally find to be "new and exciting", rather than iterating on the existing, boring codebase.
I've spent the past 4-5 years advocating that memcached is effectively dead because of Redis. The more you try to make Redis some kind of ACL-secured, clustered database – rather than the lightweight cache store we learned to love – the more likely another project will step in and replace Redis the same way Redis replaced memcached.
* Clients will stay simple, RESP3 is backward compatible with RESP2
* ACLs are mostly an anti-fool protection.
* True multi-threading is impossibly hard to implement, but there will be some ad-hoc workarounds.
* Better persistence will be getting better.
* Existing data structures solve everything, so their number won't be extended.
* Read @antirez twitter to stay tuned.
> she/he (not sure)
English is really strange around gender, and idiomatic styles have changed over time. It has always been a valid style in English to use "they" to refer to a single person of unknown gender . This feels natural to English speakers when the subject is unknown and could be one of many potential people, as in:
When the first guest arrives at the party, give them a balloon.
Here, "them" only refers to a single person, but it's correct.
For the past hundred years or so, it was also common to use "he" to refer to an unknown hypothetical person. You see a lot of textbooks that do this. In theory, this wasn't supposed to come across as being exclusionary, but obviously it is. It's an explicitly gendered pronoun.
The feminist movement rightly drew attention to this problem, and writers experimented with a variety of approaches. Sometimes, at the beginning of a book you'll see an explicit "apology" saying something like "we use 'he' but don't mean it to only apply to males". Some simply switched to using "she" for everything. Others switch between "he" and "she" throughout, randomly or use the longer "he or she".
But, lately, it seems like the style is settling down towards simply using the already established singular they for all of these cases. To a native speaker, it feels a little weird at first, but you quickly get used to it. It's less jarring than "he" or "she" for most readers now.
It has a lot going for it:
* It has more established history than any other form.
* It doesn't force the author to pick an arbitrary meaningless gender.
* It's shorter than the awkward "he or she" (which also still forces you to decide which gender to put first).
* It doesn't exclude people who prefer neither "he" nor "she" as their pronoun.
So, if you're trying to figure out what pronoun to use when you don't know a specific one that is correct, default to "they".
while gender/sex is often beside the point, number often matters. we can keep the gendered pronouns for the few cases where it's needed, and use ungendered pronouns in the general case, but that requires an unambiguous singular ungendered pronoun.
English already dropped it with 2nd person, settling on the plural (and formal) “you” as the sole pronoun and dropping the informal singular “thee”; and for 2nd person distinguishing number in the pronoun is probably more often an impediment to discussion than 3rd person (a 3rd person pronoun needs an explicit referent in the framing context anyway, so you don't lose much by not adding a reminder of number into the pronoun itself.)
Even if we could, it's still not clear what the ideal solution would be. There are cases where you need to communicate ambiguous number too. It might be nice to have pronouns that distinguish "definitely more than one" from "potentially more than one".
Your assurances don't mean much compared to usage going back to at least 1382: https://en.wikipedia.org/wiki/Singular_they
Language evolves, and this means that the rules of grammar do too. The MLA already allows for pronoun usage to match that of the preferred pronoun of whomever you are writing about, and are considering allowing for the singular they in general. Some style guidelines already allow it. Others still proscribe it.
Plenty of news organizations, including the Washington Post, have set their style guidelines to use the singular they.
It's grammatically correct in plenty of places today, and the current trend is for that to increase, not decrease. You're fighting a losing battle here.
Let me rephrase that reply: Everyone is entitled to his own opinion.
Everyone takes the singular, but in that reply, I paired it with the plural their.
Here's a screenshot: https://i.imgur.com/KKbFXbD.png
That said, it was still a good read. Thank you for the time and effort you put into Redis.
I think there would be value in a "simple Redis": a well-maintained tool for a certain class of problems which do not require a distributed database.
EDIT: I also want to stress on a fact, that at the same time in the newer releases there was a simplification attempt. For instance the Lua scripting side effects problem are going to be removed completely by simplifying how the replication of Lua scripts is performed. Redis 5 is already like that, but the dead code was not yet removed for safety. Redis 6 will completely remove many parts of code. So there is not just the stress on adding, but also on refactoring. ACLs themselves allowed to refactor authentication in separated functions inside acl.c to lower the overall complexity.
Redis is the most simple, rock-solid piece of software I have ever had the pleasure of using. I find the comparison with FoundationDB jarring.
Don't let the negative comments like that sway your mindset; I think your choices are spot on and the proof is in the pudding.
A quick glance suggests that this would still be similarly possible today. Redis is by no means perfect, but it deserves it's reputation for being accessible!
PS: Thanks for the mention of the Lua side effects changes. I'll be curious to read up about your solutions to those puzzles.
I would say that's mostly due to that when it gets down to brass tacks, Redis (ignoring the orchestration in Cluster) really is a fairly simple interpreter with simple memory management (largely delegated to the malloc implementation) that glues together a fair number of nice
data structure/algorithm implementations that have little codependence. That last point is really, really important in it's developer accessibility. (antirez called this modularity elsewhere I think.) It also helps keep code size down.
a) can be used, optionally
b) can be composed with each other into new data processing (eg filtering/aggregation) models
c) did not reduce, significantly complicate, or create many 'exception rules', to otherwise a coherent model
Then those are good additions, in my view.
Also, I personally found, useful about Redis is its protocol. It is a very useful and poweful paradigm when different systems by different authors/companies implement same protocol. It reduces cognitive overload on users (programmers) and helps to continue building their domain expertise, rather than constantly re-learing APIs of tools or libraries.
So work on RESP3, in my view, is very welcomed.
Think about it: if you downvote anything you disagree with, eventually all you will see is opinions that you agree with. I don't know about you, but that is not what I expect from HN.