Hacker News new | past | comments | ask | show | jobs | submit login
Redis 3.0.0 is out (groups.google.com)
440 points by agonzalezro on Apr 1, 2015 | hide | past | web | favorite | 78 comments

From the release notes: https://raw.githubusercontent.com/antirez/redis/3.0/00-RELEA...

    * Redis Cluster: a distributed implementation of a subset of Redis.
    * New "embedded string" object encoding resulting in less cache
      misses. Big speed gain under certain work loads.
    * AOF child -> parent final data transmission to minimize latency due
      to "last write" during AOF rewrites.
    * Much improved LRU approximation algorithm for keys eviction.
    * WAIT command to block waiting for a write to be transmitted to
      the specified number of slaves.
    * MIGRATE connection caching. Much faster keys migraitons.
    * MIGARTE new options COPY and REPLACE.
    * CLIENT PAUSE command: stop processing client requests for a
      specified amount of time.
    * BITCOUNT performance improvements.
    * CONFIG SET accepts memory values in different units (for example
      you can use "CONFIG SET maxmemory 1gb").
    * Redis log format slightly changed reporting in each line the role of the
      instance (master/slave) or if it's a saving child log.
    * INCR performance improvements.

You fogot:

WARNING: Redis 3.0 is currently a BETA not suitable for production environments.

I think he just forgot to remove that from the changelog as this is indeed marked as a stable release now, except possibly for the cluster feature.

Sorry, stale warning removed.

Here's the full cluster spec: http://redis.io/topics/cluster-spec

Looking forward to seeing it taken apart by aphyr.

Congrats to Salvatore and the rest of the redis committers!

Then you might want to see both of them on the same stage in June :) http://www.dotscale.io/

You made my day! I'm booking a slot right now.

Is he still doing Jepsen tests? Thought he was doing Clojure work atm?

Yes, he is not doing Jepsen at the moment. The end of his last blog post on it has some explanation. That said, I wouldn't wonder if he'd make an exception, given that it's Redis...

He's been tweeting about Jepsen tests on Aerospike lately: https://twitter.com/aphyr/status/582653874894282752.

We are considering Aerospike because apparently a lot of our competitors are using it and I really would love to see those tests.

Pretty sure I saw him tweeting about jepsen the other day

(To note, I have limited experience with using Redis. My questions may be stupid.)

As far as I can tell, most of the advantages of Redis come from the fact that it's all held in memory and so access is fast. Is networked access to other parts of the cluster quick enough that it is quicker than storing the data on one computer, partly on disk? When would one want to use a Redis cluster rather than something stored on-disk and cached in memory?

The performance section of the spec[1] does a pretty good job explaining how the implementation remains fast.

    In Redis Cluster nodes don't proxy commands to the right
    node in charge for a given key, but instead they redirect
    clients to the right nodes serving a given portion of the
    key space.

    Eventually clients obtain an up to date representation 
    of the cluster and which node serves which subset of 
    keys, so during normal operations clients directly 
    contact the right nodes in order to send a given command.

    Because of the use of asynchronous replication, nodes 
    does not wait for other nodes acknowledgment of writes 
    (if not explicitly requested using the WAIT command).
[1] http://redis.io/topics/cluster-spec#performance

The primary advantage of Redis is native data types and access patterns beyond simple key-values: lists, sets, sorted sets, hash tables, pub/sub, hyperloglog, and scripting (lua) support.

Being memory-based was simply a feature but not necessarily something that set it apart: Memcached had that area pretty well locked down for being a blazing fast key-value store. And then Membase was basically memcached + persistence and clustering. Now Redis has clustering too!

seconded. memory = fast is not the point. the point is that it gives you specific data types that do specific things and since it's in memory it does those things very quickly. that said, you need to implement your own clustering or sharing solution. jury is still out on the new cluster code, i havent reviewed it yet.

As well as the optimisations mentioned by famousactress, note that it's common to run Redis (or other in-memory databases such as memcached) on a separate server anyway. So there's already some network overhead but it tends to be small enough if the content is small and everything's co-located.

Right, but I assume the commenter was more curious about the presumed multiple-network-accesses usually involved with a cluster which proxies to some other node to satisfy requests would do to the overall performance of the system... which is a totally reasonable question/concern.

For lazy readers: CLUSTERING!

   Maybe it works great from day one,
   maybe it will need a few more iterations, 
   and possibly with 3.2 we'll improve support for many stuff, 
   but my guess is that Redis 3.0.0 today, in some way, changes what Redis is.

It's software. Everybody knows how it works :)

Noooo, apologies for the downvote. That "up" button is tiny and apparently easy to miss.

I just hope it's not an April's fool joke

The author wrote this on twitter: https://twitter.com/antirez/status/583279481453936640

"p.s. no April fools here. 1 year ago I released HyperLogLog, today 3.0.0, just to contrast with some shit done this April fool lameness."

Coincidentally, the HyperLogLog redis type was introduced on April Fools last year.


Yeah, clustered Redis, announced today? Sounds like a joke to me!

Not a joke, the release was due in these days, so I picked 1 April since we have a tradition now to ship 1 April. Last year with HyperLogLog support, because of the futuristic name of the data structure, people had an hard time to believe it was a really thing and not a joke.

I understand your reasons and nevertheless hope that you can realize that you are still contributing to April Fool's insanity by making it that much harder for people to tell what is serious and what isn't on this day of disregard.

You’re totally within your rights here. I just beg you to consider the audience.

Yep this can lead to some confusion, but the magic of OSS software is that you can download it and check :-)

Won't someone please think of the childr...er, developers!

After many years of DevOps and configuration management, I've learned to think of developers as children...

Does anyone have more details on what the "embedded string" object encoding is and what workloads it helps with? The closest thing I can find is https://github.com/antirez/redis/issues/543, which seems pretty old.

Hey, it's very straightforward. Normally you have something like the redis object structure which has a type field, and a pointer to the actual representation of the object. If type is REDIS_STRING, you have the pointer to an "sds" string (where sds is the name of the string library used).

Now with embedded strings there is a special kind of string objects that instead use a single allocation for both the object structure and the string itself. This is slightly more memory efficient, but especially, improves memory locality a lot, so basically everything uses string objects (string types, or aggregate data types large enough to use string objects as values of the collection), will perform better.

Those special strings are used only for small strings (the majority in most work loads).

That was a nice timing, just yesterday I hacked around asyncio-redis package to provide clustering support.

Anyone interested in trying it out, or contributing, you can see the project here: https://github.com/renatomassaro/asyncio-redis-cluster

Edit: as for blocking requests, redis-py-cluster is an awesome lib that provides cluster support over the traditional redis-py client.

I really like antirez approach to building system software. Build it step-by-step, iterate and very soon you have an amazing piece of software

(Or Debian unstable in an hour or so...)

Does the cluster feature deprecate redis-sentinel?

No... Sentinel will still be developed alongside with Redis Cluster. For single instance deployments where all you want is HA, Sentinel may be a more obvious way to get it compared to running Redis in cluster mode. In the long term, with plenty of early warnings, we may support the current Sentinel use case with Cluster and merge the two stuff.

Neat! Really looking forward to that!

Came here to ask this too, Sentinels are pretty awful IMO

On https://github.com/antirez/redis/tree/3.0 branch, it shows "This branch is 1004 commits ahead, 1252 commits behind unstable".

What is the best alternative for redis?

Depends on the use case, but basically, there is a big set of problems you can solve with different technologies depending on your exact details, one will be better in one way, one will be better in some other way. The sensibility of the programmer is to pick the right one, in an effort to maximize the different aspects: data model fitness for the problem at hand, operational aspects, consistency guarantees, performances (number of nodes needed), scalability, simplicity (do I need support since it's a complex stuff?), and so forth.

That would depend pretty strongly on what about redis isn't working for you.

http://memcached.org/ used to be. I haven't done system architecture in about 2 years but when we were looking at in memory databases, it came down to Redis or Memcache.

Depending on what you need your database for, there are some that perform better than others. You can check this site for comparisons: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

Memcached is just a stupid key-value store, and by stupid I mean it just stores and retrieves values. It's extremely primitive compared to Redis.

It's not even close to the same thing as Redis except superficially.

The key/value part of Redis is just the beginning. The values themselves can be of several different types that allow for a lot more flexibility in how you store and query data.

Redis is normally not in memory. Everything is written to disk and survives a power failure.

Memcached is just a cache. Don't store anything there you can't lose.

Not sure that's quite right. From the docs:

>In order to achieve its outstanding performance, Redis works with an in-memory dataset. Depending on your use case, you can persist it either by dumping the dataset to disk every once in a while, or by appending each command to a log.

Well, it's written to disk either way. You're right that if you do the "once in a while" setup, you can lose some data in a power failure.

At a previous job people started using Redis thinking it was a fast in-memory data store. It turned out we had accumulated tens of thousands of records.

I haven't measured, but I doubt it's much faster than Postgres. It does have other nice features. I like using the expiring records for caching.

There isn't really one. Memcached fulfills a subset of its simplest use cases.

Hey, I also think you have to try http://tarantool.org if you wish to use persistent and an application server.

Are Oracle Coherence and Infinispan considered alternatives to redis? Don't know much about any of these besides their names to be honest.

I think yes. Take a look at Apache Ignite Data Grid as well.

Look at HyperDex http://hyperdex.org/

Last time I looked into HyperDex it requires proprietary extensions to get access to all of its features.

That would be Warp[0], which adds ACID transactions across multiple keys.

[0] http://hyperdex.org/solutions/

One alternative is to use the database you're already using (like Postgres) until you actually need Redis.

What do you want to do ?

April 1st is one of the worst dates for announcements ever...

    p.s. no April fools here. 1 year ago I released HyperLogLog, today 3.0.0, just to contrast with some shit done this April fool lameness.

Agreed, I thought it was going to be a spoof of the MongoDB 3.0 announcement/release..


Why not just linking to the google groups post or the release notes?

Still no SSL, so using redis-client still just spews your password out all over the internet.

There are plenty of alternatives to every library having to have yet another probably broken security layer. Probably better to focus on this layer being separate from everyone having to implement it.

Like, stunnel: https://www.stunnel.org/index.html and how to setup (Re: MySQL over stunnel) http://linuxgazette.net/107/odonovan.html

Why are you connecting to a Redis box across the internet? There's a great (and after Heartbleed, prophetic) post on the Varnish web site about why they don't implement SSL, I imagine Redis would be similar:


I love this post. Not every single piece of software needs to include SSL support out of the box. Sometimes, for the exact reasons Varnish explains, it just doesn't make sense.

Varnish is other thing. For cross-datacenter replication you will want SSL. So for Redis Cluster it's a necessary thing.

For cross-datacenter replication you should be using a secured VPN anyway.

No, if your db can use SSL, then additional layer of complexity is not required.

upd.: don't get me wrong, Redis is my favorite DB, really. But better to be objective.

It's hard to imagine every service in your infrastructure implementing SSL would be more secure than a single VPN tool. You are very optimistic about the difficulties of getting security right.

It's really simple to imagine and I even have implemented it :) "One single VPN" may (and will) fail sometimes, so count your complexity and stability with and without one extra service.

I'm sorry to be skeptical, but when a random person on the internet claims to have implemented SSL more securely than open source tools that are completely built around security, I tend to not believe it.

Implementing SSL is easy. Implementing SSL correctly is very difficult, and you probably won't find out you did it wrong for a long time, if ever.

I'm not implementing SSL, I just use it. With MySQL you can just use it. With Redis you have to use VPN with all costs of VPN. Please calm down and stop forcing your preference of VPN as the only right way.

Surely you would keep this in a private network? Layer 2/Encrypted VPNs?

For Redis - yes. But VPN is additional latency and additional service you have to monitor/restart/duplicate.

Transport security is mostly better implemented via ipsec (a VPN tunnel).

I'm happy that redis doesn't implement SSL, it just shows that they are prioritizing relevant features.

As a operations person, this is the wrong way to go. The VPN becomes a single point of failure. Attempts at HA fail in my experience.

Also solutions like stunnel create a separate process that has to be managed. If I have one for redis, and then one for something else it is harder to tell them apart, because both will be named stunnel.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact