
Redis on Acid: 0.5M ops/sec, 1ms latency and ACID compliance - dvirsky
https://redislabs.com/blog/your-cloud-cant-do-that-0-5m-ops-acid-1msec-latency/
======
aaronblohowiak
"almost always" atomicity means "not atomic". (if i guess correctly, there is
no stage/commit phase for data mutations in the aof persistence, so i believe
incremental changes are written to the AOF as the script runs so you can have
partially-applied updates on restore if your redis process dies in the middle
of executing a watch/multi/exec or lua script.) also, instances die all the
time.

~~~
antirez
Hello, I'm not sure of the setup Redis Labs is using, but vanilla Redis AOF
does not allow partially-applied updates. Not for Redis transactions
(MULTI/EXEC) nor with scripting. The same happens in the replication channel.
In order to enforce this, Redis goes a very long way to avoid partial
applications of Lua scripts, more info are in the EVAL page and in the -BUSY
error in case of scripts not returning that already made writes.

~~~
aaronblohowiak
ah, thank you. I should have checked the docs! It is nice to know that you are
writing the script and multi/exec semantics into the AOF log (at least in the
default config) and that my guess is wrong.

I still wonder what the details are around the "almost" in "almost always" and
stand by the conclusion that "almost atomic" is not the same as "atomic".

~~~
antirez
My best guess is that the author is referring to the fact that there are no
rollbacks in Redis transactions, but I'm not sure. I'll try to ask internally.
Thanks!

------
znpy
Questions I would like to see answered:

\- How does oss redis compare to enterprise redis ?

\- What are the differences when running the same benchmarks with both oss
redis and enterprise redis ?

\- what is the marginal utility of an additional cpu core/thread ? that is,
what happens if I run those benchmarks on an AMD ThreadRipper ?

~~~
koolba
I'm curious on all of these as well. Also, is this a first party (ie antirez)
piece of software? If not, has he blessed calling something "Redis
Enterprise"?

~~~
jonesetc
I believe he still works for redislabs, so I'm sure this is all done with his
knowledge.

~~~
dvirsky
Yes, of course it is (I work for Redis Labs as well).

------
segmondy
Very impressive, especially how it doesn't seem to matter much on the
read/write ratio. I've only used redis to cache, do folks really use it as a
DB?

~~~
edem
I'd also like to know what else is it used for.

~~~
neomantra
I use it for a specialized time-series storage / messaging layer. We are
receiving stock market data directly, normalizing it into JSON and also
PUBLISHing these objects via Redis to consumers (generally connected through a
custom WebSocket gateway). We basically turn the whole US stock market into an
in-memory sea of JSON, optimized for browser-based visualization.

Redis is great because of its multiple data structures. Depending on their
"kind", these JSON objects are either `APPEND`ed onto Redis Strings (e.g. for
time&sales or order history) or `HSET` (e.g. opening/closing trade) or ZSET
(e.g. open order book).

Sometimes an object transitions from a SortedSet to a String. We used to
handle this with `MULTI` but now we use custom modules to do this with much
better performance (e.g. one command to `ZREM`, `APPEND`, `PUBLISH`).

We run these Redis/feed-processor pairs in containers pinned to cores and
sharing NUMA nodes using kernel bypass technology (OpenOnload) so they talk
over shared-memory queues. This setup can sustain very high throughput (>100k
of these multi-ops per second) with low, consistent latency. [If you search
HN, you'll see that I've approached 1M insert ops/sec using this kind of
setup.]

We have a hybrid between this high-performance ingestion and long-term
storage. To reduce memory pressure (and since we don't have 20 TB of memory),
we harvest these Redis Strings into object storage (both NAS and S3 endpoints)
with Postgres storing the metadata to facilitate querying this.

We also do mundane things like auto-complete, ticker database, caching, etc.

I love this tech! It's extremely easy to hack Redis itself and now with
modules you don't even need to do that anymore.

------
nosefouratyou
I would think that something like hyperdex/warp[1] would be better for that
use case.

[1] [http://hyperdex.org/](http://hyperdex.org/) [2]
[https://www.cs.cornell.edu/people/egs/papers/hyperdex-
sigcom...](https://www.cs.cornell.edu/people/egs/papers/hyperdex-sigcomm.pdf)
[3] [http://rescrv.net/papers/warp-tech-
report.pdf](http://rescrv.net/papers/warp-tech-report.pdf)

~~~
lobster_johnson
Hyperdex is no longer maintained [1]. While the technology is impressive, the
author seems to have lost interest, and is now working on something called
Consus [2].

Hyperdex's problem all along was that the author — a very talented developer
from what I can tell — seems more invested in his projects from the
perspective academic research (he's at Cornell) than in delivering a
practical, living open source project. He tried to form a company around
Hyperdex (the transactional "Warp" add-on thing was commercial) even though
nobody seemed to be using it; and he was the sole developer. Unfortunately, as
interesting as Consus is, history seems to be repeating itself there.

But yeah, Hyperdex seemed to have real potential at one point. It was the only
NoSQL K/V store (at the time) that had transactions.

[1]
[https://github.com/rescrv/HyperDex/issues/233](https://github.com/rescrv/HyperDex/issues/233)

[2] [https://github.com/rescrv/Consus](https://github.com/rescrv/Consus)

~~~
misframer
> It was the only NoSQL K/V store (at the time) that had transactions.

What about FoundationDB?

~~~
lobster_johnson
Not open source, though.

~~~
misframer
Warp wasn't open source either.

~~~
lobster_johnson
Good point, but at least you could try out Hyperdex and consider whether you
wanted transactions. But this is pretty moot at this point, unless someone
picks up Hyperdex development again.

------
cjhanks
I guess I don't understand who requires less than 1ms latency on ACID writes
in a system accessible only through socket interfaces. Even so - isn't this
benchmark simply pushing the requirement of fast disk syncs onto a fast flash
drive designed for DMA? I mean, okay.. I guess... did customers actually think
the networking wasn't the latency culprit?

------
dis-sys
Stopped reading after I saw the following statement - “All or nothing”
property is almost always achieved, excluding cases like...

Almost always achieved? That actually means it is "never really achieved",
fixed for you.

It is just sad that cheap marketing materials like this one are keep being
pushed to the front page of NH.

------
b34r
Why wouldn't they say 500kops/s? Using a small number like 0.5 is just bad
marketing.

~~~
overcast
Because anyone who actually understands any of that, knows it's an impressive
number regardless. They aren't "marketing" this to the general public.

------
latch
FWIW, I just get a blank page in FF with uBlock running. I believe it's the
"naked-social-share" which various 3rd party extensions (Fanboy's) is
blocking.

------
patkai
I don't like to admit but I still don't get it, could I use Redis as a main
webapp backend?

~~~
immad
Redis won't function till it has loaded all data from disk to memory. So if
your webapp doesn't have too much data then possibly.

Also since Redis is memory backed you need more RAM then data. This can get
very costly.

Another annoyance is that Redis is single process single threaded so you
really have to avoid running long running queries unless you do extensive
manual sharding.

(Disclaimer: it's been a few years since I had to think about these
constraints so maybe some are removed in more recent versions of Redis)

~~~
yaaminu
Redis can now be multi threaded in custom modules

~~~
dvirsky
Sort of - the model it supports is not really multi-threaded. The modules can
spawn threads, and acquire a "GIL" when they want to touch actual Redis data -
thus only one thread at a time actually "owns" the entire Redis instance.

This allows long running queries to do primitive cooperative multi-tasking,
releasing the GIL and letting other queries have a chance; But there is no
real parallel data access. You will only gain real parallelism if you have
actual work to do that does not touch the data directly when a thread is not
touching the GIL. There aren't many cases that this applies to - usually
copying the data aside to do work on it is not worth the gain of parallelism.

------
sgmansfield
It's worth noting that they're running this benchmark on an absolutely
enormous (and expensive) instance: [https://aws.amazon.com/ec2/instance-
types/x1/](https://aws.amazon.com/ec2/instance-types/x1/)

~~~
mkj
Keep reading - AWS latency was too high so the benchmark was run on some bare
Dell hardware.

------
jjawssd
All I see with uBlock is a black screen. I have to whitelist over a dozen ad
trackers to see the content. Several ad scripts pull in even more scripts from
external domains. Try it yourself.

~~~
janpieterz
uBlock Origin working fine here. 2 requests blocked.

~~~
jjawssd
Are you using the uBlock defaults? I have blocked everything above "Regions,
languages" in the 3rd-party filters tab except the experimental filter.

