
RediSearch – Redis Powered Search Engine - gilad
https://oss.redislabs.com/redisearch/
======
jonstewart
From an information retrieval perspective, this is embarrassing, and the
benchmarks aren’t close to being apples and oranges—it’s more like apples and
Mack trucks. The linked benchmark is for a toy data set of 5.6GB, which we all
know Redis will store in RAM; yet they don’t say whether they set
“index.storage.type: memory” on ElasticSearch (guessing they didn’t).

At the same time, ElasticSearch/Lucene puts considerable effort into analysis
at both indexing time and querying time that goes into ranking the search
results. RediSearch’s ranking is “user provided”—how does that work exactly?
What does that even mean? Ranking is at the heart of information retrieval—of
what value is it to allow for 4x more queries to be run on a cluster when the
result sets are terrible?

ElasticSearch could be better at scaling up on nodes to handle more query
operations on a given cluster. However, if you care about full text search,
this won’t do the job.

~~~
jiofih
It’s amazing that you reached the conclusion that it won’t do the job after
reading a benchmark for 2 minutes!

The benchmarks are quite realistic (5m docs, 25m products) for document
search, and the settings reflect real usage. There is no “memory” storage type
for ES, it has “mmapfs” which the documentation explicitly says is dangerous
and might be removed. The default `hybridfs` already has the ability to use
memory mapped files as an optimization. Forcing mmap would actually make it an
unrealistic benchmark.

> ElasticSearch could be better at scaling up on nodes to handle more query
> operations on a given cluster

Based on? They are both running with 5 shards for the benchmark.

~~~
joking
Well, you have MemoryIndex, which is different proposition (it's used for the
percolate feature). I can get that redis search can be faster than Lucene
based search engines, but actually running the search is the easiest part,
there's a lot of thought and effort on making a decent search engine, and the
article doesn't tell me if I can get facets, order by any field, make both
fulltext and geospatial queries, and many more things important when you make
a search engine.

I rememeber long ago when SQL Server included full text, we thought that our
time fiddling with lucene.net had become to an end, but when we tried it
failed miserably because the full text engine and the relational one where
like 2 different parts, and if you wanted to make a full text search and order
by a numeric field, it would have to make a temporal table with all the
results of the full text and order that. Those are the things tha lucene
solves so well, that I'm reticent to think that redis search has managed to
make them ok at the first try. So, not apples to oranges benchmark, but if you
have already redis and the search capabilities that you want to add are
fullfiled by redis search, it can be a good product.

~~~
kristoff_it
> I'm reticent to think that redis search has managed to make them ok at the
> first try.

Your example doesn't really match the reality of Redis. Modules in Redis can
bring their own data types and algorithms and don't need to resort to the same
kind of hack you mentioned. The ecosystem inside Redis is designed for
modularity and clear, minimalistic interfaces between components.

RediSearch might not be perfect, but any problem will stem from a different
set of causes, not because it tries to "emulate" a data model.

The whole point of Redis is not having to emulate data types and algorithms.

~~~
derefr
> Modules in Redis can bring their own data types and algorithms and don't
> need to resort to the same kind of hack you mentioned.

To emphasize this:

One thing people might misunderstand about Redis is that Redis extension
developers _aren 't_ expected to fit their data structures to the needs of a
storage-engine. It's not like Cassandra, where your data structures must 'boil
down' to key-value pairs; nor is it like an RDBMS, where your data structures
must 'boil down' to tuple-sets; nor like a graph DBMS, where your data
structures must 'boil down' to EAV triples. Redis data structures aren't
"implemented in terms of" any other simpler 'canonical' data structure.

Instead, when you look at something like Redis Streams or the Redis Graph
module, the whole complex data structure for each stream/graph is a big opaque
in-memory thing dangling off a _single_ key. It doesn't need to be broken down
into parts "legible" to Redis. It can just be what it is. "Objects" in the
Redis keyspace (the things keys are holding) can't hold pointers to one-
another, so the core Redis commands can blindly manipulate them (e.g.
deallocate them.) For everything else, you go through the module's commands,
which walk the internals of the data-structure as what it is: a plain-old in-
memory C struct, defined in your module's header files.

The canonical representation of a data structure in the Redis AOF (WAL log) is
just the sequence of commands used to build it; not the data-structure itself.
So, as a module developer, to get AOF persistence of your module's types, you
don't need to do a thing, other than ensuring that your module's commands are
deterministic.

You _do_ need to do a bit of work to get your module's types to serialize into
Redis RDB snapshots. But it's completely up to you how to define your types'
serializations. Redis just provides an API for writing and reading scalar
types from the RDB file stream. How your module uses them to save/load a value
of a type is up to you. (And you can just skip this if you like; RDB
persistence is used far less often than AOF persistence, so support for RDB
persistence it's not even a highly-demanded feature for modules. If you don't
bother, then loading your module just disables RDB persistence.)

~~~
kristoff_it
Thank you, exactly what I was meaning to say.

------
softwaredoug
I’m trying to imagine why I would not use existing, mature search tech
(Solr/Elastic) with existing mindshare I can hire.

Why would I pay to be off on an island very locked into a proprietary search
tech? Even if it is a little faster with the redis brand attached to it? It
doesn’t seem quite turnkey as Algolia or as deeply featured like Lucidworks
Fusion...

Maybe I just don’t get the pitch yet...

~~~
dvirsky
You don't have to pay for it. The original license was completely Free As In
Speech and even today it costs only for the distributed version. There is even
a fork of it preserving the original license.

It attaches to existing redis hash keys and allows you to add an index to your
already existing data for example.

Disclaimer - I'm the original author of RediSearch (started in 2016), before
the switch to a non FOSS license. I'm not affiliated with it anymore and
haven't followed its development for the past couple of years. But I can tell
you that our first users were people who couldn't get ES working for their
workloads, or had good redis infrastructure in place already and were able to
utilize it for this as well.

~~~
softwaredoug
Awesome, this is really helpful to understand what the value prop is!

The non FOSS licensing mixed with comments about needing an "Enterprise"
account confused me TBH...

~~~
dvirsky
What I was aiming for originally beyond speed, was a simple API and an
intuitive query language
([https://oss.redislabs.com/redisearch/Query_Syntax/](https://oss.redislabs.com/redisearch/Query_Syntax/))
that scales from simple textual queries to structured queries, and a big
emphasis on real time incremental indexing as the core features. There is no
batch vs. incremental indexing - you feed it a document and it's searchable
immediately, which I thought was very important for an in memory database (it
comes at the cost of avoiding some nifty index compression techniques, the
compression is pretty straightforward).

But TBH it started as a demo dogfooding project for the redis module system -
something to demonstrate how powerful the API is, and at the same time test
it, find bugs and design problems with it, and gather real developer asks. It
became the reference project for the modules API and a lot of the module
system's features were designed to accommodate it (most notably async
execution of slow commands).

But then a few people started using it (which all of the sudden gave
redsiearch itself user asks and testing and all that), requesting features and
being happy with it when these features got implemented, and it slowly started
to be a thing, with people even contributing code. Then we wrote the
distributed version requested by enterprise users, and it became a product,
which is now developed by a team. I left it a bit over two years ago.

------
dilandau
Good for them on building a business around redis.

That said, fuck paying for some bolt-on approximation of a search engine with
no real durability. I'll use elastic or solr.

They're free, and have massive communities and user-bases. Issues are surfaced
quickly. Compatibility is a priority. And all the other benefits of network
effects.

~~~
RhodesianHunter
>fuck paying for some bolt-on approximation of a search engine with no real
durability

Presumably you're storing your data elsewhere and pumping it into Redis for
search

------
rawoke083600
I think it's not terrible - redis itself is just brilant ! But wow a search-
redis ? I think they have a lotttt of ground to make up to compete in the
search space. The amount of options/analysis/tokenization/other-search
features you get out the box from anything Lucene based (solr,elastic) is just
staggering.

~~~
gkorland
Did you check the redisearch docs? it covers all the features you mentioned
see:
[https://oss.redislabs.com/redisearch/#primary_features](https://oss.redislabs.com/redisearch/#primary_features)

~~~
rawoke083600
Damn my bad ! I should have checked first ! Wow I didn't realise they that
feature-complete !

------
say_it_as_it_is
I was under the impression that Redis was intended for short-lived data, but a
full text search suggests otherwise. Are people using Redis, with persisted
storage, as their database?

~~~
sergiomattei
Yes, in a strange way: I use Stream-Framework by GetStream (curiously moved
away from it to GS today).

The backend storage for stream activities is Redis. It's lightweight and fast
enough for most use cases.

Sure, not a full on database use case... But it is data that is persisted.

------
Roritharr
I haven't followed Redis Modules so far, what's the best way to start the road
towards prod usage if otherwise I rely on the AWS ElastiCache version?

~~~
411111111111111
I haven't used RediSearch myself for the simple reason that you need to pay
for a Redis Enterprise Licence in order to use it.

[https://redislabs.com/redis-enterprise-cloud/compare-
us/](https://redislabs.com/redis-enterprise-cloud/compare-us/)

[https://redislabs.com/redis-enterprise-
cloud/pricing/](https://redislabs.com/redis-enterprise-cloud/pricing/)

thats not particularly expensive, but a dealbreaker for hobby project on which
i would want to use it. sadly, there is no non-commercial licence either.

~~~
Roritharr
I have no problem paying for it, but what I miss is a clear recommendation how
to get started if you have the rather common case of your own AWS Account
using ElastiCache.

If I buy into the Redis Enterprise Cloud, what are it's preferred usage
scenarios, how do I secure the connection properly, etc. etc. etc.

This is just shoving a buy button in my face and I have no idea how to decide
if what I'm buying is actually feasible for me, for example from a compliance
pov.

~~~
gkorland
Redis Cloud Pro can run in your VPC.

~~~
Roritharr
Great. How? Is that recommended? What kind of user rights I need to give redis
to manage that instance in my vpc? How to lock that down to the minimum
without breaking it? In that configuration, what risks do I have to control
for otherwise? ...

~~~
itamarhaber
All these details should be covered by [https://docs.redislabs.com/5.2/rv/how-
to/creating-aws-user-r...](https://docs.redislabs.com/5.2/rv/how-to/creating-
aws-user-redis-enterprise-vpc/)

------
Kerollmops
Nice work, it’s pretty awesome to be able to plug a search engine as easily as
adding a plugin. I just have one question about the recently published
benchmarks, are the latency based on the two-word search "hello world"?

------
pqdbr
Does RediSearch require that my dataset fits within RAM? I couldn’t find that
information.

~~~
ddorian43
Yes. Unless you use something like Redis Flash or KeyDB that store data in
flash too. You should check the data structures and functions being done on
them, if it can be efficient to also work with disk. Ex: increment in redis is
fast, but in another db may end up as Rocksdb get+update which is not.

------
faizshah
Are we pronouncing this Ready-Search, Redis-Search, or Redis-urch? Personally
I vote Ready Search but I’m also in the Postgres-Sequel camp :)

~~~
stronglikedan
ree-dih-search

EDIT: reh-dih-search, since the "re" is pronounced like in "red" [0]
[https://redis.io/topics/faq](https://redis.io/topics/faq)

~~~
francislavoie
The icon is red so I call it red-is ️

~~~
stronglikedan
You're right, according to the last question in their FAQ [0]. Updated my
original comment.

[0] [https://redis.io/topics/faq](https://redis.io/topics/faq)

------
zumachase
I wish Redis would focus a bit more on the standard clustering options.
Cluster has some real drawbacks and Sentinel is about the most brittle service
I’ve seen. We run all of our peer discovery for Squawk[1] on Redis but
recently made the switch to KeyDB and have been thrilled with the resiliency.
We’re definitely not Redis experts so I’m sure we’re doing some things wrong,
but for anyone else looking for a rock solid HA solution, I highly recommend
it.

[1] Squawk - Walkie Talkie for Teams
[https://www.squawk.to](https://www.squawk.to)

------
karterk
> Cluster Support and Commercial Version: RediSearch has a distributed cluster
> version that can scale to billions of documents and hundreds of servers.
> However, it is only available as part of Redis Labs Enterprise.

This is a bummer because high availability is really important for many search
uses cases. For e.g. think about e-commerce where search literally prints
money.

EDIT: it seems like the open source version supports a read-only replica for
failover but my overall thoughts about not crippling/compromising the
clustering story in open source version still stands.

While I understand the rationale for this move, unfortunately not having HA in
a non-starter.

I had a similar temptation when open-sourcing Typesense
([https://github.com/typesense/typesense](https://github.com/typesense/typesense))
and thought long and hard about keeping clustering as part of a closed source
commercial edition but eventually decided against it. I understand that
commercialising certain features is a necessarily evil and trade-offs must be
made. However, I think there are still many avenues to do that without keeping
clustering closed source.

Apart from that, I am happy to see the search space heating up with a lot more
interesting options.

Disclosure: I have stakes in the open source search eco-system
([https://github.com/typesense/typesense](https://github.com/typesense/typesense))
but a genuine Redis fan.

~~~
gkorland
HA is available and doesn't require Redis Labs Enterprise, the comment above
refer to what is called "Redis Cluster" meaning sharding

~~~
karterk
Thanks, it seems like read-only replication is supported. I've updated my
comment.

------
sam_lowry_
What is this "source available" thing in the first paragraph? You can see our
source but can't touch it?

~~~
filipn
You can read about their licenses here:
[https://redislabs.com/legal/licenses/](https://redislabs.com/legal/licenses/)

    
    
      Redis Modules created by Redis Labs (e.g. RediSearch, RedisGraph, RedisJSON, RedisML, RedisBloom) are licensed under the Redis Source Available License (RSAL).
    

Here is their RSAL license: [https://redislabs.com/wp-
content/uploads/2019/09/redis-sourc...](https://redislabs.com/wp-
content/uploads/2019/09/redis-source-available-license.pdf)

~~~
teruakohatu
Basically you can't use it for a database product, caching engine, stream
processing engine, search engine, indexing engine or ML/DL/AI serving engine.
I feel just about any web app does at least one of these things.

~~~
411111111111111
the correct way to phrase it would be: you cannot use it unless you pay for a
licence.

its source is only available so you can look at the implementation if you wish
to explore the internals.

~~~
teruakohatu
I was referring to the Redis Source Available License, I am sure if you pay
enough you can modify the source too.

~~~
gkorland
you can modify even if you don't pay as long as you application is not a
"Database Product"

~~~
sam_lowry_
What application is not a "Database Product", one way or another?

~~~
jashmatthews
If you resell Redis that’s a DB product. If you sell widgets and use Redis for
search it’s not. There’s obviously some grey areas but I thought it was
relatively clear?

------
artembugara
Any advantages over Elasticsearch?

EDIT: I did read the article, though I overlooked the link to comparison with
Elasticsearch.

~~~
vvangemert
Did you even read the article? There's a link in there whith benchmarks it
against elasticsearch. Read the conclusion on:
[https://redislabs.com/blog/search-benchmarking-redisearch-
vs...](https://redislabs.com/blog/search-benchmarking-redisearch-vs-
elasticsearch/)

~~~
bryanrasmussen
I mean speed is nice, but it is not the primary thing I am concerned with in a
search engine, as long as it is acceptable it is not in the list of
requirements - things that would be on my list - not necessarily in this order
but close

1\. What human languages does it support.

2\. In these human languages how does stemming and decompounding work in your
implementation.

3\. how is word importance determined in your index - TF-IDF? other algorithm?
Are least important words automatically dropped from queries?

4\. Do you have ability to rank on both the stemmed/decompounded query/results
and exact matches? So something like raw field access.

5\. Can I create my own semantics - I remember seeing a post on here recently
where someone had created a search engine (in Rust I think) that was faster
than ElasticSearch but from what I could see you couldn't create your own
field names so you were stuck searching in title, description, body,
creationDate and a couple other fields which really decreases the usefulness.

I mean these are the things that right away spring to mind to ask about when
someone tells me they have a new search engine, and when they show me look at
my speed benchmarks I'm thinking "what am I supposed to do with this?"

on edit: formatting

on second edit: So I guess as in most things I am interested in how the
product actually fulfills what should be its primary functionality, so how
does the search engine function as a search engine, I suppose my questions
could be answered with quick - our search engine has feature parity with
ElasticSearch / Solr where features A, B, and C are concerned - features D and
E will be supported in the future.

~~~
gkorland
The question in general is "Yes, RediSearch supports all of these features".
You can read about it all in the docs
[https://redisearch.io](https://redisearch.io)

I also pointed bellow to the specific relevant area in the docs.

> 1\. What human languages does it support. > 2\. In these human languages how
> does stemming and decompounding work in your implementation.

[https://oss.redislabs.com/redisearch/Stemming/](https://oss.redislabs.com/redisearch/Stemming/)

> 3\. how is word importance determined in your index - TF-IDF? other
> algorithm? Are least important words automatically dropped from queries?
> >4\. Do you have ability to rank on both the stemmed/decompounded
> query/results and exact matches? So something like raw field access.

[https://oss.redislabs.com/redisearch/Overview/](https://oss.redislabs.com/redisearch/Overview/)

~~~
bryanrasmussen
Thanks, I guess I was more taken with answering on the link to benchmarks on
the sub-thread which seemed not what I would consider pertinent. That said
everything looks pretty nice.

------
hartator
Wow didn’t take long to lose direction after Antirez’s departure.

~~~
detaro
How is this multiple years old _add-on module_ a loss of direction for Redis
or Redislabs (who always have done exactly that: building things _around_
redis) after the departure of Antirez, who wasn't in charge of setting
direction at Redislabs?

------
sigmonsays
I am a bit dated but I dont have a high confidence of redis with persistent
storage such as a typical ACID database. How does this fit into an
architecture?

It seems painful to have to write code to reload the search db if it fails.

How long is redis search going to exist and be supported?

If this is delivered as a module, what guarantees do I have that the module
interface wont break and leave redis search in a broken state?

~~~
chmod775
> but I dont have a high confidence of redis with persistent storage

Anything that kills redis persistence is also going to corrupt whatever else
database you'd have used instead. In fact redis persistence is such a simple
model, I'd be surprised if most RDBMs weren't more likely to corrupt data.

I've been running a redis database since 2013 and haven't had a case of lost
data once and never even had to restore from a backup.

