

Autocomplete with Redis - thomasknowles
http://antirez.com/post/autocomplete-with-redis.html

======
RyanMcGreal
>Too bad that I'm forced to play the game of the SE0 L00z3r

There's building a link farm, and then there's giving your article a
descriptive title.

~~~
antirez
I agree in general terms, but it's sad that search engines are not advanced
enough to do a reasonable job even if your title is not very descriptive. My
non descriptive titles are lame, like "Redis weekly update number ...", but
there are many artistic, paradoxical, weird-in-an-interesting way titles in
the writing tradition that may disappear because search engines are not good
enough.

~~~
ergo98
You're defending the ridiculous, and really the first two paragraphs of your
entry are just unnecessary defensive noise that encourage readers to hit back.

Your title is now vastly better for everyone, from search engines to casual
readers.

~~~
antirez
Anyway what I did was switching to better titles :)

I too think it's a good idea in general btw. Don't want to defend my lame
titles, but clearly there is a technical limit in our technology. An human
will easily recognize that a given article is about something regardless of
the title. Just hoping that in the future we'll be ok even with lame titles,
but the right thing to do now is indeed switching to saner, descriptive
titles.

~~~
ergo98
To some degree I think that which you seek is already available -- everyone is
going to link to your post with link text about autocomplete, and that will
add the semantic meaning to it (in the same way that a googlebomb works).

~~~
antirez
good point, thanks for the hints.

------
antirez
Update: the Ruby script was broken! Thanks to Pedro Melo for finding the bug,
now it's fixed.

------
geuis
How do the memory and speed requirements of the described technique compare to
other systems like couchdb or solr? We use solr at work for a range of
projects, including autocompletes. Specifically, I believe it has
lexicographic features built in so we don't have to populate every version of
a potential search

~~~
antirez
I'm not sure about performance and memory efficiency of the other systems you
mentioned. In Redis it is possible in theory to add a command to perform this
task against a sorted set, without adding prefixes. Otherwise it's also
possible to run binary search directly using ZRANGE, against the sorted set.
This way the sorted set can only contain the actual words without prefixes
(the memory complexity will be the same but with a smaller constant factor).

I don't think we'll add such a command, nor that a more complex implementation
is a good idea, because this algorithm is pretty straightforward and very
fast, and can be modeled using the pre-existing API. Also in most completion
systems you want to complete only frequent items, so I'm not sure if it's a
memory bound problem in practice.

That said I expect the proposed solution to be able to deliver at least 10k
completions per second in a small virtual machine.

~~~
bl4k
I benchmarked it. My binary search using ZRANGE was fast, but not as fast as
your normalized method:

    
    
        Ran 10000 queries in 26.52s using bsearch
        Ran 10000 queries in 3.87s using normalized method
    

the advantage in using the bsearch is that you have a clean data set, incase
you use that set in other parts, and it is easier to manage the data.

~~~
antirez
Hello! like in most Redis benchmarks, without concurrent clients you are
measuring latency in this way, not real performances! You need to spawn N
threads, and if you do it the wrong way (N ruby scripts for instance) you'll
not be ale to actual meter the performance as the clients will use all the
CPU...

I think the normalized method is faster indeed but it's not possible to tell
from this benchmark.

