

Querying Riak Just Got Easier: Introducing Secondary Indices - rbranson
http://www.slideshare.net/rklophaus/querying-riak-just-got-easier-introducing-secondary-indices

======
psadauskas
Awesome feature, but why these terrible URLs?

    
    
      # Query for category_bin = "armor"
      curl http://127.0.0.1:8098/buckets/loot/index/category_bin/eq/armor
      {"keys":["gauntlet24"]}
    
      # Query for price_int between 300 and 500
      curl http://127.0.0.1:8098/buckets/loot/index/price_int/range/300/500
      {"keys":["gauntlet24"]}
    

Why not use URI query parameters for the query parameters?

    
    
      /buckets/loot/index?category_bin=armor
      /buckets/loot/index?price_int=[300..500]

~~~
seancribbs
That is still under debate AFAIK. The latter form would allow you to compose
multiple index lookups as well, which is why I'm for it (although the query
planner might not support it yet).

~~~
KevBurnsJr
What about

    
    
      /buckets/loot/index/category_bin,armor/price_int,300,500
    

Looks a lot like link walk syntax, don't it?

------
arielweisberg
I don't follow how these don't always end up being distributed queries since
the index key doesn't include the partition key. Where is the locality coming
from in the index scanning and value retrieval?

If the key is individual armor type, there could (and usually will) be values
in any price range at every partition.

Is the index partitioned separately on its own key? Is a copy of the value
stored in the index or is it then retrieved separately from the index scan?

~~~
rbranson
It leverages merge_index from Riak Search, which stores the index data on the
same partition as the object. Queries are only performed against a subset of
partitions that the "query planner" thinks will contain the object.

~~~
arielweisberg
If the index key does not start with the partition key, then won't that end up
being everything in the majority of cases?

In the examples given (price, license plate) there is no locality between the
partition keys (armor id, person) and the index key. A query for all armor
priced between 200-400 would end up touching every partition that contains
armor priced between 200-400. Unless the set of armor is small you will end up
needing to scan every partition.

~~~
jtuple
Currently, the entire keyspace is queried, but querying the entire keyspace
does not requiring touching every partition. Only a covering subset which is
influenced by your N-value (number of replicas) needs to be queried because
the index is replicated alongside your k/v data.

For example, in a 4-partition ring with N=2, keys mapping to p1 are replicated
on p1,p2; p2 on p2,p3; p3 on p3,p4; and p4 on p4,p1. As such, you only need to
query p1,p3 or p2,p4 to cover the entire keyspace.

In general, approximately RingSize / N partitions need to be queried. The new
smart coverage code figures this out as well as deals with routing around
failed nodes and other issues.

EDIT: Since the replicas value (N) is settable per bucket in Riak, there's
some interesting extreme cases that you could envision here. For example, you
could have a bucket where N = RingSize, in which case the index is replicated
to every node and you only need to query a single partition to lookup values.
Of course, then you lose the ability to perform multiple queries in parallel
with a more partitioned/distributed index space (which would be more useful
for large results sets). As with database systems in general, the best
configuration here depends on data and use case.

~~~
strmpnk
I assume queries are done over R=1 consistency then? Is W=N the only way to
keep writes consistent with these indexes at all times?

~~~
jtuple
As far as I know, R=1. Rusty is likely the best to comment on this and things
may change before/after release, but currently there is no way to specify R
for index lookups, and only the minimal set of replicas is queried.

Technically, when you perform a write, Riak will always dispatch to N
replicas. W simply requires Riak to confirm W writes before responding to the
client. So W=N allows you to know N index sets have been updated, but it's not
strictly necessary. At the end of the day, indexes are eventually consistent
like the rest of Riak.

~~~
strmpnk
Right. I'm just saying that if I write and I want to assume a query after that
write will include it, I will need W=N since R=1. Which is fine... but tricky.
W=2,R=2,N=3 has been my favorite combination but I guess there are always
cases to try other setups.

------
rbranson
This presentation is particularly interesting because I think it really
touches on why secondary indices should be implemented at the datastore level
and are impractical to simply graft onto a distributed K/V store.

------
alnayyir
The candor about the trade-offs involved in picking one database vs. another
is definitely appreciated and is a tone the community needs to take on the
subject in general.

~~~
tolitius
[http://www.dotkam.com/2011/07/06/noram-db-if-it-does-not-
fit...](http://www.dotkam.com/2011/07/06/noram-db-if-it-does-not-fit-in-ram-i-
will-quietly-die-for-you/)

I agree PR and lots of BUZZ makes it really difficult to choose the right DB.

My latest obsession is Riak => Basho is VERY honest, and quick in helping you
out and telling where and how Riak can help you, and most importantly what
would NOT be a good fit for Riak. ( I am not working for them :)

/Anatoly

