

Riak 1.0 - franze
http://www.basho.com/news_riak_v1_0.php

======
socratic
I am increasingly interested in Riak, in part because a very vocal minority on
HN seems to think it is the One True NoSQL solution.

However, I still don't quite get it. What is Riak? It seems to be some sort of
Dynamo implementation (like the ironically ill fated Cassandra), but
apparently it has a workflow engine? What do people use it for? What is it
best at?

Right now we're using PostgreSQL, Redis, and S3. PostgreSQL gives us ACID,
Redis gives us fast in-memory access, and S3 gives us an infinite KV store. Is
there some reason to use Riak? Would Riak just replace S3?

~~~
amock
If your values are very small Riak is probably a good replacement for S3. If
your values are large then S3 is probably better than Riak.

~~~
siculars
I'm not so certain. A value written to riak has something on the order of
450bytes of overhead (as of version 0.14, not entirely certain of the exact
overhead in 1.0). Basically riak will write your value to disk with a bunch of
other data that it uses internally to do its thing. Writing a stream of
integers, one integer per riak key, would be a bad idea (tm), imho.

------
superjared
"Riak 1.0 will be available later this month."

I can't stand pre-announcements like this.

~~~
timf
Further, there are only three days left.. The HN title should be changed away
from "Riak 1.0".

------
tsuraan
Notice that this is a bit of a pre-announcement:

    
    
      Riak 1.0 will be available later this month. To preview 
      some of the new features, download Riak, or to inquire 
      about a commercial deployment, please visit http://www.basho.com.
    

Their github page still just has 1.0-rc1 tagged. I'm excited though.

------
jibs
A bit tangential to this particular announcement - but i've been musing about
using Riak, though so far put off by their (seemingly) open-core, rather than
open-source implementation. Are the paid, enterprise functions stuff you
eventually need in most use cases? the lack of multi-site replication in
particular is curious; would this mean I can replicate between nodes on the
cluster, as long as they are in the same datacenter, but not across the
interwebs until i hand over some $$$?

~~~
nirvana
Riak is Open Source. It contains a very complete platform. Riak Core is a
dynamo style distributed system platform (not database specific), Riak Pipe is
workflows, Riak KV is a KV database, Riak Search is full text search over that
database. And there's lot of other stuff I'm not even mentioning (like
bitcask, the logging stuff, etc.)

When you go to the Riak project on github, what you find is actually sort of a
skeleton, that has as dependancies all those projects I mentioned above, such
as riak_kv, riak_pipe, etc.

Riak ES, the commercial offering, is a superset of Riak. It has Riak as a
dependency, and adds the feature of cross datacenter replication. I think the
real reason you buy Riak ES is because you're wanting to buy support.

Riak ES being a commercial product doesn't make Riak any less open source,
than Oracle Server being a commercial product makes Linux less open source.

Also, Basho is keen to develop users of Riak ES, and customers of Riak (who
don't spend any money) still get some support from Basho. Basho has a "Riak ES
for startups" program, which gives you a huge discount.

I'm building my business on Riak because Riak is open source. IF Basho goes
away, I'll still have Riak. There's nothing missing from Riak that I need.

I figure if I get big enough where I want to be running out of multiple data
centers, I'll be big enough to afford Riak ES, and if I can't afford Riak ES
at that point, then I'll be able to build my own solution. (I don't think it
would be that hard, actually.)

~~~
neonkiwi
> I figure if I get big enough where I want to be running out of multiple data
> centers, I'll be big enough to afford Riak ES

I was thinking along those exact same lines, but a big unknown was pricing on
their enterprise offering. That information is unavailable on the web, and
despite my skepticism in contact-us-for-the-price situations, I filled out
their online form, which is a request to be contacted by a representative.

I haven't heard from them, but they did put me on a mailing list—I got an
email about this 'milestone release' today! Not quite what I wanted to know,
though :)

Nirvana, or someone using their Enterprise offering, perhaps you could fill us
all in on the price?

~~~
mshneider718
Hey neon...work for Basho so would like to research why you were not
contacted...any details are greatly appreciated

------
latch
I haven't used Riak, but I did look into it for a project short while back.
One problem I had was that the documentation on their website is heavily
focused on what Riak is, vs how to use it. It's great that you can get such a
fundamental understanding of Riak as-a-dynamo-implementation, and they do a
great job writing that stuff, but its completely out of touch with what I
expected/needed.

Technically, what eventually put me off, is that I couldn't figure out how to
maintain a clean secondary index. If you have a: SiteId, UserId, Data, and you
want data to be accessible by SiteId or SiteId+UserId, I couldn't figure out a
nice atomic way to maintain the secondary index. This is pretty basic stuff.
I'm glad to see 1.0 will support native secondary indexes, but I think my
inability to figure it out shows that their documentation is poor (or it could
be that I suck).

~~~
bgentry
The secondary index stuff is actually very new, so it's entirely possible that
it didn't exist when you last looked at Riak.

More info here <http://blog.basho.com/2011/09/14/Secondary-Indexes-in-Riak/>

------
geoffhill
LevelDB support! Would love to see how it compares to Bitcask in terms of
speed.

~~~
willbmoss
Bitcask can guarantee one disk seek, whereas LevelDB will do one disk seek per
level, so at least from that perspective, it can't be better.

Level also has to look down the entire tree if a key is missing. This means
inserts end up being more expensive than reads or updates (which are all just
a hash lookup in Bitcask).

~~~
fizx
"Bitcask can guarantee one disk seek, whereas LevelDB will do one disk seek
per level, so at least from that perspective, it can't be better."

Yep, this is a standard tradeoff. When you want your data to be iterable, you
have to take the hit. In practice (I oversee a large cassandra cluster), this
hit happens about ~1% of the time, which is either a lot, or a little,
depending on your constraints.

"Level also has to look down the entire tree if a key is missing."

This is why Cassandra has a bloom filter on top of a very similar data store.

------
lwat
Is anyone here paying for the Enterprise level Riak? I'd love to hear how they
charge and whether you think it's worth it. Currently we're looking towards
the Denali release later this year but Riak is looking more interesting by the
day.

------
vdm
> Interested in Riak Enterprise for your company? Contact Us Now

I don't have a company. Yet.

What I am interested in is a go-away button on this obnoxious ad bar so I can
read your webpages on my vertically-challenged 11" screen.

