Hacker News new | comments | show | ask | jobs | submit login

Trying to read thru all the hype, this looks like Riak hosted on machines with SSDs, with less features, and a nice billing system in front of it.

Of course for people who want a hosted solution, the fact that it is hosted, is what gives it a lot of value. There haven't been a lot of hosted NoSQL databases, at least on this scale and availability, out there.

But technologically, what's new here? Is there anything here that is really innovative? (not a sarcastic question)

As you and others have pointed out, hosting + SSDs + synchronous replication across availability zones counts for a lot. If DynamoDB lives up to the hype, it could be a huge step forward in the world of "don't have to think about it" data storage.

DynamoDB does have at least one significant feature not provided by Riak -- range scans. This makes many common access patterns much easier to implement efficiently. Still, as you suggest, there don't appear to be any fundamental technical advances here. The advances are in the service model and operation.

And there are, of course, many limitations. (Just to name a few: items -- i.e. rows -- can't exceed 64K; queries don't use consistent reads; seemingly no atomic update of multiple items.) It's miles ahead of SimpleDB, but still not nearly as flexible as many of the existing NoSQL databases. If Amazon lives up to past performance, they'll make steady improvements, but slowly.

you can choose whether queries are consistent or eventually consistent.

>DynamoDB does have at least one significant feature not provided by Riak -- range scans.

Riak has the ability to select keys for processing via various queries, including range of the key.

Riak also has secondary indexes and full text search.

If there's something significant about the DynamoDB method of doing range scans I'm interested in hearing it. My purpose here isn't so much to bash DynamoDB (in fact, I don't want to do that at all) but to try and spread a little more awareness of Riak.

Riak really came into its own in 1.0.

> Riak has the ability to select keys for processing via various queries, including range of the key.

Based on the resources I can find online, any select-by-range operation in Riak requires broadcasting to all nodes (or at least enough nodes to hit at least one replica of each record), and then performing a scatter-gather operation to fetch the matching records. There also doesn't seem to be any way to specify a sorting order. This is not quite what I would call a range scan: while useful, it presumably doesn't have the same cost or scaling characteristics. It's the difference between scanning a block of data that is stored contiguously, and filtering through an entire table to identify records meeting a criterion which happens to take the form "a <= value <= b".

This is not to diss Riak, which is a nice piece of work and does many things that DynamoDB doesn't.

For one, it responds quicker than Riak: Riak has (cold) response times of about 300ms, while this service claims single-digit ms response times. Also setting up Riak is not exactly trivial, and using this service outsources that hassle.

You can use Riak Smartmachines on Joyent's cloud that would get similar performance for an order of magnitude cheaper than what amazon is charging. If you are seeing 300ms response times, you are not using SSD's and you are not using a similar number of nodes that Amazon is charging you for.

I'm not hating on Amazon, it is a good move for them and they are doing some things that Riak cannot do, but cost and response is not one of them.

I both wouldn't bet on Joyent's Smartmachines being a magnitude cheaper and being as fast: On the speed: We run a Riak cluster, and for us the actual response times we get from Riak are, as described, slower than what Amazon promises in its docs.

On the price: If you would get 3x16 GB machines with Joyent, that would cost you 1400$/month. You can get a lot of resources for that with this new AWS service.

I don't have any experience with either Joyent or this new DynamoDB, but I do have some experience with Riak, and from the docs this new service would be a very viable competitor.

Riak has many shortcomings, but I wouldn't describe latency or installation as primary concerns. Our cold response times have a 99% bound of 8ms, and median of 5ms on commodity SSDs. Installation is handled by apt and is trivial to automate.

What are some of Riak's shortcomings?

Could you clarify what you mean by "cold response time of 300ms"? Cold as in requesting data that hasn't yet been cached in RAM? How good does it get once the cache is warm?

Yes, that's what I meant with 'cold'. For recently requested data, that is cached in RAM, the response can be as quick as 3 ms.

Thanks, that's useful to know.

If anything, from my experience with Riak, Basho guys should be having an emergency meeting.

Not mentioned here - DynamoDB also has built in monitoring and management.

What has been your experience with Riak?

Riak is based on Dynamo, and the original paper by Werner (if I recall correctly?). This is just offering up an easy to use cloud service version of what's backing S3 already?

Also it is spread across AZs which is handy.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact