Can someone please make sense of Riak for me please? Why would I use it as the datastore for my webapp?
Say for example I have a blogging application that stores blog posts and comments in a JSON document plus some other "typical" application functionality, I can't see how Riak can do that? It seems to be missing basic functionality to query within those JSON documents, to do ordinary selection of data.
Am I missing something? How do people use Riak if it's not really possible to do field level search and selection and ranges queries and page through results?
Riak's basic advantage is the way it handles scaling. Unlike monolithic databases that scale horizontally using master/slave architectures, Riak is distributed on day-1; your starting configuration would be multiple servers. Riak has no master servers. As and when needed, you add new nodes to Riak clusters with a single command.
What this offers developers is essentially a locally-hosted version of S3 with performance and cost characteristics that admit to creating lots of small objects.
Riak has other features (mapreduce over collections of objects, indexing, full-text search), but fundamentally, you'd use Riak for the kind of application that could be reasonably built around S3-scale key-value.
In practice, I've found Postgres and Riak to be a pretty good 1-2 punch: metadata (anything I'd use to formulate queries) goes to Postgres, bulk data to Riak.
My application doesn't have a zillion users, nor does it store lots of "files"; what it does do is generate a lot of variable-sized historical data. Before Riak, I'd have to think hard about how I'd digest that data and what parts of the data I'd have to throw away. After Riak, I'm finding I can basically throw disk at the storage problem while continuing to get consistent performance characteristics.