

MapReduce from the basics to the actually useful (in under 30 minutes) - cloudant
http://blog.cloudant.com/39351506

======
conesus
Am I crazy or was there not a single reduce query in the post? I've been
working on map reduce queries for a little while now, aggregating statistics
over feeds and stories posted, normalizing over item frequencies, and
generally doing simple stuff.

I was very excited to see the second half of MapReduce. The reduce queries are
always harder to write. If map is addition, reduce is division. I was hoping
for examples on different reduce queries that are more advanced than the
standard count tags in a list of lists. But no reduce period, so am I missing
something?

~~~
mlmilleratmit
Plenty of reduce queries, you just didn't need to write the code ;) Good
point, though, good material for the next round. I'm open to suggestions if
you have an interesting data set or question.

~~~
moultano
Here's a great example of something that is most natural in mapreduce:
<http://www.danvk.org/wp/2007-04-06/nebulabrot/>

------
aheilbut
Terminology is so badly abused in this article that it almost reads like a
parody.

Also, the tasks could have been accomplished in 3 or 4 lines of SQL.

~~~
tdavis
In what way is terminology badly abused? There isn't a single use of the word
"cloud" in the entire article and the only "buzzword" terminology used appears
to be used accurately and sparingly. If that isn't your contention, why not
explain otherwise? How does the abuse/misuse of terminology detract from the
quality of the article or make it inaccurate?

Further, your argument that the same could be accomplished in "3 or 4 lines of
SQL" is a straw man. The article never claimed the specific task was
shorter/easier to do using MapReduce and Cloudant; the author made an example
based on common use cases—one that didn't require architecting a full
requirements specification for when one would find Cloudant/MapReduce/Non-
relational databases superior to a few lines of SQL for various values of
"superior".

Your comment is vague, largely irrelevant, and completely useless regardless
of its accuracy since you provided absolutely no arguments to back up your
claims. The fact that it has even two points is disheartening.

~~~
aheilbut
The phrases that irked me were:

"I have yet to meet a database that isn’t a key/value store"

Dimensions that are "somewhat orthogonal"

"It suffices to say that MapReduce is all about giving programmers an
efficient way to consume data without needing to know how or where it is
actually stored."

I don't think that definition suffices, and it misses (or buries in
'efficient') the rather central point that MapReduce is a programming model
for distributing computation.

I'm all for non-relational databases where they're appropriate, and Cloudant
sounds like it is doing great things. But I think that there's a risk in
presenting toy examples in a way that seems to sell them as the solution to
common use cases that really could be solved more easily with old-school
tools.

~~~
bitdiddle
I suppose "somewhat orthogonal" is like being "almost pregnant", perhaps the
author was just speaking loosely here.

I agree that the examples were toy ones, but it's precisely these simple
things like projections that we see folks struggling with when coming from the
relational world.

The answer to "can't you just do this in SQL" is always yes you can. Things
change, non-relational dbs were around before relational ones and are now
returning for a variety of reasons.

I'd like to see some follow on posts that delve into the subtleties of
rereduce, another real pain point for new users.

One thing that is not emphasized enough in my mind is the flexibility that a
schema-less database such as BigCouch gives you. Consider the trivial schema
one would construct to support the 3 or 4 lines of SQL required for these toy
examples and then consider how that schema might evolve as needs for different
queries change, as the data grows, as different apps with different O-R
mapping issues are brought into play and so on.

A schema-less approach does push more of the complexity into the app layer for
sure, but it allows the schema to evolve more naturally. After all schema-less
doesn't mean the schema really goes away conceptually.

I agree also that Cloudant is doing great things, that's why I joined the
team, that and the free coffee :) Thanks for the feedback.

------
js4all
Thanks for the article. It is like a continuation from the NoSQL tapes. When I
saw the video, I wondered how to use the built-in reduce functions. Now it is
clear.

I also like the use of json.tool to format the output at the command line.

