

A two-year NoSQL case study: technologies, trade-offs, tips - jhs
http://www.dataversity.net/archives/6714?t=1320768580

======
AdesR
We have had a good deal of luck using ElasticSearch to index and query our
couchdb databases on some projects. Couch publishes a _changes feed that
allows ES to subscribe to updates and index them as soon as they hit the
database. It has a very powerful JSON based query language that eased a lot of
the pain points related to couchdb views for us.

I wrote an article about my experiences with it :
[http://developmentseed.org/blog/2011/may/31/flexible-
facetin...](http://developmentseed.org/blog/2011/may/31/flexible-faceting-and-
full-text-indexes-using-elasticsearch/)

------
itaborai83
I´d like to ask the Couch luminaries that hang around here about the status of
the inclusion of Google´s snappy compression library and the rewrite of the
view engine. I´m aware that we are talking about a Cloudant specific solution
here, but how much of an impact would it have in a scenario such as the one
described in this talk?

~~~
jhs
I am just a user and occasional patch submitter.

With CouchDB, you front-load all of your disappointment. In exchange,
everything that CouchDB _can_ do has compelling big-O performance. For
example, all queries finish in logarithmic time, including one-to-many, one-
to-one, and merge-joins. Map-reduce is not a job you run; it is a living data
set that always exists and always reflects the latest changes to your data.
(Updating a map-reduce result takes linear time for the number of updates, if
I recall.)

Plus, the BigCouch builds allow you to specify your redundancy needs. The
preceding paragraph still holds true. Nothing has changed. You just get to
throw hardware at the problem to guard against machine failures.

CouchDB is slow. Its VM is pokey. Its disk format is bulky. Its protocol is
bloated.

CouchDB is fast. Everything that you can do, you can do in logarithmic time.

CouchDB is neither slow nor fast, but _predictable_. Fun fact: the entire
CouchDB Erlang code base is almost the same size as the NodeJS standard
library (20k apples vs. 15k oranges).

To answer your question, snappy compression and view optimizations will be a
welcome boost for the other speed question: speed of development, time to
market. If you think the compile step is time sink, rebuilding an index on all
of your data is just untenable. So, the optimizations will improve day-to-day
experience, but they will not change its fundamental value proposition.

------
jhs
CouchDB fails the benchmarks but gets honors in the school of hard knocks.

~~~
rkalla
Exactly. Restrictive query model and data rw patterns required all tech
decisions to be bounded by couch requirements (which is a no no normally) but
in 2 years, 40TB and going from 1 to 14 nodes seamlessly, there was no lost
data or crippling cluster events.

Thats a big deal.

~~~
jhs
"NoSQL" is meaningless. Instead, look at programming language terminology:
general-purpose languages vs. domain-specific languages.

CouchDB is a domain-specific database.

~~~
banjiewen
This is a good way to look at it, I think. People often ask if I'd use CouchDB
again if I was building another similar system - I would, but I'd restrict it
to a subset of the problem that was best-suited to CouchDB's strengths/quirks.

(note: I gave the linked talk)

~~~
itaborai83
and what would that subset be in your opinion?

------
einhverfr
I thought the discussion of tradeoffs was interesting, in particular the
problem of saying no to customers, and that the business became rigidly
centered around the database structure.

It makes me wonder if CouchDB and others might not be best when used with,
rather than as a replacement for, RDBMS's.

------
banjiewen
I should also mention that Meteor's hiring software developers in Seattle. If
this is interesting to you, shoot us an email at jobs@meteorsolutions.com.

