

CouchDB Implements a Fundamental Algorithm - ropiku
http://jchrisa.net/drl/_design/sofa/_show/post/CouchDB-Implements-a-Fundamental-Algorithm

======
arnorhs
I might be really retarded, but I have to admit, I didn't really understand a
word in that article. Maybe there are a lot of CouchDB-specific words, maybe
it's my lack of knowledge of map reduce, I don't know, I just didn't
understand it. Sorry for a lame comment.

------
jrockway
Transactions? Oh, nope, they _removed_ those...

~~~
RiderOfGiraffes
I'm not an expert in these things, and appreciate the opportunity to learn
more.

It seems to me that a transactional database necessarily requires all sorts of
locking and waits to ensure that everyone using the database sees the same,
bang-up-to-date version of the data. This is probably essential in some
applications.

The point of CouchDB seems to be that sometimes the system doesn't need the
most up-to-date version of the data, and that data that's not-very-old will
do, especially if it was valid at the start of the process. CouchDB then has a
different set of invariants, a different collection of guarantees.

CouchDB seems to work on the idea that changes will propogate, and that the
system will settle, and that processes will never get inconsistent data,
although the data might sometimes be old/stale. These guarantees might then
suffice, especially in places where locking and the delays/bottlenecks
threatened by locaking would be unacceptable.

Is that a reasonable summary?

~~~
jrockway
Read this: <http://blog.woobling.org/2009/05/why-i-dont-use-couchdb.html>

The system doesn't really settle. Imagine there are two versions of a record.
Your process reads the old one, and generates a completely new record based on
that data. Later on, "eventual consistency" removes that original "old"
record; the "new" one won. Well, that affects that initial record, but it
doesn't affect records that used data from that record to make new records. So
the database gradually becomes filled with data that is completely
inconsistent. There is no such thing as "eventual consistency". (Transactions
handle this problem; if your new record is based off of old data, your
transaction will abort. If you commit successfully, you know that your data is
consistent.)

Now, if you never use your data to make new data, and none of your records are
related to each other via links, and you never need to create more than one
record at once, CouchDB is fine. But it seems to encourage people to throw
away basic data integrity for some magical scalability reasons, which is
probably a premature optimization.

(And I say this as someone who only uses relational database rarely. The
relational model does not fit my problem space very often, but strict
transactional semantics always do.)

~~~
silentbicycle
I may be misunderstanding you, but is this loosely correct? It merges the
distributed data like git, except in cases where there's a merge conflict, the
newer one just wins?

It's certainly hard to come up with hueristics for automated merging in all
corner cases, and punting and asking the (likely best-informed) user to merge,
as git does, seems like the most practical solution. Unfortunately, automated
replication doesn't seem to allow that option.

Am I really off base here?

~~~
dantheman
If there is a conflict it is reported to the application using couchdb and the
application can use whatever logic that it want to manage the conflict. Most
recent is a simple method that an application can use.

