
How the Dropbox Datastore API Handles Conflicts – Part Two - llambda
https://www.dropbox.com/developers/blog/56/how-the-dropbox-datastore-api-handles-conflicts-part-two-resolving-collisions
======
jchrisa
The problem with this approach is that it requires you to resolve conflicts
when you first see them. So you can't do workflows that accumulate more than
two versions of what happened. Nor can you resolve the conflict
asynchronously.

For an approach that doesn't ever force you to throw away or merge data, see
the data structures in my OSCON talk.
[https://speakerdeck.com/jchris/couchbase-lite-
oscon](https://speakerdeck.com/jchris/couchbase-lite-oscon)

I owe the world a write-up explaining these slides. Or come see me talk at
StrangeLoop in St. Louis or Couchbase SF in September.
[http://www.couchbase.com/couchbase-
sf-2013](http://www.couchbase.com/couchbase-sf-2013)

~~~
erikb
I could interprete it as a multiple-branch model with Git. But it's still
rather confusing for me what use it would have for anybody outside of source
code repos (meaning: I don't have the knowledge, not that there wouldn't be a
usecase).

For some end users it's already confusing to have different states in the same
linear history on different devices because of upload times and devices being
offline. I think giving most users multi-branch capabilities would overpower
them and they might just ignore all branches but one, manually merging parts
of older branches into the newer ones.

So I guess you assume usecases for application writers? I'm also not deep
enough into this topic that I could imagine any usecases. Would you mind
giving some examples?

~~~
jchrisa
As a database, the #1 rule is don't lose data. So a model that allows you to
defer conflict merges, and to do the merge on any peer, keeps the database
doing what it does best, and gives maximum flexibility to the application.

For instance on a customer record you might have three sales folks all qualify
a lead as a 4, one adds a fax number, one adds an email address, and one adds
the same fax but a different email, and a note about the best time to call. If
the database forced the application to merge on encountering each update,
there'd never be a chance to set all three versions side by side and make an
informed comparison. Instead it would be luck that A and B merge, and then the
result merges with C. Or some other order.

Where it gets really useful is if you have different actors that have
ownership of the merge on different fields. So for instance you could
partially resolve the conflict by merging some fields automatically but
present a human with the full set of options for a bigger octopus merge on,
say, the notes field.

~~~
pkj
If a client app is processing the cust record, how would cust.email look like
? Typically I would assume a single value, but in case of unresolved conflicts
will it be a iterator/array ?

~~~
jchrisa
The client will have N copies of the customer record, so the record schema
doesn't change. Instead all the potentially valid versions of it are preserved
until a peer (any client or server) marks the conflict as resolved.

------
pkj
Trying to wrap my head around this. Seems difficult without clear usecases.

Let's say I have 10 devices d1,d2....d10 making updates to "a" on the server
and went offline. a==20 and last update was by d5 before everyone went
offline.

When the devices come back up, the fate of "a" depends on the rulesets.
Following are 3 possible high-level combinations.

i) All devices have "remote" rule. On reconnection, everyone rollback "a" to
20. They are essentially back to the time before going offline. Even the
device which did the last update(d5) before going offline is rolled back too,
which seems bit odd. Still simple to reason with..

ii) All devices have "local" rule. On reconnection, the last device to
reconnect updates "a". It is then broadcasted to all other devices. Note that
it is _not_ the last device to update "a". Rather it is the last to reconnect
(Now, even if all of them reconnect at same time, depending on the queueing at
server, the one at the tail wins). Not really simple..

iii) Mix of "remote" and "local" Let's say d1 had "local" rule and all others
had "remote". On reconnection, d1's "a" will be propagated to everyone. This
is irrespective of the order of reconnection (I am assuming that between
reconnections "a" is not modified). This is pretty simple and perfectly
predictable. Now, if we have more than one "local", we start getting non-
deterministic, and at the extreme move to case ii)

~~~
smarx
Note that every device submits its change with an expected "parent revision."
The server checks the change against the server's current revision, and the
change is accepted if and only if the server revision matches the parent
revision of the submitted change.

So when devices d1 through d10 make a (simultaneous, I assume) change, they
all submit their change with the same parent revision. Assuming they were up-
to-date before they submitted that change, exactly one of the device's changes
will succeed (whichever reaches the server first). For example, if the
previous revision was 100, they'll all submit changes with the parent revision
100, and the first one to reach the server will succeed, at which point the
server revision will be increased to 101. When the other changes come in,
they'll all fail because their parent revision doesn't match the server.

I have to slightly revise your scenario and say all these devices went offline
and then made a local change. This may be what you meant, but I want to
clarify that the changes were queued up locally but not yet sent to the
server.

So in (i), where the conflict resolution strategy is "remote," what will
happen is that one device's change will win (whichever reaches the server
first), and all other devices will throw out their change in favor of the
change that made it to the server. It's not the case that _everybody 's_
changes are rolled back.

In (ii), the first device to connect submits its change and is accepted by the
server. Subsequent devices submit their change (with parent revision of 100),
see that they're out of date (server revision is now 101 or higher), and
resubmit their change with the new parent revision, effectively clobbering any
changes that have been made on the server. So each device in turn clobbers the
value on the server, and the ultimate value is whichever change was submitted
last.

For (iii), really don't do that. Having different devices use different
conflict resolution strategies is a bad idea, and I can't really think of a
valid scenario fro that. Can you?

So to sum up, (i) would mean "first change to reach the server wins," and (ii)
would mean "last change to reach the server wins."

~~~
pkj
Thanks for the comprehensive clarification. Yes you are right that the use of
revision-id would allow "first change to reach the server wins" for case (i).
It would result in a simple and fair outcome. I did not see the mention of
revision usage in the parent link. But it makes absolute sense.

Think we agree on case ii). On case iii) I still think that having 1 local and
N-1 remote might be useful when we want to prioritize a particular writer over
others. Borrowing the sales example from commenter jchrisa, consider a new
user sales-head (local rule). He syncs the data uploaded by his sales folks
(remote rule) and then goes offline. When he is done editing and comes online,
he wants to make sure his delta takes precedence irrespective of any previous
changes by sales folks during his being offline. Since he has local rule, his
update will just win. Further, he does not want sales guys who were offline
and come online after him to overwrite his last update immediately. I am
assuming that the sales folks with "remote" rule will see the data with newer
server version and accept it.

~~~
smarx
I suppose your scenario for (iii) is reasonable in the case of sharing a
datastore among multiple users. In that case, you'd probably still want all of
_that user 's_ devices to use the same conflict resolution strategy.

BTW, (for now) the Datastore API doesn't support sharing among multiple users.

Given the relationship the Datastore API has between the client and the
server, it's also possible for a client to use a custom resolution rule
(though this isn't yet exposed in any of our clients), which would let you do
smarter things like look at the changes that were made and decide based on
app-specific logic which one should win. I personally think that's a big
advantage, but as I said, it's not one that's yet been fully realized.

------
boomzilla
Awesome, this is one of the best explanations of version conflict resolution
in distributed storage.

The best engineers not only needs to write great code, they need to write the
best documentations too :)

------
joejohnson
I might not understand what the Datastore API is for, but if I wanted to make
a simple app that could sync to Word Docs between to clients, would these .doc
files get represented in this database structure (tables, records and values)?
Is that data structure just for example purposes?

How would this algorithm handle changes to the same text file? Is the file a
record, and each row considered a value in that record?

EDIT: Looks like this API is only for "structured data like contacts, to-do
items, and game state."
[https://www.dropbox.com/developers/datastore](https://www.dropbox.com/developers/datastore)

~~~
smarx
Your edit is correct. To sync files, you would use the Core API or Sync API.

([https://www.dropbox.com/developers/core](https://www.dropbox.com/developers/core)
and
[https://www.dropbox.com/developers/sync](https://www.dropbox.com/developers/sync))

~~~
joejohnson
I'd be interested to hear how conflicts are handled with those APIs.

~~~
smarx
The API uses (optional) optimistic concurrency. So when you write a file, you
can specify the expected parent revision, and if the revisions don't match,
the operation will fail. If you want, you can have Dropbox instead just
overwrite whatever file is present on the server or write the new file but
rename it in the process (e.g. "myfile (1).txt").

This is pretty much all that's possible to do generally for files, since their
structure is opaque. One of the ideas behind the Datastore API is to support
developers structuring their data in such a way that changes can be merged
automatically.

------
joshuak
Why does chrome prompt me for keychain access on this page?

I'm not logged in, and it doesn't log me in.

~~~
minikites
I've heard around (I don't use Chrome personally) that Chrome just prompts for
keychain access until you give up and grant it access all the time.

~~~
joshuak
Yes, but even then it only needs to prompt when it wants information from the
keychain to fill in forms and such. The question is why would this blog page
trigger such an event?

~~~
iza
There is a sign in dropdown form in the upper right.

