
Why Vector Clocks are Easy - skorgu
http://blog.basho.com/2010/01/29/why-vector-clocks-are-easy/
======
roder
Justin (CTO of Basho) gave a phenomenal presentation (keynote worthy) at NoSQL
East.

Vector clocks are easy and once he explained it in the presentation it
"clicked" for me.

You can watch the whole presentation here, but if you want to get just the bit
on Vector Clocks, fast forward to 28m

<https://nosqleast.com/2009/#speaker/sheehy>

But seriously, I _highly_ recommend watching the whole presentation. These
guys are brilliant.

------
tptacek
Is this basically the same idea as Lamport's logical timestamps?

~~~
evgen
Lamport timestamps sync events between two systems, vector clocks extend the
sync to N systems by using an array of these logical timestamps.

~~~
mbrubeck
That's not true. Lamport clocks work across any number of nodes/processes. The
figures in Lamport's paper all show examples using three processes:

[http://docs.google.com/viewer?a=v&q=cache:IJxXjuFmdHEJ:c...](http://docs.google.com/viewer?a=v&q=cache:IJxXjuFmdHEJ:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.112.7608%26rep%3Drep1%26type%3Dpdf+vector+clock+lamport+clock&hl=en&gl=us&pid=bl&srcid=ADGEESj9G9HmsPpW6K6_8GQQ8xB-
FQG2Y6NoEPNFrc_Cy7jcugKba8os_np3_7KQ2ecdS2M3cf_7VKc20VWaIqFZWpALejCa3mXf3CIcyhvvCCvL67WXK-
bF45N5ewp9ODFem5R5p8aU&sig=AHIEtbQ4rf26DoHRXxGghyaByJnFtTWR2A)

~~~
justinsheehy
Lamport's logical clocks work across multiple nodes, but they can lose
information about causality.

Hence Mattern's extensions to Lamport's work, which introduce the idea of
vector clocks.

------
Raphael
Cool, it's like the data is surfing a web made of people (or inanimate
actors), carrying a browsing history with it.

------
DLWormwood
Okay, this article made me feel stupid, given that I've been outside of a CS
department for over a decade.

A brief Googling/Wiki-ing explains that they are used to solve certain
concurrency problems, but without further explanation. Can somebody give me a
practical, real world example of a problem this kind of strategy is supposed
to help solve? I feel like I'm missing out on something...

~~~
agazso
Suppose you have a cluster of database servers in different datacenters, and
somehow the connection between the datacenters go down. If you want to ensure
high availability and allow your users to reach and modify the data while
there is no connection, then you have to mark each modifications somehow so
that later when the connection is up and the servers synchronize they are
automatically able to merge most of the changes and find out conflicting
cases.

That problem is solved by vector clocks and the most prominent application of
them is Amazon's Dynamo system. See also eventual consistency for more
information.

