
Real Differences Between OT and CRDT for Co-Editors - shunza
https://arxiv.org/abs/1810.02137
======
amsha
Here's the key sentence: "concrete implementations of CRDT in co-editors
revealed key missing steps in CRDT literature." This paper may be correct for
academic CRDTs but it is very wrong when looking at industry implementations.

My hunch is that because CRDTs are so much easier to grok than OT, engineers
are empowered to make use case-specific improvements that aren't reflected in
academic literature.

For example, the Logoot/LSEQ CRDTs have issues with concurrent insert
interleaving; however, this can be solved by dedicating a bit-space in each ID
solely for followup inserts. The "Inconsistent-Position-Integer-Ordering"
problem is solved by comparing ids level-by-level instead of in one shot.

Fundamentally, CRDTs have strong strategic advantages over OT. Given a
document size N and a document edit history H:

* CRDT size is O(N) where OT is O(H)

* CRDT updates are O(log N) where OT is O(H^2)

In any nontrivial document, H ≫ N. This means CRDTs have much better perf than
OT. Additionally, the best CRDTs (like Logoot/LSEQ) don't require tombstones,
garbage collection, or "quiescence." The complexity burden is far lower.

To top it off, CRDTs are offline-capable by default.

~~~
shunza
The time complexity of most OT systems is not related to H.

~~~
josephg
I've been working in OT systems for years (G Wave, ShareJS, ShareDB, some
other stuff). I'm consistently surprised by how badly academic papers predict
OT systems will perform. In reality, they perform _great_. My little C
implementation of text OT can handle about 20M text operation transforms /
second[1].

Part of the gap is that many academic papers model text operations as just
single character edits. If you do that, copy+pasting some text into an editor
can inject thousands of insert operations into the system all at once. But
that design is really sloppy - nobody actually designs real world OT systems
like that. A much better way to design text OT operations uses a list of skip,
insert and delete parts. That way you can arbitrarily compose edits together.
This way an entire pasted string just gets inserted in one operation and
performance is fine. (Or if the user goes offline, you can merge all the edits
they do into a single change, then transform & commit it all at once).

I've still never seen the OT algorithms of anything I've worked on have a
meaningful impact on performance. Actually thinking about it I'm not sure if
I've ever even seen the OT code show up in profile traces.

[1] [https://github.com/ottypes/libot](https://github.com/ottypes/libot)

~~~
espadrine
I fully agree, I don't understand the emphasis on performance both in the
literature and in comments on the subject.

Additionally, it is more common to see CRDT libraries that make character-wise
operations and create an object for each character (eg. y-js.org,
github.com/google/ot-crdt-papers), which is not ideal.

Obviously there are also libraries such as Atom's Teletype that have string-
wise operations.

The cynic in me feels like the CRDT vs. OT war misses the forest for the
trees. What matters is lacking features, and the feature that is most needed
and least described is a systematic way to offer a diff editor matching the
normal experience. Indeed, after having been offline a while, one wants to see
and select how their changes will integrate the shared resource.

~~~
nused
I think you'll find the article supports your sentiment to an extent.

One of the key messages we tried to get across is that academic research for
co-editing ought to go beyond theoretical analysis but rather drive research
by pushing the envelope in new areas of application and/or features. OT's
evolution has been symbiotic with new applications, while we haven't seen this
in CRDT (for coediting) research. The reality is that you'll find new CRDT
papers every year that still (a) make fairly broad claims of benefits are
either theoretically dubious or not backed/validated by application/system
implementations, (b) dwelling on technical issues that were resolve years ago.
To move beyond this, we want to have put these issues in context (a general
transformation approach) and dispel the most common fallacies in theoretical
analysis.

Related to the 'visual-merging' feature you mentioned, when I worked on making
MS Word collaborative as a research project, one of the UX research questions
was how to allow users to visualize the different combined effects of updates
to objects (e.g. if you changed the background of object X to red and I change
it to yellow at the same time). We came up with a multi-versioning technique
to display the potentially different combinations and help users select one
that's the most desirable. It is definitely a interesting and challenging
problem to consider for text documents.

------
marc_shapiro
The argument of Sun's paper seems to be that CRDTs have hidden performance
costs. Perhaps this is true.

This completely misses the main point. OT is complex, the theory is weak, and
most OT algorithms have been proven incorrect (see
[http://hal.inria.fr/inria-00071213/](http://hal.inria.fr/inria-00071213/)).
AFAIK, the only OT algorithm proved correct is TTF, which is actually a CRDT
in disguise.

In contrast, the logic of CRDTs is simple and obvious. We know exactly why
CRDTs converge. There are several papers proving that specific CRDTs are
indeed correct.

Furthermore, I'm somewhat doubtful of the performance discussion, since Attiya
et al. proved that there is a lower bound on the complexity of concurrent
editing, independently of the technology used. See
[http://dx.doi.org/10.1145/2933057.2933090](http://dx.doi.org/10.1145/2933057.2933090).

Disclaimer: I did not read the paper in detail, just skimmed over it.

~~~
marc_shapiro
"Most OT algorithms have been proved incorrect": a better reference is
[https://doi.org/10.1016/j.tcs.2005.09.066](https://doi.org/10.1016/j.tcs.2005.09.066)

~~~
shunza
Representative CRDTs, at least Logoot and TreeDoc, were never truly proved.
The correctness of Logoot and TreeDoc are claimed under the assumption that
the required property of ids is preserved by the CRDT design. However, the
existing id generation algorithms cannot achieve the desired property.

To be specific, I will enumerate two examples.

First, in Logoot
([https://hal.inria.fr/inria-00432368/document"](https://hal.inria.fr/inria-00432368/document"),
the function generateLinePositions in section 4.2 has a serious error, which
could lead to the failure of generating new identifiers. This flawed design
can be also found in the following research (Logoot-Undo,
[https://hal.inria.fr/hal-00450416/](https://hal.inria.fr/hal-00450416/))

Second, in TreeDoc
([https://hal.inria.fr/inria-00445975/document](https://hal.inria.fr/inria-00445975/document)),
the function newPosID described in section 3.2 may generate incorrect ids.
This function is to generate an id r between any two ids p and q (p<q) so that
p<r<q. However, in line 5, if pn=0, then we would get r<p. Readers can refer
to Figure 3, in the tree, the character dY precedes the character dZ in infix
order, however, the id of dY (10(0:dY)) is greater than the id of dZ
(100(1:dZ)), thus their positions are inconsistent their ids.

In the literature, CRDT researchers always claim their CRDT designs are
formally proved.

However, there are various scenarios whereby CRDTs would have inconsistency
issues as stated above.

In my opinion, I don't think there are much validity for these claims.

~~~
marc_shapiro
Well, RGA has been proved formally [DOI 10.1145/2933057.2933090]. Regarding
Figure 3 of the Treedoc paper, I believe the IDs of dY and dZ are in the
correct order, according to the rules in the paper.

------
zellyn
If you're interested in this stuff, you might also like Raph Levien's writing
on it:

\- [https://medium.com/@raphlinus/towards-a-unified-theory-of-
op...](https://medium.com/@raphlinus/towards-a-unified-theory-of-operational-
transformation-and-crdt-70485876f72f)

\- [https://github.com/xi-editor/xi-
editor/blob/e8065a3993b80af0...](https://github.com/xi-editor/xi-
editor/blob/e8065a3993b80af0aadbca0e50602125d60e4e38/doc/crdt-details.md)

~~~
raphlinus
Thanks! I think a unification of OT and CRDT is a very promising space, both
for practical implementation and for deeper academic understanding. To be
honest, I only skimmed the posted paper, but it didn't grab my attention. It's
very difficult these days for me to motivate spending time in this space.

~~~
shunza
TTF is a CRDT-like approach. They both maintain an extra data structure
including tombstones. And TTF-operations are only applied on the designed data
structure. To be applied on the document visible to users, TTF-operations need
to be translated like CRDT.

Your solution is to combine GOTO with TTF. Actually, it is a combination of OT
and CRDT.

------
zzzcpan
"most other OT solutions ... do not require a central server to do (any part
of) the OT work, but only require the use of an external causal-order-
preserving communication service (the same as most CRDT solutions). "

CRDTs don't require an external causal-order-preserving communication service.
That's kind of the whole point of CRDTs. At the same time this is what imposes
certain limitations on how they can be used. But so is automatic conflict
resolution in collaborative editing.

~~~
shunza
Except for WOOT, which CRDT for co-editor does not require causal ordering?

~~~
gritzko
RON CT/RGA may consume inputs in arbitrary order.

~~~
alangibson
Some background on the Causal Tree (CT) that gritzko is referring to:
[http://archagon.net/blog/2018/03/24/data-laced-with-
history](http://archagon.net/blog/2018/03/24/data-laced-with-history).

------
zawerf
Only tangentially related, but today I was reading about collaborative editing
using Quill.js (a rich text editor) and found a really nice solution based on
sharedb (using ottypes/rich-text): [https://github.com/pedrosanta/quill-
sharedb-cursors](https://github.com/pedrosanta/quill-sharedb-cursors)

Before I dive knee deep into this, does anyone have any opinion on sharedb or
better alternatives? (it was listed in this post under it's old name, sharejs,
as one of the more successful OT implementations out there so that's a good
sign)

~~~
lowry
The article above says sharedb is originally written by a Google Wave
developer.

~~~
josephg
_waves_ Y'all are talking about me. I was indeed on the wave team, although I
joined Wave right near the end. (And then I stayed on to help opensource the
whole thing.)

We wrote ShareDB at Lever[1], which was in the 2012 YC batch (iirc). We wrote
it to allow realtime collaborative editing in our application of all our data
fields by default. I'm still really proud of that work. ShareDB primarily uses
JSON-OT[2], which lets you do realtime OT over arbitrary application data. The
lever team has been maintaining & improving sharedb for the last few years,
which is really lovely to see as I've moved on to other projects.

One nice thing about OT is that its much easier to implement. The original
quilljs author (Jason Chen) wrote the rich text OT implementation. ShareDB
works with any OT code (so long as operations can be JSON stringified /
parsed). And its all hooked up in quill, which is super neat. And looking at
the quill github[3], it looks like David Greenspan has been maintaining quill
recently. David Greenspan is one of the original authors of Etherpad, which is
one of the first web based text OT engines.

[1] [https://lever.co/](https://lever.co/)

[2] [https://github.com/ottypes/json0](https://github.com/ottypes/json0)

[3] [https://github.com/quilljs/quill](https://github.com/quilljs/quill)

~~~
lowry
Thanks a lot for the background on sharedb! Great work, indeed.

------
williamstein
I enjoyed reading this paper. I also recently talked with Chris Colbert about
his new plans to use a CRDT approach to collaborative editing of Jupyter
notebooks. This made me curious again about how CoCalc’s collaborative editing
is related to OT and CRDT approaches, so I wrote up a blog post about that
just now. [http://blog.sagemath.com/2018/10/11/collaborative-
editing.ht...](http://blog.sagemath.com/2018/10/11/collaborative-editing.html)

~~~
nused
Enjoyed reading your thinking around what "Intention preservation" means for
CoCalc. I think "Intention" was left open-ended since it was always meant to
be defined in the context of a specific application, e.g. image vs text
editing would like have different notions of user-intentions and what it means
to preserve them.

TP2 has been made into this monster that seems to scare anyone starting to
look at OT. You don't need to worry about TP2 with a suitable protocol with a
server or fully peer-to-peer, and either solution are simple enough to
implement. A fine point is that TP2 and intention-violation are distinct
properties: you can easily achieve convergence by serializing the operations
through a total order at every site (but your won't have intentions
preserved), on the other hand, if you insert 'a' and I insert 'b' at the same
location simultaneously, it's reasonable for both results 'ab' or 'ba' to be
intention preserved, but without transformations we'd each see a different
result (thus divergence).

We've documented plenty of CRDT's performance and correctness issues in that
article. I think a bigger app design question I would ask is if you want to
"lock-in" your apps internal data model with a consistency-maintenance scheme
-- there is real value in separation of concerns (SoC) .

------
gritzko
An article of disappointing quality from well known OT authors.

Like,algo x has issue X, algo y has issue Y, z has Z, so CRDT has issues X, Y
_and_ Z... and many things like that.

~~~
zzzcpan
I agree. I don't understand why they are attacking CRDTs at all.

~~~
alangibson
I'm not accusing the authors of this, but some people seem to have knee-jerk
reaction against CRDTs because from far away they look like magic. "Use this
magic data structure and all your conflicts shall disappear." Of course, once
you look closely you see there are plenty of tradeoffs, just like anything
else.

------
Docky
Are there any CRDT-based industrial co-editing apps? Please, do not mention
the toy implementations and applications unrelated to co-editing.

~~~
gritzko
Recently one guy reverse engineered Apple Notes. It is a CRDT. I personally
made a CRDT sync for Yandex. There is also list on Github...
[https://github.com/ipfs/research-
CRDT/issues/40](https://github.com/ipfs/research-CRDT/issues/40)

~~~
Docky
Good work. Which CRDT alg was adopted in your product?

------
marknadal
Sorry, what?

OT requires centralized servers to do intention resolution.

CRDTs are a set of operations that can work on any peer and resolve correctly,
without extra coordination or central resolution.

The two things are about opposite as they can get. Some CRDTs can be used to
make Google Docs style OT, but are not specific to OT use cases.

Source: Me, I'm the leading industry and open source expert on CRDTs which
power the popular [https://github.com/amark/gun](https://github.com/amark/gun)
decentralized database that the Internet Archive, D.tube (1M+ monthly
visitors), notabug.io (1K/daily users) use. CRDTs have let us scale to 3K
tx/sec on $99 hardware in P2P mesh networks.

~~~
shunza
Why OT cannot be applied on P2P networks? Does Google Docs use a
transformation-based server mean that all OT need a central server?

~~~
m12k
Yeah, I thought the whole point of OT was that it allowed any client to do
resolution (because all operations transform other operations in a way to make
them all resolve the same way no matter their order) Which is what made it
seem like such overkill for Google Docs where there is a central authoritative
server, so a much simpler architecture could have done the job...

~~~
josephg
Simple OT algorithms do work a lot better with a centralized server.

With a centralized server, you can consider every resolution as happening
between 2 nodes (server and client). This means transform / catchup is as
simple as a for loop.

In contrast, in decentralized context OT code needs to support Transform
Property 2[1] in order to converge correctly in all cases. TP2 dramatically
complicates the OT implementation, and you need a much more complicated
resolution algorithm to merge arbitrary changes between nodes.

For text, this means you need:

\- Tombstones (or something like it) - eg
[https://github.com/josephg/TP2/blob/master/src/text.coffee](https://github.com/josephg/TP2/blob/master/src/text.coffee)

\- Either an operation prune function (anti-transform) or full history,
forever.

\- A resolution algorithm that can flatten DAGs of operations into a
transformed list. This stuff gets really hairy, and hard to implement
efficiently. This is my attempt from a few years ago. Its correct, but super
slow:
[https://github.com/josephg/tp2stuff/blob/master/node3.coffee](https://github.com/josephg/tp2stuff/blob/master/node3.coffee)

Implementing high performance OT with centralized servers is easy.
Implementing OT in a decentralized network is hard to do correctly, and much
harder to implement in a highly performant way. For decentralized systems,
CRDTs are a much better approach imo.

[1]
[https://en.wikipedia.org/wiki/Operational_transformation#Tra...](https://en.wikipedia.org/wiki/Operational_transformation#Transformation_properties)

~~~
m12k
Very insightful, thank you for the detailed comment. About the DAG flattening
algorithm you mention - in what way does this differ from a topological
sorting?
([https://en.wikipedia.org/wiki/Topological_sorting](https://en.wikipedia.org/wiki/Topological_sorting)
) E.g. does it need to know about expensive or cheap combinations of
operations, and try to avoid the former?

~~~
josephg
Oh! I didn’t know that had a name. Yes - it is indeed a topological sort. At
least the way I’ve implemented it, the performance problem is that moving an
operation from position K to position N in the output requires O(|K-N|) steps.
CRDTs can do this work in O(log(|K-N|)) steps.

