
Are CRDTs suitable for shared editing? - signa11
https://blog.kevinjahns.de/are-crdts-suitable-for-shared-editing/
======
lewisjoe
I’m part of the team that makes Zoho Writer (Google Docs alternative) -
[https://writer.zoho.com](https://writer.zoho.com)

Back in 2009, we wrote our real time collaboration component using Operational
Transformations (OT). But when CRDTs started to gain the spotlight, these are
the reasons why it didn’t warrant a rewrite to our stack:

#1 Memory issues with tombstones. Marking as deletion has a cost of
maintaining them throughout the session.

#2 CRDTs were perfect for plain strings and arbitrary key/value pairs. But
when it comes to schematic JSON that has semantic value CRDT was an added
overhead. For example, try making a collaborative HTML editor with support for
table operations. It is very likely you will end up with a table structure
that is invalid to render. In OT analysing, modifying or cancelling ops were
much easier.

#3 Since the software is primarily a cloud document editor, a central server
is necessary even otherwise. So why not use the server for efficient version
management and operation sequencing as well? Why prefer CRDTs whose bulk of
complexity comes from eliminating the need of a central system?

Such practical reasons have kept us from venturing into CRDTs. As of now,
common editing platforms like Google Docs, Zoho Writer, CKeditor, ProseMirror,
Quill, CodeMirror - all of these work with OTs instead of CRDT for
collaborative editing.

~~~
cryptonector
Also, if the monoids used in a CRDT implementation do not align with the
mental model users have for what they're doing (hint: they do not), then the
result will be artifacts the users don't like. Single user text editing is
amenable to using ropes and monoids internally because none of that leaks into
the UI/UX, but multi-user text editing is not amenable to representation as
monoidal interactions because in the multi-user case the monoidal interactions
will become evident and confusing.

Or at least that's the lesson I took from all the xi blogs I've read.

~~~
hinkley
This is a problem we need to solve with code editors and version control as
well.

Patches are always a little bit garbage, and can cause extra problems with 3
way merges because of it.

~~~
Chris2048
In feel this is probably a language-level problem, and solving it _only_ at
editor/VC level will be complicated.

At the same time, changes to the lang level would require changes at editor/VC
level too.

As en example: a MIX type syntax can separate classes and class-methods into
different files, and combine them back again. This way moving a class method
around in a class won't be seen as removing a method and creating a new one:
the file give the specific method an identity, and a method of differentiating
between code movement and code alteration.

------
raphlinus
In xi, there are two issues. One is the cost of the CRDT representation, the
other is whether it accurately represents what you're trying to do. Classical
implementations of CRDT use a heavyweight node per character in the text, each
with its own unique and permanent id. In my research in xi, it became clear to
me that more compressed representations are possible, and the xi CRDT is one
such: it is an "op" object per edit, but the string itself is just a string.

Lately there has been other work pursuing those compressed representations:
Chronofold, I think Martin Kleppmann's automerge has some memory-efficiency
work, and there are others.

But at least in the context of xi, this is not where things went wrong. As
I've written about (and the author was kind enough to link), it's because CRDT
merges aren't a good fit for the problems a code editor is trying to solve,
particularly when the "collaborators" are automated processes such as language
servers.

In human collaborative editing, it's important to preserve text that's being
entered, even in the face of conflict. But when a peer is an automated
service, it's much better to drop the edit on the floor and recompute. I'm
simplifying here, as it depends on the service - some are history-insensitive,
some are sensitive to a small window of history (in the case of automatically
inserting indentation, etc), and (speculative) other services may be sensitive
to more history.

In addition, the CRDT constrains the data model considerably. In other words,
it's unfortunately _not_ a clean abstraction where you can easily add higher
level layers on top of the CRDT, but you always have to design those with the
CRDT in mind (ie, everything still has to be a monotonic semi-lattice).

So, as with everything else, it's a tradeoff, and it's a question of weighing
the pros and cons. I'm glad to see work being done to improve CRDT, but even
with a very efficient representation and solid algorithms, the problems with
CRDT would be enough for me not to use them in a code editor.

~~~
kevinjahns
Hi Raph,

your GitHub comment about "Why CRDT didn't work out as well for collaborative
editing in xi-editor" [1] was sent to me several times as an argument not to
use CRDTs. I agree with you that CRDTs might be too complex for the way the
xi-editor used it (although I loved the idea, and appreciate that you tried
it).

But the title of the Hackernews post tells a very different story. So many
people misunderstood WHY CRDTs didn't work out well for the xi-editor.

At the very least, my article shows that CRDTs are well suited for shared
editing. You brought up some valid points why CRDTs are not well suited for
having different editor components communicate with each other (language
server, indentation, syntax highlighting, ..).

Although, I still think there is a lot of merit in using CRDTs as a data model
for a code editor (or any kind of editor). Not for editor components
concurrently modifying the editor model, but just as an collaboration-aware
model.

• Marijn considered a CRDT as a data model for CodeMirror 6 because positions
in collaborative software can be better expressed [2]. A position in the Yjs
model is simply a reference to the unique id of the character.

• Even without collaboration, Yjs servers as a highly efficient selective
Undo/Redo manager. Each item on the Undo/Redo stack just consumes a couple of
bytes. Furthermore, most existing implementations don't support selective
Undo/Redo. This is free when using Yjs as a data model.

• Some components can work in the background and annotate the CRDT model (not
manipulate it). For example, a code analyzer that runs on a remote computer
could annotate a function and notify the user about potential problems. The
position of the annotation will still be valid if users modify the model
concurrently in a distributed environment.

[1]:
[https://news.ycombinator.com/item?id=19886883](https://news.ycombinator.com/item?id=19886883)

[2]: [https://marijnhaverbeke.nl/blog/collaborative-editing-
cm.htm...](https://marijnhaverbeke.nl/blog/collaborative-editing-cm.html)

~~~
raphlinus
I'm not sure what to add. I definitely agree that CRDT is viable for
collaborative editing, I'm just saying people need to be aware of the issues.

Regarding position and undo, these are problems that can also be solved just
fine using OT techniques, and are (imho) simpler when there is a central
authority that can order all revisions into a globally consistent sequence.

I think it's inevitable that people are going to misunderstand arguments,
given the complexity of the underlying space and the paucity of good learning
resources. So thanks for your writeup, it adds to the discussion.

~~~
josephg
Just to add my 2c, having written sharejs and sharedb and working on and off
on OT systems for the past 10 years or so:

After playing with a simple implementation of Martin Kleppmann’s newest
automerge work, I was wrong to doubt CRDTs. I’m seeing about 6M ops / second
in a prototype text engine in rust. I have much more to say on this - probably
a blog post more to say. But I think CRDTs are the future for many workloads,
and I have a strong sense of despair seeing how much time I’ve wasted
investing in approaches which won’t feature strongly in the future.

Automerge is really good.

~~~
kevinjahns
> I have a strong sense of despair seeing how much time I’ve wasted investing
> in approaches which won’t feature strongly in the future.

Your comment makes me sad on several levels..

Firstly, ShareJS and OT types was the first open framework to build
collaborative applications on the web. Your work inspired me to work on shared
editing - I just wanted it to work over WebRTC.

Until 2015, JavaScript engines implemented different garbage collection
approaches that wouldn't allow efficient CRDT implementations (as they need to
handle millions of objects).

The idea of distributed applications on the web just popped up a couple of
years ago. WebRTC didn't even exist when you started ShareJS. Even Websockets
were still a bit experimental. OT was the right technology at the time.

Lastly, I'd love for you to try out Yjs. The current JavaScript implementation
handles 2.5 million operations / second. A Rust/C implementation would surely
handle more than that as it is not limited by automatic garbage collection.

You can compare Automerge's current performance branch with Yjs' current
implementation in [1]. To be fair, their implementation is not finished. If I
understood correctly, they want to load the compressed format directly into
memory. I played with this idea a few years ago and represented the Yjs model
in an ArrayBuffer. This approach will improve load-time, but the performance
will be intolerable in other aspects (e.g. when applying document updates,
transforming the document to a String/JSON, or when computing diffs). The
performance of Yjs is the result of more than five years research. I still
have a couple of ideas to improve performance significantly. Although, my
article clearly shows that it is definitely good enough now.

[1]: [https://github.com/dmonad/crdt-
benchmarks/pull/4](https://github.com/dmonad/crdt-benchmarks/pull/4)

------
spankalee
I've implemented operational transform based systems several times, and every
time I check out CRDTs I end up staying with OT.

I find the complexity arguments against OTs to be far, far overblown and
mostly academic, and that OT is, for me, much easier to understand.

The two big critiques against OT are transform explosion: you could have N^2
transforms. Except that not every operation pair interacts. I usually see just
a couple of categories of operations: text, key/value, tree, and list, and
they all only need transforms within a category.

The second critique is around sequencing in a distributed system, but I've
also never seen a production system that doesn't have a central coordinating
service. With a star topology OT sequencing because simple. You don't even
need a lamport clock, you can get away with a simple counter. Buffering at the
clients simplifies even more.

There's a great series of posts on collaborative editing by Marijn Haverbeke,
the author of CodeMirror and ProseMirror, that are designed with OT in mind:
[https://marijnhaverbeke.nl/blog/collaborative-editing-
cm.htm...](https://marijnhaverbeke.nl/blog/collaborative-editing-cm.html)

------
AnthonBerg
Following the golden rule, I always post a link to a series of papers
comparing the theoretical properties of CRDTs and OT – here's the latest one:

 _Real Differences between OT and CRDT under a General Transformation
Framework for Consistency Maintenance in Co-Editors_

Proceedings of the ACM on Human-Computer Interaction 2020

Chengzheng Sun, David Sun, Agustina Ng, Weiwei Cai, Bryden Cho

It’s an evolutionary series, here’s the rest I believe:
[https://arxiv.org/search/cs?query=Sun%2C+Chengzheng&searchty...](https://arxiv.org/search/cs?query=Sun%2C+Chengzheng&searchtype=author&abstracts=show&order=-announced_date_first&size=50)

------
nestorD
Here is the best article I read on CDRT. It is the thing that got it from
theory to "ok I think I can impelment it if I ever need it and I believe it
would be a solid solution": [http://archagon.net/blog/2018/03/24/data-laced-
with-history/](http://archagon.net/blog/2018/03/24/data-laced-with-history/)

------
jamil7
Nice overview of Yjs. I went down a bit of a CRDT rabbit hole a few months ago
before ultimately deciding against using it in a project. Very interesting and
fun stuff though, I keep a close eye on these projects. I was surprised not to
see automerge mentioned but it does show up in the benchmarks. They're
(automerge team) working on a major rewrite of some internal data structures
that will eventually improve performance quite a lot, it also has a Rust and a
Swift port.

------
simias
I don't have an opinion on the core of the issue but this paragraph leaved me
perplex:

>Most CRDTs assign a unique ID to every character that was ever created in the
document. In order to ensure that documents can always converge, the CRDT
model preserves this metadata even when characters are deleted.

The entirety of the C code in my checkout of the Linux kernel is made out of
567464661 characters (counted with wc on all .c and .h files). Assuming a
naive algorithm that assigns a 128bit UUID to every single one of these
characters, you have a little over 8GB of RAM for the IDs. Of course you also
have to consider the deleted characters (and storing the characters
themselves, duh).

That's assuming that you want to keep track of the entire source code forever
however, surely in practice you can be massively more efficient by "freezing"
the changes past a certain reasonable delay.

So yeah, that does seem pretty memory-intensive, but in the age of Electron-
based code editors and docker containers that doesn't seem all that absurd to
me tbh. My Signal desktop client currently uses up almost 300MB of RAM to
display plain text, I find that a lot less reasonable quite frankly.

~~~
kevinjahns
Based on the [B4] benchmark, we can predict the size Yjs document representing
the complete editing history of the Linux kernel (probably the largest Git
repo ever created): 864 MB. The size of the Git repository is currently 1.1
GB. So Yjs has better encoding.

If you would load the the Yjs document containing the editing history of the
Linux kernel in-memory, you would use about 13.8 GB of memory. Of course, you
wouldn't write the complete Linux kernel in a single file. As of 2011, the
project consisted of ~37,000 files. If you represent each file as a separate
Yjs document, you would use, on average, just a few kilobytes to load a single
file.

The editing history of the Linux kernel is a very interesting benchmark
resource. Maybe I will add it to crdt-benchmarks.

~~~
jcranmer
> the complete editing history of the Linux kernel (probably the largest Git
> repo ever created)

Linux is definitely not the largest git repo ever created [0]. The big
corporate monorepos are definitely larger; I know MS has moved Windows to git,
and itself claims it to be the largest ever created (~300GB as of 2017, per
[1]). Google and Facebook both eschew git, though.

Finding data on the largest open repos is more difficult. The largest classes
of projects are those that develop in monorepos that implement critical
operating system [2] functionality, browser engines, and compiler
implementations. The shortlist I'd make comes out to these projects (in no
particular order):

* gcc

* LLVM

* Mozilla

* Chromium

* Linux

* OpenJDK

I haven't finished downloading all of these repos (my disk is begging me to
stop right now), but it looks Linux is larger than gecko-dev by a very thin
margin (so a putative gecko-dev that included comm-central with its CVS
history as well would easily outstrip Linux), and Chromium seems to be an
order of magnitude over both.

[0] To be clear here, I'm mostly thinking in terms of primarily textual
repositories. Repositories with large binary assets are clearly not relevant
for your means.

[1] [https://devblogs.microsoft.com/bharry/the-largest-git-
repo-o...](https://devblogs.microsoft.com/bharry/the-largest-git-repo-on-the-
planet/), although
[https://news.ycombinator.com/item?id=14411724](https://news.ycombinator.com/item?id=14411724)
claims that the 300 GB measures the size of the checked-out directory on disk,
not the putative size of a full .git folder.

[2] I'm including both kernel roles as well as key userspace roles. Qt and
Gnome would both be on my list of putative largest repos were they monorepos,
but they appear to use many small repos instead.

------
robmccoll
Has anyone tried a pseudo-distributed OT with a floating server (maybe leader
elections)?

Also, would establishing a hierarchy of OT groups (or some division of
responsibility over the dataset) help with scaling?

It just seems like it's easier to make OT work in practice and that there
might be practical techniques and compromises to overcome its limitations.

------
prepend
It seems like CRDTs would be useful for contact tracing in that distributed
contact tracers collect data on cases and potential cases. They operate
independently for hours through the day with occasional network access or at
least end of day.

~~~
kevinjahns
Since there is no concurrency in contact tracing (only you will manipulate
your own data), a CRDT might be an unnecessary overhead. The German Corona-
Warn-App is a decentralized approach for contact tracing. It's awesome, you
should check it out: [https://github.com/corona-warn-
app](https://github.com/corona-warn-app)

~~~
prepend
There is concurrency in contact tracing. In developing nations it’s like a map
reduce problem. You send out the contact tracers in the morning, they gather
results and sync up at the end of the day.

Being able to sync better through the day means reducing multiple checks on
families, adds in new contacts that could be checked by others nearby. Stuff
like that.

Thanks for sharing this app. I haven’t seen any data showing these user run
apps are useful. Even if they got enough users to run them, the noise is so
high to actually do anything with the results.

Although I’m optimistic that someone will figure it out and publish some
papers on it.

------
MrStonedOne
>Conflict-free Replicated Data Type (CRDT)

For anybody wondering.

