
The Xi Text Engine CRDT - jxub
https://github.com/google/xi-editor/blob/e8065a3993b80af0aadbca0e50602125d60e4e38/doc/crdt-details.md
======
archagon
If anyone’s interested, I wrote a long article a few months ago on (what I
believe to be) the ultimate string CRDT, among other things:
[http://archagon.net/blog/2018/03/24/data-laced-with-
history/](http://archagon.net/blog/2018/03/24/data-laced-with-history/)

I really, really love the work that Levien and Hume have put into the Xi CRDT,
and how open they’ve been with their research. IMHO, though, it’s a bit too
“academic” and clever for any sort of general purpose use, though it may well
be ideally suited for its stated purpose. There even appears to be a limit to
the amount of rewinding you can do, which points to O(n^2) complexity for
conflict resolution: a huge problem for offline-first development, where
concurrent timelines don’t really have any limit on the length of their
diverging histories. (I might be mistaken about this—please correct me if I’m
wrong.)

Still a very satisfying algorithm, though! You can see more of Raph Levien’s
thinking here: [https://medium.com/@raphlinus/towards-a-unified-theory-of-
op...](https://medium.com/@raphlinus/towards-a-unified-theory-of-operational-
transformation-and-crdt-70485876f72f)

~~~
repsilat
> offline-first development, where concurrent timelines don’t really have any
> limit on the length of their diverging histories

Is there really great utility in automatically merging wildly divergent
documents? Being able to send your queued changes and agree on a new state of
the world is lovely, but for most of the circumstances I can imagine the ideal
new "state of the world" is "please fix this merge conflict", not "here's my
best guess at a reasonable state based on the semantics of some data
structure."

Maybe it's domain specific. Maybe in code you want merge conflicts and in some
kinds of rich text docs you just want a best-guess auto-merge. Even then,
though, I think it would depend a lot on what the document is for. Maybe blog
posts and company culture docs should get auto-merge and legal contracts
should get merge conflicts.

~~~
archagon
Yes, there is great utility in this approach! For one, most concurrent text
editing, rich or not, is actually quite easy to merge in 99% of cases. But
even that aside, when you have data structures or document formats that don't
ever have to resolve conflicts—or, more accurately, when conflict resolution
becomes something that's naturally done in app-space, on the edges, and not as
part of the network layer—then you open up the possibility of offline-capable,
collaborative apps that require zero coordination between nodes. A node with
changes only has to _deliver_ their content to interested servers and peers,
not engage in lengthy conversations about who has the definitive version,
whether somebody has to revert, etc. You could build a Google Docs clone that
synced entirely over Dropbox!

And in any case, the "merge" part would still be up to the app. The data
structures will always be able to merge, sure, but the app could still throw
up a dialog box on wildly divergent changes saying, "Hey, some crazy stuff
happened here—want to look it over and commit manually?" (Where "commit" means
generating new changes to overwrite any undesired changes, since it's not
possible to literally undo.) You could argue that we've simply implemented ad-
hoc coordination protocols in the app layer at that point, but IMO, that's
exactly where they belong in cases where there's no authoritative store for
your data. (Which is really the case with mobile, offline-first development.)

------
okket
Previous discussions about the Xi editor and rope science:

[https://news.ycombinator.com/item?id=17109930](https://news.ycombinator.com/item?id=17109930)
(3 months ago, 47 comments)

[https://news.ycombinator.com/item?id=16267202](https://news.ycombinator.com/item?id=16267202)
(6 months ago, 295 comments)

[https://news.ycombinator.com/item?id=14129543](https://news.ycombinator.com/item?id=14129543)
(a year ago, 60 comments)

[https://news.ycombinator.com/item?id=11576527](https://news.ycombinator.com/item?id=11576527)
(2 years ago, 177 comments)

------
colemickens
Also in this space, some Atom devs are working on a new editor/engine in Rust
and have recently shifted focus to CRDT as a way to get collaborative editing
and advanced SCM-like scenarios.

The editor is called 'xray' and the CRDT tech is Eon. They have some info here
and you can find more in the repo/branches.
[https://github.com/atom/xray/blob/master/docs/updates/2018_0...](https://github.com/atom/xray/blob/master/docs/updates/2018_05_28.md)

If you're into the client/server model of Xi, xray is targetting the same,
including an in-browser experience connecting to a remote backed. Similarly,
there is Theia-IDE which actually seems the most advanced in terms of a
functional in-browser editor with a client/server model.

I think these tools are going to enable entire new generations of programmers
on super low end hardware where their editor services and toolchains are
running in a remote DC.

There are others in this space with similar tech, but most seem focused on
very specific niches and use cases. If there are other softwares that hit the
collaborative editing, CRDT, and in-browser experience points, I'd love to
hear about them.

~~~
iainmerrick
Hmm, what does low-end hardware have to do with it? I can’t think of any
challenge in text editing that requires beefy hardware.

Coordinating multiple editors is tricky, yes, but it doesn’t need fast
hardware, just good software and ideally a reliable network.

Editing text on a phone is hard, but that’s a UI problem -- it’s the small
screen and lack of a keyboard. Most phones these days have very capable CPUs
and plenty of memory.

(I agree that this technology is very cool, though! I’m just curious why you
pick out that low-end use case.)

~~~
colemickens
The point is that the frontend is a "thin" JavaScript UI rendered in the
browser while the entire real dev environment is remote -- LSP (language sever
protocol plugins aka autocomplete, semanic highlighting, nav to
reference/definition, etc), other plugins, the project's toolchain, etc, are
running in a container/pod on a beefy machine.

Theia (and GitPod.io) will give you this today and it is compelling. GitPod
gives you a single button on PRs/Issues that drops you in a dev environment,
ready to build and test at the click of a button. No cloning, no installing a
toolchain, etc.

If Rust and Rust Language Server are running in a container with Theia, this
means I can use a Chromebook-style device for serious, real development
without having to enable dev mode or even Linux apps. Every machine in the
world becomes a real potential development environment.

Theia even has (or is about to have) debug protocol support too. A real, full
IDE running on a remote DC, accessible from your browser. ( If you follow what
the Theia and Che devs are doing, they're trying to support the full VS Code
API... Which is SUPER exciting!)

(Note, with the level I'm speaking about here, the CRDT is a bit of an
implementation detail, but it's useful for collaborative editing and in xrays
case, syncing state b/w the browser client and the backend and the underlying
SCM system.)

------
zokier
Does anyone know if there has been any work to apply CRDTs for collaborative
editing of source code _AST_ , especially for something more complex than
sexprs? I imagine that could be neat, but also have some pitfalls.

~~~
arianvanp
Yes definitely. I'm currently doing research in this field. And there are
others doing research in this too
([http://www.expressionsofchange.org](http://www.expressionsofchange.org))

------
bkase
I enjoyed the post, but there's something that bothers me about using the name
CRDT.

A CRDT is just what mathematicians (and functional programmers) call a
Semilattice[1], right? In general, I find it frustrating when people make up
new names for existing mathematical concepts because it deprives others from
learning and seeing the big picture. Does this resonate with anyone here?

~~~
rntz
CRDTs and semilattices are definitely connected, but they're not quite the
same thing. A CRDT is a data structure that can be merged across the network
in a way that's commutative, associative, and idempotent. That much is exactly
the same as a semilattice.

But when people talk about "string CRDTs", the underlying semilattice, the
data structure that gets merged, isn't a string; it's usually something rather
more complicated. Then there's a _function_ which interprets that more complex
data structure as the string that the user or application really cares about.

So a CRDT is a semilattice equipped with an interpretation function.

But it gets even more complicated, because there are many ways to do CRDTs in
practice. Rather than gossiping your _entire_ complex data structure across
the network ("state-based CRDTs"), usually CRDTs try to only send what's
necessary. This leads to optimisations like delta-based and operation-based
CRDTs. These optimisations are crucial for real-world use of CRDTs, but their
connection to semilattice theory is not immediately clear to me. (That doesn't
mean there isn't one, though!)

In any case, the story is a little more complicated than "CRDTS are just a
semilattice". I do wish more people knew about the connection, though.

------
kuwze
While JSON isn't perfect, I hope they base this around the JSON CRDT[0] so it
can handle nested structures well.

Also they might not need tombstones[1].

[0]:
[https://news.ycombinator.com/item?id=12303100](https://news.ycombinator.com/item?id=12303100)

[1]:
[https://news.ycombinator.com/item?id=12303467](https://news.ycombinator.com/item?id=12303467)

------
eddyb
This is an old link, but I could find [https://github.com/google/xi-
editor/blob/master/docs/docs/cr...](https://github.com/google/xi-
editor/blob/master/docs/docs/crdt.md) on master.

------
brunoqc
I wonder when the Xi editor will be ready to use. I can't wait.

------
vesak
Why's it under google's namespace in github? Was it there always?

~~~
adwhit
Google allows it's employees to put their 20% projects on the google github,
provided they a disclaimer in the README. It doesn't mean much but perhaps
it's good for publicity. Xi has always been in there.

------
vasili111
How does Xi editor compares to Vi and Emacs?

~~~
dangom
Vi and Emacs are older projects with a larger community, and thus there's a
ton of plugins out there even for niche applications. Xi is much newer, and
attempts to solve many of the problems that Vi and Emacs face (examples: in
Vi's case the difficulty of writing extensions, and in Emacs' the difficulty
in maintaining and extending the core of the editor written in C). Xi also
focuses in areas that weren't considered that important when Emacs/Vim were
first developed, such as asynchronous operations and collaborative editing.

------
abakus
This editor is probably gonna be banned in China if it got noticed by the
government. Hell they banned Winnie the Pooh

