
Atom's new concurrency-friendly buffer implementation - madspindel
http://blog.atom.io/2017/10/12/atoms-new-buffer-implementation.html
======
trishume
[https://github.com/google/xi-editor](https://github.com/google/xi-editor)
uses many similar ideas to implement concurrent editing by plugins and other
devices. In fact nathansobo, the author of this post, wrote a summary of some
of the work Raph Levien (Xi's primary author) has done on CRDTs, so this is
probably not a coincidence.

Xi also keeps track of changes using a data structure like Patch. It doesn't
use a tree to do so, although we may eventually switch some to use Xi's
generic Rope tree data structure. One interesting difference is that Xi always
creates another "layer" for every edit and never modifies anything. This leads
to lots of layers, but instead of working forwards from a "base text" we work
backwards from the current text. So accessing the current text is very fast,
and accessing a past snapshot may require mapping in reverse past a small
number of edits.

They both solve the same problems though, although Xi's approach of layers for
every edit also enables seamless multi-device concurrent editing via a full
CRDT merge implementation.

If you're interested in reading in detail about the approach Xi uses, I wrote
a big document about it near the end of my internship working on Xi's CRDT:
[https://github.com/google/xi-editor/blob/master/doc/crdt-
det...](https://github.com/google/xi-editor/blob/master/doc/crdt-details.md)

~~~
archagon
Thanks for this! Great writeup. I'm guessing xi's CRDT uses ideas from Raph
Levien's work on combining OT and CRDTs? I've only given Raph's papers a
cursory read, but there seem to be a lot of similarities and it looks like he
was mentoring you. Does it have an official name yet?

Incidentally, I've also been working on an in-depth article on a particular
CRDT: Victor Grishchenko's Causal Trees[1]. The original paper only uses the
data structure for text, but I believe it can be extended to a variety of
other data types. In brief, the basic convergent structure is a tree of
immutable "atoms". This tree is stored in memory as a DFS in-order traversal.
I treat each atom as an atomic operation ("insert char" or "delete char" in
case of strings), and thus, every user action involves an O(N) mutation of the
tree array to insert the new atom in the correct spot, followed by an O(N) re-
read of the array to update the user-facing data. This requires the effect of
each atom to be O(1) when read in DFS order, but with strings, this follows
naturally. A CT string in memory looks a whole lot like your "union string"
with explicit delete ops instead of tombstones, e.g. ["ins D","ins O","ins
U","del","ins G"] which parses as DOG in O(N), except each atom does in fact
have a unique ID and holds a reference to its causal atom — the character
intended to be on its immediate left. (So no need for crazy coordinate
transforms!) As with xi, you also get a full revision history which can be
garbage collected if needed.

My implementation is a work-in-progress, but I've already extended it to basic
Bézier drawing and I'm really curious to see how it scales compared to a more
OT-like CRDT such as yours. My guess is that CT won't work very well with
large files on account of the full O(N) read/write requirement as well as the
~10x memory cost vs. C strings, but on the other hand, it seems that very few
other CRDTs can be painlessly tweaked for use with other data types, and this
would be a worthwhile tradeoff for me. CT also benefits from a mostly O(N)
merge and O(N) past revision viewing, provided you keep to the reconstruct-
data-in-linear-time-on-DFS-traversal invariant.

Anyway, CRDTs are super cool! Interesting to see them being used in local
contexts, but it totally makes sense, especially since consistency w/garbage
collection isn't a concern.

[1]:
[https://ai2-s2-pdfs.s3.amazonaws.com/6534/c371ef78979d7ed84b...](https://ai2-s2-pdfs.s3.amazonaws.com/6534/c371ef78979d7ed84b6dc19f4fd529caab43.pdf)

[2]: [https://github.com/archagon/crdt-
playground](https://github.com/archagon/crdt-playground)

~~~
trishume
Glad you found my writeup useful, and your CT stuff sounds interesting. You
may also find [https://github.com/google/xi-
editor/issues/250](https://github.com/google/xi-editor/issues/250)
interesting, especially the TODO list of optimizations.

Yes the Xi CRDT (closest we have to a name) is based on Raph's work, plus some
discussions we had and about one week of me standing in front of a whiteboard
figuring out how to apply the concepts to doing a merge on Xi's specific CRDT
structures.

Currently Xi also stores many bytes for each character typed, but one of our
planned optimizations is to delta-encode chunks of old revisions. The
operations are stored with fields that are designed to delta-encode
efficiently so for runs of normal insertions we should be able to get down to
~2 bytes per character.

------
bhhaskin
Thanks for the post. Interesting stuff. Atom has come a long way since initial
launch, and it still has a long way to go, but its great to see these kinds of
improvements.

------
pjmlp
Bit by bit, Atom will become a native editor.

~~~
majidazimi
Sooner or later everything has to become native. JS/Ruby/PHP/... shit is only
for initial launch.

~~~
callumjones
How do web programming languages like PHP and Ruby become native?

~~~
majidazimi
When they decide to abandon:

1\. Dynamic typing

2\. Global interpreter lock

3\. Automatic garbage collection

4\. Interpreting the code (instead of compiling to a single binary).

5\. Virtual machine shit

It is just the matter of time, until the team accepts the pain to learn "How
to do stuff in C/C++/Rust?".

~~~
ninegunpi
>3\. Automatic garbage collection >4\. Interpreting the code (instead of
compiling to a single binary). >5\. Virtual machine shit

When feeling like the rest of the world is running in a direction of utter
madness, it makes sense to re-read "Worse is better" [1] once in a while.
However, I agree, DT and GIL aren't helping anything but writing poor code
faster and blinder.

[1] [https://www.jwz.org/doc/worse-is-
better.html](https://www.jwz.org/doc/worse-is-better.html)

~~~
theprotocol
FYI: When clicked, that link takes you to an image of a testicle in an egg cup
captioned with a sarcastic comment about Hacker News.

You have to copy and paste the url.

~~~
WillyOnWheels
> FYI: When clicked, that link takes you to an image of a testicle in an egg
> cup captioned with a sarcastic comment about Hacker News.

jwz has strong opinions about Hacker News

------
tekkk
If this gets rid off the freezing issues
[https://github.com/atom/atom/issues/7275](https://github.com/atom/atom/issues/7275)
then i thank you atom developers for your efforts in fixing it. Although i
already switched to VScode it's good to know that there's now more choices
available.

------
aphextron
Reinventing the 40 year old wheel for the 20th time

~~~
Klathmon
I always hated this metaphor.

I don't know about you, but I sure don't want my new Audi running on stone
wheels.

~~~
LungButta
Your Audi runs?!?

------
CharlesW
Could WebAssembly, at least when fully realized, be used for this kind of
"custom data structure" optimization?

~~~
CyberDildonics
Yes, but even faster would be actual native instructions run directly by your
CPU.

~~~
bicubic
So... webassembly?

~~~
akmittal
Webassembly run on VM for portability.

~~~
Frizi
Which in turn JIT compiles the code down to your specific hardware, possibly
doing some extra optimizations along the way as it cares only about a single
processor at a time. I do think languages like Rust are really superior to
what we have in JS land, but the VMs there are truly a marvel of modern
engineering.

------
_pmf_
> "JavaScript can be quite fast, but fundamentally it’s a scripting language
> with the unavoidable overhead that implies."

I don't even ... but better late than never.

------
astaroth360
Yay, hopefully no more jankiness. Maybe I can actually use Atom again.

------
yAnonymous
Does Atom have a proper side bar yet to show open files? I can't even begin to
understand how anyone can get by with tabs when working on big projects.
Horizontal file navigation is a terrible idea.

~~~
chrisseaton
Horizontal tabs works better for me. I didn't like the side bar of open files
when I tried VS Code.

So it's just an opinion thing - not objectively terrible as you think.

~~~
yAnonymous
It stops being an opinion thing when you are working with so many open files
that you can't read their names in the tabs. That's why I specifically said
"big projects". VS Code gives users the freedom to show files as they want.
Atom doesn't.

~~~
chrisseaton
I work with so many files that I can't read their names in the tabs - so I
scroll the list of tabs horizontally - which is exactly what you do I guess if
the vertical list doesn't fit on the screen. Or I just find the files again in
the project tree. It works fine for me - they can't add everyone's preferred
style to the UI or the project would eternally bloat.

------
CyberDildonics
Atom: reinventing the wheel and never making it round

~~~
always_good
Speaking of reinvention, thanks for the original humour / quip in every
submission with Atom in the title.

~~~
CyberDildonics
Link me what you are talking about

~~~
doesnt_know
I tried Atom and didn't have a positive experience, so I'm not going to type
out a bunch of words defending it. But the "reinvent poorly" comment comes up
all the time on Atom posts. You're not adding any original insight or humor.

Here you go:

[https://www.google.co.nz/search?lr=&as_qdr=all&q=allintext%3...](https://www.google.co.nz/search?lr=&as_qdr=all&q=allintext%3A+atom+editor+reinvent+site%3Anews.ycombinator.com&oq=allintext%3A+atom+editor+reinvent+site%3Anews.ycombinator.com)

~~~
CyberDildonics
I'm missing the part where someone else said the exact same thing I did.

> the "reinvent poorly" comment comes up all the time on Atom posts.

That's likely because they reinvent things and do it poorly. This post is
about yet another instance of that. It isn't something I've seen discussed
before, it is a brand new part that has been reinvented poorly.

