
Optimizing an Important Atom Primitive - mrbogle
http://blog.atom.io/2015/06/16/optimizing-an-important-atom-primitive.html
======
jerf
You know, it's funny how it's 2015 and we're just dripping with raw power on
our developer machines, yet, open a few hundred kilobytes of text and
accidentally invoke a handful of O(n^2) algorithms and _blammo_ , there goes
all your power. Sobering.

Edit: We need a type system which makes O(n^2) algorithms illegal. (Yes... I
know what I just dialed up. You can't see it, but I'm giving a very big ol'
evil grin.)

~~~
userbinator
It's also funny to see these techniques being reinvented and called
optimisations, when perhaps 30 years ago they would not even be considered
optmisations but the only possible way to do it, because anything else would
be _ridiculously slow_ to the point of being unworkable.

 _For example, the find and replace package uses markers to highlight search
results, and if you ran a search for the letter e in a large file, the
excessive time spent updating thousands of markers on every keystroke was
intolerable_

This suggests another problem they have is a _huge_ constant, in addition to
the algorithmic one --- "thousands" shouldn't be something a modern machine
would choke on, if updating positions is a simple operation like increasing
their offset by 1 after a character is inserted. On a CPU that can do _one
billion_ instructions per second (a good OOM estimate, which still
underestimates how fast real CPUs today are by a few times), adding 1 to 10000
variables should not take any perceptible amount of time --- less than a
millisecond.

Going from 800ms to 50ms is a great improvement, but I think that's still
several orders of magnitude slower than what the machine can really do.

~~~
nathansobo
> This suggests another problem they have is a huge constant, in addition to
> the algorithmic one --- "thousands" shouldn't be something a modern machine
> would choke on, if updating positions is a simple operation like increasing
> their offset by 1 after a character is inserted.

As I mentioned in the article, a big part of the cost was updating a different
interval storage structure as the markers were updated since the absolute
positions approach left us no way to defer it.

------
martanne
A piece table[0] solves this rather elegantly. Since it is a persistent data
structure, a mark can be represented as a pointer into an underlying buffer.
If the corresponding text is deleted, marks are updated automatically, since
the pointer is no longer reachable from the piece chain. Lookup is linear[1]
(or logarithmic if you store pieces in a balanced search tree) in the number
of pieces i.e. non-consecutive editing operations.

[0] [https://github.com/martanne/vis#text-management-using-a-
piec...](https://github.com/martanne/vis#text-management-using-a-piece-
tablechain)

[1]
[https://github.com/martanne/vis/blob/master/text.c#L1152](https://github.com/martanne/vis/blob/master/text.c#L1152)

~~~
JoeAltmaier
I used an editor in 1980 that was implemented as piece tables! That was a 64K
HP2000. It worked great!

------
dunstad
I tried out Atom a few weeks ago. I loved the UI! Absolutely fantastic,
beautiful, nothing but praise there.

But I had so many issues with stability, and really missed small but important
features that were present in my other editors. I also found that most of the
plugins worked either poorly or sporadically.

In the end, I decided that it was not worth either using Atom or spending time
contributing to it when I have some "pretty close" solutions today. Definitely
looking forward to the 1.0 version though, and hats off to all those spending
their time contributing to it. I'm sure it's going to become something great!

~~~
ngrilly
I'm curious about which features you missed present in your other editors?
What are those editors by the way?

~~~
dunstad
For me specifically, Atom was mostly replacing Notepad++. At the time I was
testing out Atom, I found that it didn't consistently save the documents you
had open during your last session. There may or may not have been other
things, that's the biggest one I remember.

Neither it nor Notepad++ could handle gigantic text files, which is a real
shame as well, though not a point of comparison.

~~~
yulaow
there is a plugin called last-session or save-session if I remember well, one
of the first I installed.

~~~
fapjacks
Yes, save-session is good, but it has no idea how to handle multiple window
panes. It just throws all your shit in one big ole pane and you're left to
pick up the pieces.

------
Veedrac
I actually just retried Atom yesterday. Aside from the normal complaints (it's
sloooww, undo doesn't affect markers or selections), one thing that struck me
is that markers can't be zero-width. Well, they can but they won't show up.
I'm wondering if this is related to the technique mentioned here - it's
certainly been a pain to work around. Sublime Text even has multiple options
for this (DRAW_EMPTY and DRAW_EMPTY_AS_OVERWRITE).

That said, I'm loving the API design. Coming from Sublime Text, it's a massive
upgrade. The ability to embed literally anything a web browser can render in a
well-designed framework is mindblowing.

~~~
maxbrunsfeld
> one thing that struck me is that markers can't be zero-width. Well, they can
> but they won't show up. I'm wondering if this is related to the technique
> mentioned here

Thanks for bringing this up. It's not a fundamental limitation at all.
Highlights whose markers are zero-width are deliberately filtered out here[1].
This might not be the right move. If you've found it inconvenient, would you
mind opening an issue on atom/atom, and describing your use case?

[1]
[https://github.com/atom/atom/blob/ebc5758d79e421f61f2b6669a8...](https://github.com/atom/atom/blob/ebc5758d79e421f61f2b6669a886a27ee7283816/src/text-
editor-presenter.coffee#L1265)

~~~
Veedrac
[https://github.com/atom/atom/issues/7311](https://github.com/atom/atom/issues/7311)

Thanks in turn. :)

------
twic
Didn't the Xanadu project solve this problem in 1972?

[https://en.wikipedia.org/wiki/Enfilade_%28Xanadu%29](https://en.wikipedia.org/wiki/Enfilade_%28Xanadu%29)

Solve it, keep it secret, and then fail to properly write about it to this
day.

------
drewm1980
I really, really, don't get the whole "implement everything using web
technologies" thing. As an outsider from that dev ecosystem it looks like the
youtube videos you see of people implementing electronic circuits in
Minecraft.

~~~
CJefferson
It makes more sense (to me) than "implement everything in a monospace text-
based terminal you can't resize, hacking around years of horrible different
methods of repositioning cursor / changing colour / setting title".

Of course, one could just write a native interface for every OS, but then that
adds a new layer of complexity. HTML & Javascript engines are nowadays well
optimised and available everywhere, making them the only real choice as a
replacement for terminals.

------
Erwin
If you thought this was an interesting article, here 's the obligatory link to
just about the only book on crafting a text editor, "Craft of Text Editing":
[http://www.finseth.com/craft/](http://www.finseth.com/craft/)

------
octref
Recently I learned all contributors will receive a gift for Atom pre-1.0, and
when I asked a Github stuff when will I receive the gift (I'm moving during
this summer) he mentioned it would be sent out in early July. I guess we can
expect a pre-1.0 before August.

One of the main remaining functionality to be implemented is good support for
large files. Looking at this issue [1], it seems Atom team is making some
progress but there are still some problems to be tackled.

In 0.208.0 (released 7 days ago) they mentioned in the changelog _Atom now
opens files larger than 2MB with syntax highlighting, soft wrap, and folds
disabled. We 'll work on raising the limits with these features enabled moving
forward_. Little bit disappointed at the progress as you could open large file
with these features disabled long time ago through a package "view-tail-large-
files".

Just updated to 0.209.0 and using ember.js (1.9 MB) to test. Editing/scrolling
has some delays but it's better than previous versions.

Good luck Atom team!

[1]:
[https://github.com/atom/atom/issues/307#event-325455529](https://github.com/atom/atom/issues/307#event-325455529)

------
revelation
Yet, the onKeyDown handler still takes 50ms. Are you kidding me? You can push
a billion tris in that time.

~~~
nathansobo
Yeah, still more work to do. There are other things that are slow in that
keydown event. We'll get there.

------
ohitsdom
Appreciate the candidness of the team writing about their naive approach.
Definitely would have been a simpler fix to just search the currently visible
text, but I'm glad they fixed the root issue to make markers more efficient
for all.

------
alexchamberlain
What is the data structure used for the text itself? A rope? The markers could
be stored as offsets to the substrings themselves.

------
msoad
This kind of knowledge and experience exists in Microsoft campus for years
thanks to Visual Studio team. That's why Code is much more efficient. I only
wish if it was open source so I could totally move away from Sublime Text.

~~~
ZenoArrow
I thought VSC (Visual Studio Code) and Atom were based on the same codebase?
I'm sure there are differences in functionality, but if open source is a
prerequisite for you I'd suggest helping Atom grow rather than waiting on a
licence change for VSC.

~~~
carussell
To add to chowyuncat's comment:

Electron is not an editor itself. The Atom editor is built as an Electron app,
and so is Visual Studio Code, but Electron is not an editor. It's not even an
editor framework. When you hear "Electron", think "Cordova for the desktop".
Sort of.

That seems to be the extent of the similarity between the two--that they're
each packaged as essentially a single page app for a Chromium-based, site-
specific browser runtime.

For more info on VSC, see castell's comment from last week[1] and this post[2]
from Scott Hanselman a couple years back (including the comments).

1\.
[https://news.ycombinator.com/item?id=9691289](https://news.ycombinator.com/item?id=9691289)

2\.
[http://www.hanselman.com/blog/ARichNewJavaScriptCodeEditorSp...](http://www.hanselman.com/blog/ARichNewJavaScriptCodeEditorSpreadingToSeveralMicrosoftWebSites.aspx)

------
romaniv
This reminds me of how I re-implemented nested sets in relational databases as
spans in a "coordinate" system.

    
    
      | Root              |
      | Node        | Node|
      | Node | Node |
    

I stored only "X" and "Y" coordinates for every node, so you had to read
"next" node in a row to get current node's "size".

It was a bit more human-readable when looking at the data. More importantly,
it reduced (on average) the number of nodes I needed to update on insert
compared to nested set and gave an easy way of retrieving immediate children.
But you still had to "move over" all the nodes "right" of the one you're
inserting.

The structure in the article looks eerily similar. I wonder whether it's
somehow possible to apply GitHub's optimization to this "coordinate" based
schema and make it relative without messing up the benefits of column
indexing. Hm...

~~~
ubercore
Sounds like a version of [http://www.sitepoint.com/hierarchical-data-
database-2/](http://www.sitepoint.com/hierarchical-data-database-2/)?

~~~
romaniv
Isn't that the nested set model mentioned in the post itself?

------
imslavko
Vim also has a similar optimization: when a file changes, Vim only runs syntax
highlighter on a visible part of the text + some buffer in both directions.

~~~
plorkyeran
Which unfortunately tends to result in everything getting highlighted as if it
was a string literal if you have any multi-line strings anywhere.

~~~
serve_yay
Yeah, I don't think there's a good way of highlighting part of a code file,
what you're looking at depends on everything before and (to some extent) after
it. You have to do the whole thing.

~~~
mpu
The 'extent' you talk about is exactly the lexer (parser) state, you just have
to properly serialize it for the beginning of the buffer to get cheap
redisplays. It's not rocket science but almost no editor got it right.

~~~
porges
Yi is one editor that does incremental parsing correctly:
[https://github.com/yi-editor/yi](https://github.com/yi-editor/yi)

------
caiob
Does it open files >2mb yet? My terminal vim does.

~~~
maxbrunsfeld
Yes it does. Currently, syntax-highlighting and soft-wrap are disabled for
these files. We're continuing to work on optimizations and structural
improvements to the editor that will allow it to support arbitrarily large
files with the full feature set.

------
asQuirreL
Hmmm... So the article seems to suggest that for every insertion of a
character, a log time lookup is made. Is that really the case? If so, why is
the leaf node that the cursor is in not saved? If you were to use a B+-tree
implementation then you would already have access to neighbour pointers for
rebalancing purposes, making the majority of incremental changes very cheap
(constant time). This is just a thought, there may be good reasons why it's
not possible.

------
baldfat
Atom is still a hog on my main programing machine. It makes it unusable for me
still.

It is an OLD i3 Dell from 6 years ago desktop.

~~~
davnicwil
Genuinely intrigued to keep seeing this persistent complaint with Atom.

I've been running it for ages on a mbp with 2.6GHz i5/8Gb RAM. It's lightning
fast.

 _Very_ occasionally it will appear to be hogging CPU - typing is slow and it
might hang for a couple of seconds when you're navigating around. This only
seems to happen when I've had it running for days if not weeks without a
restart. I just restart it and it's back to being lightning fast.

I really like Atom and would be pretty upset if it stopped working well for
me, so would be curious to know what kinds of things people suspect cause it
to have issues.

Is it just that my machine is powerful enough to not notice, and it's only
really a problem on less powerful machines? I don't run many plugins, and
installing many of these tend to bring it down? It runs well on OSX but not
other platforms?

~~~
maxbrunsfeld
Packages can definitely have a significant impact on Atom's performance. You
can observe all kinds of events, and binding a _slow_ handler to certain
events like text changes and cursor movements can make the editor feel
sluggish.

It's important for Atom's extensibility that these kinds of events are
provided, because it allows many major editor features to be implemented as
separate packages, but it means that naively-implemented packages can really
slow things down.

Many Atom packages are pretty new, and their authors may not have put a lot of
effort into optimization yet. In the past few months, the team has put a lot
of effort into stabilizing and solidifying our APIs. We're hoping that now
that the APIs are stable, the package ecosystem will really start to mature.

~~~
seanp2k2
What about a built-in mechanism to let you inspect what plugins are causing
the most delay e.g. During a measuring period? Is that already a thing?

~~~
maxbrunsfeld
The best tool for that (and any profiling in Atom) is chromium's built-in
profiler. See the flame graphs in the blog post.

------
z3t4
One thing I love about vanilla JS is that you can both set and get with the
same property. I wonder if having both setters and getters is enforced by
CoffeeScript or a design decision of the Atom team!?

