Hacker News new | comments | ask | show | jobs | submit login
Optimizing an Important Atom Primitive (atom.io)
294 points by mrbogle on June 16, 2015 | hide | past | web | favorite | 90 comments



You know, it's funny how it's 2015 and we're just dripping with raw power on our developer machines, yet, open a few hundred kilobytes of text and accidentally invoke a handful of O(n^2) algorithms and blammo, there goes all your power. Sobering.

Edit: We need a type system which makes O(n^2) algorithms illegal. (Yes... I know what I just dialed up. You can't see it, but I'm giving a very big ol' evil grin.)


It's also funny to see these techniques being reinvented and called optimisations, when perhaps 30 years ago they would not even be considered optmisations but the only possible way to do it, because anything else would be ridiculously slow to the point of being unworkable.

For example, the find and replace package uses markers to highlight search results, and if you ran a search for the letter e in a large file, the excessive time spent updating thousands of markers on every keystroke was intolerable

This suggests another problem they have is a huge constant, in addition to the algorithmic one --- "thousands" shouldn't be something a modern machine would choke on, if updating positions is a simple operation like increasing their offset by 1 after a character is inserted. On a CPU that can do one billion instructions per second (a good OOM estimate, which still underestimates how fast real CPUs today are by a few times), adding 1 to 10000 variables should not take any perceptible amount of time --- less than a millisecond.

Going from 800ms to 50ms is a great improvement, but I think that's still several orders of magnitude slower than what the machine can really do.


> This suggests another problem they have is a huge constant, in addition to the algorithmic one --- "thousands" shouldn't be something a modern machine would choke on, if updating positions is a simple operation like increasing their offset by 1 after a character is inserted.

As I mentioned in the article, a big part of the cost was updating a different interval storage structure as the markers were updated since the absolute positions approach left us no way to defer it.


Or if you use a continuously-fading color in the CSS for your extension (e.g. Rainbow selection), it'll eat 100% of one core just to change the background color on some text. It's a fun extension, but when my fan spins up because I forgot to deselect before switching windows, it kinda makes it less fun.


Hi there. I'm the author of that admittedly ridiculous and unnecessary extension :)

"Extension" should probably be in scare quotes because the implementation is literally just CSS animation: https://github.com/dmnd/rainbow-selection/blob/master/styles...

There's no good reason such an animation should be taxing your machine. It works fine on my MacBook Pro. But someone else reported this issue too: https://github.com/dmnd/rainbow-selection/issues/3

If you're interested in working out what's going on, please respond to that github issue. I understand if you don't want to spend time on such a trivial thing. But you spent the time to write this comment, so I figured maybe you're interested :)


so cool, installed immediately


Also known as the paste minified javascript in Jetbrains ide effect.


IntelliJ forks handle pasting large volumes of text much better than vim or sublime text in my opinion. IntelliJ will be able to maintain performant syntax highlighting because it builds an ast and does not use regular expressions.


The problem is not volume but extremely long lines. Minified js is one veeery long line. I think that they have some O(n^2) in their line algorithm. You can paste 60-70 thousand lines and it will highlight instantly. The same code minified will hang the IDE.


On a side note, you should preserve some new lines in your minified js as it makes dealing with production JavaScript errors much easier, assuming you have some kind of onerror hook that sends you js errors. The new lines don't add much to the overall size of your js but they make debugging much easier.


We have been meaning to get around to looking into this on my current project but haven't yet: I believe the browser's error event will return a column number.

> Column numbers in Firefox. Firefox is the only browser that doesn't support column numbers for exceptions. This makes debugging minified javascript essentially impossible. I've sent one patch but there's a lot more to do.

https://bugsnag.com/blog/js-stacktraces/

- written in 2014

but MDN now says:

> Column number for the line where the error occurred (number) - Requires Gecko 31.0

https://developer.mozilla.org/en-US/docs/Web/API/GlobalEvent...

If anybody has live experience with this, I'm eager to hear about it.


Yes you get column numbers and that really helps as well, but as mentioned new lines don't really hurt that much and I'd rather have some new lines than none. In response to others about using source maps, I'm talking about production errors received from users in the wild who may or may not have source maps enabled in their browser.


Thats what I do, too. It makes debugging so much easier, and you can safe yourself the pain of source maps. A few bytes I gladly spend on keeping my sanity.


I've always wondered why minifiers don't take advantage of ASI to this effect. Your files come out the exact same size as cramming it onto a line with semicolons...


You should use source maps


Tell that to the people that pack jQuery ...


I've never thought about it that way. This makes web dev in Eclipse suddenly make so much more sense.


I'm a hue PyCharm fan, but if this is true then vim and Sublime must be barely usable. Python, it's great at...but open a big JS file and it...will...take...forever...to do the JS syntax highlighting, if it ever finishes. CoffeeScript is even worse, and almost always crashes.

It's always a little weird, because I have no performance issues anywhere else in it, and it's got ~7GB of RAM it could use.


I use PyCharm every day and IntelliJ and I have never had this problem. Maybe it's a resource issue with your hardware though and your version of Java. I use the new versions that bundle OpenJDK on a newer Macbook Pro.


It's amazing how true this is, huge files work fine, but one long line... I wish I knew more about editors to help fix it.


Yeah, I actually looked at atom sometime back. They have a problem with long lines because their regex engine runs in O(n^2). I suggested that they switch to a regex engine that runs in O(n). https://github.com/atom/atom/issues/979#issuecomment-7770371...

I don't know what happened after that. Maybe they will act on this. The only problem with the O(n) engine was that a lot of TextMate grammars for languages wouldn't work with atom but that was easily resolvable by writing them anew.


Let's say your CPU is 2.5GHz. The square root of 2,500,000,000 is 50,000...


Interestingly, both Sublime Text and Atom are two programs that I absolutely cannot use over X11 forwarding without noticeable lag ... Even if the OS and the X server are on the same local machine.


All of this to build 'like' buttons. #cynicism


I know you are joking, but technically this is possible today.

https://youtu.be/4i7KrG1Afbk?t=1251


A piece table[0] solves this rather elegantly. Since it is a persistent data structure, a mark can be represented as a pointer into an underlying buffer. If the corresponding text is deleted, marks are updated automatically, since the pointer is no longer reachable from the piece chain. Lookup is linear[1] (or logarithmic if you store pieces in a balanced search tree) in the number of pieces i.e. non-consecutive editing operations.

[0] https://github.com/martanne/vis#text-management-using-a-piec...

[1] https://github.com/martanne/vis/blob/master/text.c#L1152


I used an editor in 1980 that was implemented as piece tables! That was a 64K HP2000. It worked great!


I tried out Atom a few weeks ago. I loved the UI! Absolutely fantastic, beautiful, nothing but praise there.

But I had so many issues with stability, and really missed small but important features that were present in my other editors. I also found that most of the plugins worked either poorly or sporadically.

In the end, I decided that it was not worth either using Atom or spending time contributing to it when I have some "pretty close" solutions today. Definitely looking forward to the 1.0 version though, and hats off to all those spending their time contributing to it. I'm sure it's going to become something great!


I'm curious about which features you missed present in your other editors? What are those editors by the way?


For me specifically, Atom was mostly replacing Notepad++. At the time I was testing out Atom, I found that it didn't consistently save the documents you had open during your last session. There may or may not have been other things, that's the biggest one I remember.

Neither it nor Notepad++ could handle gigantic text files, which is a real shame as well, though not a point of comparison.


there is a plugin called last-session or save-session if I remember well, one of the first I installed.


Yes, save-session is good, but it has no idea how to handle multiple window panes. It just throws all your shit in one big ole pane and you're left to pick up the pieces.


I had this same experience. Had Notepad++, tried Atom and Brackets, uninstalled Atom and Brackets.


https://github.com/atom/autosave disabled by default.


Switching over from sublime - I noticed the last session problem, found a plugin for that. - Searching for files and ordering of files presented is a bit different; in Sublime I'd get more relevant files first (in a Rails app) typically you're wanting your controllers/models/views first. I haven't spent time comparing but logically I think it should be presenting the files closer to the top of the tree first (well at least in a Rails app), maybe this can be another plugin. - Laziness on my part yet, haven't found the command for vertically selecting all lines at a given position. - Don't open a large minified JS file... kaboom.

Having said that I'm fully stoked with it in other respects; and I can see the minor annoyances getting fixed.


FWIW, I've been using Atom since it launched. For the past several weeks it's been so bad, unstable and slow I switched to another editor. Now with version 0.209.0 Atom is back to what I'm used to.

Atom will be a really nice editor, it's just going to take some time, and it does have rough patches now and then.


I actually just retried Atom yesterday. Aside from the normal complaints (it's sloooww, undo doesn't affect markers or selections), one thing that struck me is that markers can't be zero-width. Well, they can but they won't show up. I'm wondering if this is related to the technique mentioned here - it's certainly been a pain to work around. Sublime Text even has multiple options for this (DRAW_EMPTY and DRAW_EMPTY_AS_OVERWRITE).

That said, I'm loving the API design. Coming from Sublime Text, it's a massive upgrade. The ability to embed literally anything a web browser can render in a well-designed framework is mindblowing.


> one thing that struck me is that markers can't be zero-width. Well, they can but they won't show up. I'm wondering if this is related to the technique mentioned here

Thanks for bringing this up. It's not a fundamental limitation at all. Highlights whose markers are zero-width are deliberately filtered out here[1]. This might not be the right move. If you've found it inconvenient, would you mind opening an issue on atom/atom, and describing your use case?

[1] https://github.com/atom/atom/blob/ebc5758d79e421f61f2b6669a8...



Didn't the Xanadu project solve this problem in 1972?

https://en.wikipedia.org/wiki/Enfilade_%28Xanadu%29

Solve it, keep it secret, and then fail to properly write about it to this day.


I really, really, don't get the whole "implement everything using web technologies" thing. As an outsider from that dev ecosystem it looks like the youtube videos you see of people implementing electronic circuits in Minecraft.


It makes more sense (to me) than "implement everything in a monospace text-based terminal you can't resize, hacking around years of horrible different methods of repositioning cursor / changing colour / setting title".

Of course, one could just write a native interface for every OS, but then that adds a new layer of complexity. HTML & Javascript engines are nowadays well optimised and available everywhere, making them the only real choice as a replacement for terminals.


Web developers are probably the main target audience, so making it easy for them to help with the development is a good thing.

I'm really happy with Atom's development, it has come very far in a relatively short time.


If you thought this was an interesting article, here 's the obligatory link to just about the only book on crafting a text editor, "Craft of Text Editing": http://www.finseth.com/craft/


Recently I learned all contributors will receive a gift for Atom pre-1.0, and when I asked a Github stuff when will I receive the gift (I'm moving during this summer) he mentioned it would be sent out in early July. I guess we can expect a pre-1.0 before August.

One of the main remaining functionality to be implemented is good support for large files. Looking at this issue [1], it seems Atom team is making some progress but there are still some problems to be tackled.

In 0.208.0 (released 7 days ago) they mentioned in the changelog Atom now opens files larger than 2MB with syntax highlighting, soft wrap, and folds disabled. We'll work on raising the limits with these features enabled moving forward. Little bit disappointed at the progress as you could open large file with these features disabled long time ago through a package "view-tail-large-files".

Just updated to 0.209.0 and using ember.js (1.9 MB) to test. Editing/scrolling has some delays but it's better than previous versions.

Good luck Atom team!

[1]: https://github.com/atom/atom/issues/307#event-325455529


Yet, the onKeyDown handler still takes 50ms. Are you kidding me? You can push a billion tris in that time.


Yeah, still more work to do. There are other things that are slow in that keydown event. We'll get there.


Appreciate the candidness of the team writing about their naive approach. Definitely would have been a simpler fix to just search the currently visible text, but I'm glad they fixed the root issue to make markers more efficient for all.


What is the data structure used for the text itself? A rope? The markers could be stored as offsets to the substrings themselves.


This kind of knowledge and experience exists in Microsoft campus for years thanks to Visual Studio team. That's why Code is much more efficient. I only wish if it was open source so I could totally move away from Sublime Text.


I thought VSC (Visual Studio Code) and Atom were based on the same codebase? I'm sure there are differences in functionality, but if open source is a prerequisite for you I'd suggest helping Atom grow rather than waiting on a licence change for VSC.


To add to chowyuncat's comment:

Electron is not an editor itself. The Atom editor is built as an Electron app, and so is Visual Studio Code, but Electron is not an editor. It's not even an editor framework. When you hear "Electron", think "Cordova for the desktop". Sort of.

That seems to be the extent of the similarity between the two--that they're each packaged as essentially a single page app for a Chromium-based, site-specific browser runtime.

For more info on VSC, see castell's comment from last week[1] and this post[2] from Scott Hanselman a couple years back (including the comments).

1. https://news.ycombinator.com/item?id=9691289

2. http://www.hanselman.com/blog/ARichNewJavaScriptCodeEditorSp...


VSC is built on Electron (formerly Atom Shell), but does not use the same text editor as Atom. I hope the VSC editor is open sourced at some point.


This reminds me of how I re-implemented nested sets in relational databases as spans in a "coordinate" system.

  | Root              |
  | Node        | Node|
  | Node | Node |
I stored only "X" and "Y" coordinates for every node, so you had to read "next" node in a row to get current node's "size".

It was a bit more human-readable when looking at the data. More importantly, it reduced (on average) the number of nodes I needed to update on insert compared to nested set and gave an easy way of retrieving immediate children. But you still had to "move over" all the nodes "right" of the one you're inserting.

The structure in the article looks eerily similar. I wonder whether it's somehow possible to apply GitHub's optimization to this "coordinate" based schema and make it relative without messing up the benefits of column indexing. Hm...



Isn't that the nested set model mentioned in the post itself?


Vim also has a similar optimization: when a file changes, Vim only runs syntax highlighter on a visible part of the text + some buffer in both directions.


Which unfortunately tends to result in everything getting highlighted as if it was a string literal if you have any multi-line strings anywhere.


Yeah, I don't think there's a good way of highlighting part of a code file, what you're looking at depends on everything before and (to some extent) after it. You have to do the whole thing.


The 'extent' you talk about is exactly the lexer (parser) state, you just have to properly serialize it for the beginning of the buffer to get cheap redisplays. It's not rocket science but almost no editor got it right.


Yi is one editor that does incremental parsing correctly: https://github.com/yi-editor/yi


Does it open files >2mb yet? My terminal vim does.


Yes it does. Currently, syntax-highlighting and soft-wrap are disabled for these files. We're continuing to work on optimizations and structural improvements to the editor that will allow it to support arbitrarily large files with the full feature set.


While I certainly see your point, Vim has quite the maturity advantage. I'm not sure if Atom will ever be as fast, being based on a web browser, but they're making progress and you can follow along if you care, or just wait a few years until it's solid / fast enough for you :)



yes


Hmmm... So the article seems to suggest that for every insertion of a character, a log time lookup is made. Is that really the case? If so, why is the leaf node that the cursor is in not saved? If you were to use a B+-tree implementation then you would already have access to neighbour pointers for rebalancing purposes, making the majority of incremental changes very cheap (constant time). This is just a thought, there may be good reasons why it's not possible.


Atom is still a hog on my main programing machine. It makes it unusable for me still.

It is an OLD i3 Dell from 6 years ago desktop.


Genuinely intrigued to keep seeing this persistent complaint with Atom.

I've been running it for ages on a mbp with 2.6GHz i5/8Gb RAM. It's lightning fast.

Very occasionally it will appear to be hogging CPU - typing is slow and it might hang for a couple of seconds when you're navigating around. This only seems to happen when I've had it running for days if not weeks without a restart. I just restart it and it's back to being lightning fast.

I really like Atom and would be pretty upset if it stopped working well for me, so would be curious to know what kinds of things people suspect cause it to have issues.

Is it just that my machine is powerful enough to not notice, and it's only really a problem on less powerful machines? I don't run many plugins, and installing many of these tend to bring it down? It runs well on OSX but not other platforms?


Packages can definitely have a significant impact on Atom's performance. You can observe all kinds of events, and binding a slow handler to certain events like text changes and cursor movements can make the editor feel sluggish.

It's important for Atom's extensibility that these kinds of events are provided, because it allows many major editor features to be implemented as separate packages, but it means that naively-implemented packages can really slow things down.

Many Atom packages are pretty new, and their authors may not have put a lot of effort into optimization yet. In the past few months, the team has put a lot of effort into stabilizing and solidifying our APIs. We're hoping that now that the APIs are stable, the package ecosystem will really start to mature.


What about a built-in mechanism to let you inspect what plugins are causing the most delay e.g. During a measuring period? Is that already a thing?


The best tool for that (and any profiling in Atom) is chromium's built-in profiler. See the flame graphs in the blog post.


Editing text in Atom isn't instantaneous for me like MacVim or Sublime but the typing latency is low enough to not bother me (roughly 1-2 frames), roughly the same as the jetBrains family of IDEs. When I load up atom-typescript, typing latency takes a noticeable hit and drops to the annoyingly sluggish range (~3-4 frames). This is on a 2012 mba with a 1.8GHz i5. I also have 2-3s of startup delay and frequently have a half-second to second delay when opening a file. None of these are terrible but they're all noticeable.

I suspect that people who complain about speed are either on lower end hardware or are comparing it directly to a native editor directly and consider any typing latency at all to be unacceptable.


I am talking about 5 seconds a word speed. I type and it takes 5 seconds for my word to show up on the screen. I have default plugins nothing else installed.


I haven't had any speed issues with Atom either. It has worked well for me overall. What I don't get is the people who say Atom is slow and then say Intellij is fast. I'm not sure how I can take them seriously.


> Very occasionally it will appear to be hogging CPU - typing is slow and it might hang for a couple of seconds when you're navigating around.

Do you literally mean that it hangs for a couple of seconds? That seems like an eternity, especially if it happens during navigation.


Literally, and in this state it is totally unusable. But let me emphasise this happens very rarely, once every couple of weeks maybe. When it does I just kick it over and it's fine.


Maybe try removing all plugins and retry. It's possible that one of them is responsible for this slowdown. This was the case for me couple of months ago.


Running `atom --safe` will start the editor with only bundled packages activated.


When was the last time you tried it? Atom works fine on dell computer with Intel duo core 2 from 7ish years ago.


I like Atom! However, sadly a simple regex search can kill it in 5 seconds.


This is exactly the case that is better as a result of these optimizations.


It's still much slower than, for example, Sublime Text. A search for a single letter [1] in a ~5,000 line file occurs instantly in ST, but takes about 1 second in Atom. It's not much, but it's very significant from a UI point of view.

[1] I know such a search is ridiculous, but both editors perform search-as-you-type, although atom does attempt to delay that if you type quickly enough. Anyway, it's just an example to demonstrate the speed difference.


Just like in emacs and vim there are plugins that let you use ack and grep: https://atom.io/packages/atom-fuzzy-grep

The built in search is slow (for now), but they have been steadily making gains and use performance testing to measure progress.


There's still some major algorithmic optimization to be done regarding the time it takes to run the find-and-replace search. The optimizations discussed in this blog post were more focused on the performance of editing the buffer in the presence of large numbers of search results.


Right. Don't get me wrong, the article was interesting and I appreciate there's a lot of hard work going on. I really hope atom is a success because an open source equivalent of Sublime Text would be very welcome.


>"an open source equivalent of Sublime Text would be very welcome."

Good news, it exists. If you know Go it's possible to see it sooner: http://limetext.org/


What OS are you running on the Dell?


OpenSUSE 13.2 with a tiled window manager and no other Desktops installed.


One thing I love about vanilla JS is that you can both set and get with the same property. I wonder if having both setters and getters is enforced by CoffeeScript or a design decision of the Atom team!?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: