
Atom understands your code better than ever before - guessmyname
https://blog.github.com/2018-10-31-atoms-new-parsing-system/
======
WorldMaker
It seems interesting to compare/contrast this Tree-Sitter toolkit/approach
with the Language Server Protocol [1] work that VSCode has been taking. My gut
impression is that the VSCode assumption that a language's own compiler is
often the best source of truth regarding that language, and using an
information "onion" of fast "traditional" syntax highlighter augmented with
the ability to request a more detailed version from a language
compiler/interpreter is a better approach for long-term maintenance reasons,
if nothing else.

Admittedly, Tree-sitter has the opportunity to centralize some of the work in
incremental parsing that LSP pushes to individual language tooling, but
arguably in the modern world of watch-based programming most language tooling
is going to feel a need to build it regardless.

The maintenance headache, though, is that Tree-sitter has its own custom
grammar DSL, which likely won't look anything like or share anything with the
native grammars of the languages it is parsing, and which ultimately has to
settle for a lowest-common denominator approach to grammar parsing. Even if it
is more sophisticated than so-called "crude" RegEx-based syntax highlighters,
it can still at best be a separately maintained emulation of a language's
grammar versus the LSP encouraging a language to provide its own intelligence
presumably from the very same parser it uses to get work done.

[1] [https://docs.microsoft.com/en-
us/visualstudio/extensibility/...](https://docs.microsoft.com/en-
us/visualstudio/extensibility/language-server-protocol?view=vs-2017)

~~~
maxbrunsfeld
Tree-sitter isn't really an alternative to LSP. We think of it as solving a
different set of problems.

LSP is probably the best way to provide the fixed set of classic IDE features
that LSP does - inline diagnostics, autocomplete, go-to definition, etc.

We think that for many _other_ features, Tree-sitter is a much cleaner
solution than pushing the logic out into the language server. For example, if
you want accurate syntax highlighting that's lightweight and updates
immediately (as opposed to in a delayed fashion like in most IDEs), you need
incremental parsing. Incremental parsing is a highly specialized problem, so
writing an incremental parser in the form of a Tree-sitter grammar is much
simpler than modifying a language's compiler toolchain in order to _make_ it
incremental, and bundling that whole toolchain into an app like Atom.

The same goes for other features besides syntax highlighting. The syntax tree
is now available via a uniform, in-process API in Atom, and is always up-to-
date, so you can script the editor to manipulate your code intelligently. I'm
not totally sure what kinds of things we'll end up building on top of that,
but I think it's a different set of things than LSP will end up providing.

~~~
mattbierner
Very good points. Just to add a few more details on the LSP and VS Code side
of things:

\- VS Code implements syntax-aware code folding using the LSP. This means that
folding is super flexible but it is also makes computing folds fairly
expensive. And every time the document changes, the language server has to re-
compute and update the folds. In almost all cases, language servers is just
generating folds based on the document's syntax anyways.

Tree-sitter is interesting because it lets folding and other syntax based
language features be calculated accurately and quickly on the client, freeing
up the language server to do more interesting things (or just go to sleep for
a moment). Document outlines are similar; VS Code uses the LSP for this but in
many cases the same syntax derived outline could be generated by tree-sitter.

\- The LSP probably isn't well suited to general syntax highlighting due to
computation cost and communication chattiness concerns, but the LSP may
eventually support semantic syntax highlighting [1]. This could, for example,
allow an editor to color all singletons hotpink, which requires a semantic
understanding of the code. Semantic highlighting would augment the base
highlighting provided by tree-sitter or by a TextMate grammar.

I'm the developer of VS Code's JavaScript/TypeScript and Markdown support, and
am interested in tree-sitter if only in the hope the it will free us from
TextMate grammars. If you want to see just how far regular expressions can be
pushed, just go browsing through some of these bad boys; TypeScript's is a
classic [2].

Keep up the great work Max!

[1]: [https://github.com/Microsoft/language-server-
protocol/issues...](https://github.com/Microsoft/language-server-
protocol/issues/513)

[2]: [https://github.com/Microsoft/TypeScript-
TmLanguage/blob/16c5...](https://github.com/Microsoft/TypeScript-
TmLanguage/blob/16c5fcb7aaa387579c320bca08bcc7eadddfdcc9/TypeScript.YAML-
tmLanguage)

~~~
Matthias247
Besides what has already been said by the 2 parent comments: Textmate grammars
had the advantage that setting up syntax highlighting for a new language (e.g.
for a simple DSL) was a pretty easy and low overhead process, and that enabled
getting highlighting for dozens to hundreds of languages in lots of editors.

Developing a language server is however a more complicated task - even if
there would be a template that does most of the boilerplate. Therefore I would
continue to welcome in-process highlighting mechanisms like textmate grammars
and tree-sitter as a baseline. They could be augment by LSP features whenever
a plugin author feels it's necessary.

Regarding tree-sitter itself: I've read the documentation, and it looks super-
interesting. I've developed textmate grammars before and wasn't really
satisfied with them, tree-sitter looks like it can provide a lot better
results.

I would love to see those getting into VsCode and other editors too. Then good
grammars can again be shared between editors! Maybe getting it into VsCode is
now easier, since both are now somehow Microsoft editors? :)

------
tnolet
If this is better than the Jetbrains' IDE's I'll eat a pony. I'm a WebStorm
and IntelliJ user, with Sublime Text for simple stuff. Used VsCode and Atom
once or twice. Ditched them because highlighting and code completion was below
par.

~~~
vegasdew
Can you add details, for which languages and when did you try vscode?

~~~
pcr0
Not OP, but I've had the same experience with Ruby on Rails. I switched to
RubyMine last month after using VSCode for 2 months. I don't blame VSCode,
Ruby has got to be the hardest language to do static analysis for, but
JetBrain's proprietary code inspection and IDE integration blows every other
open source solution out of the water.

------
maxbrunsfeld
Hello! I worked on this feature and wrote the post. I'd love to hear any
feedback or answer any questions.

~~~
Matthias247
Have you thought about a pure Javascript implementation of the parser, or
compiling the runtime to WASM? Since a parser shouldn't really use any
platform features it should be possible - and it could enable using the
technology in pure browser based editors (like Monaco, CodeMirror, etc).

~~~
maxbrunsfeld
Yes! I'm very interested in making the Tree-sitter runtime and all of the
parsers available as pre-built WASM modules. I think it should be
straightforward to do.

------
dpkonofa
Does this require an extension on the file? One of my pet peeves with Sublime
is that every package for beautification, syntax highlighting, and code-
folding requires whatever I'm working on to be saved to a file using a
specific extension to enable those features. Sometimes, I just need to copy
and paste a few blocks of code and format, re-indent or any number of things
but I don't necessarily want it saved to disk. I know this is a really niche
situation but Atom is looking more and more promising to me and this edge case
would help make that switch totally justifiable.

~~~
Ajedi32
It goes by either file extension or the hashbang line. But I think it only
make the determination when you open the file; it doesn't change the language
when you edit the text.

You can manually set the language pretty easily though via the command palette
or by clicking the language in the bottom-right.

------
finchisko
Bit offtopic. But what is the future for Atom in hands of Microsoft? Will they
continue to develop it or cease development in favor of VSCode? Or merge those
two?

------
partycoder
Atom is slow even on a fast computer.

