
Tree-sitter: new incremental parsing system for programming tools (2018) [video] - ggurgone
https://www.youtube.com/watch?v=Jes3bD6P0To
======
georgewfraser
The most obvious application of tree-sitter is editors. I wrote a VSCode
extension to replace the built-in syntax coloring with tree-sitter-based
coloring:
[https://marketplace.visualstudio.com/items?itemName=georgewf...](https://marketplace.visualstudio.com/items?itemName=georgewfraser.vscode-
tree-sitter)

I actually think it would make more sense for the various VSCode language
extensions to just bake in tree-sitter for their language. I have had a PR
open to do this with golang for a while: [https://github.com/microsoft/vscode-
go/pull/2555](https://github.com/microsoft/vscode-go/pull/2555)

~~~
dmortin
What is the point of replacing the builtin syntax coloring? Is it faster or
does it color more things?

~~~
dunkelheit
Builtin syntax highlighting for e.g. rust is laughably bad - the treesitter
highlighting is much better. Side note: I've recently switched to vscode as my
main editor and so far the experience has been full of contrasts - many
advanced features such as remote editing are the real gamechangers and work
flawlessly, but some basic features (the aforementioned highlighting, folding,
basic git integration) are notably lacking in polish. You kind of expect that
if they've gotten advanced stuff right then basic stuff is surely in order,
but that is not the case.

~~~
AnthonBerg
Have you tried Jetbrains IntelliJ? In my experience the IntelliJ platform is,
well, if you look in the direction VS Code is pointing, there you'll find
IntelliJ?

Tangentially related, there's some tree-sitter activity in the Jetbrains org
on Github: [https://github.com/JetBrains?utf8=&q=tree-
sitter&type=&langu...](https://github.com/JetBrains?utf8=&q=tree-
sitter&type=&language=)

which is cool

~~~
dunkelheit
I've used intellij a little bit and it is awesome (albeit a bit slow for my
taste). The reason I stick to vscode is remote editing - compiling rust code
locally on my laptop is a torture compared to compiling it on a beefy remote
box! Remote editing in vscode is very well done, even most extensions work
flawlessly without any changes. As I understand, there is nothing comparable
for intellij.

~~~
AnthonBerg
Interesting!, thanks!

------
minxomat
Important recent development in tree sitter was the new query language. Like
TextMate or Sublime Grammars, ts in atom did use CSS selectors, but now it has
a much more powerful s-expression query language which is useful for more than
just syntax highlighting, e.g. static analysis. An application of that is
Github's semantic, a haskell tool for code navigation and call graph analysis.

Demo and explanation: [https://github.com/tree-sitter/tree-
sitter/pull/444](https://github.com/tree-sitter/tree-sitter/pull/444)

------
adadgar
Neovim is aiming to integrate this in the next major release, v0.5:
[https://github.com/neovim/neovim/pull/11113](https://github.com/neovim/neovim/pull/11113)

------
lewisl9029
I've been following tree sitter for a while, as I find the tech super cool and
can't wait to see more practical applications.

One thing (among many others) that I've found really promising about Dark is
its editor. See the hands-on video on their homepage for a demo:
[https://darklang.com/](https://darklang.com/)

It mostly feels like you're just typing text like in any regular text editor,
but your inputs are actually manipulating the AST directly, and the editor
itself ensures that your inputs can never result in an invalid program (i.e.
there's no such thing as making a syntax error in Dark). It's inspired by
tooling in the lisp world like Paredit and Parinfer, but Dark itself doesn't
have to _look_ like a lisp because the structure of the AST is maintained by
the editor itself instead of by users manually inserting and removing parens.
It's an ingenious way to get most of the productivity benefits of a lisp-style
syntax and all the structural editing tooling that comes with it, without
intimidating new-comers with the super foreign looking parens infested syntax
lisps are infamous for.

The other day I was actually briefly looking into whether or not it could be
possible to replicate something like this in Atom using tree-sitter for some
mainstream language like JS, but ended up getting blocked by the fact that
Atom doesn't seem to offer an API for plugins to block/replace user input.
This is probably for the best, given all the horrible ways this could be
abused, but it does mean if I wanted to explore the idea further I'd probably
have to either fork Atom to experiment with the idea or build something up
from scratch, which is a pretty daunting undertaking given how deceptively
complex modern editors can get these days.

But maybe I'm missing a different way to accomplish this in Atom with its
existing APIs? Or does anyone know if VSCode's extension APIs can support this
use case? I realize I've probably barely scratched the surface given how
little time I've spent on it so far.

~~~
minxomat
I really don't think it's inspired by Parinfer. It's likely based on the
theory of structural editing and AST projections first popularized by
JetBrains' CEO and available for experimentation in the open source project
MPS. An end to end application of this theory is commonly referred to as a
language workbench.

Papers:
[https://confluence.jetbrains.com/display/MPS/MPS+publication...](https://confluence.jetbrains.com/display/MPS/MPS+publications+page)

Language workbenches:
[https://www.martinfowler.com/articles/languageWorkbench.html](https://www.martinfowler.com/articles/languageWorkbench.html)

Nice intro to structural
editing:[https://medium.com/@mikhail.barash.mikbar/looking-at-code-
th...](https://medium.com/@mikhail.barash.mikbar/looking-at-code-through-the-
prism-of-jetbrains-mps-8e9b70e3257d) (also mentions scratch)

~~~
carapace
> It mostly feels like you're just typing text like in any regular text
> editor, but your inputs are actually manipulating the AST directly, and the
> editor itself ensures that your inputs can never result in an invalid
> program (i.e. there's no such thing as making a syntax error in Dark).

The basic idea has been around for a while.

Here's something from the 80's: Alice Pascal
[https://www.templetons.com/brad/alice.html](https://www.templetons.com/brad/alice.html)

> One of the first projects I did after forming Looking Glass Software Limited
> was a syntax-directed programming environment called Alice: The Personal
> Pascal.

> Syntax-directed editors are somewhat controversial, however I think they are
> quite good for people learning programming, and Alice was written first to
> be used in education in the school systems of Ontario. Our first sale was a
> contract to develop it for the Ministry of Education there.

------
dmortin
Will tree sitter also stimulate creation of free tools which work on the AST?

E.g. it's a mystery to me why we don't have free refactoring tools like the
ones in IntelliJ. Like some free library which could extract methods, rename
variables, etc. by modyfing the AST. It does not seem too hard.

Is it because the current AST parsers are not fast enough or is there some
other reason?

~~~
lioeters
From my limited knowledge/experience, the use of language server protocol
(like in VS Code editor) enables refactoring operations like you describe, for
example, in TypeScript it can create a struct out of function parameters, or
create a class from old function-prototype based definitions. Compared to IDEs
like IntelliJ, though, I imagine the feature set is much, much smaller in
scope.

I did see some discussion about integrating tree-sitter with VS Code, but the
focus seems limited to syntax highlighting, not operating on ASTs.

~~~
lioeters
I found that the last time this talk was posted on HN [0], the author of tree-
sitter mentioned that a couple of language servers are indeed using tree-
sitter.

* Bash - [https://github.com/mads-hartmann/bash-language-server](https://github.com/mads-hartmann/bash-language-server)

* Ruby - [https://github.com/rubyide/vscode-ruby/tree/master/server](https://github.com/rubyide/vscode-ruby/tree/master/server)

[0]
[https://news.ycombinator.com/item?id=18213022](https://news.ycombinator.com/item?id=18213022)

------
dmitriid
So... You write your grammars in Javascript. Which is then serialized to JSON
but a parser defined in Rust, so that it can be compiled to C?..

That’s... a very roundabout way of doing things.

[http://tree-sitter.github.io/tree-sitter/implementation](http://tree-
sitter.github.io/tree-sitter/implementation)

~~~
maxbrunsfeld
Many parser generation tools use their own custom grammar language, and then
generate a C parser based on that. With Tree-sitter, it’s a similar setup,
except the grammars are written in JavaScript instead of some custom language.

The parser generator itself is all written in Rust, but the end user doesn’t
need to use rust in any way.

------
rrampage
The project page is at [https://tree-sitter.github.io/tree-
sitter/](https://tree-sitter.github.io/tree-sitter/)

------
dang
Discussed at the time:
[https://news.ycombinator.com/item?id=18213022](https://news.ycombinator.com/item?id=18213022)

------
based2
(2018)

~~~
ggurgone
(it is the title of the talk)

~~~
saagarjha
Dates are usually added to posts that aren't recent.

