Hacker News new | past | comments | ask | show | jobs | submit login

Yeah I think I get what you're saying -- basically that if you have incremental algorithms then you don't need to take a dependency on CRDT or OT to make the UI updates fast?

In other words, you can do things synchronously BUT still quickly, since you have an incremental algorithm. In any case, I basically asked that same question here:


I've seen Treesitter and am impressed by it, which I wrote a little about in that comment.


Since you speak of "sane" languages, I wrote a parser for an "insane" language by hand, with a principled regex-based lexer:

How to Parse Shell Like a Programming Language http://www.oilshell.org/blog/2019/02/07.html

In fact the first thing I tried was to use ANTLR to translate the POSIX shell grammar (which only covers about 1/4 of the shell language). This drove home the limitations of grammar-based approaches. I found its model of lexing to be very limited (along with insisting on materializing a parser tree rather than providing semantic actions in ANTLR v4).

So it's good to hear that it's being updated for "context-aware lexing", and incremental lexing. Are you using ANTLR for IDE-like tools? I wasn't aware that it was good or commonly used for that kind of use case.


I briefly talked with the author of Treesitter on Github about his bash syntax highlighter with Treesitter, which is of course a parser and not just a lexer, and can be made incremental.

I wasn't sure if the context-aware lexing mechanisms here could really handle the shell language:


I think his response was basically that it doesn't have to be 100% correct, which I agree with for an editor. Trading off some corner cases for speed is probably the right thing.

But I have thought about what it would take make a parser for a "crazy" language incremental and 100% correct (which you would want for further semantic analysis).

I wonder if you can write recursive descent parser in the Skip language and then get incrementality for free? The language is designed so that every algorithm can be made incremental, and it's not limited to pure functional constructs like other systems. Supposedly it goes to some lengths to provide you with a more imperative / mutable state modelk, so it should be able to express an "arbitrary" parser.


In other words, instead of having different algorithms for making LL(star) and GLR parsing incremental, Skip might give you a more expressive model for the parser. A bunch of my blog posts are about the limitations of grammar-based approaches.


FWIW the style of context-aware lexing I use is explained here:

When Are Lexer Modes Useful? http://www.oilshell.org/blog/2017/12/17.html

And the other post lists the 14 different modes or contexts that the shell front end has:


It's a very simple and powerful mechanism, but my parser only cares about batch mode for now. It's not clear to me how it differs in expressiveness from what Treesitter offers (or ANTLR). The current lexer mode basically depends on the control flow of the parser.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact