Modular lexing (parallelizable, limits effect of token/AST changes on single-character changes). Simply by replacing multiline strings and comments with the “start token on each line” variant, which is already popular for comments but not strings. Ex:
// First line of a multi-line comment
// Second line of a multi-line comment
let multi_line_string =
\\ Hello
\\ world
\\ a multi-line string
And this small change makes it much easier to do efficient, incremental lexing, and decreases the frequency of edits which clobber the AST (like inserting a single quote causing the rest of the document to be parsed as a string literal) which make incremental parsing and everything else more effective too.
This is something I never realized. And I thought lexing was the easiest, most unchanged, least important part of writing a compiler…
This is something I never realized. And I thought lexing was the easiest, most unchanged, least important part of writing a compiler…