Hacker News new | past | comments | ask | show | jobs | submit login
On Modularity of Lexical Analysis (matklad.github.io)
3 points by ingve on Aug 3, 2023 | hide | past | favorite | 1 comment



Modular lexing (parallelizable, limits effect of token/AST changes on single-character changes). Simply by replacing multiline strings and comments with the “start token on each line” variant, which is already popular for comments but not strings. Ex:

    // First line of a multi-line comment
    // Second line of a multi-line comment
    let multi_line_string =
        \\ Hello
        \\ world
        \\     a multi-line string
And this small change makes it much easier to do efficient, incremental lexing, and decreases the frequency of edits which clobber the AST (like inserting a single quote causing the rest of the document to be parsed as a string literal) which make incremental parsing and everything else more effective too.

This is something I never realized. And I thought lexing was the easiest, most unchanged, least important part of writing a compiler…




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: