Part of the motivation was that while I've written many simple parsers, I mostly used parser generators with BNF grammars and I didn't feel like I had a good sense of how the code I ended up generating actually worked for something with complex syntax and semantics. I felt like I was writing specs for a parser, rather than writing a parser.
My toy language has vaguely C-like syntax with block scope and infix operators with various precedences, so it was a bit more complicated than JSON, but I ended up using something like Pratt parsing/Precedence Climbing (see https://www.oilshell.org/blog/2017/03/31.html) and wrote the whole thing in a way that's - hopefully - pretty easy to read for folks interested in wrapping their head around parsing complex syntax (e.g. with scope and name resolution). The lexer, parser and language definition ended up being about 1000 lines of JS (counting some pretty extensive comments).
Any JS programmers that are interested in really getting into the nitty-gritty of writing your own parser/compiler should check it out. The source is here: https://github.com/j-s-n/WebBS (relevant files for parsing are in /compiler - start with lexer.js, parser.js and syntax.js).
If you want to play around with the language and inspect the generated ASTs and WASM bytecode, there's an interactive IDE with example code here: https://j-s-n.github.io/WebBS/index.html#splash
Acorn (JS AST parser) is an interesting codebase https://github.com/acornjs/acorn/tree/master/acorn/src
engine262 (JS AST parser and evaluator) too is interesting, here's how JSON parser is handled: https://github.com/engine262/engine262/blob/master/src/intri...
I get the purpose, of course, but geez.
But an entire JSON parser? That's nuts!
Maybe it makes sense if you assume the input has already been tokenized so you are not expected to deal with the minutiae of string literal escape sequences and such and can focus on the high level design/flow...
- PegJS (https://github.com/pegjs/pegjs)
- Nearley (https://github.com/kach/nearley)
But there can also be good reasons to use a hand-rolled recursive descent parser. It simplifies the build system, and it can also be easier to produce good error messages.
I've been looking at the WASM side myself and suspect it's only a matter of time before there's a JS/TS PEG library written mostly in Rust.
Anybody interested in parsers is gonna love it: