Hacker News new | past | comments | ask | show | jobs | submit login
Pasukon – The easy JavaScript parser generator (pasukon.rocks)
67 points by gosukiwi on Sept 1, 2020 | hide | past | favorite | 25 comments

I like this a lot!

A small critique: it might be useful to include a small explainer of why parsers / ADTs are useful - maybe in the context of decoding JSON.

(Learning about ADTs shifted the way that I think about data modelling and application development, so I am a huge fan.)

Great point. I'll ellaborate on it in the readme and/or website :D Thanks for the feedback!

Agreed. I still don't know what this demo is even about, but looks cool.

This looks cool but the “easy to learn” intro statement set a disappointing expectation. Building up some simple examples would really help.

I mean this as constructive criticize, apologies if it feels too harsh.

Yeah I really need to work on the docs. I want to write a "Getting Started" guide for someone with absolutely zero background on the topic and put it in the repo's Wiki.

Did you see the GitHub repo's README? I do ellaborate a bit over there. Do you think I should explain the basics in the website too? Or maybe just link to the wiki once I have that "Getting Started" guide up?

This looks like a nice simplification of yacc/bison and lex/flex. The `then` keyword seems superfluous though; what was the motivation for `then` as opposed to just juxtaposition?

Thanks :) The reason it looks like that is because I need a way to distinguish binary calls from unary calls: `a b` would call the combinator `a` with the parser `b` as input. `a b c` will call the combinator b with a and c as input.

I certainly could have chosen another syntax, but I settled for that one. Because the user can define their own combinators I can't just hard-code all combinators into the syntax.

For example `*:A` is alias for `(many0 (token A))` but if the user defines his/her own unary combinator, I need a way to let him/her call it without changing the syntax.

That's what I wondered too.

I'm not sure what this gives me over and above yacc/lex (for C) or PLY (for Python).

Good stuff. I have 2 questions:

1. Is there a Pasukon grammar for JavaScript/ES6?

2. Is there a Pasukon grammar for Pasukon grammars?

1. Nope, I still need to make some more example grammars :)

2. Yup: https://github.com/gosukiwi/Pasukon/blob/master/lib/grammar....

I find (2.) rather interesting. A parser than can parse its own grammar definitions. A language that can define itself.

Yeah it's pretty cool, if I say so myself :) It parses itself (using a pre-compiled AST) to parse other user-defined grammars.

Initially I started using PEGjs, and then swapped once it was ready.

Unfortunately JavaScript is up there with C++ when it comes to sheer number of syntaxes, so it'd be a pretty substantial task. Not that it isn't worth doing, but I'm not surprised this example wasn't available at launch.

Does the term "substraction" mean anything in particular? It's kind of funny to have the same typo four times on the main page... I guess it compiles either way. b^)

It looks like a great package. I have loved parser combinators forever.

It's a typo I have, Spanish being my native language, I always get confused with that one :) Just pushed a new version with a fix

Similar tool: https://github.com/DmitrySoshnikov/syntax

It can emit parser in many languages.

There are also a couple of udemy courses from the author, about parsing and language creation. They are quite good imho.


Having written my own CommonMark parser, I often wonder if I should have used some kind of parser generator. Can I use this to parse CommonMark (in 100% compliant way)? – And, can I parse it such that I can recreate the original Markdown based on the resulting model?

I think Pasukon should perform similarly to a hand-written recursive descent parser. If you already have something going, I don't see much benefit in porting it, besides ease of creating and maintaining the parser.

It can parse any context-free language, so it should have no problem parsing CommonMark.

Awesome! I've been wanting to add a proper parser to my Scheme interpreter for a long time now, and this might lower the barrier enough to motivate me to actually do it.

Awesome, do it! Making languages is always fun :)

I am pretty happy with the parser from MS: https://github.com/microsoft/ts-parsec#readme the API is pretty intuitive and it the package is reliable Can't imagine why I would want to switch to PASUKON(what an odd name by the way).

For me it was mostly a fun project I've always wanted to make :) I didn't know about ts-parsec though, looks cool!

I originally designed Pasukon as an alternative to PEGjs. A way to quickly prototype a language and generate an usable parser. What I didn't like much about PEGjs is the lack of extensibility, so I added the ability to use your own lexer and combinators to really fine-tune the parser if needed.

Also PEGjs struggles with indent-based grammars, because it can't use caching with those, and it can get quite slow. That gets solved by using a lexing step which outputs proper `INDENT` and `DEDENT` tokens.

I believe the name is a play on

a) romanization of the Japanese pronunciation of PC ("personal computer" -> "PersCom" Japanese abbreviations of English tend to concatenate the initial syllable of every word, and "n" and "m" end up sounding very similar).

b) "parser combinator" would have the same abbreviation in Japanese

Exactly that :) Glad someone got it! Heh.

Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact