
Participle: A parser library for Go - pjf
https://github.com/alecthomas/participle
======
kjeetgill
Parsers are one of those fields I wish I had the time to really study deeply.
I'm especially impressed with LALRPOP.

Cribbed from a previous comment of mine:

From time to time I find myself revisiting this thread: "Writing parsers like
it is 2017"[0].

In particular I love a Rust parser generator called LALRPOP and it's emphasis
on diagnosing ambiguous grammars [1].

> What I’ve tried to do now in LALRPOP is to do that clever thinking for you,
> and instead present the error message in terms of your grammar. Perhaps even
> more importantly, I’ve also tried to identify common beginner problems and
> suggest solutions.

They work out a fairly deep example with error guidance in ambiguous grammars
in that post.

[0]:
[https://news.ycombinator.com/item?id=15016061](https://news.ycombinator.com/item?id=15016061)

[1]:
[http://smallcultfollowing.com/babysteps/blog/2016/03/02/nice...](http://smallcultfollowing.com/babysteps/blog/2016/03/02/nice-
errors-in-lalrpop/)

------
iampims
A good intro by Rib Pike for those who’d like to learn more about
parsers/lexers
[https://youtube.com/watch?v=HxaD_trXwRE](https://youtube.com/watch?v=HxaD_trXwRE)

~~~
deklerk
s/Rib/Rob :)

------
skybrian
Cool idea. The way it handles alternatives looks awkward since each
alternative is a field in a struct. You have to check each field to find the
one that's not nil.

(See Term in the example.)

Normally this would be done with an interface, like in go/ast

~~~
skybrian
Although, on second thought, this means writing a chain of if-else statements
instead of a type switch in a visitor. Maybe it's not that big a difference?

------
pebers
I've used this before - really nice library and an easy way of getting started
without having to write so much custom code. Only down side is that of course
it's still faster to hand-write a parser :)

------
pjmlp
Clever idea of using struct tags for grammar rules.

~~~
laumars
Wouldn't that then require reflection, which is "slow" (comparatively speaking
- if this isn't in a hot path then perhaps who cares?)

~~~
bpicolo
It's not in the hot path, it's used to generate the parser.

~~~
laumars
Ahhh, then that's a cool approach to take parser generation.

------
weberc2
This looks really cool. I spent a few weeks last year trying to build my first
programming language, but I kept getting bogged down by parsing and eventually
gave up.

~~~
laumars
Parsing isn't actually that hard once you wrap your hard around the problem.
There are lots of good guides for writing parsers online which really helped
me out when I was feeling the same bourdon as you are.

So what I'm trying to say is stick with it and I'm sure you'll crack it. :)

~~~
weberc2
How do you write a parser in a language like Go? I stumbled upon parser
combinatory and tried to shoehorn them into Go, but the lack of generics
mostly prohibited this. I also tried imperatively writing the parser, but this
was tedious and error prone. I also explored parser generators, but these
seemed to have huge learning curves and still produced awful code. Beyond
that, it’s hard to tell what kind of parser I need because of all of the
academic jargon. :/

I would love to be pointed in the right direction!

~~~
zzzcpan
Parser combinators don't need generics, as they are merely higher-order
recursive descent parsers. I.e. recursive parsers where functions are created
via other functions to generalize and compose them. Go is expressive enough to
do that.

For example, let's say we start with a function that parses a character '['.

    
    
      buf, pos := []byte("[123]"), 0
      
      LBRACK := func()bool {
         if buf[pos] != '[' {
            return false
         }
         pos ++  // consume character
         return true
      }
    

It would suck to write such function for every single character we have to
parse. We can generate it instead:

    
    
      CHAR := func(c byte)func()bool {
         return func()bool {
            if buf[pos] != c {
               return false
            }
            pos ++
            return true
         }
      }
      
      LBRACK := CHAR('[')
    

Now let's do the same for parsing a sequence. We combine outputs of other
functions and return one that parses them one by one.

    
    
      SEQ := func (args... func()bool) func()bool {
         return func()bool {
            for i := 0; i < len(args); i++ {
               f := args[i]
               if f() == false {
                  return false
               }
            }
            return true
         }
      }
      
      parser := SEQ(
         CHAR('['),
         SEQ(
            CHAR('1'),
            CHAR('2'),
            CHAR('3'),
         ),
         CHAR(']'),
      )
    

This is the idea.

This is not a real world example though. You'll need to have more primitives,
a wrapper that handles various things, like restoring buffer index on failure,
captures, that save starting index, move buf/pos into context structure,
virtual stack to store captures/errors, etc.

    
    
      func SEQ (args... func(ctx *Ctx)bool) func(ctx *Ctx)bool {
         return wrap (func(ctx *Ctx)bool {
            ...
         })
      }
    

But it's important to get there on your own. And it's pretty much as fast and
as flexible as completely hand-written recursive descent parsers and as
powerful as parser generators.

~~~
weberc2
Is that a parser? That looks like a matcher. I actually built something that
looks almost exactly like this, but it doesn't get you an AST, right? What am
I missing?

~~~
zzzcpan
To get to AST you have to create AST nodes. Make a function that captures the
output and put it into places where you want to create AST nodes. Sort of like
this:

    
    
      SEQ(
         ...
         CAPTURE(
            // parser to capture
            SEQ(...),
            // extra arguments: 
            // a name, a function to call,
            // that creates an AST node, etc.
         ),
      )

