So far this book seems to be just at the right pitch for my purposes.
Does anyone else have any recommendationed reading for someone like me? My absolute priority is keeping the code simple and having as few features as possible.
I’ve actually been using my ignorance as a design constraint... I don’t want to allow complex syntax in the editor, (it’s not a general purpose editor) and I want the parser source to be approachable.
But perhaps by this point I may have the resolve to resist the power a proper parser will give me.
There's also this book that from my understanding, a lot of people consider fairly foundational / useful (own it, haven't read it)
Writing an Interpreter in Go (https://interpreterbook.com/)
It's a very approachable introduction to lexing and parsing using nothing but Go and its standard library.
the focus is on speed, but honestly I found this code easier to read and write (once I got my head around the general concept) than most other parser code I've seen.
If your looking to make your lexer as simple as possible I would recommend looking at (http://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-e...). It's written by the same guy, and gives a short look on pratt parsers, which is one of the simplest ways to write a parser. There's also a simple language he wrote that uses the method as an example available on GitHub. At a top level you end up with declarations like:
infix(token: "plus", parselet: BinaryOperator, precedence: 13);
Having done a fair bit of prototyping parsers and reading about it the hardest parts of parsing JS are regex literals and automatic semi-colon insertion.
Of course, try the OP, maybe there is something for you.
You might want to watch Per Vognsen's "bitwise" series (another guy that was at RAD Game Tools I think, like Sean Barrett who is linked in a sibling comment) on youtube as well. The recordings are very approachable and mostly straightforward.
Having a little theory to guide you is important, but mostly it's a practical issue. So I mostly recommend trying various approaches yourself. Get high-level ideas somewhere else, and then make implementations on your own.
In that spirit, a few high-level tips for approaching parsing, from the top of my head.
How you parse a language depends heavily on what the parsed language is. You need a good feel for the language to understand how it wants to be parsed. If you have the luxury of making your own language, great!
Pretty much every programming language should be split into parser and lexer for sure, although the stages might need to interact (e.g., parser detects a switch to a different, embedded language, and lets the lexer know).
As for the lexer, absolutely don't use any tools. Probably not even regular expressions. Just handcode the stuff as you go. It's boring but extremely straightforward code. You get a lot of flexibility and debuggability this way, and no painful dependencies. In the simplest case the Lexer is a only few type definitions (Token stream wrapper around some I/O facility, Token kind enum, Token record) and a lex_next_token() function.
As for the parser, the best computer languages are those that can be parsed by recursive descent (again, no tools. A C-like language is enough for implementation). You look at the next input token and do the appropriate thing. Parsing is extremely simple if a language is designed to be parsed that way.
The more mathematical approaches to parsing (like LR) are pretty boring and unapproachable in my opinion. They are comparatively hard to understand and debug.
Besides parsing of global structure, most languages have an expression grammar with infix operators and precedence levels. You can work out how to parse that separately. There is the shunting yard algorithm, which is a table driven method unlike recursive descent. I guess you should know how to use it, but you will also quickly see that it gets hairy as soon as function calls and other non-infix syntaxes come into play. Parsing expressions with recursive descent, too, turns out to be not hard after all, and a nice benefit from that approach is that the parser is written in a consistent style.
Interesting, how initial contact was made... Did the author contact someone from the Dart team, or did the team/Google contact him in the first place?
Impossible is too strong. Unusual, yes, but not impossible. If you pay attention to their job listings, you will occasionally see openings posted for specific roles in specific teams.
It's also beautifully presented, so compliments to author for that as well.
>By which I mean Machine Learning, since the term "AI" smacks of not just determinism, but also a distinct lack of VC funding.
Is all we do, in a way or other:
> Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.
Thats why "Interpreter" is one of the common OO patterns on
I done a lot of micro-languages (later I found are called "Domain Specific Languages" by hipsters). I have one that build the Database interface code automatically from the .sql script that build the postgresql database.
I code the database at hand, run "builddb.py" and it spit out all the mapping and part of the ORM layer. I doit with bad hacks, regex and stupid stuff (from a proper compiler construction perspective) but I only spend like half day doing it.
Later, I also build the inserts/updates and other extra stuff that sometimes is useful. Then I also spit out both F# and swift code, with the same terrible hacks...
Now, after do this thing like millon times before I start to think that I'm a idiot that could have donde this MUCH better. So, this time as a side project I'm doing the same thing, like a ORM-made-as-language.
And boy, this thing is fun. I not only can run "sql" agains my RDBMS but also with owm in-memory structures, similar to LINQ but, IMHO, better.
Now, the thing is... if this little project get to a point where is good, why not release it as a language?
If you think smaller though, there's a ton of utility to be had in taking the ideas and applying them to domain specific languages. I don't know much about Racket, but it feels like that's its reason to be.
In the last 10 years all sorts of new languages have come out and been successful. We're not nearly at the end game of language development.
I have nothing against patterns, but it made it harder to follow, so I stopped reading it.
If you do the whole process in one pass like in C you can use the parser evaluation but even a simple feature like lambdas requires more than one pass.
So either you use a Visitor or you develop pattern matching in your language and bootstrap it rapidly to be able to use you language to write the backend of your language.
Visitor is the only way I know of dealing with it in C++. Once you do it enough, Visitor starts to feel like a pretty natural way to extend a class without modifying the class.
Are you talking about traditonal OOP visitor pattern ? It's pretty terrible, when compared to std::variant ; imagine for instance that you want to have a way to draw the rect that contains a given shape without changing your shape class: https://godbolt.org/g/34KyMw
Visitor is nice when some kind of transformation needs to be applied to the visited objects.
The top comment supported the use of the visitor pattern, but I would say that MOST people in the thread did NOT use visitors.
You only really need a visitor if you have more than one traversal. And even then you don't necessarily need it.
" A handbook for making programming languages.
This book contains everything you need to implement a full-featured, efficient scripting language."
The title makes it look like you would write a compiler /assembler or some plugin for existing one. and then the first scentence dives down to the actual content.
This is obvious from the title here. it just bugs me these kind of words (programming language vs scripting language) are used like that.
Other than that, always interested to read about both topics :D