
Compiler Construction by Niklaus Wirth [pdf] - vector_spaces
https://inf.ethz.ch/personal/wirth/CompilerConstruction/CompilerConstruction1.pdf
======
NonEUCitizen
OP's link points to a .pdf covering Chapters 1-8. Web page below has a link to
Chapters 9-16 as well:

[https://inf.ethz.ch/personal/wirth/CompilerConstruction/](https://inf.ethz.ch/personal/wirth/CompilerConstruction/)

------
ainar-g
Wirthian languages, and especially Oberon-07, are definitely a topic I want to
familiarise myself with, but there is so little information about it on the
Web, and even fewer codebases to learn from. And a lot of the compilers and
tooling seems to be weirdly Windows-oriented. The questions I still have very
little answers to are:

* How is error handling done in Oberon-07?

* How does one do generic programming in Oberon-07?

* What is Oberon-07's async story?

I will be thankful for any pointers (heh).

~~~
Rochus
Here ([https://github.com/rochus-keller/Oberon](https://github.com/rochus-
keller/Oberon)) is a Qt/C++ based viewer/cross-referencer for Oberon-07 code
bases as well as an Oberon to C++/Lua/LuaJIT bytecode transpiler/compiler; an
IDE is work in progress.

> How is error handling done in Oberon-07?

There is an Oberon programming language with compiler written in Oberon, and
there is an operating system called Oberon as well; you can have a look at the
source code, but there are also a couple of free books about it, see
[http://www.projectoberon.com/](http://www.projectoberon.com/)

> How does one do generic programming in Oberon-07?

No support for generics yet; there was a proposal in the nineties, but Wirth
didn't integrate it. Neither the Go language which is a brain child of
Oberon-2 has generics (yet).

> What is Oberon-07's async story

Oberon is single threaded; there is a coroutine library part of the Oakwood
standard. But nothing comparable e.g. to Go.

------
chrisseaton
People should note that this isn't usually how we construct compilers in
industry today.

(This comment is posted every time but it's true.)

~~~
hasbot
It really wasn't how compilers were constructed in '96 either. I was very into
compilers in the late 80's and even then we didn't handwrite lexers or
parsers. I was using bison and flex around '89.

~~~
patrec
Basically no serious programming language uses bison or flex, and everything
that does (awk is YACC based) has terrible error messages. People totally do
handwrite lexers and parsers, if you use some language in earnest the chances
it has a handwritten parser or lexer or both are far greater than that it has
been developed with bison/flex.

~~~
judofyr
> Basically no serious programming language uses bison or flex,

[https://en.wikipedia.org/wiki/GNU_Bison#Use](https://en.wikipedia.org/wiki/GNU_Bison#Use):
Ruby, PHP, GCC until 2004, Go, Bash, PostgreSQL, MySQL, Perl.

Can we stop with this "duh, real programmers write their own handwritten
parsers"? Parser generators are _awesome_! They force you to write a
declarative grammar which a human can understand. If you're _designing_ a
language this is very useful. With a handwritten parser you're very likely to
encode some unintended behaviour in edge cases; using a parser generator early
in this process will very quickly reveal the ambiguities.

That doesn't mean that parser generators are always the best. There are many
use cases where a handwritten parser is better. But the picture is way more
nuanced than "no serious programming language uses bison or flex".

~~~
dbcurtis
The problem with LALR parser generators is that you end up with an LALR
parser. Which is fine as long as the input is error-free, but trying to get a
reasonable error message out of _any_ LALR parser is about as much fun as
repeatedly poking yourself in the eye with a sharp stick.

FLEX, on the other hand, is perfect for the job. I can't imagine why anyone
would hand-write a lexer.

~~~
jcranmer
> FLEX, on the other hand, is perfect for the job. I can't imagine why anyone
> would hand-write a lexer.

Flex introduces a new dependency to your project, which involves generating
source files at build time. This is a lot of complexity in your build system
if this is the first time you need it. And it also by default uses a style
that isn't really modern--it's more annoying to use if you want to use it in a
multithreaded programming, or if you want to have multiple lexers in one
program. (Or you're not even using C/C++ in the first place!).

Furthermore, lexers are actually really easy to write. Most tokens [1] are
recognizable within a character or two, and usually amount to a regex on the
order of [a-z][a-z0-9_]+ or "([^\"]|\\.)*" \-- the sorts of regexes that have
very trivial implementations to handcode. The lexer I wrote for a JSON-ish
file format is 128 lines of code. Sure, it'd be more like 30 lines of code if
I used an automatic generator, but running into any issues integrating the
automatic generator into my workflow would toss out the benefits of saving 100
lines of code.

[1] Excluding keywords, but keywords are usually implemented by parsing it as
an identifier and looking for the identifier in a hashtable of known keywords.

------
dang
Previous versions were discussed:

2015:
[https://news.ycombinator.com/item?id=10764672](https://news.ycombinator.com/item?id=10764672)

2010:
[https://news.ycombinator.com/item?id=1529288](https://news.ycombinator.com/item?id=1529288)

2008:
[https://news.ycombinator.com/item?id=267916](https://news.ycombinator.com/item?id=267916)

A related thread from 2018:
[https://news.ycombinator.com/item?id=16609360](https://news.ycombinator.com/item?id=16609360)

------
sbmthakur
Just discovered Wirth's book _The School of Niklaus Wirth: The Art of
Simplicity_. I would like to know if it's worth the time investment?

