Hacker News new | past | comments | ask | show | jobs | submit login

Just a note for anyone reading the Crenshaw book, in chapter 9 he notes:

    The C language is quite another matter, as you'll see.   Texts on
    C  rarely  include  a BNF definition of  the  language.  Probably
    that's because the language is quite hard to write BNF for.
There is now an excellent reference that includes a full BNF Grammar for C(C99): http://www.amazon.com/Reference-Manual-Samuel-P-Harbison/dp/...

(This book didn't exist when the tutorial was written)

IIRC C was not context free, a requirement for BNF. The problem was in variable declarations, probably structures, but I dont remember the details. (It may have been fixed by C99.) It is often possible to transform a grammar from context sensitive to context free by trickery in the lexer. Because of this I dont recommend C as a first compiler language, although it makes an excellent target language.

The issue, as I understand it, is that you need to know whether an identifier in certain contexts is a type name or not. C99 increases the number of contexts. Consider:

    x * y;
If x is a type, this declares y as a pointer to elements of that type, if you're at the beginning of a block or in C99. If x is a variable, this is a multiplication in void context, unless you're at the top level of the translation unit. A somewhat more interesting example is

    x *f();
which may declare f or call it, depending on what x is. Of course, lacking the ability to overload the multiplication operator, there's no reason you'd ever do a multiplication in void context, but it is legal.

If your lexer can distinguish type names from other identifiers, the problem is solved, and you can use BNF from there.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact