Hacker News new | past | comments | ask | show | jobs | submit login

As others have said, yes, that's mostly the case.

Even parsing C and elaborating it to a correctly typed AST is not quite as simple as others are making it out to be, though. Getting all the implicit type conversions correct is not completely trivial, and is a popular source of bugs (see some examples in http://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf for instance).

There are also some annoying ambiguities in the grammar and in name resolution rules that mean that sometimes it's not easy to tell whether something is supposed to be a type name or a variable name: https://jhjourdan.mketjh.fr/pdf/jourdan2017simple.pdf

C is a "simple" language in many senses of the word, but it has a lot of complex details. It's fine to gloss over them when writing a compiler for a language very similar to C for learning purposes, but getting everything just right for actual C is tough.




Yes, stuff like the `a * b` ambiguity (declaration or expression?), typedef being just another storage class specifier (thus, `static int var` and `typedef int mytype` being the exact same syntactic forms), having complex rules for implicit promotions and conversions, the mess between signed and unsigned.

All of these design warts of C show up clearly when attempting to write a compiler and are not very obvious to most users of the language.


Function pointer declarations is also a fun one, specially in a function declaration as return value and parameters.


Typedefs are a little trickier to handle (particularly the "when do they actually take effect" part), but "the `a * b` ambiguity" is not hard, because if 'a' is a typedef or qualifier/specifier or basically anything that a declaration can start with, it's a declaration.

C++, on the other hand, is definitely far harder to parse, especially if you include things like templates.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: