
Dave Prosser's C Preprocessing Algorithm (2006) - pcr910303
https://www.spinellis.gr/blog/20060626/
======
foundry27
For anyone interested in examples of software where true preprocessor standard
conformance is essential, check out [https://github.com/rofl0r/order-
pp](https://github.com/rofl0r/order-pp). It’s a functional programming
language built on the C preprocessor that can essentially output any sequence
of preprocessing tokens, with high-level language features like closures,
lexical scoping, first-class functions, reflection, an eval primitive, etc. It
also provides data structures like sequences and tuples with many functions
for operating on them, and has arbitrary-precision arithmetic support. Pretty
neat stuff.

~~~
RossBencina
Interesting. Would you happen to have a link to some human readable
documentation and/or examples? I couldn't make much sense of the grammar in
[1] without usage examples.

[1] [https://github.com/rofl0r/order-
pp/blob/master/doc/notes.txt](https://github.com/rofl0r/order-
pp/blob/master/doc/notes.txt)

~~~
lifthrasiir
There are some annotated examples [1].

[1] [https://github.com/rofl0r/order-
pp/tree/master/example](https://github.com/rofl0r/order-
pp/tree/master/example)

------
anonsivalley652
Notice that this operates on a lexer stream to transform a source file into
the destination without any macros/macro-substitutions remaining. It's not the
same lexer as that in the C compiler, which doesn't know anything about
macros.

Also, on the C compiler-side, people might be interested in the lexer hack
because C isn't a CFG.

[https://en.wikipedia.org/wiki/Lexer_hack](https://en.wikipedia.org/wiki/Lexer_hack)

------
userbinator
It is left as an exercise to the reader to implement this algorithm
efficiently, but I really like the elegant use of recursion.

For those who can't be bothered diffing, the fix in the corrected version is
inside the 3rd condition of subst().

------
Koshkin
Looks like it’s the correct handling of tokens’ hide-sets that is the key. (A
hide-set is used for preventing the infinite recursion in the macro-expansion
process when a token appears somewhere inside its own definition.)

