
Scandalous weird old things about the C preprocessor (2015) - fanf2
https://blog.robertelder.org/7-weird-old-things-about-the-c-preprocessor/
======
cafard
Hmm. The _Intermediate Greek Lexicon_ gives the original meaning of
"skandalon" as "a trap or a snare laid for an enemy":
[http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%...](http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.04.0058%3Aalphabetic+letter%3D*s%3Aentry+group%3D9%3Aentry%3Dska%2Fndalon)

------
mehrdadn
One thing not mentioned here: it seems to me (though I'm not 100% sure) C++11
kind of wrecked macro preprocessing. Until it came along, I _think_ you could
ignore C++ tokens and just pay attention to the preprocessor directives when
figuring out e.g. what to #include, because strings couldn't span multiple
lines, so you could easily identify the directives with a # in the beginning
of the line. But C++11 introduced verbatim string literals, and it seems to me
now you have to actually tokenize C++ constructs just to figure out how to do
macro preprocessing. This means C++ preprocessing actually is a lexically
different language than C preprocessing, which is kind of bewildering to me
(notwithstanding more obvious differences like __cplusplus and such) and which
means you can no longer e.g. just process the #include's without regard to the
C++ tokens.

EDIT: Never mind, I'm wrong. Totally missed that line continuations can occur
in quotes :(

~~~
user982
_> I think you could ignore C++ tokens and just pay attention to the
preprocessor directives when figuring out e.g. what to #include, because
strings couldn't span multiple lines, so you could easily identify the
directives with a # in the beginning of the line._

    
    
      char *cstr = "This \
      #is a valid "
      "C string";
    

If I understand you correctly, is this a counterexample?

~~~
mehrdadn
Aw shoot, yeah it is. :( Thanks!

------
ScottBurson
While I love to hate on the preprocessor -- and did so publicly once in _The
Unix Hater 's Handbook_ (p. 211) -- item 4 is silly. The _whole point_ of many
macros is to circumvent referential transparency! This is true even in a
language with real macros, like Lisp. ("Real macros" operate on a syntax tree
representation, not on character strings.)

------
Ididntdothis
All I can say is that I really miss the preprocessor. There are so many things
that could easily be done with the preprocessor but instead have to be done
with reflection or code generation in a more complex way.

Sure, it can be be abused/misused but I am not sure that's a good reason to
take it away completely like for example in C#.

~~~
ygra
C# has other mechanisms to achieve what macros are commonly used for, though.
In a bit, with Roslyn's source generators there will be an even better way for
some of them.

~~~
rwmj
One place where preprocessors shine is where the language itself has changed
in an incompatible way, where "language" can be interpreted broadly including
calls to external libraries. For example some function you need to call added
an extra parameter:

    
    
      #if LIB_VERSION_GE_3
        lib_f (1, NULL);
      #else
        lib_f (1);
      #endif
    

It can be difficult to do this with other mechanisms where the compiler is
actually compiling and therefore type checking both branches.

OCaml is a pretty rich language with many features, but we fall back on
preprocessors like cppo (or even cpp) to deal with these kinds of cases.

~~~
pjmlp
You can do that with .NET conditional compilation, no need for the C style
macros.

~~~
rwmj
Assuming you mean this: [https://docs.microsoft.com/en-
us/dotnet/csharp/language-refe...](https://docs.microsoft.com/en-
us/dotnet/csharp/language-reference/preprocessor-directives/preprocessor-if)
it looks like a restricted form of the C preprocessor. (We may quibble about
whether it is separately "pre" processed or not, but even cpp doesn't run as a
separate stage in modern C compilers).

~~~
Ididntdothis
The C# preprocessor is unnecessarily restricted. It has a few features but it
feels very arbitrary what they include or exclude. I think they didn’t like
the idea but couldn’t avoid it totally so they did something half assed. I bet
a lot of frameworks that need reflection could be replaced with macros if the
preprocessor had a few more features.

~~~
pjmlp
It is not half assed, rather what has been proven since the 60's to work
properly in all languages that aren't C, nor copy-paste compatible with C.

Even C++ is reducing the need to keep using the pre-processor with each ISO
revision, in an ideal modern C++ world without C legacy baggage, the pre-
processor will be so half assed like in C#. C++20 is already almost there.

------
bla3
(2015)

------
coliveira
The author tries to compare the preprocessor with programming languages,
however he seems to forget that the processor is, above all, not a language!
There is no parse tree for a preprocessor, since it is just putting text
strings together, with little regard for the resulting syntax. That's why it
is stupid to try to use the Cpp as a programming language, it won't work.
Think of it as a text replacement engine, like an integrated sed, for example.

~~~
raverbashing
Ahem
[https://www.ioccc.org/years.html#2001_herrmann1](https://www.ioccc.org/years.html#2001_herrmann1)

------
WalterBright
He missed one important thing - the C preprocessor is completely unaware of C
types. Its expressions follow different rules.

A major goal of D was to make the language expressive enough to not leave room
for the preprocessor:

1\. modules

2\. manifest constants

3\. nested functions

4\. lambdas

5\. static if

6\. compile time function execution

7\. string mixins

~~~
TheSoftwareGuy
I don't think he did miss that, rather it simply wasn't on-topic for the
article he was writing. Anybody who has used the C preprocessor knows it works
at the level of text/tokens and no further. The fact that it knows nothing of
types would hardly be "scandalous"

But you can go ahead and use this thread to promote D anyways.

~~~
WalterBright
The #if does more than just text and tokens, it has an expression grammar
(constant-expression). It's implementation-defined whether the following two
produce the same value:

    
    
        #if 'z' - 'a' == 25
        if ('z' - 'a' == 25)
    

Another difference is all constants are typed as intmax_t and uintmax_t, not
the usual C types. Yes, it does have types.

This difference in behavior from C expressions is not necessary.

------
joosters
Which of these are scandalous, exactly?

If the OP is aggrieved by the operation of cpp, perhaps they could have
explained in each of their points what cpp is doing wrong, and how they would
improve the standard. Instead, they just list 'odd' behaviour, with no comment
about why it is wrong and how it could be better.

~~~
klyrs
This document immensely valuable in enumerating pitfalls in the spec and in
how the spec is interpreted by various implementations. I would wager that
most good-faith* C programmers are weary of overusing cpp, and use a limited
subset of its capabilities. This is a great resource to throw at a junior who
gets too excited about macros, for example.

* where I'm defining "bad-faith" C programmers as using it for sadistic purposes like IOCCC, golf, etc

> perhaps they could have explained in each of their points what cpp is doing
> wrong, and how they would improve the standard.

This is a fairly toxic attitude, and possibly the reason that your comment is
unpopular. A child can correctly observe that an emperor is not wearing
clothes, and bring that up without possessing the skill to make the emperor a
suitable outfit.

~~~
joosters
I just found the article odd; it starts off by declaring the behaviour of cpp
to be objectionable, but then it just lists different aspects of its actions.
At no point does the OP complain that cpp is doing the _wrong_ thing - indeed,
they point out in several places that it couldn't really be changed to do
anything else because either it would break existing code, or would be just as
odd if it behaved in another way. So I'm wondering just what it was they were
scandalized about? I'm not demanding that they fix cpp themselves!

At least the child told people that the emperor was wearing no clothes; a more
apt analogy with this article would be as if the child merely said "that's
scandalous!" without telling anyone about the missing clothes...

~~~
klyrs
> 1) No Comprehensive Standard

Here the "scandal" is that the "standard" is poorly specified, and therefore
isn't much of a standard. This seems to be the root of the scandal, and all
else appears to flow from here.

> 2) Context Free, Just Kidding!

Here the "scandal" is regarding to context sensitivity -- the author
vacillates about the status of the "scandal" but still, it seems clear that
they're identifying context-sensitivity as scandalous.

> 3) Whitespace Insensitive, Just Kidding!

Here the author considers irregularities in how "whitespace sensitive" cpp is
-- both in the "spec" and in implementations, with an example of non-
portability between gcc and clang.

et cetera

