
Some Obscure C Features - Morhaus
https://multun.net/obscure-c-features.html
======
conistonwater
In numerical analysis, "Hexadecimal float with an exponent" is not an obscure
feature, it's a really nice one! If you want to communicate to somebody an
exact number that your program has output, you need to either tell them the
decimal number to enough digits + the number of digits (i.e., "Float32(0.1)",
which is distinct from "Float64(0.1)"), or you can tell them the same number
in full precision in binary, in which case the floating-point standard
guarantees that that number is exactly correct and does not depend on how you
interpret it. It's really nice for testing numerical code, especially with
automated reproducible tests. Completely unambiguous, and I wish more
languages had that (I saw the feature in Julia first).

~~~
jacobolus
I wish Javascript, etc. had hexadecimal floats. It’s annoying to worry about
whether different implementations might parse your numbers differently,
worrying about whether you need to write down 15 or 17 decimal digits, ...

Often the numbers (e.g. coefficients of some degree 10 polynomial
approximation of a special function) are not intended to be human-readable
anyway. Such values were typically computed in binary in the first place, and
the only place they ever need to be decimal is when written into the code,
where they will be immediately reconverted to binary numbers for use.

~~~
microcolonel
I mean, it's not too difficult to add them. You can parse, shove your result
into a data view, Uint32Array, or whatever, then turn that into a
Float64Array.

~~~
jacobolus
Fair enough. You can also just base64-encode (or whatever) a big block of
binary data. But having built-in support for hex floats would be nicer.

------
elteto
The preprocessor trick of passing function macros as parameters is not that
obscure. I have seen it used and I've used it myself. It is very useful when
you have a list of static "things" that you need to operate on.

Say I have a static list of names and I would like to declare some struct type
for each name. I also would like to create variables of these structs at some
point, and I would always do so for the entire block of names. You could do
something like this:

    
    
        #define apply(fn) \
            fn(name1) \
            fn(name2) \
            fn(name3) \
            ...
            fn(nameN)
    
        #define make_struct(name) struct name##_t { ... }
        #define make_variable_of(name) name##_t name;
        
        ...
    
        apply(make_struct);  // This defines all the structs.
     
        void some_function(...) {
            apply(make_variable_of);  // And this defines one variable of each type.
        }
    

Yes, it is not pretty (it is the C preprocessor after all), but it can be very
useful and clean.

~~~
kzrdude
I would call that the X macro pattern, but the wiki article doesn't agree that
it should pass the `fn` as the argument. Not sure if that's important..
[https://en.wikipedia.org/wiki/X_Macro](https://en.wikipedia.org/wiki/X_Macro)

~~~
elteto
Maybe at some point macros could not be passed as arguments? I honestly don’t
know. Passing it as a parameter avoids all that define/undefine business.

------
kazinator
Most of these are due to the cruft added in C99 and later.

Compile-time trees are possible without compound literals.

More than twenty years ago, I made a hyper-linked help screen system a GUI app
whose content was all statically declared C structures with pointers to each
other.

At file scope, you can make circular structures, thanks to tentative
definitions, which can forward-declare the existence of a name, whose
initializer can be given later.

Here is a compile-time circular list

    
    
       struct foo { struct foo *prev, *next; };
    
       struct foo n1, n2; /* C90 "tentative definition" */
    
       struct foo circ_head = { &n2, &n1 };
    
       struct foo n1 = { &circ_head, &n2 };
    
       struct foo n2 = { &n2, &circ_head };
    

You can't do this purely declaratively in a block scope, because the tentative
definition mechanism is lacking.

About macros used for include headers, those can be evil. A few years ago I
had this:

    
    
       #include ALLOCA_H  /* config system decides header name */
    

Broke on Musl. Why? ALLOCA_H expanded to <alloca.h>. But unlike a hard-coded
#include <alloca.h>, this <alloca.h> is just a token sequence that is itself
scanned for more macro replacements: it consists of the tokens
{<}{alloca}{.}{h}{>}. The <stdlib.h> on Musl defines an alloca macro (an
object-like one, not function-like such as #define alloca __builtin_alloca),
and that got replaced inside <alloca.h>, resulting in a garbage header name.

~~~
quietbritishjim
I haven't heard of "tentative definitions" before. Couldn't you just replace
it with a regular declaration i.e.

    
    
        extern foo n1, n2;
    

Is there any benefit of tentative definitions over this?

~~~
kazinator
Yes; the "extern" is potentially confusing when the definition is located in
the same file below. They usually inform the reader of the code to look in
some other file for the definition, and usually appear in header files.

------
compi
I was reading the source code for a NES assembler written in pre-C99 C, and
there was an odd C feature used in it that I haven't really seen anywhere
else.

It was before C had built-in booleans and the author had defined their own,
but true was:

    
    
        void * true_ptr  = &true_ptr;
    

true_ptr is a pointer to itself. So however many times you deference it:

    
    
        printf("%p\n", true_ptr);
        printf("%p\n", &true_ptr);
        printf("%p\n", *((void**)true_ptr));
        printf("%p\n", *((void**)*((void**)true_ptr)));
    

You get the same pointer:

    
    
        0x5555fefe715b
        0x5555fefe715b
        0x5555fefe715b
        0x5555fefe715b
    

I still think that it's neat that, even with ASLR, you have an address at
compile time that you know won't collide with address space of malloc results,
or the address space of your stack.

Also you can declare the pointer as const and the value it points to as const
and, if your kernel faults on writing to readonly memory pages, you get a
buggier version of a NULL pointer that only segfaults on write.

Also it takes a second to figure out why the position of the const matters
even though the pointer's value is the value of the pointer, and why only one
of these segfaults on write:

    
    
        const void * const_pointer = &const_pointer;
        void * const const_value   = &const_value;

~~~
davelee
I've used this feature to create constants that are unique IDs.

~~~
shultays
Wouldnt count macro work as well?

~~~
davelee
I think that would be per file/translation unit. These values are unique
across a linked binary.

------
vq
No mention of trigraphs? They are one of my favourite obscure C language
features that I've never used.

Excerpt from GCC man page:

    
    
      Trigraph:       ??(  ??)  ??<  ??>  ??=  ??/  ??'  ??!  ??-
      Replacement:      [    ]    {    }    #    \    ^    |    ~
    

Missing backslash on your keyboard? No problem, just type ??/ instead.

~~~
cpeterso
Related to trigraphs are the alternative logical operator keywords like `and`
and `or`. I'm surprised people don't use them more often because they're nicer
to read than && and ||. In C, you must #include <iso646.h> but I think they're
standard keywords.

C++ code example on Godbolt:
[https://godbolt.org/z/ED6tXK](https://godbolt.org/z/ED6tXK)

[https://en.cppreference.com/w/cpp/language/operator_alternat...](https://en.cppreference.com/w/cpp/language/operator_alternative)

~~~
paulrpotts
Hmmm... I would not prefer using "and" and "or" because that syntactic sugar
obscures whether the bitwise or logical operations are intended. You get
really used to reading && and || as "and" and "or" in your head after the
first two decades of C programming : )

------
piadodjanho
A more obscure feature is the uncommon usage of comma operator. We often use
the comma operator in variable declaration and in for loops. But it can also
be used in any expression.

For instance, the next line has a valid C construct:

    
    
       return a, b, c;
    

This is particularly useful for setting variable when retuning after an error.

    
    
       if (ret = io(x))
           return errno = 10, -1;
    

The possibilities are endless. Another example:

    
    
        if (x = 1, y = 2, x < 3)
         ...
    

But the comma operator really shines when used in conjunction with macros.

~~~
sfoley
The comma character as a token in variable is decalararions is not the same
thing as the comma operator.

~~~
piadodjanho
You are correct. My mistake.

------
nothis
All craziness but then:

>a[b] is literally equivalent to *(a + b).

Is this obscure? I thought that's pretty much the first thing you learn about
arrays in C? It's pointers, all the way down.

~~~
atq2119
Not really. The way it's usually introduced is that you get the same
"reference" both ways. The fact that it's _literally_ equivalent, and
especially that there's no pointer type requirement on the left-hand-side,
with the consequence of allowing ridiculous code like 2[array], is pretty
obscure. Even more so because the equivalence _doesn 't_ work that way in C++
-- in general, features of C that _aren 't_ available in C++ tend to be not as
widely known.

~~~
eMSF
>Even more so because the equivalence doesn't work that way in C++

What do you mean? As far as I remember, C++ is very similar in this respect
when it comes to array and pointer types.

~~~
atq2119
Only for the most primitive types, though. For anything else, operator[] is
used in C++ instead, without an operator+ fallback. So for example, if your
array is a std::array instead of a C-style array, saying 1[array] will not
work.

------
jimmoores
These examples actually were more obscure than the usual list.

~~~
dkersten
Yeah, I did know 3 of them[1] but the rest were completely new to me! Not that
I’m an expert or anything, but I tend to enjoy reading about obscure C
features. That sizeof can have side effects is... a bit crazy although the
multiple compatible function declarations are horrifying.

[1] Array designators, Preprocessor is a functional language and a[b] is a
syntactic sugar.

------
needs
A little off-topic but here is a cool piece of code about a special case in C:

    
    
        void (*foo)() = 0;
        void (*bar)() = (void *)0;
        void (*baz)() = (void *)(void *)0; // Error
    

Can you guess why compilers reject the last line?

~~~
eMSF
Only the first two expressions on the right are null pointer constants
(integral constant expression with a value of 0, optionally cast as a void *),
that can be used to initialize all pointer variables, including function
pointers. The last one is merely a null pointer (to void), that can't be
implicitly converted to a pointer to a function.

C++ has stricter rules for null pointer constants, and thus only the first
version is valid C++.

------
bsder
Although calling the preprocessor "functional" is being too pleasant. The C
preprocessor was always a text substitution system, so macro as parameter is
not that "obscure". Of course, I may have missed something subtle in the
example.

It's also not clear how to use that preprocessor example.

~~~
augusto-moura
see [http://conal.net/blog/posts/the-c-language-is-purely-
functio...](http://conal.net/blog/posts/the-c-language-is-purely-functional)

old but gold

~~~
juped
>The C ADT is implemented simply as String (or char *, for you type theorists,
using a notation from Kleene)

Still makes me laugh

------
jcranmer
sizeof actually doesn't evaluate its expression for side effects most of the
time; only if the operand is a variable-length array is it evaluated.

~~~
_kst_
And it's not even entirely clear what that means.

The standard says:

"If the type of the operand is a variable length array type, the operand is
evaluated; otherwise, the operand is not evaluated and the result is an
integer constant."

But what does it mean to evaluate the operand?

If the operand is the name of an object:

    
    
        int vla[n];
        sizeof n;
    

what does it mean to evaluate `n`? Logically, evaluating it should access the
values of its elements (since there's no array-to-pointer conversion in this
context), but that's obviously not what was intended.

And what about this:

    
    
       sizeof (int[n])
    

What does it mean to "evaluate" a type name?

It's not much of a problem in practice, but it's difficult to come up with a
consistent interpretation of the wording in the standard.

~~~
jcranmer
> If the operand is the name of an object [...] what does it mean to evaluate
> `n`?

When you say "n", syntactically, you have a primary expression that is an
identifier. So you follow the rules for evaluating an identifier, which will
produce the value. C doesn't describe it very well, but the value of the
expression is the value of the object. In terms of how it is implemented in
actual compilers, this would mean issuing a load of the memory location, which
is dead unless `n` is a volatile variable.

~~~
_kst_
Sorry, in the first example I meant to write (adding a declaration and
initialization for n):

    
    
        int n = 42;
        int vla[n];
        sizeof vla;
    

not `sizeof n`). (It doesn't look like I can edit a comment.)

Logically, evaluating the expression `vla` would mean reading the contents of
the array object, which means reading the value of each of its elements. But
there's clearly no need to do that to determine its size -- and if you
actually did that, you'd have undefined behavior since the elements are
uninitialized. (There are very few cases where the value of an array object is
evaluated, since in most cases an array expression is implicitly converted to
a pointer expression.)

In fact the declaration `int vla[n];` will cause the compiler to create an
anonymous object, associated with the array type, initialized to `n` or `n *
sizeof (int)`. Evaluating `sizeof vla` only requires reading that anonymous
object, not reading the array object. The problem is that the standard doesn't
express this clearly or correctly.

~~~
enriquto
Your example is unnecessarily tame. The value of n need not be known as
compile time, e.g. it may be input by the user at runtime.

------
shift_reset
The article has examples of array parameters like

    
    
        int b[const 42][24][*]
    

but you can get more fun with variably modified array parameters instead of
just constants like 42. For example,

    
    
        double sum_a_weird_shaped_matrix(int n, double array[n][3*n]) {
            double total = 0;
            for (int x = 0; x < n; ++x) {
                for (int y = 0; y < 3 * n; ++y) {
                    total += array[x][y];
                }
            }
            return total;
        }
    

has a variable and a more complicated expression in those positions.

But those variably modified parameters can have arbitrary expressions in them,
like

    
    
        int last(size_t len, int array[restrict static (printf("getting the last element of an array of %zu ints\n", len), len--)]) {
            return array[len];
        }
    

C++ denies us this particular joy which could have made function overload
resolution even more fun.

~~~
stevenhuang
Another neat note IIRC is that array parameter sizes don't actually do
anything, they just are there for semantic purposes and get treated as raw
pointers.

So if you do

    
    
        void func(int x[10]);
    

You're free to call it like

    
    
        int k[5];
        func(k);
    

And you won't get any warnings. Unsettling!

~~~
shift_reset
That's what the static keyword means in those array declarators.

    
    
        void func(int x[static 10]);
    

must be called with an argument that is a pointer to the start of a big enough
array of int. I can't get recent GCC or Clang to warn on violations of this,
though.

~~~
spc476
There are cases where the compiler _can 't_ enforce it:

    
    
        void foo(int *p)
        {
          func(p);
        }
    

How can the compiler know if `p` points to space for 10 integers?

------
stefan_
A nice but I guess more of a linker feature is if you declare a function as
__weak__, you can check it at runtime for == NULL to determine if the
application was built with the function defined.

~~~
multun
I would say most interesting linker features are non standard, so outside of
the scope of this article :/

~~~
cryptonector
True, however, it is precisely the stupid linker tricks that make C such an
interesting and powerful language nowadays. Weak symbols. Interposition
(LD_PRELOAD). dlopen() and friends. Filters. Direct binding / versioned
symbols. ELF semantics in general (which make the use of one flat symbol
namespace safer).

------
Keyframe
One of the more obscure, yet often employed, "features" is to use something
else as a C preprocessor.

------
tasty_freeze
This was c++ and not C, but it is a preprocessor pitfall.

I needed to compare and older and newer version of some file from the RCS, so
I saved temporary copies named "new" and "old". diff told me what I needed to
know, but I failed to delete those temp files.

Hours later I typed "make" to build my program and got all sorts of errors
deeply nested in some library function. Did someone misconfigure the server I
was on? OK, maybe it is an incremental build problem? etc. It took took long
to figure out the problem.

It turns out that during compilation, as one of the library .h files was being
scanned, it contained #include <new>, which picked up the junk file in my
working directory instead of using the C library.

------
etaioinshrdlu
I found the most interesting one to be compile time trees.

Does anyone have any good use cases for it?

~~~
kccqzy
I don't like to think of it as compile-time trees. It basically only allows
you to construct complicated structures but doesn't allow you to examine them
at compile time (that would require constexpr functions in C++; can't be done
in C). It's honestly not a very impressive feature.

------
higherkinded
Effectful sizeof, what a delightful _feature_!

Though the compile-time magic with structs and functional macros are so
tempting that I feel like it's high time to do some C.

------
deckar01
> VLA typedef ... I have no clue how this could ever be useful.

I used this feature recently. I had several arrays of the same size and type,
and the size was determined at runtime. The VLA typedef let me avoid duplicate
type signatures which I find more readable.

    
    
        int N = atoi(argv[1]);
        typedef int grid[N][N];
        grid board;
        grid best;
        grid cache;

------
JeromeLon
No mention of the downto (-->) operator?

    
    
      int x = 10;
      while (x --> 0) {
        printf("%d ", x);
      }

~~~
inquist
That's just the decrement operator and inequality operator!

~~~
cvs268
not inequality, rather the "greater-than" operator.

"!=" would be the inequality operator... :-)

------
haolez
> a[b] is literally equivalent to *(a + b). You can thus write some absolute
> madness such as 41[yourarray + 1].

Wow. This has to be the best C obscurity that I've ever seen.

~~~
multun
Come on, that might not be obscure for a C ninja master like you, but I'm
pretty sure many people had no idea ;-)

Notice how the list is sorted from actually obscure to less interesting.

I even took the time to write a pseudo disclaimer above this one :D

~~~
haolez
I wasn't being ironic :)

------
inlined
The WTF part seems useful. E.g.

type quaternion float[4];

quaternion SLERP = {1.0, 0, 0, 0};

~~~
kccqzy
Did you mean

    
    
        typedef float quaternion[4];
    

?

That's not a VLA.

------
megiddo
Hell, c11 has ghetto generics.

------
radamadah
Are we calling these features, now?

~~~
aspaceman
Most of the examples listed (hex floats especially) are very much features.

------
EGreg
Here is an obscure c feature:

    
    
      int main()
      {
         int a = 8;
         {
           int a = 4;
           /* a is only scoped to this block */
         }
         printf("%d", a); /* prints 8 */
      }
    

It is also why C++ is not a strict superset of C

~~~
jcranmer
If you want to tell if your compiler is C or C++, run this program:

    
    
       #include <stdio.h>
       int main() {
         printf("%d\n", sizeof('a'));
       }
    

It's the second thing C++ mentions in its list of incompatibility with C (the
first is "new keywords").

A more obscure difference is this program:

    
    
       int i;
       int i;
       int main() {
         return i;
       }
    

It's legal C but not legal C++.

~~~
ksherlock

        int main() {
          return 4//**/2
          ;
        }

~~~
eMSF
Line comments have been part of the C language for a long, long time (added in
C99); so much so that, especially when discussing the subject on the internet,
more and more often they predate some of the younger participants.

~~~
saagarjha
Personally, I didn't know that C lacked line comments until I ran into a
project that compiled with -std=c89 -pedantic.

~~~
_kst_
C doesn't lack line comments. The obsolete 1989/1990/1995 version(s) of C did
lack line comments.

