
Some obscure C features - mort96
https://mort.coffee/home/obscure-c-features/
======
mrgriffin
The author tried using __COUNTER__ and a static array of function pointers,
and narrowly missed a workable idea—a static linked list! Forward declarations
pun definitions, and can be used roughly as follows:

    
    
      struct Element {
        int value;
        struct Element *next;
      }
    
      #define ELEMENT(value) _ELEMENT(value, __COUNTER__)
      #define _ELEMENT(value, n) \
        struct Element CAT(e, SUCC(n)); \
        struct Element CAT(e, n) = { (value), &CAT(e, SUCC(n)) }
    

You can then walk from e0 through the linked list, but because of how the
initialization works, you're done when the current element has `next == 0`.

    
    
      int main(void)
      {
        struct Element *e = &e0;
        while (e->next) {
          printf("%d\n", e->value);
          e = e->next;
        }
      }
    

Here's a repl.it:
[https://repl.it/repls/DetailedYoungAsianconstablebutterfly](https://repl.it/repls/DetailedYoungAsianconstablebutterfly)

~~~
sounds
The Linux Kernel also used another trick - placing data in a special named
"section."

The linker's job is to combine all the separate pieces in a "section" (special
linker term) and write out one section.

Using a special named section lets you create an array using macros, though
using it for function pointers is probably safe and using it for other data
types might be pretty fragile.
[http://www.compsoc.man.ac.uk/~moz/kernelnewbies/documents/in...](http://www.compsoc.man.ac.uk/~moz/kernelnewbies/documents/initcall/kernel.html)

~~~
AceJohnny2
I was impressed when I discovered our codebase was using this trick to
implement new CLI commands. Each C file would implement the functions and
helptext etc for implementing a command, but I couldn't find where each
command was being registered in the central parser. Because they _had_ to be
registered in some central list, right?...

Turned out the macros I overlooked at the end of each file placed the struct
defining the command in a special linker section, and the parser init would go
and load all the commands from that section.

Interesting to know that this trick already existed in the Linux kernel.

I'm (perhaps perversely) disappointed we aren't using more of the linker's
power in many C codebases.

~~~
BeeOnRope
Yeah, the linker offers a lot of magic waiting to be discovered.

A primary problem with linker magic is that it is usually very non-portable.
Hacks you do with macros may be ugly but at least they work everywhere, unless
you used a non-standard behavior.

Linker tricks tend to be platform specific, and even more so that other tricks
(eg POSIX platforms share a lot of common behavior, but still tend to have
quite different linkers).

~~~
openasocket
I had a minor epiphany a few months ago where I realized the linker can be
seen as sort of doing dependency injection for a bunch of object files. In
theory, you could have a program that depends on some set of symbols, two
object files that each include the symbols, and a linker script that at load
time checks some configuration parameter before determining which subsystem to
link with. You could even get really fancy and dynamically generate another
object file with those symbols that acts as a proxy. And from there you could
basically implement Spring for C, which obviously is what everyone wants to do
;)

~~~
Someone
In theory? LD_PRELOAD is specifically designed for it. For examples, see
[https://rafalcieslak.wordpress.com/2013/04/02/dynamic-
linker...](https://rafalcieslak.wordpress.com/2013/04/02/dynamic-linker-
tricks-using-ld_preload-to-cheat-inject-features-and-investigate-programs/)

------
WalterBright
My not-so-humble-opinion is that if you find yourself doing metaprogramming
with the C preprocessor, it's time to upgrade to a more powerful language.

~~~
vardump
Sometimes you just don't have other options than C.

Like kernel drivers, embedded (microcontroller) development.

Or perhaps you just need to develop a small loadable library without _any_
runtime (no malloc, etc.). Or perhaps writing Python, Lua etc. C-module for
better performance.

You just need to treat C with proper respect, not be too arrogant about your
skills.

I hope over time we can move more and more to Rust or some other language
which gives as many errors at compile time as possible and helps with not
shooting at our feet.

Shouldn't have too many C-macros, of course. Some macro messes I've seen can
be nearly impossible to debug...

~~~
pcwalton
> You just need to treat C with proper respect, not be too arrogant about your
> skills.

We've been hearing "C isn't the problem, bad programmers are the problem" for
40 years. This has continued to perpetuate the problems with the language.

I mean, I certainly agree that there are no alternatives to C _now_ for
certain domains, but there are real problems with it that aren't solved by
just "treating it with proper respect".

~~~
vardump
Having programmed C for soon 30 years, I absolutely agree with you that C _is_
the problem.

But often you just need to get the job done, and have limited options
available.

Perhaps in 50-100 years, people looking back at software from this era won't
be able to understand how we managed to make it work at all.

I haven't met or know anyone mastering C/C++ ( _especially_ C++), and doubt
I'll ever hear about one.

~~~
jstimpfle
> I haven't met or know anyone mastering C/C++ (especially C++), and doubt
> I'll ever hear about one.

C can definitively be mastered. C at it's core is pragmatic minimalism. The
basic ideas are very simple. If you want to be a language lawyer and know the
standard completely (including all historical accidents), it can get complex,
but still manageable. But anyways, you shouldn't be a language lawyer. That's
not C. Make maintainable programs instead, stay in the middle of the road.

C gets out of the programmer's way. It's not about mastering _C_ , but about
mastering _programming_. And while I agree there's much bad C software out
there: I haven't seen one maintainable non-bloated Java project. Here is one
master of C that I'm sure you know: Linus Torvalds. Of course there are many,
many more. Just remember it's not about C, but programming machines in
general.

~~~
vardump
> C can definitively be mastered.

If that's true, how come there are almost no widely used network facing C
programs that didn't have plenty of security vulnerabilities?

(Anything written by Daniel J. Bernstein doesn't count, because there's no way
he's a mere mortal. :))

~~~
hedora
OpenBSD has done a great job, especially if you look at what they achieved,
how fast they got there, and what resources were at their disposal.

------
cperciva
Only the first two of these are available in C. Computed gotos and local
labels are gcc extensions; and the dynamic linker is POSIX but not C.

(Incidentally, the dynamic linker hack is precisely how both FreeBSD and Linux
perform all their kernel initialization: Individual modules define symbols
with a particular naming pattern, and then the kernel linker enumerates those
symbols.)

~~~
loeg
I think you're missing the point of the article. The author isn't limited to
standard C, and neither is anyone with a unix and GCC.

Are you just nitpicking the author's description of extensions as "C?"

~~~
kbenson
It was a clarifying comment, didn't denigrate anyone, and added a useful and
interesting tidbit of information. I see nothing wrong with this.

Perhaps the comment author should receive some benefit of a doubt as to his
intentions?

~~~
loeg
Sorry, you're just incorrect about both the content of Colin's comment and
authorial intent. Colin clarifies he is expressing an objection, not a
clarification, and that it is in fact the nitpick I guessed it was:
[https://news.ycombinator.com/item?id=16236015](https://news.ycombinator.com/item?id=16236015)

Yes, it doesn't denigrate anyone, but that's a really low bar.

I do give authors the benefit of the doubt when something isn't clear. That's
why I couched my message in "I think" and asked a question instead of just
asserting facts.

~~~
kbenson
> Sorry, you're just incorrect about both the content of Colin's comment

You seem to have misinterpreted me. I wasn't saying Colin was attempting to
clarify (I didn't know his intent), I was saying that is his comment was
_factually clarifying_. I didn't have any evidence as to his intent at the
time, but I didn't see a reason to assume anything other than he was being
helpful.

> Yes, it doesn't denigrate anyone, but that's a really low bar.

That's not even half of the criteria I listed, so it's not really the bar I
set at all, is it?

The comment contributed by clarifying the submission name for those that had
not yet read the article. It _additionally_ contributed by adding some factual
information about some of the referenced features and how they are used in the
kernel.

I don't think anything he said takes away from the article, so I don't think
he's necessarily missing the point.

> I do give authors the benefit of the doubt when something isn't clear.
> That's why I couched my message in "I think" and asked a question instead of
> just asserting facts.

I think you also misunderstood what I was trying to accomplish, how much of my
statement was meant to be a condemnation and correction, and how much was
leveled at you instead of the general readership. The only existing reply to
you at the time I posted was referencing people on HN that like to nitpick. It
was meant as a "this isn't necessarily negative, so let's just take it for
what it provides, and it provides some usefulness" and not as "shame on you
for assuming the worst".

Edit: Blah, there was some weird wording in that last paragraph

------
epilk
I also encountered the last problem, it can be handled pretty easily with
__attribute__((constructor)):

    
    
      void (*_test_functions[NTEST_FUNCTIONS])(void);
      int _test_function_idx;
    
      #define TEST_FUNCTION(fname) \
        void fname(void); \
        __attribute__((constructor)) static void _init_ ## fname (void) { \
          _test_functions[_test_functions_idx++] = fname; \
        } \
        void fname()
    
      TEST_FUNCTION(test_foo_bar) {
        assert(foo_bar == 0);
      }

~~~
apaprocki
A quite fun, related feature of the AIX linker, is that it will automatically
make any function name prefixed with "__sinit_" statically invoked the same as
this attribute does. So you name the function a certain way to take care of
AIX/xlc, add pragmas to take care of other UNIX vendor compilers add the
attribute for any GCC-based (or compatible) compiler, and they are actually
pretty cross-platform! Not... that... I would use these all the time or
anything like that :)

~~~
pjmlp
The fun was even better in the old days, where each UNIX had a different
concept of what a shared object should be.

AIX used to implement shared objects just like on Windows, using an export
definitions file and import libraries, for example.

~~~
apaprocki
Still does to this day. And it has a machine-wide cache of libraries, so
simply updating the library and restarting the processes isn't enough to get
the new version -- you have to be aware it could be in the cache and might
need to be manually flushed by someone with elevated access. _shudder_

~~~
pjmlp
Thanks for the clarification.

Last time I used it was about 2003.

------
webkike
I love taking the address of labels. Back in highschool I wrote a couple of
articles on using them to create JIT compilers:

[http://maplant.com/jit_bf.html](http://maplant.com/jit_bf.html)

[http://dginasa.blogspot.com/2012/05/jit-compilation-
without-...](http://dginasa.blogspot.com/2012/05/jit-compilation-without-much-
assembler.html?m=1)

~~~
coliveira
This is similar to the technique used to implement gforth. The interpreter is
all contained in one function, and the instructions are accessed using local
labels and goto.

~~~
loeg
I believe CPython does something similar.

------
dfox
Instead of the tricks with __COUNTER__ it is better to define the macro such
that it also defines initializer function which registers whatever you want to
register dynamically on program startup. The way of doing that is somewhat
non-portable but essentially every interesting compiler-platform combination
has straightforward way to do that as it is needed for both C++ (to call
constructors of global variables) and ObjectiveC (as part of the code that
@implementation expands into) implementations.

Edit: also, syntax-wise it seems to me that it is better do define the macro
such that it can be used as

    
    
        define_foo(blabla...){ ... code ...}
    

In this case this would involve code like

    
    
        #define define_foo(name, ...) \
        static foo_ ## name(); \
        ... initializer function ...
        static foo_ ## name()
    

which also solves the first problem mentioned in the article.

~~~
mort96
I agree with your point about syntax; it would have been nice if the syntax
was `define(blah) { ... }` and `it("whatever") { ... }`. However, I don't
think that's possible, because I need to include code after the end of the
block.

That might not be obvious from the simplified macros I used in my example, but
it's pretty clear by looking at the actual definitions. The describe and
subdesc-macros
([https://github.com/mortie/snow/blob/a9ad850df456f78bcf96e1aa...](https://github.com/mortie/snow/blob/a9ad850df456f78bcf96e1aaeb5f42081f243f63/snow/snow.h#L377))
need to increment counters and print their status after the block, and the
`it` macro
([https://github.com/mortie/snow/blob/a9ad850df456f78bcf96e1aa...](https://github.com/mortie/snow/blob/a9ad850df456f78bcf96e1aaeb5f42081f243f63/snow/snow.h#L336))
needs to run deferred expressions.

I think it would be possible to implement the `describe` macro such that it
can be used as `describe(blah) { ... }`, because that defines a function and
we can both give the function argument and expect return values from it, but I
can't think of any way to do it with the other macros which just create
regular `do { ... } while(0)` blocks.

If I'm wrong, please show me how; the `foo(blah) { ... }` syntax would make
the __LINE__ macro work, it would play better with auto indenters and syntax
highlighters, and it would give prettier error messages if you have a syntax
error in a test case. I just can't see any way it would be possible.

~~~
fanf2

        #include <stdio.h>
        
        #define before_after(before, after)					\
        	for(int before_after_##__LINE__ = 0;				\
        	    before_after_##__LINE__ < 3;				\
        	    before_after_##__LINE__++)					\
        		if(before_after_##__LINE__ == 0) {			\
        			printf("%s\n", before);				\
        		} else if(before_after_##__LINE__ == 2) {		\
        			printf("%s\n", after);				\
        		} else
        
        int
        main(void) {
        	before_after("hello", "world") {
        		printf("to all the\n");
        	}
        	return(0);
        }

~~~
kreco
You should combine "__COUNTER___" with "before_after_##__LINE__" to make a
unique identifier more robust.

Your snippet won't work with this inline example:

    
    
      before_after("hello", "world") { before_after("hello", "world") { }}

------
loeg
@mort96, for the latter example, the BSDs provide this in the sys/linker_set.h
header, with macros like SET_BEGIN, SET_FOREACH, etc. Unfortunately I don't
see the same functionality in Linux headers, but there's no reason the same
technique cannot be used there.

------
ggm
Enormously powerful, and powerfully unreadable for mere mortals. you can take
the assembler out of the loop but you can't take the failure to comprehend out
of the programmer.

I don't do C for a living any more (barely did) but the number of times I
asked smarter people "why did you use this idiom with this terrificly
confusing side-effect" and they said "thats not a side effect" comes to
mind...

------
faragon
More examples (C89/90 preprocessor compatible):

Function building using templates:

    
    
         https://github.com/faragon/libsrt/blob/master/src/saux/ssort.c
    

Passing macros to macros (meta-templates):

    
    
         https://github.com/faragon/libsrt/blob/master/src/saux/ssearch.c

------
0xcde4c3db
My favorite (somewhat) obscure C feature:

    
    
        puts("Line one\nLine two\nLine three");
    

and

    
    
        puts("Line one\n"
             "Line two\n"
             "Line three\n");
    

mean the exact same thing. This can be a godsend for making complex debug/log
output more presentable in the source.

------
Animats
Or, "Things you can do with macros in C, but probably shouldn't."

~~~
cryptonector
Why not? I'm serious. Please don't say "don't use C". We all agree on _that_.
I'm asking why one should not use clever macro tricks -- I know many codebases
that do, and I'd not shy away from it when I have to use C anyways.

~~~
kbwt
> Please don't say "don't use C". We all agree on _that_.

I wouldn't be so sure John Nagle agrees with such a blanket statement.

------
d33
> I don't know how that works, but the important part is that it does.

Coding in C is already super risky, as proven by the huge number of projects
that have exploitable memory errors in there. Adding complexity using features
one doesn't really understand is at best going to complicate debugging and
make thing work magically and in the worse case, lead to the risk that someone
is actually going to understand how your code works and exploit it.

Not worth it.

~~~
mort96
I might not have expressed myself clearly enough. It's not risky, it's not a
feature I don't understand how it works, and I'm using it exactly as it's
described in GNU's documentation on labels as values and computed gotos:
[https://gcc.gnu.org/onlinedocs/gcc/Labels-as-
Values.html](https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html)

The "I don't know how that works, but the important part is that it does."
sentence was meant to highlight the apparent absurdity of dereferencing a
void*, not to say that I don't understand exactly how to use the feature.

------
dmm
Google Cache:
[https://webcache.googleusercontent.com/search?q=cache:XqXHF0...](https://webcache.googleusercontent.com/search?q=cache:XqXHF0raAVEJ:https://mort.coffee/home/obscure-
c-features/+&cd=1&hl=en&ct=clnk&gl=us)

------
jstimpfle
Sending blocks as arguments to macros: Just wrap them in another set of
parentheses. For parentheses-balanced input (such as valid C blocks) this
should always work (I think).

    
    
        $ echo '#define x(a,b,c) a b
        x(1, (2, 3), 5)' | cpp

~~~
mort96
That isn't really a viable solution, because C doesn't allow parentheses
around blocks.

Calling `describe(foo, ({ int a, b; }))` would make the preprocessor happy,
but it would also produce invalid syntax:

    
    
        void test_foo() ({ int a, b; })

~~~
cryptonector
Tsk tsk, if you're going to know about label values and other GCC C
extensions, you might as well know about statement expressions (another GCC C
extension). `({ <statements> })` is an _expression_.

Naturally, Clang also supports statement expressions...

EDIT: I highly recommend spending some time reading about all the various GCC
C extensions, and what compilers support them (typically Clang, but also Sun
Studio, if that's still around..., and maybe others). There are quite a few
very useful ones.

------
varjag
Interesting as highlights of what C preprocessor and compiler extensions can
do. But I'd hate seeing it in actual production code.

~~~
WalterBright
I used to be enamored with what I could do with the preprocessor back in the
80's. I've since reverted all of that (in C code I still use) back to mundane
uses of the preprocessor.

Eventually one tires of no support in the symbolic debugger, program analysis
tools, compiler error messages that are based on the post-expansion code,
nobody else understands the code, hygiene problems, etc.

~~~
varjag
Exactly my sentiment, especially the collaboration part. If you take it too
far, you effectively invent a new language. Forcing it upon others is more
about hubris than skills.

~~~
MaxBarraclough
> you effectively invent a new language

Perfect. Can't tell if that's deliberate.

------
rxaxm
in college my buddy made a defer function in C that is a little simpler (but
only works with clang)

[https://github.com/asubiotto/cdefer/blob/master/defer.h](https://github.com/asubiotto/cdefer/blob/master/defer.h)

~~~
cryptonector
It's not just that it only works with Clang, but that it only works with
_Blocks_ (Apple's extension to C which basically adds closures of indefinite
extent, which is pretty cool).

------
ziotom78
Really interesting read. However, I fear that if other C library maintainers
will start to use this kind of techniques, creating bindings for other
languages will become increasingly difficult.

