Hacker News new | past | comments | ask | show | jobs | submit login
C99 tricks (noctua-software.com)
158 points by guillaumec on Feb 13, 2015 | hide | past | favorite | 96 comments

There's nothing "C99" about that ARRAY_SIZE macro, of course.

That said, I'm rather sceptical towards it, I think it's better written as-is.

Otherwise, the question "does SIZE mean bytes, or number of elements?" always arises to haunt me. Once you need to keep looking stuff like that up, the macro isn't helping.

In actual code, the name of the array never needs to be enclosed in parentheses so it's often clearer too. E.g. something like:

    int fourk[4096];

    for(size_t i = 0; i < sizeof fourk / sizeof *fourk; ++i)
I realize that it's not 100% DRY to repeat the array name, but I think it's a small price to pay for not having to use a non-standard macro that requires learning. It's all about friction.

I use the convention "size = number of bytes" (as in sizeof), "count = number of elements" and so my sizeof(x)/sizeof(x[0]) macro is actually called COUNT.

In the BSDs (and some other code bases) the macro is spelled nitems and is common enough to be considered idiomatic.

Thank you. I did not know about this, I've been using the long version. nitems is defined in sys/param.h in OpenBSD and FreeBSD in case anyone is interested.

I always call my version lengthof() to match sizeof.

I use lengthof for lengths of string literals.

I always call mine "countof" to match the "sizeof" naming convention.

https://msdn.microsoft.com/en-us/library/ms175773.aspx - although I'm not sure it's a macro. The nice thing is it's detecting that you actually give it an array.

I use that macro frequently but warily. The fact that it only works with static arrays and (AFAIK) the compiler won't warn you if you accidentally use it on dynamic arrays/pointers/array parameters is a big risk.

Come to think of it, I'm afraid I'm leaving a time bomb for the next inexperienced maintainer who doesn't know the limitations.

See the one I wrote for the Linux kernel; it will error at build time on a non-array (using tyoeof).

Even better, grab it from CCAN: http://ccodearchive.net/info/array_size.html

Cheers, Rusty.


I see you use __builtin_types_compatible_p, which looks very useful but also appears to be a GCC-specific.

The PHP language uses count() for array lengths. Maybe copying that and using ARRAY_COUNT() or such might be less ambiguous?

Read the first comment in the comments section and get your mind blown away... What can export restrictions do!

Ah, javascript and bad design breaking the web again. Disable JS and it'll be fine.

(I had to enable JS to see the comments, and that caused me to see the very same effect reported in the first comment)

> due to US sanctions

If that's true, blaming fabric of the web for this is ridiculous. Blame idiots who went ahead with 'sanctions' on information.

These are GNU C tricks! Fine anyway, I think I especially like those that are ISO C. Trick number one is a C11 feature -- anonymous struct members in structs and unions, though I will not vouch for its validity without consulting with my C11 standard handbook.

Oh, and by the way, why not use X macros in the form where the macro to apply is passed as the argument? #define SPRITES(S) S(1, 2, 3) S(2, 3, 4) etc.

About the X: Yeah, that is stupid actually... I will update the article to change it to X.

That's not what I meant! :-)

The first one (?:) is not standard but only a GNU extension, so be carefull with that one.

Thanks for letting me know. I though this was in the standard. I will fix the post.

As a note to anyone interested in what's in the standards: the C committee openly publishes its work in progress, including documents that are functionally identical to the official standards except for the working at the front that says ‘this is an official standard’.

C99: http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf (post-publication ‘draft’ including technical corrigenda 1–3)

C11: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf (final draft as voted on)

Similarly for C++:

C++11: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n333...

C++14: http://open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3797.pd...

One can try to catch use of gcc extensions with:

    gcc -std=c99 -pedantic

I usually don't use those flags because some extensions are well supported (as long as I stay away from MSVC), and so I feel free to use them if they help me. One example is the ##__VAR_ARGS__ gnu extension that is not strictly c99, but not using it is just too painful for variadic macros.

C11 standardizes variadic macros — see §6.10.3 Macro replacement http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

Yes, even c99 has them IIRC.

The issue that was called out, however, was the use of "FOO(x, ## __VA_ARGS__)".

GNU C has an extension that causes the ',' to be removed if __VA_ARGS__ is empty. AFAIK, it is not standard.

It is also supported in clang.

Yep, it's very nice for a nil/default assignment in Objective-C like |= in Ruby

Out of curiosity:

Who has written C99 in the last 24h within a larger project and if yes what kind of project?

All the code I write is C99. What counts as a "large project" is open to interpretation though.

I sure have.

I work on embedded software for industrial assembly tools.

I wrote 14.62 hours worth of C89 yesterday, and 2.33 hours so far today. For PyParallel. (https://bitbucket.org/tpn/pyparallel)

Various bits of numerical analysis. The overall framework and plumbing is written in Python, as are all the algorithms initially. Then for those that are too slow, or need to be run lots of times, algorithms are reimplemented in C as a Python extensions.

The existing Python implementation then helps with debugging and testing.

The Linux kernel uses various C language features, including both C99 (such as designated initializers) and GCC extensions.

I have. The project is an event based http server and client. Mostly I work with libevent, redis, and openssl. We don't build our software on Windows so we don't have to worry about the limitations on that platform (non-POSIX mostly).

All my code is C99. Computational chemistry program that ships on Linux, Mac, and Windows.

what do you compile with for windows?

You can use GCC and Clang directly on Windows, both implement 100% C99. Alternatively, Visual Studio 2013 implements most of the C99 standard (not a complete implementation though).

GCC: only as Mingw or MSYS? You can use intel too, I wanted to know what this person was using, not what options there were.

You don't need to install MinGW or MSYS in order to use GCC (I assume you are interested only in C99 here and not in POSIX programming). Here is an example of a standalone GCC (contains C, C++ and Fortran compilers):


I'm currently writing a game in C99.

Compilation times, no need for inheritance and the fact that most libraries I'm using are written in C were the reasons I chose C over C++ or something else.

I have, and most of the linux libraries that I work with are C.

Yes, I have and I found this was useful. Would like more sites like this or a central location for standard tricks which are tagged as being C99 or GNU specific.

I'm working on a music player client (Currently only terminal based, but that could change).

I write C code (not always C99) for automotive/racing.

I was working on an assembler/linker in C last night.

SIP phone software. Display and self-test code today.

I have also. DSP code in an embedded environment.

Me; a high-security network crypto device.

Everyday for a router operating system.

Waves hand. Multiple OS libraries.

Here's one.

An MPI library.

My favorite C99 trick:

    const char *lookup[] = {
      [0] = "ZERO",
      [1] = "ONE",
      [4] = "FOUR"
    assert(strcmp(lookup[0], "ZERO") == 0);
(not available in C++ or pre-C99)

I see this used a lot with enums to make a kind of 'homogeneous struct' where all the members have the same type:

    enum fields = { x, y, z };
    int point[] = {
      [x] = 1,
      [y] = 2,
      [z] = 3
This is used pretty extensively in the QEMU codebase, which is where I first learned it.

How does this even work?!?!

x, y, z being enum values, are constants.

Can someone explain what the purpose of the safe min macro is?

What is the advantage of this:

  #define min(a, b) ({ \
      __typeof__ (a) _a = (a); \
      __typeof__ (b) _b = (b); \
      _a < _b ? _a : _b; \
over the naive

  #define min(a, b) (((a) < (b)) ? (a) : (b))


One of the two values (a or b) gets evaluated twice. Eg)

  min(a++, b++);
a or b would really be incremented twice.

Nice thanks. Makes perfect sense and I can see that making for a hair pulling debug session if you weren't aware of it.

It's the kind of thing that's made me very familiar with the -E option for GCC, which makes it spit out the preprocessed code...

It evaluates its arguments once. That means min(expensive_calculation(x), expensive_calculation(y)) works how it reads.

A compiler might be able to avoid this if it can prove expensive_calcluation is pure, but this is usually not done unless WPO is in play.

The linux kernel goes one step further and includes…

        (void) (&_a == &_b);      \
… which makes sure that the types are actually compatible, and not just having one converted to the other.

The purpose is to evaluate `a` and `b` only once; consider `min(i++, j)`.

Many of those "tricks" have nothing to do with C99.

If someone uses this switch macro in a codebase I am working on, I'll most probably punch them in the face.

Wow ! I Like the GL macro. I think that would be very helpfull to debug some problems that I have on OSX related to core profile

It's very useful; I've long done the same thing. Though - you do need to do something so you can examine the error value. Even though OpenGL's error enum is terribly vague, most functions can produce more than one type of error. It's nice to be able to see what the value was, even if only to verify your assumption that what's obviously the case is indeed actually happening. Store the value somewhere global so you can examine it in the debugger when the program's stopped, or (if you have such a thing) use some fancier assert macro that prints out the problem values.

Also look up GL_ARB_debug_output - https://www.opengl.org/registry/specs/ARB/debug_output.txt. I don't remember this ever telling me anything terribly useful for debugging, but I did get a couple of perf hints from it... anyway, it's vendor-specific, so on OS X, maybe it will help.

(BTW - if you ever end up using Direct3D on Windows, be sure to activate the debug runtime. Compare and contrast.)

I put a simplified version in the article to make it simple, but the macro I actually use in my games is slightly more advanced, and will output more informations about the error and its context.

Ah, right! - my first version was just the assert, and it was working fine until one day I ended up on a wild goose chase because I was certain it had to be GL_INVALID_OPERATION. But in fact... it was GL_INVALID_ENUM :) So I just thought I'd save somebody a couple of minutes.

A common pattern in embedded C is to use something like "goto fail" instead of assert, wrap your function calls in this sort of macro, and then do error handling in one place at the end of the function.

Apparently Apple is unaware of this technique...

I use that style too, but I've not found it useful for OpenGL. When you're developing your product, you want every OpenGL error to explode in your face, so you know it happened. In your final build, you probably won't ever check, because... well, what will you do if something happens? OpenGL errors are much more like ENOMEM than they are like ENOENT, and if you get one in production, you're basically stuffed. The best thing you can do is just carry on and hope that the driver ignores it correctly! (One advantage to working on code whose primary purpose is merely to display something on the screen: you can do stuff like that, and it's OK.)

There are exceptions to this general rule, and sometimes you do need to use glGetError in the normal run of things, and take action based on the result. But they are very much exceptions.

Okay, so as someone who is ramping up on C where would I go to learn all these common patterns? I could start going through github repos and start reading code but this seems very inefficient and I might pick up something that is actually a bad technique.

I like Zed Shaw's Debug Macros. Actually his whole book is great.


Another useful OpenGL macro, at least for me, is this one:

    #define GLSL(str) (char*)"#version 330\n" #str
It basically enables you to embed quick snippets of GLSL inside your C code, instead of using concatenated strings. Example:

    char* fragShader = GLSL(
      in vec3 uv;
      out vec4 color;
      uniform sampler2DArray tex;
      void main()
        // Comments work too
        color = texture(tex, uv);
In Sublime Text you even get code completion and color highlighting, since GLSL and C look so alike.

I believe it might work for OpenCL Kernels too.

The only downside is that it doesn't add line breaks by itself (hence the "#version 330\n", which requires a line break), so GLSL Compilation errors aren't as useful.

(Source: https://open.gl/geometry )

If you use C++ you might be interested in glbinding (https://github.com/hpicgs/glbinding). It allows you to define an after callback which can be defined as a function that checks for errors. It also has a few other improvements over GLEW.

Why is it wrapped in a do{}while(0); ?

Any reason why they don't use trick #1 inside trick #5?

// Instead of x = x ? x : 10;

// We can use the shorter form: x = x ?: 10;

#define min(a, b) ({ \ __typeof__ (a) _a = (a); \ __typeof__ (b) _b = (b); \ _a < _b ? _a : _b; \ })

Wouldn't that return _a < _b instead of _a?

indeed it would.

Isn't there some C11 version of the language that would add those kinds of features ? I guess they might break earlier C code though, but I'm not sure.

These really aren't the features you'd add in a language version though honestly. Maybe the standard library, but even then these are so simple that IMO it's not worth adding them. For a lot of these, a big reason they're not in the language is because they work if you understand what they're doing, but they don't work in every context and you get strange errors in those cases.

Ex. ARRAY_SIZE works by simply using 'sizeof' to find the size of the array. This only works if the array was defined earlier in the same section of code though, it doesn't work on say, arrays passed to functions. So the name 'ARRAY_SIZE' for this macro is misleading because it only works on some arrays in some instances. Anybody who's programmed in C would know this, but they could probably write their own macro to do this anyway. It's beginners who wouldn't know how to write this macro, and would also be confused by how it works.

The thing you might consider standardizing is the GNU extensions that they used in this article, naming ({ }) and typeof. Both can be handy honestly, but C11 doesn't standardize either IIRC.

Yes, there is a C11 standard. See for instance http://en.wikipedia.org/wiki/C11_%28C_standard_revision%29 for a list of new features in C11. But no, it doesn't implement these things.

Here's a handy table showing GCC's support: https://gcc.gnu.org/wiki/C11Status.

I'm pretty sure Clang supports C11 well too, didn't find a corresponding table though. It says "By default, Clang builds C code in GNU C11 mode [...]" on this page: http://clang.llvm.org/compatibility.html.

“Clang strives to both conform to current language standards (up to C11 and C++11)”. If anything's missing, it's a bug. clang also kindly defaults to C11 so I don't have to curse and go add a command line flag every time I write “for (int i = ...”.

GCC 5 will default to -std=gnu11 (C11 with GNU extensions)

Here's a readily digestible summary of what's new in C11:


What happens if I call ARRAY_SIZE on an empty array?

C doesn't allow empty arrays.

How about a dangling pointer that decayed from an array?

In that case, it'd evaluate to

    sizeof(mytype *) / sizeof(mytype)
which will usually be 0 or 1 (most of my arrays hold at least words.)

Sorry, I was misremembering the edge case here, which indeed is still around pointer decay:

  #include <stdio.h>

  #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))

  void p (int* a) {
    printf("ARRAY_SIZE of arr = %lu, %lu, %lu\n", ARRAY_SIZE(a), sizeof(a), sizeof((a)[0]));

  int main () {
    int arr [] = {0, 1, 2, 3, 4};
    printf("ARRAY_SIZE of arr = %lu, %lu, %lu\n", ARRAY_SIZE(arr), sizeof(arr), sizeof((arr)[0]));

ARRAY_SIZE of arr = 5, 20, 4

ARRAY_SIZE of arr = 2, 8, 4

where in the function, the arr has decayed into a pointer losing the additional info about size (now the size of a pointer 8bytes or 64bits on my host). Which is why you usually get the length of the array before invoking the function and pass it in as additional information, or wrap the complete array in a struct to preserve sizeof info (http://spin.atomicobject.com/2014/03/25/c-single-member-stru...).

What is the difference between x ?: y and x || y?

|| gives 1 or 0 only?

I see, thanks.

"The C Companion" gives logical identities like

    (A && B) || (A && !B) == A,
but what he means by A on the RHS, I think, is that you must take into account the fact that A_LHS might be zero. You can't really write this identity and give a constant on the right, so he wrote the next best thing.

Well, the real next best thing would be !(!A).

#define ARRAY_SIZE(x) ((&(x))[1] - (x)) :-)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact