Hacker News new | past | comments | ask | show | jobs | submit login
What is “:-!!” in C code? (stackoverflow.com)
242 points by frontfor on Nov 21, 2016 | hide | past | web | favorite | 83 comments

That code seems needlessly arcane. You can get the same results without resorting to anonymous bitfields:

  #define BUILD_BUG_ON_ZERO(e) (sizeof(char[(e) ? -1 : 0]))
which, when it fails, results in

  error: size of unnamed array is negative
which is no worse than the provided code which results in

  error: negative width in bit-field '<anonymous>'
It's worth noting that declaring an array with 0 elements is not allowed in C99. However, using a struct with no named members has undefined behavior in C99 [1].

[1] http://stackoverflow.com/a/12918937/959866

Edit: You can get around having 0 elements by using

  #define BUILD_BUG_ON_ZERO(e) (sizeof(char[(e) ? -1 : 1]) - 1)
but you're starting to lose clarity again.

In some compilers due to C99 variable length arrays [1], your macro compiles happily if passed a value that is not a compile-time constant.

For example, this compiles but segfaults at runtime (GCC 4.9.2):

  int main(int argc, char * argv[]) {
    return (sizeof(char[(argc) ? -1 : 1]) - 1);
[1] https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html

Ah, good point. One more reason for me to hate VLAs. ;)

The negative-length array trick is used in the Linux kernel source code [1]. I expect it's compiled with -Wvla though.

[1] https://github.com/torvalds/linux/blob/v4.5/arch/x86/boot/bo...

Looks exactly like the macro Jan Beulich replaced with the one discussed in OP: https://github.com/torvalds/linux/commit/8c87df457cb58fe75b9...

They are so good that they became optional in C11[0] and thankfully were not adopted by ANSI C++.

[0] - ISO 9899:2011 Programming Languages - C 4

    -#define BUILD_BUG_ON_ZERO(e) (sizeof(char[1 - 2 * !!(e)]) - 1)
    +#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
    +#define BUILD_BUG_ON_NULL(e) ((void *)sizeof(struct { int:-!!(e); }))
The commit that introduced the notation, actually used something similar to your second approach: https://github.com/torvalds/linux/commit/8c87df457cb58fe75b9...

And this is why I hated learning C. You went through a 500 pages C programming book, did all the exercises, had a good grasp of the language but you were still useless because you didn't know anything about those weird macros, Makefiles, automake, gcc flags, etc.

Bad C code is bad C code. There's equally bad JavaScript, Ruby, and Python code too; none of those languages have particularly simple semantics.

Moreover, this particular example is due to compatibility issues across the multitude of platforms that Linux supports, necessitating support for versions of C older than any language you might like. Were Linux only ever to be compiled with a C11 compiler, this could be replaced with simply `static_assert(x)`.

And speaking as someone who's coded C for 16 years and makes a living doing so: it is rare that I write a macro; my Makefiles are about 5 lines long; I've never touched automake; and the only GCC flags I ever use (and use consistently) are `-std=gnu11 -Wall -Werror -O3`. Heck I wrote a cross-platform NES emulator last month following these rules.

There's a lot of bad C out there to find. Don't let that turn you off from writing good C.

People often regard C code as inherently ugly because of nasty coding styles that were prevalent in the 80s and 90s – all local variables declared at the top of functions, lots of global variables, "space saving" indentation and often no whitespace between operators, bad separation of concerns. C can actually be a decent "high-level" language if you conform to a more modern style of programming. Codebases like Git and Nginx demonstrate that this is possible.

I generally agree with you, but C even at it's nicest is not "high level" except in the trivial sense of being having a recursive grammar (which assembly doesn't have).

The whole advantage of C is that it is a powerful, nice, and portable way to express roughly the same things you would otherwise express in assembly. Most of the really nasty undefined behaviour of modern compilers comes from forgetting this.

C is actually "high-level" in most aspects and we tend to forget that because of newer languages with lots of bells and whistles. C gives you a lot over assembly: functions with (mostly) no need to worry about calling conventions, automatic variables instead of dealing with registers and stack frame offsets, expression-oriented syntax that facilitates nesting of many operations within a single statement, structured programming support instead of scattered jumps and a rudimentary type system around values, pointers, arrays and structs.

Nothing fancy, but it means that the language can be compiled in a straightforward manner without any runtime support and a programmer can easily have a complete mental model of it.

Compared with languages like Ada, Modula-2, Turbo Pascal, Mesa/Cedar, Algol, PL/I, C is what we used to call a middle level language in those days.

And Algol, PL/I and their respective variants are around 10 years older than C.

Indeed. I don't want to give the impression that C is just assembly. It gives all the advantages you name, and they are very important.

But I still say that C is a way of talking about same things you want to talk about in assembly, albeit while automating some tedious but important things like register allocation. You are still commanding the computer at a low level: still telling it which byte to put on which IO port or memory location.

Of course you can build higher level abstractions on this -- but only to a point. C compilers go wrong when they imagine a C program lives in such a higher level of abstraction -- whereas the other languages thrive on defining such abstract machines.

But you can't actually directly address I/O ports in C, and directly addressing memory (via pointer arithmetic which oversteps the bounds of an object, or which puns the type of an object) invokes undefined behavior. Both those things depend on details of the underlying architecture, ABI, and operating system, and are abstracted away in C.

What is wrong with declaring local variables at the top of functions ?

Because you generally want to put declarations close to their uses to minimize their scope? Reduce cognitive load and improve local reasoning?

But in general functions shouldn't really be all that long anyway (if they are, break 'em up!). I kinda like declaring my variables all at the top — I think it looks nice & clean.

Even in short functions, "declare it when it's needed" makes your program follow a data-flow style and enables you to use const more, since in many cases the initialization is the only assignment that you do to that variable.

While the parent technically said "at the start of functions", at least C89 (I don't remember about K&R) allowed declarations at the start of any block. This is not incompatible with minimizing scope - just introduce more blocks. And in fact, introducing a block to bound the scope of a variable allows a smaller scope than simply declaring it halfway through a function, as you can also end it early.

If you declare a variable as `const`, you have to initialize it during declaration. So you have to place the declaration exactly where the value expression becomes available.

Also, it acts as a declaration of intent and makes the data dependencies in a function more explicit.

If I declare a variable in the middle of a function, I am declaring intent that this variable shouldn't be used in the first half of the function – maybe the data it is supposed to hold cannot be available at that point.

> People often regard C code as inherently ugly because of nasty coding styles that were prevalent in the 80s and 90s

Those code styles as still prevalent in 2016, speaking from the experience of occasionally having to look into how people write C code to integrate into Java, .NET, Python, Ruby at the enterprise level, which just stresses my point of view about the language.

Specially since outside HN bubble many don't even know what a static analyser or code review are all about.

Saw Hungarian notation used by two distinct developers in 2016. One of them freshly graduated and otherwise a halfway decent programmer. As far as I could tell both used it just as wrong as most developers did almost two decades ago. Bad styles find a way to live on.

Automake is nice if you know how to use it.

It provides a lot of things on top of a simple makefile...

You don't need it to write C, but everything that uses it is not "bad C code".

I wonder at what point even Linux will have to abandon the older dialects of C.

Why should it "have to"? (Honest question, I'm certainly not advocating keeping old styles for the heck of it.)

Old styes have a cost, and old compilers are not used very often.

At some point, even for a compatibility focused project like Linux, that cost must get too big for the benefit it provides.

>>Bad C code is bad C code. There's equally bad JavaScript, Ruby, and Python code too; none of those languages have particularly simple semantics.

The problem is that C is in everything. OK, not literally, but you know what I mean.

Bad JavaScript is easy to avoid. You just navigate to another website. Bad C though, not so much. Often times it's in the kernel or somewhere deep like that.

As a side effect of UNIX adoption in the industry.

Like it happened with the browser and JavaScript, C got adopted thanks to UNIX.

Before UNIX's adoption, many of us had a pleasant coding life using Turbo/Quick/Apple Pascal, Modula-2, Basic compilers.

No idea why you got so badly downvoted, your comment is pretty much spot on, bad C is hard to avoid.

I don't know either. It happens sometimes. I think it's just the bandwagon effect, and depends on the initial votes and their timing.

I think that is the case, people see a faded grey and just hit down.

I agree that C is complicated, but this one has nothing to do with macros, flags or anything external. All you need to know to understand it is in the language itself.

(Similar misunderstanding to the "down to operator" --> https://stackoverflow.com/questions/1642028/what-is-the-name...)

I've never understood the feeling that C is complex. Simple syntax, simple rules, a few hidden gotchas for the unwary.

"down to operator" = unhelpful use of whitespace. postdecrement x, check if >0

I still find C one of the easiest to pick apart. Obfuscated C entries excepted!

C is easy, computers are complex. Sometimes these are confused. Like in this case, the syntax is easy to human-parse, but it's harder to reason about the result of the expression on different architectures and with different compilers, yet the complexity is caused by the computer, not the language.

Quite. I started out in a world of C99, endianness and differing bit packing. My early day to day involved many more unions and a lot more time wondering if SunOS cc was going to behave the same as Dynix or MSC on DOS.

It's easy to forget how much of that mindless preprocessor and make conditionality we've left behind. Sadly in exchange for hardware that's much blander now.

> simple rules

The rules are simple for simple projects. Then you get into things like undefined behaviours, implementation-specific behaviours, etc. Which compilers will abuse heavily for optimisations without telling you about it - for example like the case of silently removing NULL checks. Also any undefined behaviour at all in your source code allows the compiler to throw away all code after it. Without telling you about it.

Maybe the book I read was not that good (Deitel&Deitel) but the section on macros was pretty tiny and I don't think I'd have understood that macro on my own. Anyways, I think C is great, just that most books don't seem to cover the important stuff to work on serious C projects.

The macro isn't the complicated part in the OP (`e` just gets substituted with whatever the argument is). The definition the macro generates is (for most cases, needlessly) complicated, but that has nothing to do with macros or the preprocessor.

The general rules of macros are:

1. Don't use them; prefer static functions.

2. Use them as symbolic constants only.

3. Use parameterized macros only when you must use # or ## (i.e., for code abstraction), and then use them sparingly.

4. If you really insist, wrap every use of the arguments in parentheses, and be careful not to write the name of an argument somewhere where you don't mean it.

You'll get pretty far knowing next to nothing about macros (e.g. expansion phases and tokenization rules) by following the above rules.

Deitel&Deitel make some amazing books as far as I'm concerned. They were my source material through college and a very enjoyable introduction to C, C++, and Java.

There's nothing particularly weird about this macro, and it has nothing to do with makefiles, automake, gcc flags, etc.

It is insane, though, and should have been replaced with a compiler feature to do static assertions twenty years ago.

It was, five years ago. (C11 provides static_assert.) C was probably one of the first widely-used languages to support such a thing, beside Ada and C++.

C++ added static_assert with C++11.

Though, upon further examination C++11 was published 1 month before C11 (September and October, respectively). I guess your point still stands.

Eiffel did it first.

No, Boost preprocessor is insane.

Why not both?

While this construct looks pretty weird if you decompose it element by element using the rules of the language it becomes rather clear IMO. The most arcane feature here is probably the bitfield syntax that you might not encounter very often in the wild.

The problem with C IMO is that the compilers are very permissive by default and it's easy to trigger an undefined behaviour with seemingly harmless code if you're not careful. Things like promotion rules make it difficult to guess at a glance how the code is going to behave if you don't have a very good understanding of these (rather quircky) rules. There's also the whole mess of the pointer vs. array distinction which sometimes matter and sometimes doesn't etc...

By comparison these BUILD_BUG_ON macros are relatively straightforward IMO. The naming is a bit misleading unfortunately but at least it's in full CAPS so you know it's a macro...

Oh man, wait until you find out about the Javascript ecosystem.

Funnily enough, I learned JavaScript about at the same time I learned C (~13 years ago) and at that time, JavaScript was dead simple to learn in comparison. I couldn't imagine myself trying to learn JavaScript today though, I'd probably be banging my head on walls.

FYI, this is valid js (equivalent to alert(1)):


Ha! That's epic.

Where does the alert come from?

  function range_error_on_zero(x) {return Array(-!!x).length}

This is not related to OP's question in stackoverflow. Your function fails at runtime while running, but C code above fails while compiling

But only slightly less ugly was my point.

And then you look 10 characters to the left of the code and hey, there's the explanation to what it does!

That's why Go is so popular. It's the tooling and user experience that matters.

I'm not sure that macro in this submission shows that Go is better.

Neither Go nor C has static assertions, the macro in questions uses clever techniques to implement one in C, but you probably can't do that in Go. So C is more powerful here.

Another example of clever C macro use is implementation of foreach loop (C doesn't have foreach) in linux kernel, here's question about it:


    > Neither Go nor C has static assertions
C11 has:

    _Static_assert(const_expr, fail_string)

Ha, looks like my C knowledge is a bit outdated.

I haven't really used C after I graduated and I remember my professor showing us how to implement static assertion :)

I don't agree that Go's tooling is what makes it popular. Mainly because it's tooling is not very good. All of the vendoring tooling is broken in one way or another, the linters and vet-checkers have questionable advice in some cases, and not to mention that the standard library has so many quirks that come from the fact that Go is trying to be system programming language that hides details about the system from you.

[ Disclaimer: I'm a maintainer of runC and have been programming in Go for many years. But that doesn't mean I have to like the language. Give me C any day. ]

Comparing vendoring tooling for Go and C in C's favor seems odd.

Personally I think Go actually makes a pretty bold statement about dependency management: people don't know what they want. I also have a sneaking suspicion Git and Github's interface in no small part to blame.

Golang for me made me come to one important realization: if I can't build a project by just cloning the git repository, then why the hell not?

Or related: If I cannot build a project by just cloning the repo at any arbitrary location, then why the hell not?

If you have not experienced this before, try cloning a repo that uses Go-1.6-style vendoring anywhere outside the GOPATH.

Or rather, it's why almost every popular tool is popular. Take Rails for instance. It's just that Go competes with C and boy howdy does the comparison look good to anyone who can make the switch.

Note that C11 has a more friendly static assertion feature:

    _Static_assert(condition, "error message if condition evaluates to false");
(You can also use "static_assert" as an alias for the ugly keyword after including <assert.h>.)

I wander what kernel developers thinking about migrating toward C11.

I think there's a better chance for a static_assert proposal in C, especially since Torvalds isn't particularly fond of C++.

Edit: whoops, apparently static_assert is in C11 as well as in C++11. Ignore this comment.

C11 isn't C++

Ah, neat. A version of static_assert for C. What kinds of expressions can it actually evaluate, though?

Any constant expression. In the kernel it's mostly non-zero flags, sizes of things, etc. See https://github.com/torvalds/linux/search?utf8=%E2%9C%93&q=BU...

There's also a fun compile time check for arrays: https://github.com/torvalds/linux/blob/1001354ca34179f3db924...

Here's another one.


int main() { int x = 10; while (x --> 0) // x goes to 0 { printf("%d ", x); } }

Every time I feel pretty good about my knowledge of C, something like this comes along and reminds me I don't know anything at all.

Semi related to the "is C ugly" debate.

I know Fortran 90 and use it for some simulations -- since it has a matrix syntax it's not that hard to write changing matlab code. But I know Fortran is going where the wild roses grow, and wonder if I should spend the time to learn C for high-dimensional numerics.

nah, FORTRAN is alive and well in high-performance mathematics. I wouldn't worry.

The question adds confusion to it, should have put the whole macro to give it context.

Rust discussion of a similar feature: https://github.com/rust-lang/rust/issues/6676

It doesn't look like anything came out of it?

It was implemented but then removed again. Fortunately, it can be implemented as a library in Rust (https://github.com/oli-obk/rust-sa).

It's a person misreading syntax :-/

The next time developers cry about JavaScript's incompetence because of potential hacky type coercion stupidity I will shove this in their face.

And why would that justify JS issues? If anything, it shows we have ignored the history and we are repeating it.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact