Hacker News new | past | comments | ask | show | jobs | submit login
Weekend projects: getting silly with C (lcamtuf.substack.com)
238 points by nothacking_ 4 months ago | hide | past | favorite | 113 comments



> The above example will print the value of a, but it won’t be initialized to 123!

It certainly could do though. In C, using an uninitialised variable does not mean "whatever that memory happened to have in it before" (although that is a potential result). Instead, it's undefined behaviour, so the compiler can do what it likes.

For example, it could well unconditionally initialise that memory to 123. Alternatively, it could notice that the whole snippet has undefined behaviour so simply replace it with no instructions, so it doesn't print anything at all. It could even optimise away the return that presumably follows that code in a function, so it ends up crashing or doing something random. It could even optimise away the instructions before that snippet, if it can prove that they would only be executed if followed by undefined behaviour – essentially the undefined behaviour can travel back in time!


UB can not travel back in time in C. Although it is true that it can affect previous instructions, but that code is reordered or transformed in complicated ways is true even without UB.


The time-travelling UB interpretation was popularized by this blog post about 10 years ago [1].

I'm not enough of a specification lawyer to say that this is definitely true, but the reasoning and example given there seems sound to me.

[1] https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=63...


Yes, random blog posts did a lot of damage here. Also broken compilers [1]. Note that blog post is correct about C++ but incorrectly assumes this is true for C as well.

[1]. https://developercommunity.visualstudio.com/t/Invalid-optimi...


I'm inclined to trust Raymond Chen and John Regehr on these matters, so if you assert that they're incorrect here then a source to back up your assertion would help your argument.


I am a member of WG14. You should check the C standard. I do not see how "time-travel" is a possible reading of the definition of UB in C. We added another footnote to C23 to counter this idea:

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf "Any other behavior during execution of a program is only affected as a direct consequence of the concrete behavior that occurs when encountering the erroneous or non portable program construct or data. In particular, all observable behavior (5.1.2.4) appears as specified in this document when it happens before an operation with undefined behavior in the execution of the program."

I should point out that compilers also generally do not do true time-travel: Consider this example: https://godbolt.org/z/rPG14rrbj


So maybe we have different definitions of "time travel". But I recall that

- if a compiler finds that condition A would lead to UB, it can assume that A is never true - that fact can "backpropagate" to, for example, eliminate comparisons long before the UB.

Here is an older discussion: https://softwareengineering.stackexchange.com/q/291548

Is that / will that no longer be true for C23? Or does "time-travel" mean something else in this context?


There may be different definitions, but also a lot of incorrect information. Nothing changes with C23 except that we added a note that clarifies that UB can not time-travel. The semantic model in C only requires that observable effects are preserved. Everything else can be changed by the optimizer as long as it does not change those observable effects (known as the "as if" principle). This is generally the basis of most optimizations. Thus, I call time-travel only when it would affect previous observable effects, and this what is allowed for UB in C++ but not in C. Earlier non-observable effects can be changed in any case and is nothing speicifc to UB. So if you call time-travel also certain optimization that do not affect earlier observable behavior, then this was and is still allowed. But the often repeated statement that a compiler can assume that "A is never true" does not follow (or only in very limited sense) from the definition of UB in ISO C (and never did), so one has to be more careful here. In particular it is not possible to remove I/O before UB. The following code has to print 0 when called with zero and a compiler which would remove the I/O would not be conforming.

int foo(int x)

{

  printf("%d\n", x);

  fflush(stdout);

  return 1 / x;
}

In the following example

int foo(int x)

{

  if (x) bar(x);

  return 1 / x;
}

the compiler could indeed remove the "if" but not because it were allowed to assume that x can never be zero, but because 1 / 0 can have arbitrary behavior, so could also call "bar()" and then it is called for zero and non-zero x and the if condition could be removed (not that compilers would do this)


I think the clarification is good, probably the amount of optimizations that are prevented by treating volatile and atomics as UB barriers is limited, but as your example show, a lot of very surprising transformations are still allowed.

Unfortunately I don't think there is a good fix for that.


E.g. this godbolt: https://godbolt.org/z/eMYWzv8P8

There is unconditional use of a pointer b, which is UB if b is null. However, there is an earlier branch that checks if b is null. If we expected the UB to "backpropagate", the compiler would eliminate that branch, but both gcc and clang at O3 keep the branch.

However, both gcc and clang have rearranged the side effects of that branch to become visible at the end of the function. I.e. if b is null, it's as if that initial branch never ran. You could observe the difference if you trapped SIGSEGV. So even though the compiler didn't attempt to "time-travel" the UB, in combination with other allowed optimizations (reordering memory accesses), it ended up with the same effect.


While interactions with volatile and interactive streams cannot time-travel, anything else is free to - the standard only imposes requirements on a conforming implementation in terms of the contents of files at program termination, and programs with undefined behaviour are not required to terminate, so there are approximately no requirements on a program that invokes undefined behaviour.


> Also broken compilers [1].

The issue you linked to is not a counter example because, as the poster said, g may terminate the program in which case that snippet does not have undefined behaviour even if b is zero. The fact that they bothered to mention that g may terminate the program seems like an acknowledgement that it would be valid to do that time travelling if it didn't.

> Note that blog post is correct about C++ but incorrectly assumes this is true for C as well.

Presumably you're referring to this line of the C++ standard, which does not appear in the C standard:

> However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

I looked at every instance of the word "undefined" in the C standard and, granted, it definitely didn't have anything quite so clear about time travel as that. But it also didn't make any counter claims that operations before are valid. It pretty much just said that undefined behaviour causes behaviour that is undefined! So, without strong evidence, it seem presumptuous to assume that operations provably before undefined behaviour are well defined.


The poster is me. You are right that this is not an example for time-travel. There aren't really good examples for true time travel because compilers generally do not do this. But my point is that with compilers behaving like this, people might confuse this for time-traveling UB. I have certainly met some who did and the blog posts seems to have similar examples (but I haven't looked closely now).

Note that I am a member of WG14. We added more clarification to C23 to make clear that this is not a valid interpretation of UB, see here: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf


I didn't notice that section when I last read through C23, but I'm very glad to see it. Reining in UB is one of the hardest problems I've had to deal with, and being able to say operations are defined up to the point of UB makes my job so much easier.

The lack of clarity in earlier standards made it impossible to deal with code incrementally, since all the unknown execution paths could potentially breach back in time and smash your semantics.


Thank you. This was my motivation. It is only a small step... much more work to do.


Ok, fair enough. I must admit I was looking at C99 as I thought that was most generally relevant, I don't follow recent C standards (as much as I do those for C++) and C23 hasn't been ratified yet. I've found your new snippet:

> In particular, all observable behavior (5.1.2.4) appears as specified in this document when it happens before an operation with undefined behavior in the execution of the program.

I consider that a change in the standard but, of course, that's allowed, especially as it's backwards compatible for well defined programs.

The wording is a little odd: it makes it sound a like you need some undefined behaviour in order to make the operations beforehand work, and, taken very literally, that operations between two undefined behaviours will work (because they're still "before an operation with undefined behavior"). But I suppose the intention is clear.


The definition of UB (which hasn't changed) is: "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this document imposes no requirement."

Note that the "for which" IMHO already makes this clear that this can not travel in time. When everything could be affected these words ("for which") would be meaningless.


> but that code is reordered or transformed in complicated ways is true even without UB.

Without undefined behavior, the compiler emits code that has the behavior defined by the code —- the ordering may be altered, but not the behavior.


Yes, and with undefined behavior, the compiler has to emit code that has the behavior defined by the code up to the operation that has undefined behavior.


That is false. If a compiler determines that some statement has undefined behavior, it can treat it as unreachable, and, transitively, other code before it as unreachable.

  printf("hello\n"); // this doesn't have to print
  x = x / 0;         // because this is effectively a notreached() assertion


This is in direct contradiction to what uecker says. Can you back up your claim -- for both C and C++? Putting your code in godbolt with -O3 did not remove the print statement for me in either C or C++. But I didn't experiment with different compilers or compiler flags, or more complicated program constructions.

https://godbolt.org/z/8nbbd3jPW

I've often said that I've never noticed any surprising consequences from UB personally. I know I'm on thin ice here and running risk of looking very ignorant. There are a lot of blogposts and comments that spread what seems like FUD from my tiny personal lookout. It just seems hard to come across measureable evidence of actual miscompilations happening in the wild that show crazy unpredictable behaviour -- I would really like to have some of it to even be able to start tallying the practical impact.

And disregarding whatever formulations there are in the standard -- I think we can all agree that insofar compilers don't already do this, they should be fixed to reject programs with an error message should they be able to prove UB statically -- instead of silently producing something else or acting like the code wouldn't exist.

Is there an error in my logic -- is there a reason why this shouldn't be practically possible for compilers to do, just based on how UB is defined? With all the flaws that C has, UB seems like a relatively minor one to me in practice.

Another example: https://godbolt.org/z/b5j99enTn

This is an adaption from the Raymond Chen post, and it seems to actually compile to a "return 1" when compiling with C++ (not with C), at least with the settings I tried. And even the "return 1" for me is understandable given that we actually hit a bug and there are no observeable side-effects before the UB happens. (But again, the compiler should instead be so friendly and emit a diagnostic about what it's doing here, or better return an error).

Un-comment the printf statement and you'll see that the code totally changes. The printf actually happens now. So again, what uecker says about observable effects seems to apply.


In this [1] example GCC hoists, even in C mode, a potentially trapping division above a volatile store. If c=0 you get one less side effect than expected before UB (i.e. the division by zero trap). This is arguably a GCC bug if we agree on the new standard interpretation, but it does show that compilers do some unsafe time travelling transformations.

Hoisting the loop invariant div is an important optimization, but in this case I think the compiler could preserve both the optimization and the ordering of the side effects by loop-peeling.

[1] https://godbolt.org/z/ecsdrPa94


Thanks for the example. But again I can't see a problem. The compiler does not actually prove UB in this case, so I suppose this doesn't qualify as applying (mis-) optimizations silently based on UB. Or what did I miss?


Compilers don't prove UB; they assume absence of UB.

That, plus a modicum of reasoning like "if this were to be evaluated, it would be UB" (therefore, let's assume that is not evaluated).


Let's not get pedantic about what "proving UB" actually means -- that might lead to philosophic discussions about sentient compilers.

Fact is that in this instance, the compiler did not remove a basic black of code (including or excluding "observeable side-effects" leading up to the point of UB happening). It would not be valid for the compiler to assume that the path is never taken in this case, even assuming that UB never happens, because depending on the the value of the variables, there are possible paths through the code that do not exhibit UB. In other words, "the compiler wasn't able to prove UB".

So this is not an instance of the situation that we are discussing. The emitted code is just fine, unless a division by zero occurs. Handling division by zero is responsibility of the programmer.

Nobody is arguing that UB can lead to weird runtime effects -- just dereference an invalid pointer or whatever.

The issue discussed is that based on assumptions about UB, the compiler emits code that does not correspond to the source in an intuitive way, for example a branch of code is entirely removed, including any observeable side-effects that logically happened before the UB.

Now the point of the GGP poster is probably that the observeable side-effect (the volatile access) does not happen at all because the UB happens first. But I would classify this case differently -- the volatile access is not elided from the branch.

Further more, it might well be that (and let me assume so) the order of the volatile access and the division operation that causes the UB are probably not defined as happening in a strict sequence (because, I'm assuming again as any reasonable standards layman would, UB is not considered a side-effect (that would kinda defeat the point, disallowing optimizations)). So it's entirely valid for the compiler to order the operation that causes the (potential) UB before the volatile access.


> The issue discussed is that based on assumptions about UB, the compiler emits code that does not correspond to the source in an intuitive way, for example a branch of code is entirely removed, including any observeable side-effects that logically happened before the UB.

That's literally what happens in my example: the div is hoisted above the volatile read which is an observable side effect. The practical effect is that the expected side effect is not executed even if it should have happened-before the SIGFPE.

uecker claims that the UB should still respect happens-before, and I'm inclined to agree that's an useful property to preserve.

And I don't see any significant difference between my example and what you are arguing.


See my other comments. I don't even think your example has an observeable side-effect.


Btw. you literally said "If a compiler determines that some statement has undefined behavior, it can treat it as unreachable".


The compiler is moving a potentially UB operation above a side effect. This contradicts uecker non-time-traveling-ub and it is potentially a GCC bug.

If you want an example of GCC removing a side effect that happens-before provable subsequent UB: https://godbolt.org/z/PfoT8E8PP but I don't find it terribly interesting as the compiler warns here.


In your godbolt

  extern volatile int x;
  int ub() {
      int r = x;
      r += 1/0;
      return r;
  }
and the output is

  ub:
        mov     eax, DWORD PTR x[rip]
        ud2
I don't see what is the side effect that you say is removed here?

As for the earlier example (hoisting the division out of the loop), I was going to write a wall of text explaining why I find the behaviour totally intuitive and in line with what I'd expect.

But we can make it simpler: The code doesn't even have any observeable side effect (at least I think so), because it only reads the volatile, never writes it! The observeable behaviour is exactly the same as if the hoist hadn't happened. I believe it's a totally valid transformation, at least I don't have any concerns with it.


I think this one better illutrates the point you were making: https://godbolt.org/z/qbfhb6dKo

Here I've inserted an increment of the volatile (i.e. a write access) at the start of the loop. If the divisor is 0, in the optimized version with the division hoisted out of the loop, the increment will never actually happen, not even once. Whereas it should in fact happen 1x at the beginning of the first loop iteration with "unoptimized" code.

I don't find this offputting: First, the incrementing code is still in the output binary. I think what is understood by "time travel", and what would be offputting to most programmers, is if the compiler was making static inferences and was removing entire code branches based on that -- without telling the user. If that was the case, I would consider it a compiler usability bug. But that's not what's happening here.

Second, I think everybody can agree that the compiler should be able to reorder a division operation before a write access, especially when hoisting the division out of a loop. So while maybe an interesting study, I think the behaviour here is entirely reasonable -- irrespective of standards. (But again, I don't think uecker, nor anyone else, said that the compiler may never reorder divisions around side-effecting operations just because the division "could" be UB).


Well, that hoisting is contrary to what uecker says is the standard intent.

I think that discussing about omitting branches is a red herring, there is no expectation that the compiler should emit branches or basic blocks that match the source code even in the boring, non-ub case.

The only constraint to compiler optimizations is the as-if rule and the as-if rule only requires that side effects and their order be preserved for conforming programs. Uecker says that in addition to conforming programs, side effects and their ordering also need to be preserved up to the UB.

I do of course also find it unsurprising that the idiv is hoisted, but, as the only way that the standard can constraint compilers is through observable behaviour, I don't see how you can standardize rules where that form of hoisting is allowed while the other are not.

In fact the compiler could easily optimize that loop while preserving the ordering by transforming it to this:

  extern volatile int x;
  int ub(int d, int c) {
    int r;
    x += 3;
    r += x;
    int _div = d / c;
    r += _div;

    for (int 2 = 0; i < 100; ++i) {
        x += 3;
        r += x;
        r += _div;
    }
    return r;
  }
This version preserves ordering while still optimizing away the div. In fact this would also work if you replaced the volatile with a function call, which currently GCC doesn't optimize at all.


Thanks for clarifying, I understand much better now.

And I think I can agree that under a strict interpretation of the rule that UB doesn't get reordered with observable behaviour the GCC output in the godbolt is wrong.

Maybe it has something to do with the fact that it's volatiles? I've hardly used volatiles, but as far as I know their semantics have traditionally been somewhat wacky -- poorly understood by programmers and having inconsistent implementations in compilers. I think I've once read that a sequence of volatile accesses can't be reordered, but other memory accesses can very well be reordered around memory accesses. Something like that -- maybe the rules in the compiler are too complicated leading to an optimization like that, which seems erroneous.

But look at this, where I've replaced the volatile access with a printf() call as you describe: https://godbolt.org/z/Ec8aYnc3d . It _does_ get optimized if the division comes before the printf. The compiler seems to be able to do the hoisting (or maybe that can be called "peeling" too?). But not if you swap the two lines such that the printf comes before the division. Maybe the compiler does in fact see that to keep ordering of observable effects, it would have to duplicate both lines, effectively duplicating the entire loop body for a single loop iteration. In any case, it's keeping both the printf() and the div in the loop body.


I do believe that by a strict reading of the standard GCC is non conforming here. This reading of the standard is not agreed by the GCC developers though: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104800

If the first div happens before the first printf, then it can be CSE out of the loop as any trap would have happened before the printf anyway, so no reordering, and if it didn't trap the first time it wouldn't have trapped later either. In this case CSE is fine and there are no reordering.

If the div happens after printf, then reordering is prohibited not only to preserve side effects before UB (which we have seen GCC doesn't necessarily respects), but because for the most part printf is treated as an opaque function: it could legitimately exit, or longjump out of the function or never return, so on the abstract machine the UB might not happen at all. So it is not safe to hoist trapping instruction like div above opaque functions (but it is safe to sink them).

Still the modification I showed for volatile can be applied as well: peel the first iteration out of the loop so that the first printf can be done before computing the div to be CSEd out of the loop. But GCC doesn't do it although it seems desirable.


I'm very sorry, yes, you are right of course the load is still there. I was so fixated in producing a minimal test case that I failed to interpret the result.

Now I'm not able to reproduce the issue with a guaranteed UB. I still think the loop variant shows the same problem though.

In any case, yes, according to the C standard a volatile read counts as an observable side effect.


The implementation can assume that the program does not perpetrate undefined behavior (other than undefined behavior which the implementation itself defines as a documented extension).

The only way the program can avoid perpetrating undefined behavior in the statement "x = x / 0" is if it does not execute that statement.

Thus, to assume that the program does not invoke undefined behavior is tantamount to assuming that the program does not execute "x = x / 0".

But "x = x / 0" follows printf("hello\n") unconditionally. If the printf is executed, then x = x / 0 will be executed. Therefore if the program does not invoke undefined behavior, it does not execute printf("hello\n") either.

If the program can be assumed not to execute printf("hello\n"), there is no need to generate code for it.

Look at the documentation for GCC's __builtin_unreachable:

> Built-in Function: void __builtin_unreachable (void)

> If control flow reaches the point of the __builtin_unreachable, the program is undefined. It is useful in situations where the compiler cannot deduce the unreachability of the code.

The unreachable code assertion works by invoking undefined behavior!


x/0 is not reached if the printf blocks forever, exits or return via an exceptional path (longjmp in C, exceptions in C++). Now specifically standard printf won't longjmp or exit (but glibc one can), but it still can block forever, so the compiler in practice can't hoist UB over opaque function calls.

edit: this is in addition to the guarantees with regard to side effects that uecker says the C standard provides.


But does `printf();` return to the caller unconditionally?

This is far from obvious -- especially once SIGPIPE comes into play, it's quite possible that printf will terminate the program and prevent the undefined behavior from occurring. Which means the compiler is not allowed to optimize it out.


`for(;;);` does not terminate; yet it can be removed if it precedes an unreachability assertion.

The only issue is that writing to a stream is visible behavior. I believe that it would still be okay to eliminate visible behavior if the program asserts that it's unreachable. The only reason you might not be able to coax the elimination out of compilers is that they are being careful around visible behavior. (Or, more weakly, around external function calls).


Yeah but do you have an actual instance of "time travel" happening? Without one the issue is merely theoretic discussion of how to understand or implement the standards. If you provide a real instance, the practical impact and possible remedies could be discussed.


Mmmh, how about

    #include <stdio.h>


    int f(int y, int a) {
        int x, z;
        printf("hello ");
        x = y / a;
        printf("world!");
        z = y / a;
        return x+y;
    }
In godbolt, it seems the compiler tends to combine the two printfs together. So if a=0, it leads to UB between the printfs, but that wont happen until after the two printfs. Here the UB is delayed. But will the compiler actually make sure that in some other case, the x/a won't be moved earlier somehow? Does the compiler take any potentially undefined behavior and force ordering constraints around them? ...The whole point of UB is to be able to optimize the code as if it doesn't have undefined behavior, so that we all get the maximum optimization and correct behavior as long as there's no UB in the code.


If a compiler can determine that some statement is UB, it can treat that as an assertion that the code is unreachable. All other statements which reach only that code and no other are also unreachable.

A compiler's analysis can go backward in time. That is to say, the compiler can build a model of what happens in some section of code over time, and analyze it whichever way it wants.

You cannot go back in time from execution time to translation time, but the translator can follow the code as if it were executing it at translation time.


In C, it is only undefined behavior to access an automatic object that has not been initialized.

Static objects are always initialized, so the situation cannot arise.

That leaves dynamic ones, like uninitialized struct members in a malloced structure.

Accessing uninitialized dynamic memory means isn't undefined behavior in C. It results in whatever value is implied by the uninitialized bits. If the type in question has no trap representations, then it cannot fail.


This features the construct

  switch(k) {
    if (0) case 0: x = 1;
    if (0) case 1: x = 2;
    if (0) default: x = 3;
  }
which is a switch where you don't have to write break at the end of every clause.

  #define brkcase if (0) case
That might be worth using. Compilers won't love the control flow but they'll probably delete it effectively.


Surely the following would work just as well?

  #define brkcase break;case
kinda defeats the purpose of the macro even.


That strikes me as better. The original macro presumably misbehaves if there's more than one statement in a sequence, as the if will only affect the first statement.


I think the behavior is slightly different since this one breaks the above case, and the other one only omits its case from fallthrough

Incidentally, what happens if you use your brkcase as the first case?

I don't find either particularly exciting - a macro that would append break to the current case feels better


Both version of the macro makes this fall through from 0:

  switch (a) {
    brkcase 0: foo();
    case 1: bar();
  }
so in a sense the `if (0) case` trick also affects the previous case, not the current one. But that one also falls apart when there are multiple statements under the brkcase.


I think it is super unclear how this works, and I would prefer the same control flow using goto, rather than the duffs device style switch abuses.


It only works if the case label body is a single line or is enclosed in brackets.

I'll confess, I've used this construct to mean "omit the first line of the next case label but otherwise fall through".

If you think of the case label as merely a label and not a delimiter between statements all of this makes sense.


This can be used to implement coroutines in C. https://stackoverflow.com/questions/24202890/switch-based-co...


uIP (TCP/IP stack for tiny microcontrollers) is a another fun real-world example for these types of coroutines: https://github.com/adamdunkels/uip/blob/master/uip/lc-switch...


Why did I not know that this:

    case 1 ... 10:
Is valid C? I have been programming in C for years, what standard is this from?


Unless it has been recently standardized it's not valid C, it's a GNU extension.


It appears to be a GNU C extension: https://gcc.gnu.org/onlinedocs/gcc/Case-Ranges.html but I couldn't find the history of the extension. I believe it is not in standard C (not sure about clang).


I just tried it, and it works with clang version 17.0.6.


Clang supports almost all GNU C extensions. Maybe not nested functions because they need executable stacks.


This reminds me of some silly C code I once wrote for fun, which counts down from 10 to 1:

    #include <stdio.h> // compile & run: gcc -Wall countdown.c -o countdown && ./countdown
    int n = 10; int main(int argc, char *argv[]) { printf("%d\n", n) && --n && main(n, NULL); }
Python version:

    import sys # run: python3 countdown.py 10
    def main(n:int): sys.stdout.write(f"{n}\n") and n-1 and main(n-1)
    main(int(sys.argv[1]))
Shell version:

    # run ./countdown.sh 10
    echo $1 && (($1-1)) && $0 $(($1-1))


Nitpick: you could replace sys.stdout.write(f"{n}\n") with print(n). The current code looks very much like it was written for Python 2 (apart from the f string!), where print was a statement. As of Python 3, print is just a regular function. It returns None, which is falsey, so you'd also need to change your first "and" to an "or".


Thanks for this suggestion - it works great.

    import sys # run: python3 countdown.py 10
    def main(n:int): print(n) or n-1 and main(n-1)
    main(int(sys.argv[1]))
This also works and is definitely more Pythonic:

    _ = [print(n) for n in range(10,0,-1)]


I don't think I've ever thought of explicitly calling main(). Made me chuckle.


I think it is UB

Edit: actually looks like it is UB in C++ but not C


Why would calling main be UB!? How is crt0 supposed to work?


crt0 generally isn't C and isn't subject to C's rules


malloc() can't be implemented in C either because it's defined as doing things (creating new memory objects) there are no lower level mechanisms in C to do.


all malloc is defined to do is to return a pointer to storage of appropriate size and alignment, which can easily be done in pure standard C by defining a static array and chopping it up as needed. that's not a brilliant way of doing that, but achievable without leaving standard C


AFAIK that can break because of the strict aliasing rules (although it might work in practice). Even if char can alias anything, the reverse is not true and you can't legally store other types in a static array of char type. You should be able to use anonymous memory though, so for example if you get your storeag via mmap or some other allocator it should be fine.


There isn't anything called mmap in the C standard. That's what I mean by it not being possible to implement in standard C. It is possible in some implementations of C.


> The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate.

> The lifetime of an allocated object extends from the allocation until the deallocation. Each such allocation shall yield a pointer to an object disjoint from any other object.

A static array is "an object" already. A pointer to the middle of it is not a new object.


As the other person pointed out, anything that happens before main is strictly not covered by the C standard.

Even things like “printf” can’t be implemented purely in standard C. Even making a syscall is outside of the scope of the C standard.


Another source of surprise:

  4[arr] // same as arr[4]


Thanks to array decay to pointer, we basically have `*(array_label+offset)` which in this case of yours we have `*(offset+array_label)`; in other words, `*(arr+4)` is the same as `*(4+arr)`...that's it, really!


By the same principle, these are exactly the same:

  arr[i][j]
  j[i[arr]]
These are the simplifications you'd do. You only need to know that a[x][y] is equivalent to (a[x])[y], and that a[x] is the same as x[a].

  arr[i][j]
  (arr[i])[j]
  (i[arr])[j]
  j[i[arr]]


The final obfuscated code snippet in the article brought to light another GCC extension:

https://stackoverflow.com/questions/34559705/ternary-conditi...


Found these silly tricks by the author of this blog on twitter first. Switch statement can do loops too https://twitter.com/lcamtuf/status/1807129116980007037


Also on the actually social network https://infosec.exchange/@lcamtuf/112701486085621844


aren't the switch shenanigans important to the duff's device?


Duff is relying on the fact you're allowed to intermingle the switch block and the loop in K&R C's syntax, the (common at the time but now generally frowned on or even prohibited in new languages) choice to drop-through cases if you don't explicitly break, and the related fact that C lets your loop jump back inside the switch.

Duff is trying to optimise MMIO, you wouldn't do anything close to this today even in C, not least because your MMIO is no longer similarly fast to your CPU instruction pace and for non-trivial amounts of data you have DMA (which Duff's hardware did not). In a modern language you also wouldn't treat "MMIO" as just pointer indirection, to make this stay working in C they have kept adding hacks to the type system rather than say OK, apparently this is an intrinsic, we should bake it into the freestanding mode of the stdlib.

Edited to add:

For my money the successor to Tom Duff's "Device" is WUFFS' "iterate loops" mechanism where you may specify how to partially unroll N steps of the loop, promising that this has equivalent results to running the main loop body N times but potentially faster. This makes it really easy for vectorisation to see what you're trying to do, while still handling those annoying corner cases where M % N != 0 correctly because that's the job of the tool, not the human.


> Duff is relying on the fact you're allowed to intermingle the switch block and the loop

That's just a special case of being able to intermingle switch with arbitrary syntax, which is what TFA does, before it jumps to computed gotos.


The overarching point appears to be getting rid of angle brackets, which is not something that Duff is doing. Further, Duff's device keeps case labels on the left of its control structure; moving ifs to the left is the other "innovation" here.

I think you really have to squint your eyes to see the similarities, beyond the general theme of exploiting the counterintuitive properties of switch statements.


To me the duff's device is just a mechanism to unroll a loop without having to duplicate the code for the trailing case.

While you can't use SIMD you can still benefit from instruction-level parallelism.

It's potentially better in some scenarios where you want to minimize instruction cache usage and there are few iterations of the loop.


Not sure what you mean by "hacks to the type system". All modern computing essentially converged to unified memory, which is exactly C's model.


While it's convenient technically to have unified memory and so it makes a lot of sense for your machine code, in fact the MMIO isn't just memory, and so to make this work anyway in the C abstract machine they invented the "volatile" qualifier. (I assume you weren't involved back then?)

This should be a suite of intrinsics. It's the same mistake as "register" storage, a layer violation, the actual mechanics bleeding through into the abstract machine and making an unholy mess.

If you had intrinsics it's obvious where the platform specific behaviour lives. Can we "just" do unaligned 32-bit stores to MMIO? Can we "just" write one bit of a hardware register? It depends on your platform and so as an intrinsic it's obvious how to reflect this, whereas for a type qualifier we have no idea what the compiler did and the ISO document of course has to be vague to be inclusive of everybody.


I wasn't involved back then, but I know the history. I thought you were talking about something more recent.

But this is all opinions and terms such as "unholy mess" etc do not impress me. In my opinion "volatile" is just fine as is "register. Neither are layer violations nor a type system problem. That the exact semantics of a volatile access are implementation defined seem natural. How is this better with an intrinsic? What I would call a mess are the atomics intrinsics, which - despite being intrinsics - are entirely unsafe and dangerous and indeed mess (just saw a couple of new bugs in our bug tracker).


Sure, it's just an opinion. I think the consequences speak very well for themselves.


What consequences?


Because MMIO is made to look like it's really just memory (rather than a technical convenience) C programmers use the MMIO the same way they would the heap memory in the abstract machine. Sometimes the compiler will correctly intuit what needs to actually be emitted, sometimes the hardware they're actually talking to will compensate for what actually happens - other times it just "misbehaves" because this is not memory and so it doesn't behave like memory.


Due to the way lifetimes work in C (they begin with the block, not the declaration), the following is legal:

  #include <stdio.h>
  #include <stddef.h>

  int main()
  {
      {
          int *p = NULL;
          if (p)
          {
          what:
              printf("a = %d\n", *p);
              return 0;
          }
          int a = 123;
          p = &a;
          goto what;
      }
  }


> switch (i) case 1: puts("i = 1");

I've seen this in the wild, particularly with macros.

    #define assert(c) if (!c) ...
    
    if (foo) assert(...);
    else bar(); // oops!


Fun at parties alert:

Let's stop getting silly with C, too many CVEs!

---

Serious comment:

It's a rather cool article actually. Not something I'd do daily but it's kind of sort of useful to know these techniques.


I am so lost at the final block of code. Does every C developer have to deal with this everyday?


Certainly not. That's the purpose of the article where they say in the final sentence that it's entirely possible to write readable, yet totally befuddling code in C that stands a chance in the IOCCC.


Not even close. If any C developer ever has to deal with that ever, something somewhere went horribly wrong.


see also https://www.chiark.greenend.org.uk/~sgtatham/mp/

Metaprogramming custom control structures in C by Simon Tatham


Discussed in July 2021 (43 comments):

https://news.ycombinator.com/item?id=27781784


If only there was a way of using setjmp/longjmp-style contexts instead of goto, un/winding the stack as required. So we could travel around in time... unfortunately you can't work with a setjmp buffer before it's actually created, unlike gotos.


sigaltstack tricks to the rescue! (Although POSIX only, not ISO C)


My undergrad was entirely in the C language and I’m very glad for it. Sometimes more modern languages can throw me for a loop, no pun intended, but the beauty (and horror) of C is that you are pretty close to the metal, it’s not very abstracted at all, and it allows you a lot of freedom (which is why it’s so foot gunny).

I will never love anything as much as I love C, but C development jobs lie in really weird fields I’m not interested in, and I’m fairly certain I am not talented enough. I have seen C wizardry up close that I know I simply cannot do. However, one of the more useful exercises I ever did was implement basic things like a file system, command line utilities like ls/mkdir etc. Sometimes they are surprisingly complex, sometimes no.

After you program in C for a while certain conventions meant to be extra careful kind of bubble up in languages in a way that seems weird to other people. for example I knew a guy that’d auto reject C PR’s if they didn’t use the syntax if (1==x) rather than if (x==1). The former will not compile if you accidentally use variable assignment instead of equality operator (which everyone has done at some point).

This tendency bites me a lot in some programming cultures, people (ime) tend to find this style of programming as overly defensive.


> if they didn’t use the syntax if (1==x) rather than if (x==1). The former will not compile if you accidentally use variable assignment instead of equality operator

No need for Yoda notation. clang will warn of this by default and gcc will do so if you compile with -Wall, which should also be your default.


> for example I knew a guy that’d auto reject C PR’s if they didn’t use the syntax if (1==x) rather than if (x==1). The former will not compile if you accidentally use variable assignment instead of equality operator

I've seen that one and personally dislike that mindset: Making the code less readable to compensate for a disinterest in using actual static analysis tooling.


These days GCC and Clang will both give you warnings for this if you have -Wall, which everyone should.


Less readable? I agree these are less readable:

  if (987654321987654321==x)

  if (find_optimal(frobnicator(x)*8)==1)
while these are subjectively more readable:

if (x==987654321987654321)

  if (1==find_optimal(frobnicator(x)*8))


I force my students to do C development. And it turns out that it is not that hard if you approach it with modern tools which catch a lot of problems. The lack of abstraction is fixed with good libraries.

C evolved a lot and many foot guns are not a problem anymore. For example for

if (x = 1)

you nowaday get a warning. https://godbolt.org/z/79acPPro6

Implicit int, calling functions without prototypes, etc. are hard errors. And so on.


The warning says to add parentheses, which sure enough silences the warning, your foot, however, still has a bullet hole in it.


> The warning says to add parentheses, which sure enough silences the warning, your foot, however, still has a bullet hole in it.

The warning also says that it's an assignment. It's a pretty clear warning meant to force the programmer to do extra work to get the error.


The warning is very clear. If you did intend to use the result of an assignment as truth value, you would notice. In any case, did not have a single problem with this type of error in the last decades, working with programmers of various skill levels including beginners.


The libglib-dev with gcc is very handy for toy projects, but only _after_ students try to write their own versions:

https://docs.gtk.org/glib/data-structures.html

It could be fun to do a lab summary after the lists and hashes introduction.

Have a wonderful day, =)


I absolutely I agree that learning to create you own abstractions is an incredible useful skill. It depends though. For a programming course this makes absolutely sense. But for applied problems in, say, biomedical engineering, this does not work. Many students know only a bit of Python, and then it is too much and "too inconvenient" to start from scratch in C. With Python they have a lot of things more easily available, so they make quick progress. This does not lead to good results though! For most of the Python projects, we end of throwing away the code later. Another problem is that students often do not know what they are doing, e.g. the use some statistical package or visualization package and get nicely looking results, but they do not know what it means and often it is entirely wrong. For machine learning projects it is even worse. So much nonsense and errors from copying other people Python code....


Python like Basic abstracted far to many details away from students, and trying to convince people they need to know how a CPU works later is nearly impossible.

In general, digging deep enough down a stack, and it drops back into the gsl:

https://www.gnu.org/software/gsl/

Indeed, first month attrition rates for interns at some companies is over 46%. =3


> I have seen C wizardry up close that I know I simply cannot do.

I have written C at least a few times per year for over 30 years. About ten years of that was OS development on Solaris and its derivatives.

Articles like this show crazy things you can do in C. I’ve never found the need to do things like this and have never seen them in the wild.

The places that wizardry is required are places like integer and buffer overflow, locking, overall structure of large codebases, build infrastructure, algorithms, etc. Many of these are concerns in most languages.

> auto reject C PR’s if they didn’t use the syntax if (1==x) rather than if (x==1)

When I was a student in the 90s advice like this would have been helpful. Compiler warnings and static analyzers are so much better now that tricks like this are not needed.


> I knew a guy that’d auto reject C PR’s if they didn’t use the syntax if (1==x) rather than if (x==1). The former will not compile if you accidentally use variable assignment instead of equality operator (which everyone has done at some point).

That's not so much of a footgun anymore - the common C compilers will warn you about that so there's not much point in defending against it.

Same with literal format string parameters to printf functions: the compiler is very good at warning about mismatched types.


In an embedded environment, overly defensive is an asset


That’s precisely where my little professional C experience was. I then switched to a python shop and was initially horrified at some conventions, took some getting used to.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: