Hacker News new | past | comments | ask | show | jobs | submit login

No I'm aware of examples like these, I just asked for clarification what they mean by "define it to not travel backwards in time". To me this sounds nonsensical.

I'm not deep into compiler construction, but to me these examples seem just like a logical consequence of what UB is -- it's a (runtime) situation that the compiler is not required to take into consideration. It can opt to not emit code to treat these situations at all, etc -- effectively assuming they don't happen. The point is to allow the compiler to blindly dereference a pointer even when it can't prove that the pointer is valid. Or to allow it to implement arithmetic on a register of bigger size (assuming the computation doesn't overflow), etc.

Now, depending on how optimizers are written, the compiler may end up inadvertently detecting UB and optimizing out entire branches of code, just by virtue of how the optimizer works internally. You can bet that the compiler doesn't think much of e.g. what is earlier or later in time, when doing optimizing transformations.

Of course a "miscompilation" (of code that is buggy in the first place) is an unfortunate situation and a diagnostic would be better. Compilers should improve (and they probably do). Compilers should be friendly and give unsurprising results and good diagnostics as much as possible.

But to "define it to not travel backwards in time" right in the spec would probably be very hard and might negate the point of UB in the first place. It would require doing the work of compiler authors, which are the people responsible to figure out how to make _their_ compiler solid and ergonomic while also offering the optimizations people want. This is already a hard task for the authors of a specific compiler, and probably not something that you can easily define in a language spec!

And for balance, I've never consciously had to deal with a miscompilation like this, and I write C and C++ in professional capacity almost every day. Instead, most bugs I deal with are of the most trivial kind, you hit a segmentation fault, quickly navigate to the piece of code where there is still some initialization stuff missing, and fill it in. Or there is a logic bug that is entirely unrelated to UB, those are in fact, typically, more difficult to find and fix.

Note that while I'm by no means an exceptional programmer (not that I think you think that of me). I simply want to solve a problem. And while developping I introduce bugs and even UB sometimes (even though it seems to be quite rare if I can trust sanitizers). I'm actually sophisticated enough to develop in debug mode, with most optimizations turned off, and this might be one explanation why I've never hit an annoying situation like this.

To me, these stories are fascinating, and I think they should be taken seriously. But their effect on online forums is mostly to heat up discussions.




This comes up now because SG21 (the Contracts Study Group) have a proposal for C++ 26. Proponents of this work would like to portray it as a crucial safety improvement - you can now write a pre-condition contract and, hypothetically, this could be enforced to deliver a meaningful safety improvement over just documenting the same requirement on a web page nobody reads.

But of course the proposed C++ 26 Contracts rely on C++ expressions. In C++ the expressions are themselves full of potential UB footguns, including signed overflow and illegal pointer de-reference. Thanks to time travel, this means adding the "safety" pre-condition may actually make your software much more dangerous not safer.

One proposed way to defuse this somewhat is to prohibit that time travel. Your contract expressions might still be UB but the idea is to promise by fiat that if so this doesn't actually time travel and destroy previously correct parts of the software.

I genuinely don't know what will happen there and can offer no predictions. In terms of what would be amusing as a spectator I hope either SG23 (Safety) explicitly says this is a terrible idea but WG21 ships it anyway or, equally funny, SG23 endorses the current unsafe nonsense as safe and then a subsequent committee has to establish a "Safety but really this time" Study Group to replace SG23 in a few years when it's thoroughly discredited.

> most bugs I deal with are of the most trivial kind, you hit a segmentation fault, quickly navigate to the piece of code where there is still some initialization stuff missing, and fill it in

Sure. C++ is such a bad language that most of your bug fixing is stuff which wouldn't even happen in a better language. Rust's std::mem::uninitialized<T>() is ludicrously dangerous, so it's deprecated (as well as unsafe) and yet C++ not only does this, it's silently the default for the built-in types. Hilarious. My sense is that a correct fix for this won't land for C++ 26 although maybe Barry can get the stars to align and prove me wrong.


See, I don't mind UB on signed integer overflow for example. You make it sound like a terrible terrible thing. I know it's not defined (and there is a rationale for keeping it undefined even assuming 2's complement). So I don't rely on it.

Quite honestly I don't recall signed overflow to happen, like ever. It's probably happened at some point but I really don't recall. I'm not trying to make it happen because I don't have a use for it. It's not useful anyway to have a number wrap around from e.g. 2^31-1 to -2^31. It is useful however to wrap from UINT_MAX to 0 (modular arithmetic), and this is in fact defined.

Of course, if you write "if (x < x + 20)" and turn the optimizer to -O7, then the compiler will run the body unconditionally, even though assuming signed overflow the test should fail when x equals INT_MAX. Woah, I'm crushed. That condition is exactly what I needed to write.

> Sure. C++ is such a bad language that most of your bug fixing is stuff which wouldn't even happen in a better language. Rust's std::mem::uninitialized<T>() is ludicrously dangerous, so it's deprecated (as well as unsafe) and yet C++ not only does this, it's silently the default for the built-in types. Hilarious. My sense is that a correct fix for this won't land for C++ 26 although maybe Barry can get the stars to align and prove me wrong.

I mean I could just write "#error Unimplemented" to get a compile time error but I'm not bothering. It seems what you describe as a terrible memory safety bug is simply my way of browsing to the next piece of code that I need to work on. Go figure...

Are you still developing C/C++ code? I get the impression you've given up on it and have jumped on the Rust train a hundred percent. At least there is a huge disconnect between the pictures you paint and my own development experience from daily practice.

But to make it clear again, I'm obviously not opposed to having the compiler issue an error whenever it's able to detect UB statically. In fact, this is how it should be.


> Of course, if you write "if (x < x + 20)" and turn the optimizer to -O7, then the compiler will run the body unconditionally

You seem very confident how the compiler will react to UB, I wouldn't be. You also seem unduly confident that you can spot such a footgun and wouldn't pull the trigger.

> It seems what you describe as a terrible memory safety bug is simply my way of browsing to the next piece of code that I need to work on. Go figure...

It's Undefined Behaviour, and you're just quietly confident that it'll be fine. Which it will until it isn't one day (and maybe that day was yesterday).

> I mean I could just write "#error Unimplemented" to get a compile time error but I'm not bothering.

A compile time error seems like a weird choice. Why write such an error only to immediately have to fix it? In Rust I'd write todo!() when I need to come back and actually provide a value or write some more code here later, that way it only blows up if this code actually executes.

> Are you still developing C/C++ code?

Not in anger for several years. I write Godbolt-sized samples to make a point sometimes.

> But to make it clear again, I'm obviously not opposed to having the compiler issue an error whenever it's able to detect UB statically. In fact, this is how it should be.

All the popular C and C++ compilers provide a great many flags you can set to get more of these diagnostics you're "obviously not opposed to". How many are you using today? How many did you try and then turn back off because of all the "false positive" diagnostics about things you knew were a bad idea but have preferred not to think about because hey, it seems like it works, right ?


> A compile time error seems like a weird choice. Why write such an error only to immediately have to fix it? In Rust I'd write todo!() when I need to come back and actually provide a value or write some more code here later, that way it only blows up if this code actually executes.

Well that's exactly what I get too by doing nothing and noticing the segfault when running my debug build. Sure, I get it, it's UB and there could be "time travel" and what not. But in practice I seem to get my segfault, so that's just how I end up developing. If it wouldn't work, I could write my own todo() macro, nothing magical about it right?

> All the popular C and C++ compilers provide a great many flags you can set to get more of these diagnostics you're "obviously not opposed to". How many are you using today? How many did you try and then turn back off because of all the "false positive" diagnostics about things you knew were a bad idea but have preferred not to think about because hey, it seems like it works, right ?

I compile with -Wall on Linux and -W4 on MSVC. If I'm not seeing bugs in the integration tests, there is for most domains very little economic incentive to setup various static analyzers etc, so I rarely do that. I run -fsanitize on some of my stuff from time to time just for kicks, but haven't gotten enough value out of it, which is why it's not a habit for me.

But since you mentioned it I went ahead and ran -fsanitize=undefined -fsanitize=address on a test program of the multi-threaded queue I'm working on, which is a bit performance-oriented -- on my older desktop computer it persists > 2M individual messages/sec (600MB/s) to a single disk, with to-memory message submission latencies of < 300nsecs for 99th percentile, < 2usecs for 99.9th percentile and < 30usecs for 99.99th percentile. The test program runs for ~6 seconds, submitting 16M messages (4GB of data), with 4 concurrent readers receiving the messages as soon as they come in. 178 fsync() calls were done by the enqueuer threads or the dedicated flusher threads. There are various internal buffers (a couple MB) and multiple internal message stores (1 optimized for fast submission / 1 for dense storage), and a couple low-contention mutexes but also some wait-free stuff.

-fsanitize didn't find a single UB (I double checked that the detection does work in principle by introducing a signed-overflow bug and a null-pointer dereference as well as an OOB memory dereference). And it found 3 leaks of 1 byte, which seem to be false positives: all related to smaller structures (more than 1 byte) that I allocated and freed correctly. That's all it reported.

I then went on to test using valgrind, which notably reported 0 leaked bytes, and otherwise only reported tons of spam exclusively related to printf-family calls. IIRC these are common false positives due to library mismatches or something like that. You can get rid of them, but I won't bother now.

This is the first time I tried static and runtime analyzers on this project, other than -Wall. In other words, it seems that just by fixing bugs and adding code until it worked, I produced a software of ~5K lines of C code that performs quite well and has 0 bugs or UB uncovered in the good hour of work that I put in.


The sort of thing your parent is talking about is being presented to the committee as an example of things that need to be considered, so it appears to be a serious enough issue to at least seriously discuss.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p32...


There appears to exist a guy with a formal background who is interested in submitting a paper about formal verification and static analysis and stuff. Impressive work, but I really don't know what to take home from the existance of this, or what argument this supports.

In my sibling post I merely want to illustrate that all these concerns have little bearing on my day-to-day work (which mostly doesn't need to be certified, and is not related to the defense industry or similar). Some of these I perceive as FUD, as said I know that you can provoke these situations but I've never personally encountered nasal daemons in practice, and I feel quite productive, am not spending a lot of time on bugs, so why bother.


Gabriel Dos Reis is specifically an old colleague of Bjarne Stroustrup (they worked at the same University, Texas A&M) who is now working for Microsoft on C++ tooling and so on.

So, one way to understand these papers is that Microsoft (at least some parts of it) thinks unsafe Contracts are worse than no Contracts. Now, would that mean they just won't implement an unsafe Contracts feature shipped in a C++ 26 document? Maybe. Would these fixes get it over the line? Maybe.


I'm taking a closer look, but from the looks of it I'm not a fan of adding yet another sub-language with differing syntax and semantics. This leads to complexity, it's a path to madness.

Without being involved -- I have no intention using any of these Contracts in whatever form. I will say though that I wouldn't care if there is UB in the contract language (just like there is in the normal language). I would prefer the variant with UB if it is simpler and more aligned with the language core. Removing the UB here is an academic exercise. Safety absolutists are uncompromising about the goal of correctness and provability. They are blind to the pragmatic issues created by the idealism. Contracts in either form could probably improve correctness by a lot, like 99% or whatever. So why should I care about the paper which could in theory bring the remaining 1%? It doesn't affect me pragmatically.

The flaw with either is that this is only in theory. In practice, I will never create enough formal contracts in to significantly improve correctness. Whatever system it is. Why? The costs are just too damn high, the only way to achieve 100% correctness when considering also pragmatic concerns, is to just not write any code.

My approach of just coming up with a simple design (not in code), trying to implement it in the most straightforward way, and fixing the code until it works, as described in my other comment, seems to have achieved something very close to correctness (maybe even 100%? Probably not).

Again, I'm not saying that UB is good or should be tolerated. I don't want it in my programs and if I find an instance of UB I'll try hard to get rid of it. However there is a reason why UB exists in C/C++ (as well as many other languages that may not have as much of it, but still have a lot of it even when not defined explicitly). And alternative approaches, trying to prevent UB mechanically, come with a cost that may not be worth it depending on what you're working on. I feel strongly like it isn't worth it for me. If you're building a fully verified or certified product, tradeoffs are likely different.

If we're citing big names, here is a well known person describing their view, which I find myself agreeing a lot with.

https://www.youtube.com/watch?v=EJRdXxS_jqo


Gabi is the Visual C compiler maintainer, not just tooling. The only sane person in the C++ ISO committee (_besides the sdcc maintainer, who has no power at all_).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: