Hacker Newsnew | past | comments | ask | show | jobs | submit | _kst_'s commentslogin

If the behavior is undefined, there is no wrong result.

Except that ++a means increment first and a++ means increment after. And that's well-documented and thoroughly understood. Implying that this is undefined behavior is a cop out and cope. Preposterous and juvenile. If not implemented , it should be. Case closed.

Are you volunteering to update all the code that would be broken?

Your code:

    int a = 5;
    int b = a++;
has well defined behavior. The first line initializes a to 5. The second initializes b to 5 and sets a to 6. (The language doesn't specify the order of the two operations of assigning a value to be and incrementing a, but in this case it doesn't matter.)

Giving 13 for a++ + ++a is not a bug in the compiler. It's a bug in the code.

The correct answer to "what does a++ + ++a do" is "it gets rejected in code review and replaced with code that expresses the actual intent.


"Test Your C Skills" is a published book by Yashavant Kanetkar, apparently published in 2005, and still available in paperback. The document you linked to appears to be a scan of a printed copy of that book, and is almost certainly in violation of copyright. The cover and the title and copyright pages are notably missing.

That's worth being aware of, though somewhere around 20 years is where I start caring a lot less about copyright from a moral or practical point of view. Yeah, that book stays out of the public domain until the next century, but it shouldn't.

> apparently published in 2005

No. It was published in late 90s. As per this copy on Archive.org 1997

https://archive.org/details/testyourcskills00yash


What can be optimized out depends on the context.

If you write:

    int i = 0;
    i = i++;
and never use the value of i, the declaration and assignment are likely to be optimized out. (The behavior of the assignment is undefined, so this is a valid choice).

If you print the value of i, the compiler can still optimize away the computation, but is perhaps less likely to do so.

The solution, of course, is not to write code like that. Decide what you want to do, and write code that does that. "i = i++" will never be the answer to "how do I do this?", and wouldn't be even if the behavior were well defined. If you want i to be 1, write "int i = 1;".


Agreed.

As a programmer, the solution to "int a = 5; a = a++ + ++a;" is to decide what you result you wanted, and write code that will produce that result, and probably to pass options to the compiler that tell it to detect this kind of problem and print a warning. (On my system, the result happens to be 12; if that's what I want, I'll write "int a = 12;").

But if you have an existing program that includes that code, it can be useful to look into the actual behavior (for all the compilers that might be used to compile the code, with all possible options, on all possible target systems). Fixing the code should be part of that process, but you might still have running systems with the old bad code, and you need to understand the risks.

But producing some numeric result is not the only possible behavior, even in real life. Compilers can assume that the code being compiled does not have undefined behavior, and generate code based on that assumption. The results can be surprising.

As for formatting your disk, that's not just a theoretical risk. If a program has enough privileges that it can format your disk deliberately, it's possible that it could do so accidentally due to undefined behavior (for example, if a function pointer is corrupted).


I don't see the name "thaumasiotes" at that link, nor do I see anything relevant to the code in the title.

The behavior of "int a = 5; a = a++ + ++a;" is undefined. There is no guarantee of a numeric result, because there is no guarantee of anything.


I believe they were referring to thaumasiotes's thread here: https://news.ycombinator.com/item?id=48141294

I think the objection thaumasiotes has raised there is valid and I have made an attempt to answer it as well in the same thread.


It's only the order of evaluation that is undefined.

No, the behavior is undefined. That means, quoting the ISO C standard, "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this document imposes no requirements".

A conforming implementation could reject it at compile time, or generate code that traps, or generate code that set a to 137, or, in principle, generate code that reformats your hard drive. Some of these behaviors are unlikely, but none are forbidden by the language standard.


I was wrong.

I was looking at this:

https://en.cppreference.com/cpp/language/eval_order

I'm not sure where precisely this sequencing exception to the default "eval order undefined" rule is given, but after the 24(!) sequencing rules they do give this "++i + i++" as an explicit example of undefined behavior.

Interestingly that page says that since C++17 f(++i, ++i) is "unspecified" rather than "undefined", whatever that means, and presumably plus(++i, i++) would be too, which seems a bit inconsistent.


Nope, there is no sequence point in the middle and modifying an object more than once between sequence points is undefined behavior.

This reminds me of a passage from the book "Pro Git".

<https://git-scm.com/book/en/v2>

"Here’s an example to give you an idea of what it would take to get a SHA-1 collision. If all 6.5 billion humans on Earth were programming, and every second, each one was producing code that was the equivalent of the entire Linux kernel history (6.5 million Git objects) and pushing it into one enormous Git repository, it would take roughly 2 years until that repository contained enough objects to have a 50% probability of a single SHA-1 object collision. Thus, an organic SHA-1 collision is less likely than every member of your programming team being attacked and killed by wolves in unrelated incidents on the same night."

Deliberate collisions are addressed in the following paragraph.

SHA-1 hashes are not random, so the issue of poor pseudo-random number generation doesn't apply as it does to uuidv4. And SHA-1 hashes are 160 bits, vs. 128 for uuidv4.

But I love the idea of unrelated wolf attacks.


Reminds me of this page with an example for understanding how many permutations there are for a shuffled deck of cards: https://czep.net/weblog/52cards.html

> So, just how large is it? Let's try to wrap our puny human brains around the magnitude of this number with a fun little theoretical exercise. Start a timer that will count down the number of seconds from 52! to 0. We're going to see how much fun we can have before the timer counts down all the way. Shall we play a game?

> Start by picking your favorite spot on the equator. You're going to walk around the world along the equator, but take a very leisurely pace of one step every billion years. The equatorial circumference of the Earth is 40,075,017 meters. Make sure to pack a deck of playing cards, so you can get in a few trillion hands of solitaire between steps. After you complete your round the world trip, remove one drop of water from the Pacific Ocean. Now do the same thing again: walk around the world at one billion years per step, removing one drop of water from the Pacific Ocean each time you circle the globe. The Pacific Ocean contains 707.6 million cubic kilometers of water. Continue until the ocean is empty. When it is, take one sheet of paper and place it flat on the ground. Now, fill the ocean back up and start the entire process all over again, adding a sheet of paper to the stack each time you’ve emptied the ocean. Do this until the stack of paper reaches from the Earth to the Sun. Take a glance at the timer, you will see that the three left-most digits haven’t even changed. You still have 8.063e67 more seconds to go. 1 Astronomical Unit, the distance from the Earth to the Sun, is defined as 149,597,870.691 kilometers. So, take the stack of papers down and do it all over again. One thousand times more. Unfortunately, that still won’t do it. There are still more than 5.385e67 seconds remaining. You’re just about a third of the way done.


Damn, I got the paper stack wet with all that ocean water. Guess I'm starting again from scratch...


On the other hand, it turns out that pre-image attacks are quite feasible, and as several people who have thoughtlessly committed the pre-image attack test case files to git can attest… quite problematic


This idea of everyone producing absurd amounts of git objects is less fantastic now [1]. We're still far from these numbers, but an order of magnitude less far than last year [2].

Also an interesting bit of history here: apparently there was a time when people were already writing books on Git but "one enormous Git repository" wasn't yet the most common mode of using it.

[1] https://news.ycombinator.com/item?id=47932422

[2] https://x.com/kdaigle/status/2040164759836778878


Hasn't the Git team been hard at work to optionally offer other hashes, like SHA256, in addition to SHA-1?


They have been not at work on doing that.


UUID v7 relies on knowing what time it is.

Speculation: The most likely scenario for a UUID v7 collision is if UUIDs are generated during a system boot sequence, before the system clock is set to the current time. It's always 1970 somewhere. There are still 62 random bits, and optionally another 12 random bits, but those too could be problematic if the system hasn't generated enough entropy yet.


It's not even possible to pass too few arguments to a function in C unless you go out of your way to write bad code.

You can write a function declaration that's inconsistent with its definition in another translation unit. Declaring the function in a shared header file avoids this.

You can use an old-style declaration that doesn't specify what parameters a function expects. Don't do that. Use prototypes.

You can use a cast to convert a function pointer to an incompatible type, and call through the resulting pointer. Don't do that.

You can call a function with no visible declaration if your compiler overly permissive or is operating in pre-C99 mode. Don't do that.


> It's not even possible to pass too few arguments to a function in C unless you go out of your way to write bad code.

This article is exclusively about undefined behaviour. "Bad code" is already baked into the assumptions of the article.


This is a site for intellectual curiosity, not pedantic dissmisal.


Wait. Isn't the thing you're doing right here _exactly_ "pedantic dismissal?"

Looking at your comment history that seems to be _your_ mode.


Seriously?

I discussed some of the technical issues behind the article. If you disagree with anything I wrote, please say so.

I'm not even saying that the issues discussed in the article aren't useful, just going into how likely they're likely to be encountered in practice.


Hacker News does not like actual hackers.


You could also use inline assembly.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: