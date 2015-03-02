This is not guaranteed to work. C guarantees that casting a pointer
to a uintptr_t and back results in a pointer which compares equal
to the original; but it does not guarantee that (uintptr_t)p + 1 ==
(uintptr_t)((char *)p + 1), that p1 < p2 is equivalent to
(uintptr_t)(p1) < (uintptr_t)(p2), or even that (uintptr_t)(p) ==
(uintptr_t)(p); an evil but standard-compliant compiler could
implement casting from pointer to uintptr_t as "stash the pointer
in a table and return the table index" and casting back as "look
up the index in the table".
The compiler for such a machine does not even have to be evil.
(That said, as mentioned elsewhere, we're talking about a debug check in a non-security-sensitive code path...)
IIRC it's a bit more complicated than that...? Isn't it more along the lines that no object may occupy the NULL/nullptr 'address' and that 0 (literal) in a pointer context must be interpreted as NULL/nullptr? IIRC it doesn't specifically say anything about the address 0x00000000 (add bits to taste).
Also, I'd be very interested in hearing a suggestion as to how this should be implemented. Of course you might have mentioned that in the comment I can't find, so don't repeat it if a link suffices of course.
I'd look into working explicitly with uint64_t, and stop doing pointer subtractions.
EDIT: Here's the technically correct code from the bottom of the advisory, that I failed to see: https://github.com/sandstorm-io/capnproto/commit/2ca8e41140e....
word* target = segmentStart + farPointer.offset;
if (target < segmentStart || target >= segmentEnd) {
throwBoundsError();
}
doSomething(*target);
size_t segmentLength = segmentEnd - segmentStart;
if (farPointer.offset >= segmentLength) {
throwBoundsError();
}
word* target = segmentStart + farPointer.offset;
doSomething(*target);
Also, I'd be very interested in hearing a suggestion as to how this should be implemented
I don't know enough about the problem space to say for certain, but in general I'd say that "work with unsigned integers and convert them to pointers only after performing all necessary sanitization" is good advice.
I don't even know if these things are the same in C and C++ any more (the code in question is in C++).
/me runs back to C.
This line you've commented on is a debug assert, meant to catch a bug that used to exist in the code but doesn't anymore. This check is not required for security and could just as well have been deleted altogether. It is compiled out of opt builds.
> This is not guaranteed to work. C guarantees that casting a pointer to a uintptr_t and back results in a pointer which compares equal to the original; but it does not guarantee that (uintptr_t)p + 1 == (uintptr_t)((char * )p + 1), that p1 < p2 is equivalent to (uintptr_t)(p1) < (uintptr_t)(p2), or even that (uintptr_t)(p) == (uintptr_t)(p); an evil but standard-compliant compiler could implement casting from pointer to uintptr_t as "stash the pointer in a table and return the table index" and casting back as "look up the index in the table".
I am no expert, but what I understood is that the C standard defines some situations (such as an overflow error) which result in "undefined behaviour". In this situation, the compiler is free to do whatever he wants. This is in fact what happens here.
This is clearly annoying to programmers. In this case, it is even hard to avoid undefined behaviour even if the source of undefined behaviour (a pointer overflow) is known.
Why is this done? This way the compiler can do some aggresive optimizations which are only valid if there is no undefined behaviour (e.g. no overflow, null pointer acces...).
There are various good articles about this.
To make the situation more complex for programmers, it is possible to write programs which exhibit undefined behaviour but work fine in practice. Until the compiler tries to do a specific optimization.
This is basically why you shouldn't use C. Long-time C programmers know a list of operations which are undefined behaviour, and are usually able to point out a few of these in a program written by newbie (ironically, these programs may run and pass tests just fine).
For those interested, good places to start learning more about this are John Regehr's blog [0], or the llvm blog [1].
[0] http://blog.regehr.org/archives/213
[1] http://blog.llvm.org/2011/05/what-every-c-programmer-should-...
A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
Tradition notwithstanding, if a compiler deletes all my files because of the position of a comma in nested brackets, then I'm going to simply not use that compiler. I don't give a fuck if it's technically correct.
"Oh it's not a bug". Good for you. I don't care if you want to call it a feature or a bug. It's wrong. Press the issue and I'll find someone smarter.
However, this is not the only reason I'm pointing out that this is not a compiler bug. The other reason is that there are certain implications:
1. The compiler developers are likely to declare that this is not their problem.
2. The problem may exist for other compilers as well: even if it complies to the standard it may do this. Even worse, it may only show up in a new version of the compiler, or in specific situations.
I am not saying that this is a good situation and programmers should just be more careful. Everybody makes mistakes.
The root of the problem is at the specification. Ideally, there would be no undefined behavior. From an optimizer's point of view it is very sensible to assume that the programmer will not invoke undefined behavior and use optimizations based on this principle. People surely love a compiler which produces fast code, so it may not be desirable to totally eradicate undefined behavior (and end this kind of optimizations).
The most practical remedy I see is to add a debug flag which crashes or somehow indicates undefined behavior. Indeed, GCC has done this. So ultimately, we seem to agree that the compilers should change to improve this situation.
It would seem particularly unfortunate if a compiler were to do static analysis for undefined behavior, but only use it in optimization. From a cursory search, I found this blog post [1] explaining why LLVM does not warn about such things (at least in 2011, when it was written - attitudes may have changed, and the Clang project now has the UndefinedBehaviorSanitizer [2]). One of the arguments is that it is difficult to explain what the problem is, but in my view, that is an argument for making doing so a priority.
[1] http://blog.llvm.org/2011/05/what-every-c-programmer-should-...
[2] http://releases.llvm.org/3.8.0/tools/clang/docs/UndefinedBeh...
I think there is a sort of analogy with a large black hole here [1]: you can slip across the "event horizon" into undefined behavior without noticing, but, especially if you are using an optimizing compiler, there may be no escape.
[1] Maybe not, if the black-hole firewall hypothesis is correct.
Any chance you could briefly compare it to Rust or another systems level language trying to remove undefined behavior?
There's always -O0.
"These people simply don't understand what C programmers want": https://groups.google.com/forum/#!msg/boring-crypto/48qa1kWi...
"please don't do this, you're not producing value": http://blog.metaobject.com/2014/04/cc-osmartass.html
"Everyone is fired": http://web.archive.org/web/20160309163927/http://robertoconc... (EDIT: this one just gets better every time I read it...)
I also found a new one thanks to the comments here, which you can find elsewhere in the comments - but I'll add a link to it here anyway, for good measure:
"No sane compiler writer would ever assume it allowed the compiler to 'do
anything' with your code": http://article.gmane.org/gmane.os.plan9.general/76989
I should note that this plan, throwing away gcc and clang in favor of a
boring C compiler, isn't the only possible response to these types of
security holes. Here are several other responses that I've seen:
* Attack the messenger. "This code that you've written is undefined,
so you're not allowed to comment on compiler behavior!" The most
recent time I saw this, another language lawyer then jumped in to
argue that the code in question _wasn't_ undefined---as if this
side discussion had any relevance to the real issue.
The undefined behaviour of C is deliberate so that compilers can
make optimazations. They assume you write code that only has
defined meaning and generate code for that defined meaning. That
means for instance that if you add 2 signed integers they're going
to assume it doesn't overflow and then for instance make
assumptions based on that on wether other code ever going to be
executed or not.
I'm definitely favoriting your comment so I can find it easily later.
There were C compilers before there was a C standard. They had bugs.
You're not wrong - but the dividing line between bug and feature request is subjective, and just how much (meaningful) difference between the two there is depends on the authors of the compiler.
If you compile with -fno-strict-aliasing and the optimizer breaks your code based solely on strict aliasing violations anyways, by all means, report that as a bug.
If you use a compiler that has no -fno-strict-aliasing equivalent, by all means, switch to a better compiler when they WONTFIX your feature request.
This is exactly what the C standard says. And yes, that is problematic, especially since there is no way to detect undefined behaviour. I consider this to be the main problem. If C compilers would simply check for undefined behaviour in the debug build, there would be a lot less problems.
The problem is that the standard was written as a minimum that programmers could rely on even the worst compilers to implement, with the intention that compiler writers would come up with improvements that would then be standardised (the same way that happens with e.g. web standards). Instead compilers regressed to doing the minimum permitted by the standard.
It would also be allowed to follow the principle of least surprise. Most instances of UB have a specific expected outcome; for signed overflow, one would probably expect the architecture specific signed overflow handling to happen.
It's a bit paradox that some hard errors for which no such expected outcome exists, such as an address violation, can easily be caught by the programmer (SIGSEGV), but errors (UB) for which a specific, sensible reaction exists are not only silently tolerated, but actively exploited when searching optimization.
But as we can see in this vuln, there's functionality that works in the normal case but where important checks are elided, and compilers might decide not to do anything with a piece of undefined behaviour in one version but not in the next. Whether a piece of code works in -O2 doesn't prove it doesn't contain undefined behaviour.
If you think you do, you can usually rewrite your code to avoid it, or rely on implementation defined behavior for specific compilers.
And if that is not low level enough, you should be using assembly.
From the 2007 draft for the C99 standard[1]:
"undefined behavior: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements"
[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
Overflow is an example of something that's not UB in Rust, for example. (It's a "program error" and well-defined as two's compliment wrapping.)
Checking for overflow (and pointer-out-of-object-bounds, which is a special case of overflow) in C is like defusing a bomb: you can't cut the red wire and then see if it exploded, you have to check that it won't explode before you cut the wire.
Except the proper answer to this scenario is to refuse to compile with an appropriate error, not to eliminate your code.
Take a function that returns the square of a signed integer: As long as it's called with arguments that are small enough, the behaviour is perfectly defined. But the compiler cannot necessarily know which values might get passed to it at runtime.
Now, it could in principle add runtime checks that do something defined, like abort the program, when the preconditions for defined behaviour are not met--but that often makes the generated code slow.
That is why the standard says that the compiler can do anything it wants in the case of undefined behaviour: It's just a different way of saying that it's the responsibility of the programmer to make sure that the preconditions for defined behaviour are never violated, so the compiler can generate code based on the assumption that that is the case, instead of littering the code with tons of overhead for the case that the programmer did something wrong.
Compilers don't detect undefined behaviour and then use those detected instances of UB in order to mess up your software, they simply operate on the assumption that there is no UB in your code--which means that the code they generate is correct if there is indeed no UB, but there are no guarantees as to what happens if you don't hold up your implicit promise to not invoke UB.
The obvious interpretation of the request to have the compiler error is to have the compiler error in any case it would elide statements. That may be impractical in many or most cases, but it's a different problem than what you are talking about. Presumably there would be some cases where the compiler could warn or abort (if a flag was used) if branches were elided based on undefined behavior and those same branches also included an abort of some type shortly thereafter. Or maybe some better heuristic that I'm not thinking of.
Hard is not the same as impossible, and imperfect is often better than nothing.
No, you have it all backwards, and that's not what's happening. The compiler doesn't elide code because it is undefined. The compiler elides code because it would never be executed unless it happens to be called with arguments that would produce undefined behaviour anyway.
Take this, for example:
void foo(int *x){
*x=5;
if(x==NULL)abort();
}
And there is nothing necessarily wrong with that code: It could be (and such code is common) that it is actually never called with a NULL pointer, in which case the behaviour is perfectly defined.
If a compiler actually determines that your code will always exhibit undefined behaviour, then chances are it will indeed warn you, but that just isn't what usually happens, and it's not the cause of such bugs. The compiler doesn't say "this is undefined, therefore, let's screw it up", it's exactly the opposite: It says "if we assume that this code is never called with arguments that would cause undefined behaviour, what is the most efficient machine code that we could map it to?"
> The obvious interpretation of the request to have the compiler error is to have the compiler error in any case it would elide statements.
That would be plain idiotic. It is perfectly normal to have tons of dead code, erroring out in that case would just make it impossible to compile anything.
You have to consider that the compiler isn't interested in the elegant structure of your code, so it mashes it all up to figure out the most efficient machine code for the whole thing. So, you might have code that calls lots of functions on an array, say, where each function does a bounds check. Now, the compiler potentially will inline it all into one big spaghetti function. And then it will figure out that many of the bounds checks are actually redundant (again, under the assumption that the code doesn't invoke undefined behaviour), so it will remove any checks and corresponding error handling paths that it can prove to be implied by preceding checks, or it might try and combine multiple checks into one.
That does not mean that there is anything wrong with your code, or that you should manually remove all the redundant checks (which might not be redundant for other call sites, after all). The code is perfectly fine, and the compiler tries its best to remove anything that you don't actually need.
> Hard is not the same as impossible, and imperfect is often better than nothing.
True, but also besides the point. Compilers do issue warnings for lots of stuff, and more are added regularly, but much of what you are suggesting is actually not in any way even a coherent idea, and as such, is indeed not just hard, but impossible.
You're right, it's the interaction between the undefined behavior and the ability to reason about what's possible that's the problem. It's also not entirely pertinent to the overall point, which is that the compiler does know when it elides code, and optimizing out instructions is specifically what this particular issue is about.
> That would be plain idiotic. It is perfectly normal to have tons of dead code, erroring out in that case would just make it impossible to compile anything.
The compiler doesn't need to error on all instances of this, but a flag to have the compiler warn or error if it would remove a unique branch and that branch may exit prior to returning control, that could trigger the desired behavior. If the compiler has enough knowledge to determine redundant code, then it has enough knowledge to know whether it is removing code that is not due to duplication.
C compilers have historically chosen to error on the side of speed, not on the side of safety, and we've built ourselves into a corner. Any new language introduced today that said "well, there's some interesting interactions sometimes if you don't pay close attention, and the compiler/VM might remove statements you write because they are testing those same weird interactions[1] as it things they can't happen" would be laughed out of town.
That the optimizations that C compilers do are complex and have many stages of optimization is not a suitable counter for the criticism that those same optimizations sometimes cause non-obvious interactions with safety checks meant to test the same edge cases that those optimizations take advantage of. That's like someone saying for safety reasons you need to see at least 20 feet of road in front of you at all times per 10 MPH on the highway, and people complaining about how that's not feasible because it would force you to slow down 10-20 MPH occasionally as you went around some turns. Yes. Yes it would. Just because you can do an optimization, doesn't mean you should.
> much of what you are suggesting is actually not in any way even a coherent idea
I'm not the original commenter, but if you're referring to my suggestion of at least warning when entire unique branches of the original code are removed, that's entirely possible. That it might require reworking or ever removing portions of the current optimization pipelines, or cause compilation speed to slow considerably is irrelevant to this particular aspect, because right now we aren't having a discussion of whether it's worth it, but whether it's event possible.
1: As happened here in this case.
I wrote "redundant", not "duplicate", intentionally. A check is redundant if other code implies that it cannot possibly ever end up true. It being a duplicate check is not the only way for that to happen, and it's actually the exception. Also, no, the compiler most likely doesn't have that knowledge. A compiler doesn't work the way you seem to think.
> Any new language introduced today that said "well, there's some interesting interactions sometimes if you don't pay close attention, and the compiler/VM might remove statements you write because they are testing those same weird interactions[1] as it things they can't happen" would be laughed out of town.
Or, more likely, it wouldn't. These are good reasons to not write high-level software in C, but there are also good reasons why people who actually need high speed still do use C. And it's not that we like the fact that writing correct C is hard.
You seem to imply that there is no real reason for C behaving the way it does, and that those rules for undefined behaviour only exist to make the life of programmers miserable. Those things are undefined because making them defined would actually be expensive in terms of performance.
> That the optimizations that C compilers do are complex and have many stages of optimization is not a suitable counter for the criticism that those same optimizations sometimes cause non-obvious interactions with safety checks meant to test the same edge cases that those optimizations take advantage of.
The way you phrase things suggests that you might be confused about how the compiler "reasons". The compiler doesn't read your program, sees what you mean, and then tries to find loopholes in order to misunderstand you. The compiler reads your program, and only understands what your program means according to the formal specification of the language that you claim it is written in. In the example I gave above, you seem to think that there is a call to abort() that the compiler "removes". There isn't. According to the C spec, that call is unreachable, and as such the semantics of that statement is a noop, which is what the compiler will correctly map to machine code, somehow. Explicit dead code removal on some intermediate representation of the AST is just one way that could happen, and it's an implementation detail of the compiler.
> That's like someone saying for safety reasons you need to see at least 20 feet of road in front of you at all times per 10 MPH on the highway, and people complaining about how that's not feasible because it would force you to slow down 10-20 MPH occasionally as you went around some turns. Yes. Yes it would.
Yep, that's a perfect analogy for people complaining that when they talk to a C compiler, they maybe should be writing C, and not a language that they themselves made up, if they expect the compiler to understand them. Even it's sometimes difficult.
> Just because you can do an optimization, doesn't mean you should.
I agree. But for the most part, that's not what's happening. For the most part, optimizations are not intended to break your code, but rather it so happens that, in order to optimize some code, you have to rely on all code having certain correctness properties (that it should have if it is C code, according to the C spec), which then, unfortunately, happens to break some code that doesn't have those properties. Often it's somewhere between infeasible and impossible to distinguish those cases that are correct and can thus be optimized without introducing unwanted behaviour from those that are not correct and thus break as a result.
> I'm not the original commenter, but if you're referring to my suggestion of at least warning when entire unique branches of the original code are removed, that's entirely possible. That it might require reworking or ever removing portions of the current optimization pipelines, or cause compilation speed to slow considerably is irrelevant to this particular aspect, because right now we aren't having a discussion of whether it's worth it, but whether it's event possible.
First, equivalence between pieces of code is undecidable, and second, "unique branches" is not in any way a useful concept anyway.
Sure, you can try to make a compiler detect certain instances of what seems like a safety check that doesn't ever trigger. But that will either be completely ineffective (as it only detects a small minority of cases), or it will produce tons of bogus warnings (because there are tons of cases where it is perfectly sensible to have "unique branches" in your code that are provably never taken, even ones that abort the program, and it is logically impossible to distinguish those from ones that were written with the intent to catch a runtime exception that the programmer expects to actually happen at runtime).
In this specific case, the compiler inferred that since two pointers were added, and pointers that overflow is undefined, that it couldn't possibly be that the programmer specified undefined behavior, so it must be that there was no overflow. Since there was "no overflow", the compiler decided that the conditional statement testing for overflow could never be true, and removed that branch.
In the specific example submitted to HN, the compiler has a few options:
First, it can assume the programmer is infallible and will never make a mistake, and use that to actually change the code as defined for performance reasons.
Second, it can assume the programmer is fallible, and that without additional information, it is unsafe to alter the code based on assumptions of programmer infallibility for safety reasons.
Finally, it can assume the programmer is fallible, but try to keep the information present in some manner (tag as "maybe true") so that it can be reported on later, while not using it for additional optimizations for safety with advantages. It would be trivial to note optimization that it could do, but will not because it cannot reason adequately about one or more required assumptions. It would also be trivial to note occurrences of statements where undefined behavior it knows may be encountered if the programmer is not vigilant about the inputs using that same data (that is, it need not report everything, just what it can definitively find). That is, it's trivial because the hard work of detecting the problem cases is already being done.
Currently many C compilers default to optimizing for performance, the first options above. They could, and many think should at a minimum default to safety instead. We know people aren't infallible, so acting like they are has no basis in reality.
That's not to say every optimization has to be thrown out. There should be a distinction between something that can be assumed because of mathematical properties and constraints the compiler enforces compared to constraints it's assumed the programmer will correctly follow. There are clearly cases where the compiler can optimize based on knowledge it it has. If you cast an unsigned char to an int, and add another unsigned char to it prior to doing any other operation, you can assume there will be no overflow. You can assume the same of short on platforms where int is twice as bit as a short.
The bottom line is that compilers are assuming truthfulness of expressions that aren't necessarily true, and that require extraordinary effort from programmers to make sure they avoid, as evidenced by their continual discovery in enterprise grade libraries and applications.
You keep saying that it changed the code. It didn't. It compiled exactly what the code said. You dislike C, and that's OK, but that doesn't mean that a C compiler compiling C code into machine language that is semantically equivalent to the C code is somehow changing the program. It's not.
> That is, it's trivial because the hard work of detecting the problem cases is already being done.
No, it's not, you are still committing the same fallacy as before. The compiler doesn't "collect knowledge about undefined behaviour in the program", because that is useless knowledge for the compiler.
The compiler collects knowledge that helps it reason about the program. A big part of that is tracking the range of values variables can take on. That is information that is useful in selecting how to express certain code in machine code. Undefined behaviour plays into this because operations that are defined to have undefined behaviour have no results that need to be considered as possible values of the variable that the result is stored in. So, if the compiler sees an if(x>0&&y>0){ x+=y;x/=2; }, with x and y being ints, it can derive that x will be positive, and therefore, for example, the division can be compiled to a right-shift.
There is no code in the compiler that goes "well, there is this pointer dereference, so let's remove the NULL-check branch". It's rather that the dereference limits the range of possible values of the pointer to non-NULL values, which a later stage then uses to determine that the branch cannot ever be taken, and thus can be eliminated. The same "knowledge" could be inferred from an assignment of a constant, for example, or from a preceding check ... the compiler tracks the value, not whether undefined behaviour could happen.
> Currently many C compilers default to optimizing for performance, the first options above. They could, and many think should at a minimum default to safety instead. We know people aren't infallible, so acting like they are has no basis in reality.
That's completely tautological. Every programming language assumes the programmer to be infallible. Every compiler and interpreter does what the program means according to the language spec, and if the programmer fails to express what they mean in the language, the program will do the wrong thing.
Now, there is an argument to be had over what kind of semantics of a language are easier to reason about than others, and to construct languages that make reasoning about the code as easy as possible, and some of that can be applied to adding additional restraints in a C compiler on top of the language specification, to make the language as understood by the compiler easier to reason about (while still staying within what the C spec defines, so as to stay compatible with existing, correct C code).
But there are two major problems with your reasoning here:
(1) Distuinguishing programming mistakes and legitimately optimizable code is far harder than you think. You are just handwaving through that part, but that's actually the hard part. You will either miss a lot of optimization opportunity, or you will catch close to none of the relevant mistakes. If you think you have the solution to tell those cases apart much better than current compilers do, please write a paper about it, compiler writers certainly will be interested.
(2) Performance is actually kindof central to C. If you don't need performance, you probably should just not be writing C in the first place. And if you actually need performance, just erring on the side of safety isn't necessarily gonna cut it. The question is not whether you could add all of the safety features of, I dunno, python, to C. The question is what you would expect the result to look like? There is a reason why C code tends to be faster than python, and part of that is the lack of safety.
To maybe get an idea of why compilers do assume signed overflow to be undefined behaviour, this article seems to give a good overview: https://kristerw.blogspot.de/2016/02/how-undefined-signed-ov...
From the submitted article: Thus, the compiler removes this part of the check. It did not compile what the code said, it compiled what it determined it needed to compile. That determination included an assumption of what values a variable could be based on whether they could overflow, which would be undefined. That's the whole point.
The code specfied to test whether "target < segmentStart", and the compiler determined that could never be true and removed that check. We have in this bug report direct evidence that the compiler was too aggressive in its assumptions, as it is indeed possible. It was too aggressive for exactly the reasons I have been going over, which is to say that condition can only never be true as long as the programmer protects the actual values being used from being large enough to cause an overflow in the prior statement.
> It's rather that the dereference limits the range of possible values of the pointer to non-NULL values, which a later stage then uses to determine that the branch cannot ever be taken, and thus can be eliminated.
The check in question (in the compiler in question) specifically uses an assumption that the programmer will prevent an overflow which would be undefined. That is the assumption of infallibility I'm referring to.
> That's completely tautological. Every programming language assumes the programmer to be infallible.
No, they don't. If they did, Java, Rust and just about every dynamic language would never do bounds checks. One of the main reasons for a type system is to force the programmer to follow rules to prevent mistakes.
> Distuinguishing programming mistakes and legitimately optimizable code is far harder than you think.
At this point I'm referring to a specific type of optimization that they are doing that I think rests on shaky presumptions.Removing that optimization is not hard work. It may be hard for the community to stomach, depending on how much performance impact it has.
> Performance is actually kindof central to C. If you don't need performance, you probably should just not be writing C in the first place. And if you actually need performance, just erring on the side of safety isn't necessarily gonna cut it.
I gave a specific example for how to get the same performance if this particular type of optimization was made more conservative. Performance is important, but when it comes to performance or correctness, correctness should win. Full stop.
Very little of what I'm referring to at this point is theoretical. I'm referring to real world situations, mostly the one these comments are in response to. Your comments seem to indicate you think this situation isn't possible. Can you clarify on whether you think the bug report is wrong, or whether I'm incorrect in my assessment of what the bug report is saying, or whether I'm misinterpreting your point? At this point, I'm under the impression that much of what I'm stating is fact, so I'm not sure how to interpret statements such as "It compiled exactly what the code said." as anything but wrong, but that's not getting us anywhere.
Yes, it did--that seems to be your fundamental confusion.
What that code means is determined by the specification of the C language, and only the specification of the C language. You constantly keep implying stuff that you think, or hope, or would prefer the code means, but that is completely irrelevant for the question of what the code actually means. Just because some intuitive reading of the characters that make up the code makes you assume that it should means a certain thing, does not make it so.
That code does not mean "check for overflow", no matter how much you wish it did. And because it doesn't mean that, the compiler didn't translate it as that either.
> No, they don't. If they did, Java, Rust and just about every dynamic language would never do bounds checks. One of the main reasons for a type system is to force the programmer to follow rules to prevent mistakes.
You are completely missing the point, essentially due to the same confusion as above. I didn't say that those languages didn't have bounds checks. I said that they assume that the programmer is infallible. Every programming language specifies exactly what each syntactic construct means, and which syntactic constructs don't mean anything, and what the runtime behaviour is, and where it is undefined. That is what makes a programming language a programming language. It is the job of the programmer to translate what they mean into the syntax of the respective programming language. If the programmer makes a mistake in this translation, the programm will be wrong, and it will not do what the programmer meant it to do, no matter which programming language they are using--in that sense, every programming language expects the programmer to be infallible.
The difference between programming languages is not whether they allow you to make mistake (none does or ever will), but how difficult it is (mentally) to avoid making mistakes.
> I gave a specific example for how to get the same performance if this particular type of optimization was made more conservative. Performance is important, but when it comes to performance or correctness, correctness should win. Full stop.
That's completely besides the point. Nobody is saying we should have incorrect code (well, ok, some misguided people probably do, but they aren't really part of this discussion). The question is how we are going to achieve that, and that is ultimately a question of economics: What is the easiest/cheapest way to get the greatest amount of software into a state where its execution matches what the programmer intended? Just claiming that we should throw infinite resources at the problem doesn't actually help the problem disappear.
> Can you clarify on whether you think the bug report is wrong, or whether I'm incorrect in my assessment of what the bug report is saying, or whether I'm misinterpreting your point?
Really none of those, I think. I think the way you think about the problem is just confused, which makes it difficult to nail down why exactly your suggested solutions aren't really solutions.
> I'm under the impression that much of what I'm stating is fact, so I'm not sure how to interpret statements such as "It compiled exactly what the code said." as anything but wrong, but that's not getting us anywhere.
I hope I maybe managed to explain that above? I think that's really at the core of your confusion: You are mixing up what you intuitively think things mean and what things mean according to the appropriate formal definition in the respective context. But code in particular does not mean anything, except for what the formal specification of the respective language defines, and that can deviate arbitrarily far from your intuitive understanding.
It's a bit like false friends in natural languages: Just because you know a word from one language, doesn't mean the same word cannot mean something completely different in another language, and it's just confused to use the vocabulary of one language to determine the meaning of a sentence in a different lanugage.
No, it compiled what it determined it had to, based on the C standard. There is a difference. The code, as written, specified a certain set of actions to be taken. The compiler determined some of those directions need not me translated to machine code, and thus did not, but they were specified nonetheless.
To say that the compiler did not remove any code, or directions to be carried out, when translating to machine code, is to subscribe to a torturous and unuseful definition of the terms we have been using.
Of the actions specified by the programmer in the source file, one of which was optimized out in the translation of that source specification to machine code. This change alters the execution path of the program when it is present, and to such a degree that without the optimization the program would halt almost immediately, but with the optimization it allows an out of bounds memory access.
We are not arguing whether the C standard allows this. We are arguing whether the C compilers should do this. There is a distinct difference. Stating that no code was removed has been extremely unhelpful to this conversation, regardless of whether you think it is a technically correct statement. In the generated machine code, a condition of a branch statement does not exist in the version with optimization, but does without it.
The fact that this particular optimization relied on a case where the programmer specified a statement that depending on values not knowable to the compiler at the time of compilation may have resulted in undefined behavior or not makes this a poor optimization to carry out.
> Just because you know a word from one language, doesn't mean the same word cannot mean something completely different in another language, and it's just confused to use the vocabulary of one language to determine the meaning of a sentence in a different lanugage.
Perhaps you could actually address a point I've made instead of arguing over the words used. You are arguing over a technicality of the instead of the topic at hand.
Feel free to reply, I'll read it, but I'm done with this conversation beyond that.
That's a nonsensical statement. "The code, as written" doesn't have any meaning, other than perhaps what you make up in your mind, which is not a useful reference for discussion, unless you also explain what you interpret it to mean.
I understand that maybe you do not actually mean this literally, and that you maybe are just using somewhat imprecise language to get the idea across--the problem is that exactly in the details that you are not spelling out are the problems that this discussion is all about.
> To say that the compiler did not remove any code, or directions to be carried out, when translating to machine code, is to subscribe to a torturous and unuseful definition of the terms we have been using.
No, quite to the contrary. Those definitions might not be useful for day-to-day programming work, but they are exactly the definitions that you need to clearly discuss compiler behaviour, because those are the definitions that the compiler is using, and the compiler is using those definitions because they match the concepts of how you build a compiler.
> Of the actions specified by the programmer in the source file, one of which was optimized out in the translation of that source specification to machine code. This change alters the execution path of the program when it is present
No, there is no "change", that's just confused language. There is a difference between compilation results, but neither of those is in any way the "real" thing, while the other is "changed", they are both equally valid mappings from C to machine code, with one arguably being closer to the intention of the programmer and thus maybe more useful in this specific case.
> We are not arguing whether the C standard allows this. We are arguing whether the C compilers should do this.
The problem is that those are inextricably interlinked, because the compiler must still stay within the bounds of the standard, and still produce code with reasonably good performance.
> Stating that no code was removed has been extremely unhelpful to this conversation, regardless of whether you think it is a technically correct statement.
The point is not that it's a technically correct statement, the point is that that's not necessarily how the compiler "thinks", so it's often unhelpful in discussing compiler behaviour to talk about "removing code".
> In the generated machine code, a condition of a branch statement does not exist in the version with optimization, but does without it.
It just so happens that in this case, the compilation result without optimization was closer to the programmer's intention than with optimization. But the usefulness of this observation is severely limited because in other cases the exact opposite could be true. The programmer wrote something different than what they meant, and the compiler in some situation produced code that still matched the intention of the programmer ...
> The fact that this particular optimization relied on a case where the programmer specified a statement that depending on values not knowable to the compiler at the time of compilation may have resulted in undefined behavior or not makes this a poor optimization to carry out.
Except that if a C compiler avoided all optimizations for which this is true, a lot of code would be a lot slower. You seem to only be seeing some specific cases for which the performance difference is negligible, and the risk of the optimization is obvious to you, and your imprecise use of language doesn't make discussing this any easier. What you don't seem to realize is how much optimization a C compiler does that is perfectly safe that the compiler cannot easily, if at all, distinguish from this arguably dangerous case, which is why the compiler could only choose to either in many cases produce unnecessarily slow code, or use the current strategy and occasionally produce code that does something else than what the programmer had in mind.
> Perhaps you could actually address a point I've made instead of arguing over the words used. You are arguing over a technicality of the instead of the topic at hand.
Your point is incoherent because you are using imprecise language, which makes it difficult to address. That's why I am addressing your imprecise use of language first.
int *x = some_call();
...
*x = 5;
...
if (x == NULL)
abort();
People get rather riled up about undefined behavior. There is a good subset of people who rather like fast C code (optimizing out paths with UB can make code much faster), and a good subset of people who think C should just be safe and predictable. If you want safe and predictable, C is a tough sell.
Except the compiler clearly knows that it does happen since it elides the code because of it.
> The compiler will optimize out the abort(), which is a good optimization since obviously x can't be NULL otherwise *x = 5; would be wrong.
This isn't an optimization based on undefined behaviour, it's actually a perfectly standard dataflow analysis that even languages with fully defined behaviour would implement.
That's definitely the opposite of what's happening here. The compile believes that it does not happen and therefore removes the code.
I think that's why people are so caught up on this. Technically, neither action is a problem, but together they obviously caused a problem. Either one in isolation doesn't look all that horrible (even if the undefined behavior reasoning is pushing it).
In fact TFA specifically points that out multiple times.
Edit: mannykannot noted somewhere else that clang has the same flag.
https://people.csail.mit.edu/nickolai/papers/wang-undef-2012...
http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_201...
You could make a case that it's a bug in the standard though :-)
A standard is needed in some cases where there is no 'obvious' behavior.
I think this is the way that the 'Ruby standard' is defined, by a reference implementation which is assumed to be bug-free.
About the interpretation of the standard: this is probably true, but this kind of technical documents is meant to be mono-interpretable, if you know what I mean. Ultimately there might be a few limitations which you don't encounter in practice (if you give your variables names of more than 4k characters, I doubt if it will compile), so you might argue this is an 'interpretation of the standard'.
I would argue that in such a case technically, the compiler doesn't comply with the standard, but for all purposes and intents, it does (and so, in practice no one would doubt that the compiler complies to the standard).
Erm ... nope. If code is unreachable, it can, by definition, not exhibit undefined behaviour.
> The reason it doesn't simply abort is because compiler authors don't go out of their way to do extra work for no reason.
Well, true, but the primary reason is that there are parts of the input domain for which the behaviour of the code is actually defined, and it's not defined to mean a call to abort(). The mere possibility to call some code at runtime with values for which the behaviour is undefined does not make the code's semantics completely undefined.
I don't know why we expect phones and apps to have modern great UX, solving all kinds of problems, but our text editors should remain essentially the same as in the 70s.
Taking that information into any sort of "yeah, this still represents the original intent of the author" highlight/lack of highlight/summary as it sounds like azinman2 wants might be a little more difficult. (After all, if you can implement that, why not have your compiler only emit 'sane' code in the first place?)
There are some tools that go in that direction. For example clang can help you help the optimizer auto vectorize loops: http://llvm.org/docs/Vectorizers.html#diagnostics
Not everybody expexts that:
I use sublime text for smaller editing tasks (although vscode and notepad++ are kind of ok too) and Visual studio or in my case preferably Netbeans for anything bigger (although I accept that this might sound weird).
Only if I connect over ssh or work on a console I prefer vim.
Edit: also my preferred IDE already does something similar by telling me whenever it thinks something can obviously be simplified or written in a more idiomatic way.
Nearly all serious compilers have at least one representation which is not isomorphic for all programs. (It is homomorphic, which is required for the compiler to be correct.) As a consequence, at least some programs will be mangled beyond recognition.
Just because it's hard doesn't mean it isn't worthwhile to do, or that everything the compiler does must be reserved and shown in the IDE. I'm also not suggesting that this compiler feedback is the only IDE improvement we make. I'm observing that the entire way we develop software hasn't fundamentally changed in decades, yet every other aspect of computing has. Even something that roughly seems the same like word processing is now collaborative and in the cloud with real-time updates and social threads.
Light table is one IDE that is trying to question some assumptions -- like that files are important and that everything must be 1 line spaces apart in mono font. This is the kind of progress I'd like to see.
It does seem reasonable that there be a way, even if difficult, to highlight a bounds check that gets optimized away. If we start from the problem we're trying to solve (show eliminated code that poses security risks... such as bounds checks or erasing memory) rather than the general problem (reverse all compiler changes), then things become more tractable.
The problem is that you probably won't be able to use any optimising compiler if you do that. As a result of a bunch of optimisation (mainly inlining) the compiler "uncovers" lots of UBs which it can use to pare down the code, that's why inlining is one of the most important intermediate/early optimisation.
Here's an old example by mikeash:
int ComputeStuff(int *value) {
if(value == NULL) {
long and complex computation for a NULL value
return result
} else {
long and complex computation using the data pointed to by value
return result
}
}
void DoStuff(int *value) {
int pointedTo = *value; // value *must* be non-NULL
// do some work with pointedTo
int computedResult = ComputeStuff(value);
// do some more work with whatever
}
Would -Wundefined-behaviour forbid assuming a pointer is non-null ever, requiring (and outputting) a check before each use of a pointer in a given scope/lifetime?
Either emit the if, or throw an error that the code is internally inconsistent. I mean, it IS inconsistent - first it dereferences the pointer, then it checks if it's NULL or not! This is pretty much obviously an error! I'd much rather add a bunch of assert(value != NULL); to my code to signal to the compiler that yes, this pointer is really not NULL within this scope, than have to deal with the current UB-hell.
Another acceptable alternative would be to rewrite the if(value == NULL) branch to call terminate() since if we get there, then the program has executed UB and cannot continue. Since UB is rare, we can even arrange things such that the branch doesn't slow down the CPU in the general case.
No, in most cases that's the result of defensive programming and macros or inlining, where there is a perfectly useful sanity check in some inlined function, but the compiler can see that in some specific context where it was inlined, the check is actually redundant, and thus removes it.
> Since UB is rare, we can even arrange things such that the branch doesn't slow down the CPU in the general case.
It's an additional instruction, so it always has the potential to slow the code down, if only because it makes the code bigger than the cache. Also, it's usually not one such branch, but lots and lots of them throughout the program. That's why the optimizer tries to eliminate them.
If the compiler can know that the pointer is non-null, why would it need to rely on UB to optimize it?
Problem I guess is the ever-present possibility of pointer aliasing and such nearly allowing anything to change anything else at any time... Not easy making things more sound on such a foundation.
You've got this backwards. The compilers "knows" the pointer is non-null because you dereferenced it, and dereferencing null would have been undefined behavior. That's what allows the compiler to assume that didn't happen.
There are many operations which will produce undefined behavior for some inputs (e.g., every pointer dereference, every signed integer arithmetic operation). Figuring out if it is possible for a program to encounter such inputs at runtime is equivalent to the halting problem.
This is why any attempt to craft a -Wundefined-behavior that works on non-trivial programs is doomed to failure. It may be possible to prove a program will definitely invoke undefined behavior in some useful subset of cases. Compilers already do this in some of them, even. But those cases (and many more, besides) would also be caught with ubsan and a unit test.
Technically the UB is in the original code, it's the dereferencing of a pointer without checking it. The compiler can merely use that to assert dead code (checking for a null pointer after having already deref'd the pointer is nonsensical) and optimise it away.
UB is not a property of the local code. It's a property of the code and the values you feed to it. Without explicit annotations of contracts you won't be able to statically detect this for C/C++.
At runtime UBSan exists to detect this.
Deciding that a statement must be true because otherwise it would contain undefined behavior (which is assuming the developer won't cause undefined behavior), and then using it's truthiness to elide the check the developer included to attempt to make sure they weren't causing undefined behavior is some contorted reasoning that I'm confident isn't something that compiler authors decided specifically should be included, but instead is an unfortunate interaction between two separate aspects of compilation.
In this case, the optimizer made an assumption that required the programmer to have ensured than an addition could not overflow, otherwise it would be undefined behavior. The compiler assuming that holds as true when it isn't necessarily so (and in this case was not so), is a problem. Optimizations that assume a programmer was diligent enough to follow those rules should not be attempted, period, if they result in the removal of some code (or possibly ever, just to be safe).
That's not to say those optimization are forever lost to us. It just requires the programmer be more careful about the actions they take such that no assumptions need be made. For example, if you are adding two types which could cause an overflow, and overflows are undefined, you can cast and use larger types, or rely on compiler knowledge. Examples of that might be casting in the comparison or casting to larger types earlier on and not modifying it prior to comparison, in which case a sufficiently smart compiler might infer that since it was an unsigned char originally and you haven't modified it since you casted it to an unsigned int, while its type is actually an unsigned int, for comparisons of overflow detection for optimization you can treat it as an unsigned char.
That's not to say programmers can't take steps to make it provable for the compiler.
Yes, this could help construct a "warn on assumption made" mode, but you'd likely be inundiated with a lot of mundane assumptions as well. You could filter by kind of assumption, but scary-ub-triggering assumptions like the one in the post happen all the time too, and they usually depend on the runtime properties of the program, which are hard to statically figure out. Basically, this is a very nontrivial problem, and quite likely intractable to solve in a way that doesn't produce a deluge of unnecessary info.
(In the presence of annotations to help the compiler -- like the ISOC++ core guidelines -- the problem becomes significantly easier because you have local information on the runtime properties)
My argument is that any optimization that requires the compiler to infer intent rather than make concrete decisions based on known facts and this results in a change in the output of the program, then that is an optimization that is irresponsible to apply. The only part hard to know about this is how it affects execution, as the rest is already done currently. If that's too hard to determine, the correct stance is to disallow the optimization in that instance.
It is not irresponsible to have undefined behavior that the programmer can leverage.
It is not irresponsible to optimize out instructions that do not affect the result.
It is irresponsible to allow undefined behavior, and then change the output based on how that undefined behavior is interpreted but only if a specific optimization is applied. Optimizations should never change deterministic output. Code that is non-deterministic purely because of undefined behavior needs to noted.
That people have gotten used to some speed improvements at the expense of consistency, but not necessarily to their knowledge, is no excuse not to fix it. In many cases the code could be changed to once again take advantage of the same optimizations which are more strict, or to avoid undefined behavior. In this case, that would mean either casting to a type with defined overflow prior to addition or using a large data type to accumulate the values and test if it's too large. Neither of those allows the optimization, because that optimization is actually wrong in this case. In the cases where the optimization would be correct, correct use of types and casting should yield the same result.
I would love to see a specific counter-example where this would be unworkable. I do not consider code having to be changed from what was previously a possible undefined behavior to definitively not undefined behavior as unworkable. I don't see how C could be considered a systems language without this. I understand how this is an unpopular stance with C programmers, but without it, there's actually a bunch of non-deterministic source code in the wile that's an allowed compiler tweak away from changing how it functions, not just what instructions it uses to achieve that function.
Fuzzers will catch a certain proportion of this type of problem, sure. But for a ground-up project like this there's really no excuse not to use a better language.
Nearly every language is vulnerable to integer overflow. C++ is one of the few languages (possibly the only popular language) where you can reasonably check for overflows at compile time, as Cap'n Proto now does: https://capnproto.org/news/2015-03-02-security-advisory-and-...
So, I don't accept the assertion that C++ is inherently a security problem.
In any case, Sandstorm's low-level container management bits pretty much had to be written in C/C++ since they interact closely with the operating system. Or if we were starting over today, Rust might now be an option, but it wasn't when we started.
I don't advocate dynamic languages. Indeed I regard this class of vulnerabilities as evidence of a lack of type safety. Still, C++'s undefined behaviour semantics promote almost all bugs into security bugs, which is not a great property to have in your language.
> C++ is one of the few languages (possibly the only popular language) where you can reasonably check for overflows at compile time, as Cap'n Proto now does: https://capnproto.org/news/2015-03-02-security-advisory-and-....
C++ using that kind of template metaprogramming technique has nowhere near the overall popularity of C++, so that's not really a popular approach either (in the sense that e.g. few tools understand it, people who can work on it are hard to hire...). And any language with a reasonably advanced type system (or macros) could do the same thing.
The problem is that almost no one is thinking about this class of bugs. Furthermore, few languages offer tools to detect and handle integer overflows; I read that Rust for example only has run-time checks in the debug version, which to me is disappointing.
Only by default; you can turn them on in release builds too if you want. (And Rust 1.17 will have a stable `-C overflow-checks=y` flag which allows you to turn on the overflow checks without other debug assertions.)
If that's all you want it's easy to do that in any language? What's so special about C++? The reason this gets talked about more in C++ is that the consequences of integer overflow in C++ are dreadful (undefined behaviour i.e. instant security bug) whereas in most languages adding two integers will evaluate to an integer.
So it does seem to be the case that they are using the rust version here. That's nice.
> I would say, having looked at hundreds of thousands of lines of Ada, that I have had a disappointingly frequent experience in finding the UNCHECKED_CONVERSION generic used in production Ada code.
> I agree that this is often a mark of poor craftsmanship and
unhesitatingly discourage its use, but it is there
[0] https://www.cs.york.ac.uk/hise/safety-critical-archive/2011/...
2. that's not Ada's only UB, the 2005 spec has 35 or 36, which granted is an improvement over C's circa 200, but a far cry from being UB free
This is so idiotic, and no, it's not a bug in Cap'n'Proto, it is very definitely a bug in the compiler (and/or the current version of the C spec. if it actually allows this).
I had a C compiler before there was C standard. So by your definition, this compiler could not have bugs?
As an exercise, try to express the difference between "standards-compliant" and "bug-free".
Different people may have different expectations, so bugs become subjective.
Ideally, there is a water-tight standard. In this case every deviation from the standard is a bug and vice versa.
Yes. However, we are talking specifically about the parts where the standard is undefined. The standard most emphatically does not require a compiler to do these crazy optimizations that remove safety critical code.
Not sure that's a very productive approach. The spec is what it is, and it's of course rather deliberately done the way it is in order to make it possible for compilers to optimize and extract more performance.
I still assert that at the very least if logical operations are eliminated they should be verbosely enumerated in the output of the compiler.
It might also be nice if there were a way of marking a section as /security/ rather than /speed/ critical, and thus transformations on the highlighted section would be far more limited.
Well, the code is right there, there is no magic "do what I mean" needed. Just don't magically remove code that I wrote.
> spec is [..] rather deliberately done the way it is in order to make it possible for compilers to optimize and extract more performance
Yes, and it is wrong. Breaking existing programs with magic that simply removes code that is clearly there in order to eke out a bit of performance is wrong.
Compile with O0.
> Yes, and it is wrong. Breaking existing programs with magic that simply removes code that is clearly there in order to eke out a bit of performance is wrong.
Well welcome to writing C? It's been this way for a long time and that won't change - UB is critical to performance and doesn't always lead to unsafety. If you're programming in C, don't blame the compiler when your UB leads to remote code exec, you should have known what you were getting into.
I've been writing C for 30 years.
> UB is critical to performance
Not really, no. See
"What every compiler writer should know about programmers
or
“Optimization” based on undefined behaviour hurts performance"
(referenced elsewhere in this thread)
Thanks for the paper. It's very interesting but I think it has no real impact on my initial statement - these optimizations aren't going away, you can continue to expect problems like this vuln in the future for the same reasons.
You can already do that: use a non-optimising compiler (tcc and the like) or compile without optimisations.
The standards I'm recalling were in-house standards, but a quick Google shows that Google have published their standard and they warn against unsigned too: https://google.github.io/styleguide/cppguide.html#Integer_Ty...
Bottom line: avoid unsigned for numbers. Use unsigned only for bags of bits.
The fix commit changes it to use an unsigned integer type instead, so it seems like a concrete example for the exact opposite of your suggestion.
> Since farPointer.offset is an unsigned number, the compiler is able to conclude that target < segmentStart always evaluates false. Thus, the compiler removes this part of the check. Unfortunately, in the case of overflow, this is exactly the part of the check that we need.
This certainly seems like a compiler bug!
I would accept that pointer overflow is implementation dependent. Allowing it to be "undefined behaviour" is sincerely BS.
Every processor has a "pointer register". Increment it until it overflows. What happens then? Most of the time it will roll over. I suspect this happens in even early RISC architectures that faint if you look at it too hard. There's your answer
It's right, it's not a compiler bug. C tries to solve all problems and cops out on the real issues. The language is broken.
Now, if you want to talk about changes that should perhaps be made to the standards, that's a different discussion all together (and also has nothing to do with intelligence, since the UB in the standard today is there to facilitate portability and optimizations, both of which are highly important to C and C++).
You then proceeded to call out a small subset of the comment and yes, you were technically correct. You also ignored the actual point of the comment and didn't really contribute anything to because of it, thus doing exactly what the comment was referring to.
> The person I replied to basically said (paraphrasing) "compilers which optimize based on undefined behavior and follow the standards are written by idiots"
That's not what was said. What was said is that if someone continues to hide behind technicalities rather than addressing the problem problem, then that person is not worth talking to. If you view your original reply in that light...
That's a highly charitable reading of what was written.
I responded to only part of the comment because only part of the comment offended me: "I'll find someone smarter."
Sure, if one ignores the hyperbole (like a compiler deleting all of your files compiling your UB source code, because that ever happened), some of the idea has merit. But the expression of that core idea was not done well. But I ignored all of that.
The idea that a compiler author must be mentally deficient in some way for following the language standard is absurd.
Technical correctness should not be the final word in any discussion (and I never said it was), but calling someone "not smart" for being technically correct in a situation where technical correctness has a lot of value (even if you are focused on the drawbacks) in a comment doesn't really deserve a complete and thoughtful reply.
It is. I also think a case could be made that if someone can't get past a technicality when repeatedly asked to, they are lacking in intelligence in one aspect or another, or are purposefully being obstructive. Note that "find someone smarter" doesn't necessarily mean "you are stupid". It does imply that further discussion with this person on this topic will be pointless though.
> I responded to only part of the comment because only part of the comment offended me: "I'll find someone smarter."
I submit that perhaps because of the greater context and your own views, you might have read more into the words used that strictly called for.
I think the original comment is essentially saying "If I keep saying I have a real problem and you keep deferring that it doesn't matter because of some technicality, talking to you is a waste of my time," albeit in slightly more colorful language.
It also implies "Someone smarter will see it my way" which implies "I'm smarter than you" which is getting to part of the root of what's offensive about that wording.
> "talking to you is a waste of my time"
That's slightly less offensive than what was said because it no longer implies a "smarter" person would be less of a "waste of my time", but it's still fairly offensive.
Yes, but in this case, "my way" is referring to the understanding that real problems need real solutions and that continuous deflection based on technicalities doesn't help with that. I would count that as an indicator that one person was smarter than the other in that specific context. To be clear, I don't believe intelligence can be measured along a single axis in any useful way (I've met plenty of "smart" people that acted very stupidly, and plenty of "stupid" people that showed amazing amount of competence and intelligence about certain things), so I didn't read the initial statement as indicating intelligence overall of an individual.
> That's slightly less offensive than what was said because it no longer implies a "smarter" person would be less of a "waste of my time", but it's still fairly offensive.
It's also fairly subjective. It's your right to be offended at what you want, but I would caution against being offended without confirming the intent and/meaning behind the statements. I've shown how I interpreted it somewhat differently than you, so there's at least some ambiguity.
Which is why I merely replied with "I don't know why you think this has anything to do with lack of intelligence."
I could have said "you are wrong and it has nothing to do with lack of intelligence" (which is what I believe), or I could have gone further and said or implied I'm smarter than he is (which I didn't do, but which the original commenter said I did for some reason), but I didn't.
So I believe you are preaching to the choir here.
It's also not much of a stretch to say that invoking undefined behavior can do things like delete everything on your hard drive. In fact, I've seen it happen many times. Any memory corruption exploit fundamentally relies on the fact that the author invoked undefined behavior. Although it may be undefined, it's still deterministic, so an attacker can construct an input that abuses the failure modes of your particular implementation to run arbitrary code of their choosing, like spawning a shell process and binding it to a port, or fetching some malware.
C guarantees that none of this can happen, but when you don't follow the rules of C, they really do mean it when they say anything can happen. It's not because the compiler authors are being trolls and feel entitled to delete your files or launch a rocket out of spite. It's because your machine is now executing who-knows-what.
void*x = trk->x;
if(!trk) return 0;
return doit(x);
return doit(trk->x);
Compiler writers can argue that this kind of optimization is useful because it makes other kinds of optimizations easier to implement, but the argument isn't that it's "technically correct" so they must. They have to argue on the merits of the decision, and if they fail to be convincing, people will find other solutions.
The DNS specification required clients use source port 53. I think this is dumb because it made it easy to spoof DNS requests -- simply generate 30k UDP packets in a short period of time for a domain name that is commonly requested (like yahoo.com) on a DNS server that's popular (like opendns).
If only clients would use a random source port, the number of packets goes into the billions.
A few DNS clients ignored the specification because it was wrong, but a few DNS clients stood their ground insisted that it was specified so it was right. I remember the smear campaign referring to one of those secure DNS clients as "not standard compliant". Brutish, but it was effective: Some people remained insecure for a very long time simply because they trusted the standardization process as right and truth.