Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unfortunately, having this kind of function doesn't fix the real problem, which is those certain kinds of programmers who won't let a single machine cycle be wasted.

Here's just a completely random example from the Linux kernel: https://lkml.org/lkml/2014/8/19/194

    > You are implying that such an overflow could arise in this code.
    > This is false.
    [...]
    > bNumEndpoints is an unsigned 8-bit integer.
So instead of changing the code code to be obviously correct (wrt. multiplication overflow), we'll instead keep the old code whose correctness becomes clear only after the programmer has looked up the types of variables involved.

This mindset is exactly why vulnerabilities in C code are still common. It really needs to stop.




One of the main criticisms I hear about OpenBSD is that it's performance isn't as good as Linux. I've looked at a couple articles quoting benchmark comparisons, so I'll assume for the sake of argument that it's true.

The thing is that I don't care. I'm still an OpenBSD advocate and run it whenever I get the chance. I don't actually care about performance at that level. What matters more to me is consistency, and good man pages.

I imagine more people are like me, but the thing about performance measurements is they're quantifiable. They can be easily measured. We all probably over optimize to some degree, whether we're C developers or not.

There's just something satisfying about getting a quantifiably better score. This is why we get code that sacrifices almost everything for runtime speed.


The slowness isn't a generally impacted much by stuff like this. It's more things like OpenBSD's poor multicore performance that are the issue. Just compare the performance of FreeBSD's multithreaded PF to OpenBSD's singlethreaded PF. Still, none of this matters, because there's one major guarantee that the OpenBSD devs try to fulfil: correctness.


I suppose there are probably large groups of people who care about multicore performance on the metal, but my dominant use case is herds of single core VMs, each doing one thing (well!). In this case, host CPU and memory consumption matter, and OpenBSD is very hard to beat in this metric. My OpenBSD VMs normally consume 1/4 of the CPU time and run happily in half the RAM as my FreeBSD, Linux and Windows VMs. I don't know if there is a significant performance difference on a single CPU, but I certainly don't have any complaints about performance.


The thing is, the "slowness" in question is different every time someone wants to talk about it. Because people don't benchmark it, don't actually know what is slower and by how much, and just go on the "common knowledge" that openbsd is slow. The openbsd is slow meme was going strong in the 1990s, when it didn't support SMP at all and 99% of the people talking about it were using a single CPU.


It's different because the slowness is across the board.

FreeBSD and, especially, Linux have optimizations all over the place. A call to time(2) or gettimeofday(2) is hundreds of times faster in Linux than on OpenBSD. In most scenarios the difference doesn't matter. But it does matter in, e.g., an asynchronous network service with lots of events with timeouts being constantly registered.

I prefer OpenBSD, and I like that they focus on correctness, bugs, and more high-level features. I run all my personal services on OpenBSD. But commercially and at scale I run Linux, and treat each box as a throw-away in terms of security.


>A call to time(2) or gettimeofday(2) is hundreds of times faster in Linux than on OpenBSD

Got a link to those benchmarks? Did anyone file a bug report?

>But commercially and at scale I run Linux

Wouldn't netbsd be a better option?


I doubt it's a bug. That sounds like the overhead of a syscall vs. a memory-mapped page.


How does that make it not a bug? They've repeatedly responded to the "openbsd is slow" meme by pointing out that they do not have the time to benchmark anything, and if people do find slow things they should submit bug reports so they can be fixed. Nobody has ever submitted such a bug report.


I mentioned it on the mailing-list a long time ago, when I was more naive about these things. They said I should submit a patch. However, adding virtual syscalls is a non-trivial undertaking, and I'm not experienced enough with kernel internals, which will require wide-ranging changes, including adding special mapping support, hacking the program loader, and tweaking the clock subsystem.

Plus you also have to hack userspace. Most importantly you need to add VDSO support to the linker. Refactoring time(2) would be easy, but for higher granularity clocks--gettimeofday(2) and clock_gettime(2)--you need to make careful use of RDTSC and similar instructions for each architecture. _And_ you need to be able to test it well because of synchronization issues on older platforms. On x86, for example, RDTSC is synchronized, but for a long time it varied across cores, and when Intel and AMD fixed that there was a still a small product window where it could vary across CPU packages.

It really is an intrusive change.


It's as fast as any other syscall, it's just lacking in special optimizations.


And the part where the developers explicitly say being slower than necessary is a bug and should be reported as such means it is a bug. This honestly is not complicated.


If slower than absolutely necessary is a bug I can probably report 99% of lines in the codebase. Perfection is astonishingly difficult.


I genuinely don't see how I can take your continued trolling at face value and pretend you are honestly trying to have a conversation.


My favorite thing with the OpenBSD folks isn't that they add APIs that avoid classes of security bugs but rather that they then ruthlessly update their code base to rely on the safer version.


Won't the wrapper be inlined and the condition branch-predicted? In that case, you literally aren't letting a single machine cycle be wasted.


Well, you might waste a cycle on the first run (or any-run after the branch has been flushed from the branch prediction table), as well as wasting cycles elsewhere from clobbering the table.

One solution to both of these problems I have thought about, but do not work at a low enough level to know if it is already a thing, is an opcode for unlikely branches. Sementiclly, this would behave like a normal jmp instruction, but the CPU branch predictor would always assume that the jump is not taken, and not bother keeping track of it.

In this particular case, it would also involve a way of signalling to the compiler (or the compiler inferring), but for loops it could be usefull.

Of course, in this case it is also possible that the compiler can infer that the branch can never be taken and remove it, so you really do not loose anything.


> One solution to both of these problems I have thought about, but do not work at a low enough level to know if it is already a thing, is an opcode for unlikely branches.

x86 actually does have jump instructions with branch prediction hints, but surprisingly modern CPUs will completely ignore those hints. Perhaps they never really caused large enough performance improvements to be worthwhile, or something.

> In this particular case, it would also involve a way of signalling to the compiler (or the compiler inferring), [...]

This functionality does exist, and Linux does make use of them. See the macros likely() and unlikely() at http://lxr.free-electrons.com/source/include/linux/compiler....


This solution was often part of http://en.wikipedia.org/wiki/Very_long_instruction_word architectures.

Itanium had the exact instructions you propose - look up ".dpxx" and ".spxx" for more.


I suspect nobody playing with the USB code would have to look up the type of bNumEndpoints. The `b' prefix tells you it's a byte (this is a common thing in the USB spec), and you'll probably know it's a byte anyway due to familiarity...

Another way of looking at the matter is that you shouldn't change code that's already correct.


> I suspect nobody playing with the USB code would have to look up the type of bNumEndpoints.

I could very well imagine a security auditor (who is not an USB expert) reading this code.

> Another way of looking at the matter is that you shouldn't change code that's already correct.

I personally find this kind of attitude extremely worrying. Perceptions about what is bad/good code or secure/insecure code have definitely changed since the first days of Unix. We shouldn't let code slip through the cracks just because it's old.


Overflows can crop up all over the place in C: you have to instrument every single +, - and * operator instance with overflow checks, each one of which needs a sane recovery strategy.

Some overflows indicate internal errors that should panic the kernel. Others are configuration or resource issues that can bubble up some errno value to user space. Some can have a fallback strategy: this calculation overflowed, try another one. Some overflows can be handled by "saturation" (a.k.a clamping): common in digital signal processing and supported natively in some DSP machine languages.

In other words, a silly patch that fixes a non-problem in one tiny place in a huge codebase doesn't accomplish anything. It's a big, wide-reaching task.


You don't have to instrument every single arithmetic operation. Multiplicative overflow can only happens if the value of one or the other operand is >= 2^(N/1), where N is the number of value bits.

On 64-bit machines, this means if you use 32-bit values and properly promote them, or one 32-bit value and one 64-bit value, you don't have to instrument.

Also, often times the better strategy is to design your code to be immune to overflow. Garbage-in is garbage-out, so if somebody gives you garbage then they can expect to get garbage out. All that matters is that you don't allow that garbage to taint the state of your program. I usually use unsigned arithmetic, because then I don't need to worry about undefined behavior in C. And for general bounds checking tasks my only condition is whether something is >= object-size.

Some of us rigorously analyze our code for overflow problems.

Those of you who don't are like the programmers from 20 years ago who thought that it was too tedious to do bounds checking on buffers, and that because of all the existing code it wouldn't improve security.

Well it did matter. Like with buffer overflows, it's a collective action problem. The dilemma you point out doesn't exist if you stop being so resistant to change.


Your comments basically state the same thing: "we avoid these problems in C by conscientiously choosing the data representations and defending against the problems while creating and implementing the original design" which is true. But this is different from going back into a large and complex C program whose design and implementation neglected the issues.

Fixing some realloc calls is possible only because the function has an error return code, and so when the overflow occurs in the multiplication, it can "piggy back" on that null return. The hope is that the caller is already handling null properly and so everything is cool. (Is it? Was a test case ever executed in which that particular realloc call returned null?)

In general, multiplications in C take place in situations where there is no expectation of an error, and so no error handling strategy in place in the surrounding code. It may be difficult to introduce it, and even more difficult to test all the cases.

> I usually use unsigned arithmetic, because then I don't need to worry about undefined behavior in C.

That is wrongheaded, I'm afraid. The unsigned types behave poorly under arithmetic, because they have a large discontinuity immediately to the left of zero. And because their wrapping behavior is well-defined, it basically means that unintended wrapping is not diagnosable as an error (without generating reams of false positives).

It is unfortunate that size_t is an unsigned type. This is for historic reasons. When C ran on machines where size_t was 16 bit, making it signed would have meant limiting the size of an allocation request to 32767, rather than allowing up to 65535. That would have hurt, and induced people into workarounds such as writing their own allocators. (Which, ironically, they do anyway, to this day.)

Unsigned types are best used for things like bit fields, and simulating the arithmetic of particular machines. They are also useful for doing clock arithmetic on a wrapping interrupt timer register. (This is best done with macros or functions that safely wrap it, like "time_is_before(t1, t2)".)

"unsigned char" is useful, but note that values of this type undergo default promotion (on most compilers, by and large) to a signed value of type int.


Signed overflow is undefined in C. Period. That means that detecting overflow for signed types is really messy.

For addition one method would be to subtract the maximum possible positive value. The only way to do that is to use a pre-defined macro. Which means the logic for your check is divorced from the actual operation--in other words, if the type of the operand changes, your check becomes _silently_ worthless.

The alternative exhibits at least _two_ bad security practices: 1) an independent constraint on input divorced from the risky operation is a recipe for bugs because you have no warning when the conditions change such that your input verification fails to properly constrain the values; and more generally, 2) relying on assumptions about a type, rather than relying on the actual behavior of a type or information derived directly from the type, is especially bad form in C because of its loose typing.

I think I'll stick with my strategy, which has worked out well, doesn't rely on undefined behavior, and is less susceptible to bit rot.

Regarding your objection of making changes to existing code, I would just say that we shouldn't let the perfect be the enemy of the good. I also like tedunangst's description of that predisposition: whataboutism.


> In other words, a silly patch that fixes a non-problem in one tiny place in a huge codebase doesn't accomplish anything.

I don't understand this line of thinking at all.

"reallocarray()/kmalloc_array() doesn't fix all the possible integer overflow issues in C, so why bother using it at all."

Several people in this discussion have mentioned exact vulnerabilities that would have been prevented by reallocarray():

meowface: "Here is one of many examples of an integer overflow resulting in an improper size being passed to a memory allocation function further leading to remote code execution: https://docs.google.com/document/d/1tHElG04AJR5OR2Ex-m_Jsmc8... "

tedunangst: "You've never heard of the one openssh exploit?"


I took realloc_array and applied it to a program I'm working on which contains a bunch of realloc calls. It was completely useless; I had to invent a different interface.

It was useless because several calculations take place in every instance to produce the size, whereas realloc_array only checks one multiplication.


The discussion has changed; it is about the pointless patch to some USB code where an 8 bit endpoint count is multiplied by the size of some structure.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: