Hacker News new | comments | show | ask | jobs | submit login

A little C tip I learned from hard experience

NEVER, EVER, NOT IN A MILLION YEARS use a signed int/char etc, unless you are 200% certain you're doing the right thing (that is, it's for something specific you need it)

You WILL have problems, period.

"Oh it's just a matter of knowing the C spec" then please go ahead as I grab the popcorn.

"For something specific you need it" meaning ... a negative number, like an array or memory address offset? I mean, sure, I agree that you should be doing anything sensitive to 2's complement behavior on unsigned quantities. And if you know the values are positive-definite, unsigned has much cleaner behavior. And I'd even entertain the argument that unsigned should have been the default in the language spec and that signed quantities are the ones that should have specially declared types.

But... your advice as written is just insane. They are real, and required routinely. You can't just apply this as a "for dummies" rule without essentially ruling out half of the code they'll need to write.

". a negative number, like an array or memory address offset? "

A negative number yes, but not really a memory offset (you shouldn't mix negative numbers and memory offset, really)

But yeah, if you're doing "math" go for it, but it's on rare occasions where you need negative numbers (subtraction yes)

The most common case I remember may be sound samples, where you have signed chars.

For all other cases you would be using floating point or decimal numbers

> For all other cases you would be using floating point or decimal numbers

If I'm trying to avoid mathematical anomalies, floating point is not what I would run to... "Equal" is a matter of degree, you have to be careful with anything near zero and you can't carelessly mix together numbers that are a few orders of magnitude different than each other.


But for most of "math" you would go for floating point. You won't reinvent some fixed point math using integers just because...

signed usually promotes to unsigned in nice ways, such that if you really want to store -1 in your unsigned, it will just work. I've found using all unsigned types, with the occasional signed value jammed into them, is less error prone than mixing and matching types throughout the program. ymmv, and of course, here be the odd dragon or two.

My goodness, no, this is terrible advice. Never do this. Go fix all the code you wrote immediately. It is full of security vulnerabilities. I'm not kidding, this is so bad.

I'm with the OpenBSD developer on this one.

I'm pretty confused. I didn't know who Ted was, but a quick google search shows he is an OpenBSD dev and worked for Coverity. Coverity itself will flag this error. Now you are backing up that position. Historically, this exact thing has been the cause of many security vulnerabilities. It's especially precarious with the POSIX API due to many functions returning negative for error. I recall OpenSSH making sweeping changes to rid the code of signed ints being used for error, for this reason.

Can you explain why you would advocate this? Am I misunderstanding you, or missing something?

I replied to the other comment in this thread with an openbsd vulnerability caused by doing what is being advocated (I did choose openbsd to be funny).

Looks to me like the vulnerability you linked to demonstrates the exact opposite of what you think it demonstrates: The problem is that an unsigned value (a length) was being stored in a signed integer type, allowing it to pass a validation check by masquerading as a negative value.

Well, no, select() takes a signed value for the length (it is actually not a length, but the number of descriptors, later used to derive a length), and there is no changing that interface obviously. This is the source of the "negative value" in this example. The problem arises because internally, openbsd "jammed a negative value into an unsigned int", as Ted put it, and made it a very large positive value, leading to an overflow.

If the bounds check was performed after casting to unsigned, there would have been no problem. The vulnerability occurred because a bounds check was incorrectly performed on a signed value.

Can you give some examples of why you think this is more prone to security vulnerabilities than using signed types?

Ah, thanks. From reading the summary, it seems that case would have been prevented by using signed integers throughout, but would also have been prevented by using unsigned integers throughout?

I've seen more problems from unsigned ints than signed ints (in particular, people doing things like comparing with n-1 in loop boundary conditions). There's a reason Java, C# etc. default to signed integers. Unsigned chars, I have no quibble (and Java, C#, use an unsigned byte here).

Unsigned integer overflow has defined behaviour in C, while signed overflow doesn't. Is it really better to protect people from a simple logical error by exposing them to possible undefined behaviour?

With signed integers, you'll run into the same problem with comparing to n+1 at INT_MAX or n-1 at INT_MIN.

0 is a really common value for an integer variable in programs. INT_MAX and INT_MIN are not.

It's just my experience. Don't get too wound up about it ;)

"(in particular, people doing things like comparing with n-1 in loop boundary conditions"

There's your problem: that's like saying cars are more dangerous than motorcycles because your finger can get squeezed by the door.

"There's a reason Java, C# etc. default to signed integer"

Legacy? And in Java/C#, and you usually use ints, not so much chars, shorts, etc and casts are probably more picky

I stand by my point, you should only use signed if you know what you're doing and for a specific use only (like math)

Signed arithmetic is generally only problematic when you hit the upper or lower bound. The right answer is almost never to use unsigned; instead, it's to use a wider signed type.

It's far too easy to get things wrong when you add unsigned integers into the mix; ever compare a size_t with a ptrdiff_t? Comes up all the time when you're working with resizable buffers, arrays, etc.

And no, Java did not choose signed by default because of legacy. http://www.gotw.ca/publications/c_family_interview.htm:

"Gosling: For me as a language designer, which I don't really count myself as these days, what "simple" really ended up meaning was could I expect J. Random Developer to hold the spec in his head. That definition says that, for instance, Java isn't -- and in fact a lot of these languages end up with a lot of corner cases, things that nobody really understands. Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand what goes on with unsigned, what unsigned arithmetic is. Things like that made C complex. The language part of Java is, I think, pretty simple. The libraries you have to look up."

Unsigned is useful in a handful of situations: when on a 32-bit machine dealing with >2GB of address space; bit twiddling where you don't want any sign extension interference; and hardware / network / file protocols and formats where things are defined as unsigned quantities. But most of the time, it's more trouble than it's worth.

"The right answer is almost never to use unsigned; instead, it's to use a wider signed type.""

Depends on the case of course, but yes, you'll rarely hit the limits in an int (in a char, all the time)

"It's far too easy to get things wrong when you add unsigned integers"

I disagree, it's very easy to get things wrong when dealing with signed

Why? For example, (x+1) < x is never true on unsigned ints. Now, think x may be a user provided value. See where this is going? Integer overflow exploit

Edit: stupid me, of course x+1 < x can be true on unsigned. But unsigned makes it easier (because you don't need to test for x < 0)

"what unsigned arithmetic is"

This is computing 101, really (well, signed arithmetic as well). Then you have people who don't know what signed or unsigned is developing code. Sure, signed is more natural, but the limits are there, and then you end up with people who don't get why the sum of two positive numbers is a negative one.

As you admit, you just made a mistake in an assertion about unsigned arithmetic. You're not very convincing! ;)

As I said, if I'm not convincing please go ahead and used signed numbers as I get the popcorn ;)

Here's something you can try: resize a picture (a pure bitmap), with antialiasing, in a very slow machine (think 300MHz VIA x86). Difficulty: without libraries.

Java actually uses a signed byte type, which was I guess to keep a common theme along with all the other types, but in practice leads to a lot of b & 0xFF, upcasting to int etc when dealing with bytes coming from streams.

There's a certain aspect of "functions have domains, not just ranges" at work here as well -- e.g. - restricting the (math) tan() function to the domain of -90 to 90 degrees (exclusive), unless you really get off on watching it cycle over madness. If you are going to be playing around the edges of something, it behooves you to put some kind of pre-condition in with an assert of similar mechanism.

In fairness, I guess a function like this is a good example of why you should put in preconditions, as well as a good demonstration that "not all the world is a VAX" (nor MS C 7, nor GCC version N) :-)

Actually, a colleague recently convinced me to start using signed ints for, e.g., for loops instead of unsigned ints. His reasoning was that if you're overflowing with signed integers, you'll probably overflow with unsigned integers too (we work with very large numbers), but it's easier to notice if you have a negative number rather than silently wrapping around to a still-valid value.

99% of the time, you should not be using any sort of int for a loop variable. Loop variables are almost always array indexes, and array indexes should be size_t.

Google's C++ style guide strongly recommends using signed ints over unsigned ints:


It's not just the C spec you've got to watch. I saw a wonderful bug last week where the author hadn't spotted that write(2) returned a ssize_t, not a size_t (or didn't appreciate the difference), so was gleefully checking an unsigned variable for a -1 error result.

How did the bug manifest itself? You can store 0xffffffff(ffffffff) in a 32(64)-bit unsigned int, or a 32(64)-bit signed int. In the one case you'll get UINT_MAX, -1 in the other, but they should compare as equal. If you have -Wextra turned on, gcc will give a warning, though.

Here's some sample C code tested on a 32-bit Linux system:

  #include <stdio.h>
  int main(int argc, char *argv[])
          unsigned int val1 = 0xffffffff;
          printf("val1 == -1: %d\n", val1 == -1);
          return 0;
The result:

  val1 == -1: 1

It works for == -1, but not for < 0, which is a common way to check for error returns from UNIX calls.

Any decent compiler should warn that such a check is always false, but people don't always pay attention to that stuff....

C is my absolute favorite language, and as such, I learned a long time ago to pay very close attention to compiler warnings and Valgrind memory errors.

Sadly, a lot of people have never learned this valuable lesson, and happily build their code with hundreds or even thousands of warnings.

Got it in one :-)

I'm rusty on C, and confused. What should I use instead of signed int?

unsigned int :)

and for negative numbers?

In that case, you can be pretty certain that you need a signed integer.

What if you only need to store negative integers?

Isn't that the same situation as only storing signed integers? The arithmetic is just a little different.

No, you get room for one more negative int. x = x < 0 ? -x : x; is a common buggy abs(), e.g. when printing an integer. One should test if it's negative, perhaps print the '-', and then ensure it's negative for the rest of the per-digit code.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact