Issues with porting C++ code to a 64-bit platform

mpyne · on April 1, 2013

I had issue #2 (varags) bite me once on a C function where a NULL pointer was used as a sentinel.

In C++ we normally use '0' as the name of the null pointer, as C's NULL macro ((void*)0) gives compilation errors normally (though it would have worked here). And of course, who knows what the NULL macro is defined to in C++ mode, it could be plain '0' or a compiler-specific type.

With varargs '0' simply translates to int 0. That is, 32-bit int 0, whereas the function was expecting a 64-bit null pointer. So on 32-bit mode it happened to work fine, but crashed when compiled for 64-bit platforms.

yew · on April 1, 2013

Huh. I'd always thought C++ shared C's definition of NULL, but looking I see that it doesn't. What a weird thing to change.

At least they've added some safety with nullptr_t.

plorkyeran · on April 1, 2013

C's definition would be incredibly unpleasant to use with no implicit conversions from void * . If you had to write things like int * x = (int *)NULL I'm pretty sure everyone would just use 0.

CamperBob2 · on April 1, 2013

I'm not convinced that the explosion of specialized types benefits anyone but language lawyers. Some of us actually have work to do, and NULL should mean NULL.

Using '0' for null pointers as mpyne suggests is less desirable because it robs the reader of context ("Is this a pointer being compared to 0, or a numeric type?") That could be worked around with Hungarian notation, but some would argue that the cure is worse than the disease.

to3m · on April 1, 2013

But before nullptr, NULL didn't portably mean "a null pointer". It meant "0", just as if you'd typed in "0" rather than "NULL". The language doesn't really give the library writer much choice in the matter! (Defining NULL as "((void *)0)" would at least give it an inherently pointer type, but it would be terribly inconvenient to use.)

This is the whole reason for nullptr in the first place.

CamperBob2 · on April 1, 2013

Right, I'm saying that as long as they were going to redefine NULL as part of the C++ standard anyway, they should have redefined it as ((void *) 0). That would be portable, at the expense of breaking code that used NULL in integer comparisons.

vinkelhake · on April 1, 2013

C++ doesn't have an automatic conversion from void* to T* so having NULL as ((void*)0) wouldn't have worked.

CamperBob2 · on April 1, 2013

True, but again, the whole problem is all of these weird, random changes and exceptions that only a small subset of programmers can keep in their heads. If you're going to go down that path, you should pick safe weird, random changes and exceptions.

In this case, they should have turned NULL into a first-class keyword that would parse as an invalid pointer of any given type, and called it "done." In the original C spec NULL wasn't supposed to be a pointer at all -- it was just a more polite way to write "zero." As endless Internet debates have shown, it turns out that it would have been better to reserve it for use in pointer expressions.

mpyne · on April 1, 2013

Turning a macro defined in probably hundreds of header files into a keyword would be a very bad idea for backward compatibility. It would have been nice if C had reserved it for use as a pointer and C++ had been coded from the beginning to allow NULL to be defined to any pointer type, but that wasn't how it ended up.

But what library vendors can do now is to simply

    #ifdef __cplusplus
    #if /* version check for C++11 */
    #undef NULL
    #define NULL nullptr
    #endif
    #endif

And that way there's less stuff to memorize, but still requires that the code itself has to opt-in, instead of the language having to figure out what insanity NULL may have been defined to.

jws · on April 1, 2013

Heh. Sometime around '94 I helped Southwestern Bell Telephone with that during the 16 bit to 32 bit transitions of a Mac client.

This means that sometime around the mid 2030's some poor programmers will have to take a break from worrying about Y0x80000000 to get bit by the x86-128 varargs bugs.

DerekL · on April 1, 2013

I'm not sure what you mean by the 16-to-32 transition on the Mac. Are you referring to the 68k-to-PowerPC transition?

derleth · on April 1, 2013

Initially, I thought it was the transition to 32-bit-clean code in the 68k line, but that was more of a 24-to-32 transition, and it only affected pointers:

http://lowendmac.com/trouble/32bit.shtml

bitwize · on April 1, 2013

And this is why you use nullptr, not NULL.

mpyne · on April 1, 2013

Yes, this is a great reason to have nullptr, but when this occurred there wasn't even such a thing as -std=c++0x. ;)

ChuckMcM · on April 1, 2013

Good things to keep in mind. Especially for a generation of programmers who didn't go through the 16 bit -> 32 bit transition. (Trust me, you do not want to write C code on an 80286)

bitwize · on April 1, 2013

holy shit far pointers

I just flashed back, man

ChuckMcM · on April 1, 2013

Don't worry its just PSASS [1]

[1] Post Segmented Architecture Stress Syndrome

malkia · on April 1, 2013

Shit just got unreal... mode!

Keyframe · on April 1, 2013

and DOS Extenders, namely DOS/4G(W).

16s · on April 1, 2013

But while trying to store a 64-bit integer in double the exact value can be lost (see picture 1).

I've seen that first hand. In fact, I wrote code to fix it. The guy who wrote the code, still did not get it... 64 bits is 64 bits he said, right... well yes, but that's not the issue here. When you have an int that is say 56 bits in size and you put it in a double that is 64 bits... see what happens:

#include <iostream>

#include <boost/integer.hpp>

int main()

{

    boost::uint64_t too_big = 72057594037927936;

    double wont_fit = too_big;

    std::cout << too_big << "\n";

    std::cout << wont_fit << "\n";

    return 0;

}

./a.out

72057594037927936

7.20576e+16

The maddening part about this is that it's hit or miss. Smaller numbers fit just fine:

#include <iostream>

#include <boost/integer.hpp>

int main()

{

    boost::uint64_t not_too_big = 281234;

    double will_fit = not_too_big;

    std::cout << not_too_big << "\n";

    std::cout << will_fit << "\n";

    return 0;

}

./a.out

281234

Finding and fixing bugs like this will cause ulcers.

thwest · on April 1, 2013

Your first example works just fine with increased output precision. https://ideone.com/GtMDit A slightly larger number shows the discrepancy: https://ideone.com/WPtpX4

Arelius · on April 1, 2013

> 64 bits is 64 bits he said, right... well yes

That's the problem right there. 64 bits is not always 64 bits. While a 64 bit int gives you a 64 bit number (Actually a 63 bit number and a 1 bit sign flag). An IEEE double precision float only gives you a 52 bit number, and a separate 11 bit number as an exponent. So the problem is that declaring that 64 bits is 64 bits is actually precisely the issue here.

MichaelSalib · on April 1, 2013

Actually a 63 bit number and a 1 bit sign flag

Not true. Modern architectures use 2s-complement rather than an explicit sign bit to represent integers. If you use a sign bit, there are two different zeros (positive and negative) but the range of positive numbers is exactly as large as the range of negative numbers. With 2s-complement, there's only one zero and the range of negative numbers is larger (by one) than the range for positive numbers.

dschatz · on April 1, 2013

In 2s complement, the most significant bit is the sign bit. In fact all valid C integer representations have a sign bit.

cube13 · on April 1, 2013

Signed integer representations. Unsigned ints do not, and you're given the full range of the integer as valid data.

dschatz · on April 1, 2013

Yes you're right. I should have been more specific that all signed integers have a sign bit.

MichaelSalib · on April 1, 2013

This is wrong.

Take an integer value x. Flip the sign bit. Do you now have the value -x? On a 2s complement architecture, you do not. In 2s complement, you have to flip all the bits and add one to get -x.

dschatz · on April 4, 2013

Sign bit means one bit represents the sign of the integer which is the case with the most significant bit in 2s complement.

If you flip the sign bit, the sign changes. It does not mean that if you flip the sign, the value is negated.

to3m · on April 1, 2013

I think his point is that changing a sign bit doesn't affect the absolute value, just its sign. A sign bit has no value itself - it's just a flag. If the representation has a sign bit, you'd have a negative zero.

But the top bit in 2s complement has a value - it's just a negative one (that's large enough to make any value with it set negative). That's not a sign bit! If you change it, the absolute value most definitely changes, and quite substantially.

xentronium · on April 1, 2013

This is a technicality. Leftmost bit still denotes a sign and modulo of the number is still capped at n-1 bits.

MichaelSalib · on April 1, 2013

The differences between the representations matter. In 2s complement, -INT_MIN is not INT_MAX; in fact, -INT_MIN is undefined and a C compiler is justified in deleting all your code if you ever cause it to calculate -INT_MIN.

I tell you what: if you think modern architectures use an explicit sign bit to represent integers, how do you explain the fact that INT_MAX+1==-INT_MIN? If there's a sign bit, how could you possibly represent more negative integers than positive integers?

Arelius · on April 2, 2013

Are you arguing that given such, I have more than 63 bits for the number, or do you just have a problem with the usage of the words "sign flag"?

peripetylabs · on April 1, 2013

One way to avoid loss of precision in the conversion is to check if the number of digits in the integer is larger than the constant DBL_DIG (defined in <cfloat>) -- numbers with less digits than that constant can be safely converted to the 'double' type and back.

thwest · on April 1, 2013

Note that DBL_DIG is the number of decimal digits that can be a 9. If the max integral value storable in a double is 4,000,000, DBL_DIG would only be 6. It is a conservative limit.

asynchronous13 · on April 1, 2013

I happen to be working on some 32/64 bit code right now, and this article has some very useful tips. Pretty sure they mixed up arguments on their memset() call in section 3 though:

  memset(values, ARRAY_SIZE * sizeof(size_t), 0);

should be:

  memset(values, 0, ARRAY_SIZE * sizeof(size_t));

That would not be a fun bug to track down.

AndreyKarpov · on April 2, 2013

Thank you. I will correct it.

peripetylabs · on April 1, 2013

Every C++ developer should read this. I've done research on integer arithmetic, which includes some of the pointer arithmetic issues discussed here. This is one of the best article I've seen on the subject so far. Thanks for sharing.

This is a small detail, but in C99 you can use the "zu" format specifier for size_t typed arguments of the printf/scanf functions; is this also true for C++? Another tip is to use the PRI* and SCN* macros defined in the C header <inttypes.h> (<cinttypes> for C++):

http://pubs.opengroup.org/onlinepubs/009604599/basedefs/intt... http://en.cppreference.com/w/cpp/header/cinttypes

shared4you · on April 1, 2013

One thing he doesn't mention is "-fPIC", I've been bitten by this bug many times: "please re-compile with PIC enabled and try again"!

ansgri · on April 1, 2013

Is there any reason that compilers don't somehow fix this thing by default?

izacus · on April 1, 2013

They can't - you get this error when trying to link other objects, which are already compiled without position-independent code. So the linker can't really fix this.

Also, if you're not doing dynamic linking, code without -fPIC is generally faster and for use-cases where C (and/or C++) is used that can make a difference.

cjensen · on April 1, 2013

Item #7 is incorrect. You can't make a union of a pointer and an integer type or a union between a number of chars and a larger int type and expect valid results ever -- the optimizer will rip your head off.

georgemcbay · on April 1, 2013

This is great information for anyone making the 32-bit to 64-bit transition, and at this point you really should be viewing 64-bit as your primary platform if you're writing code for x86(amd64) platforms. Just falling back to 32-bit mode with WoW or similar for your target platform is just inexcusable at this point unless you have very specific legacy support issues.

As someone who has been stepping away from C/C++ code towards Go, I'll add this for anyone in a similar boat:

When interfacing C (via CGO) or C++ (via SWIG) code with Go code the #1 thing to keep in mind with 64-bitness is that int in Go may be either 32-bit or 64-bit depending upon the GOARCH the compiler is targeting. int on the C/C++ side is virtually always 32-bit, even on a 64-bit compiler. Use int32 on the Go side to match int on the C/C++ side if you need your structs to align correctly when passing data back and forth.

cube13 · on April 1, 2013

> This is great information for anyone making the 32-bit to 64-bit transition, and at this point you really should be viewing 64-bit as your primary platform if you're writing code for x86(amd64) platforms. Just falling back to 32-bit mode with WoW or similar for your target platform is just inexcusable at this point unless you have very specific legacy support issues.

Eh, unless you absolutely need the addressable memory, there's no real reason to move a legacy codebase to native 64-bit unless the software absolutely needs the RAM.

The general takeaway is that you shouldn't be relying on primitives that aren't underlying OS/implementation safe. Using things like longs(especially if you're targeting both Linux and Windows), and assumed pointer lengths can bite you in the ass if you're not careful(and there is a special circle of hell reserved for people who think bitshifting on a pointer type is fine). A few minutes doing some preventative #defines can save you a world of hurt, especially when porting to the next largest system.

derleth · on April 1, 2013

> Eh, unless you absolutely need the addressable memory, there's no real reason to move a legacy codebase to native 64-bit unless the software absolutely needs the RAM.

Unless it can take advantage of the extra opcodes that are only available on x86-64 hardware. I think AVX is the best example here.

http://en.wikipedia.org/wiki/Advanced_Vector_Extensions

> Using things like longs(especially if you're targeting both Linux and Windows), and assumed pointer lengths can bite you in the ass if you're not careful(and there is a special circle of hell reserved for people who think bitshifting on a pointer type is fine). A few minutes doing some preventative #defines can save you a world of hurt, especially when porting to the next largest system.

Or investigate the types in stddef.h, such as ptrdiff_t:

> It is a type able to represent the result of any valid pointer subtraction operation.

> A pointer subtraction is only guaranteed to have a valid defined value for pointers to elements of the same array (or for the element just past the last in the array).

http://www.cplusplus.com/reference/cstddef/ptrdiff_t/

Trying to mess with #define stuff is only a good idea if you're targeting older or somewhat nonconformant compilers. The includes were written by the compiler authors; trust them to know more about their compiler than you do.