
C pointers are not hardware pointers - ingve
http://kristerw.blogspot.com/2016/03/c-pointers-are-not-hardware-pointers.html
======
jepler
My "favorite" bit of code that gcc eventually started treating as undefined
(after working for probably literally 20 years up to that point): A 3x4 matrix
type declared as

    
    
        typedef float xform_3d[4][3];
    

was routinely serialized by (I'm taking liberties here, this certainly not how
the 20-year-old code read!)

    
    
        float *f = &x[0][0];
        for(int i=0; i<12; i++) write_float(f[i]);
    

i.e., the code treated the same storage as a 2D-array of floats and then as a
1D-array of floats in different places.

In the "pointers are just CPU registers" world, this worked fine. But more
clever compilers know that the code dereferences past the end of the array
(e.g., memory at f[4] is always accessed) so the whole loop is undefined
behavior.

We fixed it once we started seeing the compiler warning, so I don't know
exactly how we would have been punished for the undefined behavior in this
case. So hooray for compiler warnings!

~~~
dzdt
Seriously? I have seen this code many places, even written it.

~~~
jepler

        #include <stdio.h>
    
        typedef float xform_3d[4][3];
    
        int main() {
            xform_3d x = {
                {1.f, 0.f, 0.f},
                {0.f, 1.f, 0.f},
                {0.f, 0.f, 1.f},
                {0.f, 0.f, 0.f} };
    
            for(int i=0; i<12; i++) printf("%f\n", x[0][i]);
        }
    

with g++ 4.9 (debian jessie -O3 -Wall):

    
    
        xf.c: In function ‘int main()’:
        xf.c:12:50: warning: iteration 3u invokes undefined
                  behavior [-Waggressive-loop-optimizations]
             for(int i=0; i<12; i++) printf("%f\n", x[0][i]);
                                                          ^
        xf.c:12:5: note: containing loop
             for(int i=0; i<12; i++) printf("%f\n", x[0][i]);
             ^
    

(yes, this is slightly different than the code I posted first---"laundering"
&x[0][0] through an intermediate pointer variable f makes the warning go away
in the compiler I tested, but my suspicion is that if one is undefined
behavior the other probably is too)

Edited to add: and when built and run, it printed all x[0][i] but x[0][0] as
0.0, so gcc did emit code that doesn't do what the programmer wants.

~~~
slavik81
> my suspicion is that if one is undefined behavior the other probably is too

I would suspect otherwise. I have no particularly good reason for that, but
the cases with and without warnings match to my expectations of what's valid
and what's not. Intuitively, I'd expect pointers can point to any valid
memory, while array accesses can only be done within the dimensions of the
array.

~~~
jepler
You may be right. I haven't gone and read the standard yet, but I also fed
related programs to cbmc 4.9, a C-program verifier, using the -bounds-check
flag. It tags x[0][3] as an out-of-bounds access, but not f = &x[0][0]; f[3];
(also _(x[0]+i) is OK according to cbmc, despite what we all learned about_
(a+b) and a[b] being the same!)

Neither variant is picked up as illegal at runtime by -fsanitize=address and
gcc-4.9 or clang-3.5.

------
asveikau
> The pointer p has indeterminate value after free(p), so the comparison
> invokes undefined behavior.

Is there a citation in the standard for this? Obviously all bets are off with
being able to dereference p, but it makes no sense that free() could also
modify the pointer itself, and I have a hard time justifying how any
implementation would reasonably make comparing dangling pointers invalid.

I can even see a legit use case for it being valid.

    
    
        void *p, *q;
    
        p = q = malloc(n);
        /* ... */
        if (foo)
           q = malloc(n);
        /* ... */
    
        free(p);
        if (p != q)
           free(q);
    

Here we free q only if it was a different allocation from p... The fact that p
was dangling at the free(q) line is immaterial. I guess it's not so bad to
move the free(p) to the end of this code, but it's surprising to me that a
sane language implementation would consider the above invalid.

~~~
dave2000
I think you're right. I also take issue with the "pointers point at an object,
one byte into it or are null" statement. A pointer is a piece of
memory/register that contains an address, and an address is just a bit
pattern. 0 is an address, as is 1,2,3.... 0xffd9303e. It doesn't even have to
be accessible memory; you can get all sorts of exciting effects if you write
to or even read from addresses mapped to hardware but the datatype is still a
pointer.

~~~
dllthomas
My vague recollection of the relevant bits of the C standard has it that there
are a few differences between how pointers to void, pointers to char, and
pointers to anything else are treated in some respects. This might be one of
them.

 _" A pointer is a piece of memory/register that contains an address, and an
address is just a bit pattern."_

This is true on virtually every modern system, but (if memory serves) not
literally every system ever, and the C standard leaves room for some
surprising things in some places. It would have been very nice if the author
had actually included some references, though.

~~~
jibsen
Pointers to character types get some special treatment. They can be used to
address the individual bytes of other objects (C11 6.3.2.3p7), and the strict
aliasing rule allows character types to be used to access those bytes (6.5p7)
(one interesting note is that uint8_t is not strictly required to be a
character type [1]).

The problem with assuming things like a pointer working like an integer, is
that compiler vendors may choose to start taking advantage of any undefined
behavior and point to the standard if you complain.

[1]:
[https://gist.github.com/jibsen/da6be27cde4d526ee564](https://gist.github.com/jibsen/da6be27cde4d526ee564)

------
bitwize
I once read about pointer implementations in C compilers for Lisp machines
(Zeta-C, I believe). As I recall, they were implemented as a pair consisting
of a vector of objects and an index into the vector. Casting between pointer
types must have been a royal bitch, if it were implemented at all.

So yeah, major lesson learned: C pointers are not necessarily equal to
hardware addresses.

but you should really be using c++ and iterators or smart pointers anyway

~~~
Grishnakh
>but you should really be using c++ and iterators or smart pointers anyway

Not if you're writing an OS kernel, or any low-resource embedded code.

------
dwarman
One problem is, in C at least, the compiler has no way to know that the
contents pointed to by p have reached their end of life. All these examples
use malloc() and free(), but these two names are purely convention, and any
names and memory management methods can be used with different names. And
implications for the contents.

But there is another problem too - the free() might (should:) use a mutex
while it does its work, but another thread could be woken up during the free()
postamble, after the mutex has been released but before the caller resumes
execution - or even between the == comparison and the second free() - call
malloc(), and the implementation might be one that re-uses the most recent
free(). Which would mess up the other thread wonderfully.

Neither of the above rely on any kind of mapping between the C variable p and
the underlying hardware. This is deliberate. C is defined as an abstract
grammar and semantics set, primarily because by that era it was painfully
recognised that inventing new HLL's for each new hardware architecture was a
losing proposition. There were so many of them, and those languages were not
transportable. The essence beauty of C was it is simple enough to be
implemented on any level from direct hardware out to as many layers of VM as
one could wish for. And really only required the execution machine model to be
Turing complete. Its early use was all done to the metal, hence the inherited
impression that C pointers are hardware pointers, but really only when used at
the metal is this likely to be true. And typically, these days, one finds
architectural specific extensions, such as intrinsics, that allow metal access
from C to the custom parts of the hardware.

The still painful part of C though is where its architectural independence
intrudes. Such as, sizeof(int) being the native hardware size of an integer.
8, 16, 24, 32, 36, 48, 64, I've suffered from them all. Similarly,
sizeof(enum) is undefined. The language only stats it must be large enough to
represent all its values. I have found the compiler definition differences of
bool to also be very frustrating. I grew into software from a hardware design
background, and to me

a & true == a

is always true. Or should be. The language defines boolean values as 0 for
false and any other value for true. Most compilers choose 1 for true - enum
style. So then

((a & true) == a) && ((a == 0) || (a == 1))

Useless when trying to deal with metal and bitflag registers (clearly not the
literal true, rather when somewhere else a variable has been set to true and
later is used as a mask).

Some compilers have compile command options or #pragmas for forcing the issue,
but there is no standard for these. Still. These indefinites can cause serious
and sometimes difficult to debug problems when the binary structures
containing these things have to be shared between different architectures. Not
to mention endian-ness.

All told though, C is still easier than ASM, which usually _is_ specific to
the hardware architecture.

~~~
dllthomas
_" One problem is, in C at least, the compiler has no way to know that the
contents pointed to by p have reached their end of life. All these examples
use malloc() and free(), but these two names are purely convention, and any
names and memory management methods can be used with different names."_

This seams wrong, unless by "purely convention" you mean in some sense that
the entire standard is "purely convention". The clearly defines the lifetime
of an allocated object to be from allocation to deallocation, and clearly
labels free (and realloc, in some circumstances) as deallocating. I am not
sure whether a conforming implementation can introduce new functions that
also, in the sense of the standard, deallocate, but in either case the
provided code is still technically incorrect per the standard.

 _" Such as, sizeof(int) being the native hardware size of an integer. 8, 16,
24, 32, 36, 48, 64, I've suffered from them all."_

Nit, and I'm sure you're aware, but perhaps worth noting for others: the
values you're listing are (I infer) number of bits, which is not the same as
what's returned by sizeof. sizeof gives number of _chars_ , which on some
architectures has actually been other than 8 bits!

~~~
dwarman
But does not define them in terms of hardware.

My numbers referred to the native widths of the hardware words. sizeof(int)
would of course return those numbers divided by 8, but machines are almost
universally referred to by bit width, not byte width. Intel 64 bit chips, not
Intel 8 byte chips, for example. So sizeof(int) is still a function of
hardware width, and hardware width is defined in terms of bits.

Perhaps I should have been a bit more specific, instead of leaping over fully
internalized transforms without explanation? Here? In a discussion about
hardware? And note that 36 bits is not a multiple of 8, which actually breaks
sizeof() anyway.

~~~
dllthomas
_" Perhaps I should have been a bit more specific, instead of leaping over
fully internalized transforms without explanation?"_

I don't know about "should" \- my post wasn't really meant as criticism. I
just thought we had an opportunity to dig in a little deeper and do so a
little clearer.

 _" Here? In a discussion about hardware?"_

There are a whole lot of people here who have only done web dev or high-level
x86 application development, and many who are only passingly familiar with C
or HW.

 _" And note that 36 bits is not a multiple of 8, which actually breaks
sizeof() anyway."_

Oh, a 36 bit architecture without a char size of 9? Interesting! How did that
work? Can that be standard conforming?

~~~
dwarman
"should": a problem with lengthy experience and age is just how many steps
between A and Z get chunked into merely A to B since it is "obvious". Which,
in my case, is why I do not teach. Ive forgotten too many derivations. SO
digging deeper would be a good exercise for me. Thanks.

36 bits was a custom DSP architecture. Could not use C to get at the extra 4
bits, had to use ASM.

Storage units is also inconstant. I've had fun with another DSP who used 16
bit instead of 8, could not address bytes at all. No C at all so I don't know
what it would have done with sizeof(int). Had to exchange stream data with a
conventional architecture via DMA. Fun times.

In some ways I am a little dissappinted in the arc of our technology. It may
have become constrained overly by what compiler tech can do easily and can not
do so easily. I really should not be able to understand how a quad-core
architecture is put together and works, not after some 50 years of
development, but I do. Not because I'm a genius, but because it is essentially
the same stuff I worked with in the 60's with a lot of emvblishments, short
cuts, and optimizations, but not fundamental change.

