For more details on printf() formats, see:
int x = 4711;
printf("x is at %p\n", (void *) &x);
Yes, this is the case, e.g. in micro controllers, where the program is stored in EEPROM. You can store data in program memory (e.g. constant strings) but you need to access it in a special manner.
> Has there ever been an architecture where data pointers could have a different representations?
At least conceptually, this was the case in early GPUs, where there were different memories for different access patterns. Memory area for constants, another one for global read only memory, and a third area for thread-specific read-write -memory and a fourth that was shared between groups of threads.
Whether they were actually different physical memories and whether they had different address spaces is another issue (they at least partially did, some stuff was on chip, some was external DRAM), but this was how it was conceptually seen by the programmer and the way early GPGPU APIs worked (early versions of CUDA and OpenCL).
References in dynamically typed languages however often make me nervous, as I'm never entirely sure when different types in different languages get copied or not. If I the function argument p gets modified here, did the original variable in the caller get modified? Do I have to write an elobarate check for its mutability?
Stuff like pointer aliasing, dangling pointers, pointer arithmetics (it's UB to take a pointer after the 1st byte after the end of the object for instance), NULL pointer dereference and things like that. Debugging those kinds of bugs can be super tricky and time consuming.
Not that pointers are the only source of UB in C, but they're probably some of the most easily encountered.
There's also the whole array->pointer decay thing that can be a bit tricky to handle in certain cases (I think the sizeof() a parameter declared as an array is probably a surprise to all C coders when they encounter it for the first time).
I think this is mostly an artifact of C's type system. I wish pointers were taught in a language with similar runtime semantics as C with a more expressive type system. Drawing it out helps a ton, but the way C forces you to embed that structure into code doesn't help.
Now a days, things are more complicated with MMUs and virtual memory concepts but I think the examples in the article are a good basis for anyone wanting to know what happens under the hood.
But I did have a problem with remembering to manage memory clear.
Comment below article:
> Marin Todinov said...
> Dear Programming Guru
> You are an absolute legend, ive been programming for 4 years and i have a masters in computer science, your explanation of pointers has helped me increase my efficiency in recursive functions and made a map in my breain of how these basic fundamental structers.
Lol, what the heck? Troll or astro-turfed? Where can you get a CS Master's Degree and on only four years of programming?
To the extent the operations are well-defined, both the compiler and the OS conspire to make this look true; if there's no OS, the compiler typically works harder to make it work, because the alternative would make programs too difficult to port to or away from the system in question.
The "big array of bytes, each with its own unique address" mental model is a useful lie which most programmers who know better don't pry into most of the time. Going beyond that would involve knowing about system-specific things that C is explicitly designed to abstract away, to make programs more portable.
So, no, the struct hack isn't valid C, but you can make huge arrays and, within those arrays, simple increments and decrements do work reliably, because the standard says so and the OS and the compiler will together contain enough code to make it work if they're any good at all.
However, with operating systems and multiple programs running at the same time, memory is no longer contiguous: instead, programs can request "pages" (blocks of memory). This is (more or less) what `malloc` does, if you've come across it. That's the key difference: in a modern operating system, you can't expect memory to be one big array, since your program might have requested more than one page of memory. In that sense, it's more like a collection of smaller arrays.
We have to do it this way so we can have memory protection (similar to file permissions - a program can decide if other programs can read one of their pages, write to it, etc) and swapping (i.e writing unused pages to non-volatile storage , like a hard drive, to free memory).
(All of this is to say nothing of NUMA.)
However, one of the responsibilities of the OS is to hide all that messy detail from the bare-metal programmer or compiler writer and provide a simple(r) abstraction over the hardware. Thus, "(physical) memory is a big array".
In essence the situation is pretty much same as for user space program: you get big address space and list of memory regions that are mapped and usable.
Somewhat more complex:
Other posts have more information, but that should get you going.
My experience (from interviewing people with various educational backgrounds) is that a lot of people who have a Masters in CS have very very little practical experience actually programming. People who go into the field to ascend the ivory tower often just don't do a lot of it, really.
Which is, I think, not really very different from a lot of fields. There's a distinct academic track to a lot of fields.
No, no, memory is in fact one big array of bytes. Everything else is just really nice syntactic sugar over that fact.
Now, it may well be that attempts to access that memory result in page faults or weird interrupts or IO behavior or what have you, but the computer really does only see a big array.
Define "the computer" in this context.
Certainly not the x86 chip itself - it sees memory as a series of caches (L1, L2, L3) and eventually the memory bus, which it manages through various lookup tables (TLB etc.) more closely resembling a series of hash tables on steroids than an array - and that's ignoring per-processor caches on multiproc systems and all the invalidation logic that needs to occur as a result.
What about processes? One flat memory space! ...except when you communicate with another process, say by sharing memory. Then you realize you can't share your 'indicies' without associating them to other indicies, because even if the physical memory is the same, each process has their own 'array' for indexing into that memory (and yours doesn't even contain everything in theirs.) That's at least 68 arrays on my computer at the time of writing this, not one.
The kernel's the one managing this mess of arrays, pinning pages needed for interrupt handlers and software TLB support (for not even it is addressing pure physical memory most of the time?)
I guess you could argue that because your chip supports DMA, you can do all your array indexing through that to get to your 'one true' physical memory addressing scheme, label that as what your computer 'really sees', and ignore the 99.99% of instructions executing and making up the bulk of your computation, which have nothing to do with that addressing scheme, but that seems a bit disingenuous.
The fact that certain accesses may cause memory layout to change or other strange things is something better left to the computer engineers. :)
(that's the first line of the 3rd paragraph, maybe a dozen whole lines before the part you quoted...)
Yet that fact is not very relevant for his explanation, sorry.
>Lol, what the heck? Troll or astro-turfed? Where can you get a CS Master's Degree and on only four years of programming?
In lots of places.
In some cases a "CS Master degree" can mean that you took 4 years of Economics or Physics and took a CS postgraduate course afterwards -- not continuous BSc and CS education.
Other countries require 3 years for the BSc and 1 year for a master's degree.
... and still don't totally grasp pointers apparently.
But seriously, what's so confusing about pointers?
as with any constructor, you can make it confusing. just do pointer arithmetic with different types and you get yourself a confusing mess.
Introduced to computer science / programming halfway through undergrad 2 years ago. Going to finish up a masters next fall. (3 and some change years after I started)
(I think that's the parent comment's question, if you have a bachelors + masters in CS, you'd have had to code for 6 years or so).