Hacker News new | past | comments | ask | show | jobs | submit login

I'm sure they're out there. The closest that comes to mind is the MSP430, but not quite -- although it has 20 bit pointers (with sizeof ptr being... 4, since they're padded to 4 bytes for storage) and has a 16 bit size_t, my recollection is that ptrdiff_t is also defined at 16 bits (which I think violates the C spec, which requires at least 17 bits?). I haven't worked with many other segmented architectures recently, but either there or capability machines are where I'd look.



> my recollection is that ptrdiff_t is also defined at 16 bits (which I think violates the C spec, which requires at least 17 bits?)

C2x applies a proposal to permit pre-C99 limits:

  N2808    Allow 16-bit ptrdiff_t
N2808: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2808.htm

Draft C2x: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3047.pdf


The entire purpose of my question was that "I'm sure they're out there" is as close as I've ever found an answer to this, so I was looking for an actual example, not a hypothetical one.


IBM i (formerly known as AS/400) has two types of pointers, fat 128-bit pointers and thin pointers (which are either 32-bit or 64-bit, depending on the addressing mode). The 128-bit pointers contain embedded information on the type of the object they point to, and security capabilities for accessing it – there are actually several different types of fat pointers, which constrain which type of object they point to, but there is a generic pointer type ("open pointer") which can contain any other type of pointer, and hence point to anything. By contrast, the thin pointers are just memory addresses. IBM's C compiler defines extension keywords to declare if a given pointer type is fat or thin. However, they chose to define size_t, ptrdiff_t, etc, in terms of the thin pointers only. So, even this isn't a case of what you are looking for. But, if IBM had made some slightly different choices (permitted by the standard) in the design of their C compiler, it would have been. Also, back in the late 1980s / early 1990s there was at least one third party C compiler for the AS/400 (and its System/38 predecessor), and I'm not sure what choices that compiler made.

If people are looking for examples, I'm wondering about C compilers for Burroughs Large Systems. Or C compilers for Lisp machines (Symbolics had one). Those are the kind of weird architectures on which you'd do this, if anyone ever did. Indeed, it is rather obvious that the C standards committee gave compiler developers these unusual options with those weird architectures in mind. But it can't force them to make use of them, even if they are on a platform in which they might make sense.


I mean, a C spec compatible compiler for the MSP430 would be an example (see paragraph 7.20.3), TI has just decided to deviate from the language standard here. It's an example of a platform that would meet the requirements, and chooses to not offer a (compliant) C compiler instead.


But that's not a counterargument; if anything, it's kind of my point. The standard seems to be catering to a hypothetical machine that seems to lack real-world demand/usage/market/utility/etc.

If an abstraction is placed into a standard, its answer to "how many people are benefiting from this headache we're giving everybody" really ought to be noticeably greater than zero.


Sure. The C standard has always erred on the side of supporting odd platforms -- see ones complement, decimal floating point, sizeof function pointer != sizeof data pointer, etc. Most platforms today have converged around an approach that doesn't require these escape hatches. But if you're trying to e.g. bring up C on CHERI [1] or other platforms that don't make the assumption that memory is all one big flat address space, it's not only nice to have flexibility in the standard, it's nice that LLVM and other tools maintain that flexibility into their implementation.

[1] https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf


> see ones complement

I'm glad you brought this up because is kind of my point. C++ realized this was useless baggage and finally left it behind. [1] I don't see why ptrdiff_t is much different here. C just doesn't want to let things to, I guess. Literally any feature you put into a language will end up being used (or abused) by someone for something. "It's nice" that at some point in the future someone can pick up any random shiny thing once in a while and twirl it around doesn't seem like a reason to keep it into the standard for decades and burden everyone else with it the whole time. (Not to mention there are much nicer things that C and C++ lack, and that would make people's lives easier rather than harder...)

[1] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p09...


It's a philosophy thing. There are plenty of languages that just solve for flat Von Neumann memory models. In the past, that did not describe all machines -- e.g. the pre-standard Borland C/C++ compilers for x86 real mode that used 32-bit ptrdiff_t and 16-bit size_t (which doesn't count as an answer because they were pre-standard in so many ways). In the future, it may or may not describe all machines -- it's possible that RISC wins forever and we continue to push all complexity into the software, or it's possible that something capabilities based comes forward instead. Similarly in primitive types, perhaps some day with have unums.

If the future is still flat, then... we've paid the cost of an extra few paragraphs in the standard? Folks interested in writing non-portable code can ignore this and use implementation-defined behaviors; those who want to be fully portable to future machines can be more careful (although ptrdiff_t is basically a cursed type anyway, so I don't see this particular overhead mattering). Yes, getting rid of ones complement makes sense these days -- but if you were worried about the compatibility issues around supporting it properly and avoiding undefined behavior, you're probably spending exactly as much effort today dealing with the fact that INT_MIN and friends are cursed mathematically on twos complement machines.

Meanwhile, suppose that we actually break out of this local minima. That's an interesting world, and an even more interesting one if we can carry most of our software forward with us.

We're not even a century into designing computers yet. I can't even begin to predict what architectures will look like in another five or ten centuries -- even assuming that transistor budgets continue to taper off. But I'll say that of the languages I use daily, C is one of the few that I'd still expect to be around and functional in that future, even if only as an archeological curiosity; it allows a decently high fidelity description of how an algorithm should be implemented across more than half a century of hardware.


C23 follows C++ in requiring signed integers to have two's complement representation. I think by this point it's pretty much settled that it's the "optimal" way to implement signed integers. Now if we change to non-binary architectures (or something radically different) things might change, but at that point quite a lot of the C standard will have to be thrown out as well.


I /mostly/ agree -- it's hard for me to think of a modern or future platform that would use ones complement integers... with one possible exception. Integers represented on top of IEEE floating point are inherently ones-complement (sign+magnitude), and I can definitely imagine potential future platforms that use 64-bit floats as their primary primitive, restricting them to integer representations for certain tasks. Thinking of designing something like a DSP-focused microcontroller in a world where onboard SRAM can significantly exceed 4GB, and where a desire for C compatibility and occasional tasks make it worthwhile to support function and data pointers, but supporting 53-bit pointers in a float ends up simpler than adding a 64-bit integer ALU that would rarely be used. In this case the associated types (ptrdiff_t, for example) might end up as 53-bit ones-complement integers stored in floats.


I know I'm being picky but sign+magnitude and ones's complement are a good bit different from each other despite both having negative zero.


I don't understand how it's an example. It sounds like ssize_t and ptrdiff_t need the same number of bits on this architecture. And if ptrdiff_t is wrong, they'd probably make ssize_t wrong too. Also I don't think "we need more distinct constants because then if a compiler author deliberately makes one wrong they might leave the other one alone" is enough justification to have two.


ssize_t isn't standard C. It's just POSIX, which uses undefined behavior of the C standard to create an almost but not quite C language used on all Unix systems. POSIX C is not strictly conforming C, but it is conforming C. The same goes for GNU C, Visual C, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: