
C Portability Lessons from Weird Machines - begriffs
https://begriffs.com/posts/2018-11-15-c-portability.html
======
userbinator
A lot of the "weird machines" for C are microcontrollers and the like; the
8051 is mentioned, but another big "C-hostile" MCU family is the Microchip
PIC, which still has its own C compiler. DSPs are another category where
unusual word sizes are common (24 bits for char/short/int is often
encountered.)

 _It’s amazing that, by carefully writing portable ANSI C code and sticking to
standard library functions, you can create a program that will compile and
work without modification on almost any of these weird systems._

The big question, which I ask whenever someone harps on about portability, is
_does it even make sense_? Is your average PC/smartphone application
realistically ever going to be ported to a 6502 or a Unisys mainframe? Keep in
mind that I/O in particular is going to differ significantly, and something
like standard input might not even exist (think of a fixed-function device
like an oven controller with only a few buttons and LEDs for I/O.) I don't
think it's particularly amazing, because the "core" of C essentially
represents all the functionality of a stored-program digital computer; so if
you completely ignore things like I/O it's not hard to see how the same piece
of code can express the same concepts on _any_ of the latter type of computer.

It should also be noted that these "weird" environments are often not
"strictly conforming" either, because it's either impossible or doesn't really
make sense to. Besides the "omissions" they will also have "extensions" that
help the programmer to more fully utilise the platform's capabilities.

~~~
int_19h
Agreed. I think there's a lot of overhead stemming from the use of C and C++,
and all the idiosyncrasies that stem from their specs needing to be vague
enough to cover all that exotic things, for software that's realistically
never actually going to run on them.

It's not that the ability to write code that is so portable isn't useful. For
something like a library of algorithms (e.g. compression - think zlib),
there's actual value to be derived from having a single ultra-portable
implementation that can run everywhere. But does something like Evolution or
LibreOffice really need to never assume that CHAR_BIT is not necessary 8, or
that int might be less than 32, or that int64_t might not even be defined? I
would say that of all the C and C++ code that's running on modern devices
today, the vast majority could safely assume flat memory addressing, 8-bit
chars, 32-bit ints, two's complement, IEEE floating point etc - and nobody
would even notice. In fact, a lot of it likely already does assume some or all
of that, just not explicitly.

It would be nice to have an ISO-standard superset of C that catalogs such
assumptions. Basically, a "non-DSP, non-mainframe" version of C, that's
portable across all modern non-exotic platforms, and provides definitions for
as many things that are UB or implementation-defined in standard C as
possible.

~~~
setr
>Basically, a "non-DSP, non-mainframe" version of C

I feel like if you start going down that road, towards “commonly portable”,
you naturally end up with one of the not-c variants (zig, nim, etc). ...and
ifc, if you start writing a new language, you might as well toss in features
from the last thirty years of language design.

C is C largely because its so stupidly (and impressively) portable; if you
aren’t drawing that benefit, its likely a case of the wrong tool for the job.

~~~
carlmr
>C is C largely because its so stupidly (and impressively) portable; if you
aren’t drawing that benefit, its likely a case of the wrong tool for the job.

I'm working in medium sized embedded systems (not embedded Linux, but also not
the smallest embedded systems). I've only rarerly worked on 8 or even 16bit
Microcontrollers. Most of what I work on is 8bit chars, 32bit, Little Endian
normal Microcontrollers.

I really don't need the portability, the targets are all the same, however
they only have C/C++ compilers. And we can't use GC. Rust would be a very nice
choice at least for the application code. I'm not sure how to write low level
drivers in Rust, but with FFI I would just push that to the boundaries, and
make at least 90% of the applications Rust. But there's no LLVM backend for
most embedded processors I work with, and even if there is, the maturity might
not be there, and no company would switch to that considering that the next
target might not have a backend.

We have to use C, because it's the only thing that we are offered. But for
desktop applications I see no real reason to start something in C nowadays.

~~~
pjmlp
Some microcontrollers do enjoy Basic and Pascal compilers as alternatives to
C.

[https://www.mikroe.com/compilers](https://www.mikroe.com/compilers)

But as you say, the company needs to give the option to developers.

------
chadaustin
Compiling C++ to asm.js with Emscripten is another weird architecture you
might actually use these days.

Unaligned accesses don't trap with a SIGBUS or anything, they just round the
pointer value down to an aligned address and read whatever that is.

Reading from and writing to NULL will generally succeed (just as on SGI).

Function pointers are just indices into tables of functions, one table per
"function type" (number of arguments, whether the arguments are int or float,
whether it returns an argument or not). Thus, two different function pointers
may have the same bit pattern.

~~~
gpderetta
Re functions pointers, how does Emscripten gurantee the POSIX behavior that
you can roundtrip function pointers to void*?

~~~
rcxdude
So long as you cast it back to the correct type, there isn't any issue.
Casting back to a different type is undefined behaviour anyway.

~~~
gpderetta
But this means that two distinct function pointers might compare equal when
converted to void*, unless the type is somehow encoded.

~~~
chadaustin
This is the first I'm hearing that arbitrary function pointers converted to
void* should be distinct on POSIX - do you have a reference?

~~~
gpderetta
Pointers to different objects should not compare equal in C (or at least in
C++, can't remember if this is also a rule in C). Then again, functions are
not objects.

------
int_19h
One thing that they didn't list there that probably deserves a mention is
SHARC with its 32-bit word architecture, which they decided to manifest
directly in C type sizes:

    
    
       CHAR_BIT == 32
       sizeof(char) == 1
       sizeof(int) == 1
       sizeof(short) == 1
       sizeof(float) == 1
       sizeof(double) == 2
    

I suppose the alternative would be to use an addressing scheme encoding bit
offset in the pointer, like some of the other machines in this story. But
that's also much more expensive, and this is a DSP architecture, so they went
with something more straightforward. Curiously, this set-up is still fully
accommodated by ISO C standard.

~~~
hamiltonkibbe
16- or 32-bit chars are pretty common on DSPs from any vendor. I’ve found that
it’s usually only an issue with some cross-platform serialization libraries
that make invalid assumptions about how an array of octets looks in memory.

------
jcranmer
Seeing this article reminds me that there's a good litmus test in C (at least
older versions) in determining whether a feature is undefined behavior or
merely unspecified or implementation-defined behavior. Undefined behavior
means that there is some processor that will throw an exception if you do it;
if there isn't such, then the behavior is implementation-defined. So signed
overflow is undefined because some processors have trap-on-signed-overflow,
and unsigned overflow is not because that feature is not present. The part
about undefined signed overflow being useful for optimizations only came
decades later.

That said, I'm at a loss to explain why i = i++; is undefined and not merely
unspecified.

~~~
anticensor
Because, subexpression evaluation order is indeterminate in C. We really need
-fprecedence-left-to-right and -fprecedence-right-to-left to specify
evaluation strategies.

~~~
yuubi
It is an exception to the apparent pattern in which behaviors that trap on one
of the early C targets were later declared undefined.

Given the standard, it's obvious why i=i++ is undefined: it modifies i twice
between sequence points. The question is why it was "undefined" (in which the
implementation is allowed to make demons fly out your nose) and not something
like, say, the "unpredictable" that occurs in many architecture specs (in
which the program behaves as if i had been set to something in particular, but
with no requirement on which value it is: unchanged is ok, incremented is ok,
12345 is ok if i is wide enough to hold it, but silently failing to generate
any code for statements that occur after the i=i++ isn't).

~~~
anticensor
Subexpression evaluation order has nothing to do with traps and everything to
do with optimisation. It is not like your CPU requires mandatory exclusive
lock to access a particular memory location.

~~~
yuubi
I think we're talking past each other. I'm responding to the comment about the
heuristic where for the most part trap (somewhere) means UB.

Maybe it's UB only because implementation-specified is too onerous and it
didn't seem to be worth defining a bounded "unpredictable result" behavior.

------
drewg123
I remember that back in the early 90s, the DEC Alpha was pretty "weird", as it
was one of the first common LP64 unix machines. I fixed so many issues due to
sizeof(int) != sizeof(char *) when building open-source packages for DEC OSF/1
in the early 90s..

Later, the alpha was FreeBSD's first 64-bit platform, and when working on the
port to alpha, we hit a lot of the same issues in the FreeBSD kernel. As alpha
was also FreeBSD's first RISC machine with strict alignment constraints, we
also hit a lot of unaligned access issues in the kernel.

Ah, those were the days. I now find myself grumbling about having to check to
ensure my code is portable to 32-bit platforms.

~~~
octorian
Alignment is one of those interesting things, because x86 doesn't care so much
about it... And everyone starts out assuming everything is x86.

Meanwhile, ARM does care about alignment, and its now the most popular
architecture for anything that's "not a PC".

My first experience with this was writing some smartphone code that died with
a SIGBUS when trying to make a function call, where the reason was totally
non-obvious from simply looking at the code.

~~~
burfog
If you are accessing RAM, modern ARM chips don't care about alignment.

A couple decades ago, sure, ARM was different. Had it stayed that way, ARM
would not be so popular today.

~~~
kayamon
You think a chip will be a financial success based on whether it supports
unaligned reads?

Almost zero software out there actually needs unaligned reads.

~~~
burfog
Yes.

Most of the troubles related to -fstrict-aliasing involve unaligned reads. All
sorts of file formats, TIFF for example, are most easily handled with
unaligned reads.

~~~
kayamon
Most of the troubles related to -fstrict-aliasing involve -fstrict-aliasing.

~~~
burfog
If the processor supports unaligned reads, sure, you could say that, though it
still technically violates the C standard. Otherwise, no.

A typical issue would have code like this:

foo = * (bar * )baz; // baz is a char pointer into a binary blob, and bar is a
type that needs alignment

If the data were all properly aligned, then most likely there would be no
desire to write such code. The correct types would be used.

Aside from gcc abusively optimizing, the above works fine on x86, PowerPC, and
modern ARM. It does not work on older ARM.

That sort of code is everywhere.

------
tasty_freeze
One machine I came across early in my career was the BTI-8000, designed in the
mid to late 70s. Only a few dozen were sold. It was a multiprocessor
mainframe, and the user memory space was 512KB. So what does an architecture
do with all those extra bits after using up the first 19 as a linear space
mapping all 512KB? Why encode other address modes. On that machine, it was
possible to have a pointer to a particular bitfield in memory, something like
[16:0] = 32b word of memory, [21:17] = bit offset, [26:22] = bit width,
[31:27] addressing mode. Other modes provided the ability to point to a
register, or a bitfield within a register. There were many other encodings,
such as register plus immediate offset, base+offset, etc.

If the instruction was something like "ADD @R1, @R2, 5", it would fetch the
word, register, or bitfield pointed at by R2, add immediate 5, then save it to
the word, register, or bitfield pointed to by R1.

The machine didn't have shift/rotate instructions, but it could be effected by
saving to a bitfield then reading from a bitfield offset by n bits.

They had a working (but not polished) C compiler but that project got shut
down when they realized the system was not going to take off.

[http://btihistory.org/bti8000.html#isa](http://btihistory.org/bti8000.html#isa)

------
dmitrygr
Article is wrong in a few ways. One is about the R3000 where it says that
integer overflow traps. there are actually two separate addition instructions,
they operate the same way, except for one difference. One will trap on signed
overflow and one will not. They both produce the same result. No c compiler I
know of uses the trapping version of the instructions.

~~~
saagarjha
Does -ftrapv generate add instructions?

------
classichasclass
"Accessing any address [on 6502] higher than the zero page (0x0 to 0xFF)
causes a performance penalty."

I never thought of it that way, but that's true. However, he didn't mention
the biggest issue with C on the 6502, i.e., the extremely constrained 256-byte
hardware stack. To do anything practical requires some sort of software-
maintained stack to have stack frames of any decent size or quantity (in
parallel, or replacing the use of the hardware stack completely). "Downright
hostile to C compilers," indeed.

~~~
sehugg
It'd be nice if compilers were smarter about using the stack (looking at you,
cc65) -- you could have "big" stack frames and little stack frames, pass
variables in registers, push values as 8-bit instead of upcast to 16-bit,
convert locals to static, etc.

I'm going to take a look at yet another alternative 6502 language called C02
now: [https://github.com/RevCurtisP/C02](https://github.com/RevCurtisP/C02)

~~~
classichasclass
The big problem with passing in registers is you only have three, and only one
of them can do most of the work. Maybe zero page could help there for a fast
call convention, but either way you're probably going to have to spill to
memory, which fortunately is not a big deal on the 65xx.

------
Tor3
"[3B2] Fairly normal architecture, except it is big endianian, unlike most
computers nowadays. The char datatype is unsigned by default."

That sounds like any SGI, or Sun, and although they're mostly gone there's
still the Power series from IBM (runs AIX), and the only reason to use the
expression ".. unlike most computers nowadays" is by counting the sheer number
of Intel, AMD and ARM chips in use. Of course those numbers are overwhelming -
ARM alone sells billions - but it's not like big endian is some obscure old
concept in a dusty corner. (The irony is that ARM can be used in both BE and
LE modes, by setup). Anyway, at work I have to write all the code so that it
runs on BE as well as LE architectures. BE is alive and well.

~~~
masklinn
> That sounds like any SGI, or Sun, and although they're mostly gone there's
> still the Power series from IBM (runs AIX)

According to wikipedia, there's also IBM's z/Architecture and the AVR32 µc.
PPC looks to be switchable like ARM.

> The irony is that ARM can be used in both BE and LE modes, by setup

I've always wondered how common it is for ARM CPUs to run in BE mode. Does
anyone have info?

> BE is alive and well.

If only because network protocols are generally BE.

~~~
pm215
Big-endian arm is pretty rare -- it's basically a niche requirement for
markets like embedded network hardware where there is a lot of pre-existing
code that assumes big-endian because it used to be run on m68k or mips. Most
network protocols are big-endian on the wire, so on a BE host if you forget a
htons() or ntohs() call it all still works; auditing a legacy codebase for
that kind of bug in order to port it to a LE host is painful. But any general-
purpose-ish Arm system (intended for running Linux or Android or Windows) will
be LE.

Fun fact: the optimized string library for the original ARM1176 raspberry pi
had a bonkers implementation of memcmp() which used the SETEND insn to
temporarily switch to bigendian, because on that core setend is only 1 cycle
and it saved an insn or two later. (On newer cores setend is a lot more
expensive and the trick doesn't work.)

------
mises
Sorry to be that guy, but a weird machine is a defined computer science term.
I got the wrong impression from the title, as I'm guessing, did others.
[https://en.wikipedia.org/wiki/Weird_machine](https://en.wikipedia.org/wiki/Weird_machine)

~~~
simen
The title makes no sense if you interpret it that way, also, it seems to be a
very niche term (most of the top google results are not computer related at
all). I think 99% of people who saw the headline understood it the correct
way.

~~~
gwern
> The title makes no sense if you interpret it that way

The lesson of weird machines is that weird machines can lurk anywhere; that's
why they're 'weird'. Something to do with CPP or bit endianness, perhaps, or
exploiting undefined behavior is what I thought clicking on it.

~~~
wemdyjreichert
Ditto for the endianness. Or one of those embedded boards with really weird
24-bit stuff or something similar.

~~~
carlmr
When you get into DSP 24bit isn't too weird. But then you also expect that
going in.

------
gumby
Weird? 68000 is hardly weird -- the original Sun machines were 68Ks. The MIPS
CPUs were designed to run C from the get go.

And the reference in the manual to the Unisys machine not having byte
pointers: the PDP-6 (the original 36-bit machine as far as I know) had byte
pointers that allowed you to specify bytes in the width 1 to 63 bits wide. It
was common to have six-bit characters (you could pack six to a word) as well
as 7-bit ASCII characters (you could pack 5 to a word).

~~~
sehugg
If unaligned access traps are weird, then so is ARM:
[https://medium.com/@iLevex/the-curious-case-of-unaligned-
acc...](https://medium.com/@iLevex/the-curious-case-of-unaligned-access-on-
arm-5dd0ebe24965)

~~~
vardump
Also x86 can trap on unaligned access, if "Alignment Check" (eflags bit 18) is
1 _and_ CR0 "Alignment Mask" (coincidentally also bit 18) is 1.

Then misaligned access will trap on x86. Can be useful for emulating
architectures that don't support misaligned access.

------
edoo
I do bare metal C on modern ARM chips and nowadays it is near identical to
anything you'd do on a desktop. We don't have to write portable code to
compile most of the same libraries on real computers for proper unit testing.

It is hilarious to think about the possibility of having to make your code
portable to a 9 bit big endian system too.

~~~
jandrese
I was thinking what a headache it must be to read a bitstream from one of
those machines onto an 8 bit byte machine.

~~~
SlowRobotAhead
I have a 32bit system that has a LIN connection to an 8bit system (or few or
them). On the 8bit side I can still handle a 32bit address or hash for
example. It’sslower but works fine in effect to what the user would
experience.

Before the great serialization options we have now (MessageBuffers, ProtoBuf,
etc) there was a bit more ambiguity with what your data stream was. TLV (type-
length-value) packing being pretty common... but I guess I’ve written plenty
of domain specific ‘protocols’.

If you mean a constant bitstream of never ending data, if you can use an
established format like I2C or SPI the hardware on both sides really takes
care of most the gritty stuff giving you a nice interrupt on both sides.
Doesn’t really matter one side has the “preference” to look at in in 32bit
chunks and the other in 8bit. Besides, even on ARM the physical transfer is
typically in 8bit units anyhow. SPI can send 16s or 32s, but can also stop at
8s usually. It’s the ASIC/drivers/devices that are more rigid in their
streaming requirements (this device MUST accept 8bit transfers, etc).

~~~
jandrese
I meant consuming a 9bit/byte bytestream on an 8bit/byte box.

------
nbsd4life
Is it weird to support some of these today? we just dropped acorn26, although
it was gone for a few years.

I think mips r3k has a better behaving add ("addu"), or is that only on some?
if your compiler outputs these you don't have to worry about special behavior.

I'd say a bigger concern, for vax - the lack of IEEE754 is noticeable when
people pick unsuitable float constants, or it traps by default. or the many
GCC bugs now.

For mips r3k, the complete lack of atomics. And the load delays.

~~~
monocasa
r3k didn't really need atomics. It wasn't SMP, and being a braindead simple
5-stage RISC, a syscall to atomic primitives was in the neighborhood of as
cheap as atomic instructions are today. Cheaper if you were already in a KSEG.

And load delays are annoying when writing asm manually, but not for the C
compiler.

And yeah, addu was in MIPS-I.

------
imglorp
I've owned or used about half the machines on that list. Good times. We had to
push our bits uphill both ways back then.

------
roywiggins
The weird machine my Computer Architecture class had us learn was the PDP-8,
which is almost comically constrained. 12-bit, with exactly one register, no
hardware stack, no arithmetic beyond addition and negation.

~~~
jhallenworld
I bet you used The Art of Digital Design:

[http://bitsavers.informatik.uni-
stuttgart.de/pdf/dec/pdp8/pd...](http://bitsavers.informatik.uni-
stuttgart.de/pdf/dec/pdp8/pdp8i/Prosser_The_Art_of_Digital_Design_2ed_1987.pdf)

I started with this design for my relay computer, but evolved it to be even
simpler: no accumulator.

[http://relaysbc.sourceforge.net/arch.html](http://relaysbc.sourceforge.net/arch.html)

------
jhallenworld
TOPS-20 on the DECSYSTEM-20 had 7-bit characters, but this was a 36-bit
machine... so sizeof(int) needs to return a fraction or something...

(There was a C compiler on it, but it cheated and used 9-bit chars).

------
saifsadiq1995
Ah ha! Great!! It reminds me my Grany stories in childhood. Huge memory disk,
Floppies and room-sized mainframes.

Worth to read it.

