Part of the reason for registers carrying powers of 2 bits is the runaway popularity of C and Unix. The PDP-11 had 16-bit registers. Some of the programming conventions we still use today originated with a humble PDP-11 at Bell Labs.
I would put most of the popularity for word lengths that are powers of 2 on the IBM System/360 (1964), rather than the PDP-11 (1970). The enormous popularity of the System/360 is what made computers standardize on bytes, along with words that were multiples of bytes. Prior to that, word sizes were all over the map: 6, 12, 36, as well as really random values such as 23, 27, 29, or 35 bit words.
This isn't to deny the popularity of the PDP-11, of course, but its use of 16-bit words was more of a consequence of the System/360. DEC's use of 12-bit words goes back to the LINC (1962) but by 1970, bytes were obviously the way to go.
We had a few pretty cool courses in university touching this thing. It depends a bit on the work you want to do, but:
- Pick a language that's simple enough. A subset of ML would be good, but if you want to complete it, I'd recommend a simple LISP. This is your new language. This is C.
- Use a language you know and like to implement a compiler for this language. This is your bootstrap language. Compile this for example into C--, ASM or LLVM, depending on what you know. This is the target language. As a recommendation, keep this compiler as simple as possible so you have a reference for the next step. For C, both the bootstrap and the target language were ASM.
- And now iterate on extending the stdlib your language has, until you can implement a compiler for your new language in your new language. Again, keep this compiler simple without optimization or passes, just generate the most trivial machine code possible. This usually takes a bit of back-and-forth. You'll need some function evaluation first, some expression evaluation first (this is where a lisp can be an advantage, as those are the same), then you need function definitions, then you need filesystem interactions and so on. You kinda discover what you need as you implement.
- Once you have all of that, (i) compile the compiler for the new language in your bootstrap language and (ii) compile the compiler for the new language using the result of (i). If you want to verify the results, compile the compiler again with the output of (ii) and check if (ii) and (iii) are different.
- Your new language is now self-hosted.
This was fun, because it was accompanied with other courses like how processor microcode implements processor instructions, how different kinds of assembly is mapped onto processor instructions, and then how higher level languages are compiled down into assembly. All of this across 4-6 semesters resulted in a pretty good understanding how Java ends up being flip-flop operations.
EDIT - got target & bootstrap mixed up in first part.
A lot of people build toy languages too -- to the point that some become self hosting.
But there are also just an absolute ton of parsers out there to define a grammar and parse the language into an AST (Abstract Syntax Tree).
Once you have an AST then you can build an interpreter that runs the code directly in a JVM, or you can build a compiler that translates AST into LLVM, say.
The start of C and the start of Unix are approximately at the same time. In that time, there was a stepwise process. The first Unix (IIRC, someone can correct me) was written in assembly. Also written at that time was a simple compiled language. That compiled language was used to write a more complicated compiler. There was another such cycle (I think) before there was a C compiler - and even then, it wasn't the full C, but a subset. Unix was then re-written in C.
There is some irony that for UNIX workloads we are stuck with the evolution of a language whose main purpose was only to bootstrap a compiler and be done with it.
First Unix was written in asm. Then Ken came up with B to write some userland stuff more easily. B was not portable. B was a BCPL-like, but smaller and simpler.
Then Dennis came up with C. C was just portable B (it added types for the purpose of getting around the fact that word sizes differed between archs). And then the B code was recompiled with C and modified as needed. Though C is back-compatible to B somewhat (hence why C has the thing where no type implies int).
Bootstrapping is a fun thing. You already have some good answers.
Now imagine how most things around you were made, how higher tech was made with lower tech. How they made high precision tools, when there were only lower precision tools available? For example: how to make a 0.001 mm precise caliper when all you have is a 0.1 mm one? There were a lot of challenges like that and we still get to new ones. I just wonder what general term is used for things like that.
> Nothing about C requires Unix (or any other OS).
That is not quite true, the C specification defines the standard headers, and there are many facilities thereof which don't make much sense outside of an OS (environment, filesystem, dynamic memory allocation, threads, ...)
There is a somewhat restricted subset of C called "freestanding", which is not required to accept every "strictly conforming program", specifically the standard only requires a small subset of the standard headers to be implemented: <float.h>, <iso646.h>, <limits.h>, <stdalign.h>, <stdarg.h>, <stdbool.h>, <stddef.h>, <stdint.h>, and <stdnoreturn.h>. Everything else is optional, and may be implementation-defined rather than standard-conforming.
Support for operating system facilities in no way implies those facilities are necessary.
> filesystem, dynamic memory allocation, threads
All of which can be (and has been) done without an OS. Besides, there is no native C language support for any of those things. Those things are usually (but certainly not always) done by library calls, not language primitives.
My underlying point is that the premise of the question is mistaken. It's assuming that an operating system is required in order to have a programming language. That is simply untrue, and is particularly untrue for C, which is very commonly used to program microcontrollers that have no operating system. I'm working on one such system right now.
C existed as a cross platform language long before there was a C standard specification or standard headers. Specifically, C was introduced around 1972 and the first standard was 1989. (Wikipedia). There were C compilers for non-Unix systems during that time (e.g., Borland Turbo C for MS-DOS around 1987). (Wikipedia).
I was using C compilers for micros well before those upstarts like Borland and MS (or even DOS) came around. The first micro I used C on was a Cromemco System III, running CP/M. That machine was built in 1977, I think, although I was using it a few years later than that.
Any self-hosting language compiler has to start from somewhere.
Generally it's an incremental process where the compiler for an early/subset version of the language is written in another, existing language (absent one, may be assembly code).
Once it's possible to rewrite the compiler in its own subset language, it becomes self-hosting. Then you can add a feature to the language, and once it works, enhance the compiler to use it, and so on.
Eventually the language and compiler go hand-in hand: The only way to compile it is with a compiler, and the only way to compile the compiler is with itself. This leads to interesting thought experiments such as:
The question is malformed because there is no need for Unix to run C programs. In theory they could have created C without an OS. However, in practice C was written to rewrite Unix in a high level language (previously it was assembly).
A related question: was an Operating System first known as a Time Sharing System? I've seen most of what I've known as an OS in this era termed that, also I read that the term OS was coined for CP/M.
IBM had OS/360 in 1966, about a decade before CP/M. Not sure where "OS" first appeared but it was before that.
Related: early OSes were not timesharing systems. They were batch oriented - you submit your stack of punch cards to the operator, and pick up your output later once they've had a chance to run it.
Operating systems abstracted the hardware (and other things) rather than having everything embedded in the program (as is typical in many embedded applications even still today). They don't have to offer time-sharing (aka, a scheduler), but that does constrain them to running a single process to completion before beginning the next. There were OSes before there was time-sharing OSes.
It would be useful to see if the software architecture of Whirlwind I [1] would have a piece that was like an OS. It is about a decade earlier than the IBM 360. Also it is a realtime system that tracks incoming airborne objects. This means that paradigms like "submit a batch job" are probably not an important part of using Whirlwind.
V2: (PDP-11 Unix) Kernel is written in assembly. C compiler is written in assembly.
V3: Kernel is written in assembly. C compiler is written in C.
V4: Kernel is written in C. C compiler is written in C.
https://www.tuhs.org/cgi-bin/utree.pl?file=PDP7-Unix
https://www.tuhs.org/cgi-bin/utree.pl?file=V2
https://www.tuhs.org/cgi-bin/utree.pl?file=V3
https://www.tuhs.org/cgi-bin/utree.pl?file=V4