I've been reading up lately on the earlier OS's that influenced Unix - the Compatible Time-Sharing System, the Berkeley Time-Sharing System (Project Genie), and Multics. Bitsavers.org and Multicians.org both have a lot of information on that early era (and Bitsavers has a whole lot more).
Lots of the ideas in Unix came from those earlier systems. What Thompson and Ritchie contributed was synthesizing these ideas into a more coherent whole, and demonstrating that they could be made a whole lot smaller. (Both in terms of the PDP-7 and 11/20 being smaller computers than the mainframes previous systems were written for, and in that descriptive command names were reduced to cryptic abbreviations. CTSS's LISTF was shortened to ls, ARCHIV to ar, RUNOFF to roff and then nroff/troff...) And of course Unix ran on the most popular computer of the 1970s and got rewritten in a portable language to let it run on every popular computer since then, while just about every other OS was tied to its specific hardware platform and died off as the hardware did.
Having used Multics and a farrago of other OSes (not CTSS or the Berkeley timesharing system) as part of my retrocomputing hobby, I think the single biggest thing Unix brought to the day-to-day experience of using an OS was the pipe, and the concomitant transformation of the command line from just being a way to enter program names with command line options (on OSes which even have command line options as we know them, of course) to being a programming language in its own right suitable for rapid prototyping and the creation of glue code.
Glue code isn't glamorous. It isn't something which seems to get a lot of research put into it. It is, however, important to get right, and part of getting it right is foregrounding the right thing: The stuff you're gluing together, as opposed to the glue itself. This is something the "replace shell with a Real Programming Language" projects get wrong, in that the Unix shell defaults to treating unknown barewords as external programs as opposed to syntax errors. This plays Hell with any kind of automated analysis, but it's essential for a language primarily intended to glue those external programs together. Typing isn't just about the type system, after all.
TL;DR: OSes prior to Unix had surprisingly weak scripting facilities, and attempts to "improve" scripting tend to miss the point.
I know approximately zero PDP-7 assembly, but this looks like it might be “print directory”, equivalent to pwd today? It seems to open its parent “dotdot” directory and write it out.
I don't know PDP7 assembly either, but I don't see how it could stand for "previous directory" (and be equivalent to today's `cd -`). Any command to change directory needs to be done in-process by the shell, so it would need to be implemented as a builtin, not a standalone executable.
So "print directory" or (print something about the) "parent directory" seem more likely to me.
In the earliest versions of Unix, the shell didn't fork to run programs. It would exec the program, and then the program would exec the shell instead of exiting. Makes sense, right? That's what you'd do if you didn't have an operating system at all, and it's how, for example, CP/M worked. So you could write "cd" as a non-built-in command.
But it sounds like the other commenters have figured out that it's "pack directory".
After staring at the listing for a while, I think it is a kind of garbage collector for removing unlinked files from directory listings ("pack directory"?). As best as I can work out, the code does:
open ..
loop
read a directory entry into tbuf
if we read 0 bytes (eof presumably), break loop
if tbuf[0] == '\0', go back to start of loop
append tbuf to dir
done
close ..
reopen .. with creat
write the stuff we built up into dir to ..
close ..
exit
Looking at this again, your psuedocode seems to be pretty reasonable. However, I'm still left wondering why this command would be necessary: doesn't it just read the directory entry into memory and write it back out again? I'm also curious why "sys write" takes .. directly when all the other calls seem to need dotdot.
If my reading is right, it filters out any with a leading null-byte before writing back out. My guess is that that is how they implemented unlink/rm - just write zeroes over the entry.
As for the sys write: my current hypothesis is that .. there is a placeholder for an argument which is written to by the preceding dac .+4. ('"If a program can't rewrite its own code", he asked, "what good is it?"'). But I can't make sense of what it's actually putting there - I'd guess length, but it looks like ~(dir - 2) + 8.
That took me a while. The ‘8’ in ‘tad 8' is the contents of memory location 8, i.e. the destination pointer in the memory copy, so it's ~(dir - 2) + dst. And since -x = ~x + 1 (there being no negation instruction), that works out to the length, dst - dir + 1.
dir appears to be copied into from tbuf at some point. It puts dir-1 into memory location 8 at the start, and then the code around the 2: section I'm pretty sure is a memcpy-ish loop like:
c1 = -8;
*9 = tbuf - 1;
do {
*(++(*8)) = *(++(*9));
} while(++c1 != 0);
goto 1b; // b for backwards?
It seems that memory locations 8-15 are auto-indexing[0], so the lac/dac i increments the pointed-at location before use, and isz is "increment and skip next instruction if zero".
I think ‘dotdot’ is what is called ‘dd’ there (the corresponding symbol in ls is still named ‘dd'). I don't think the concept of a parent directory existed yet. Maybe ‘pd’ is something like ‘prepare directory’, constructing the dd/dotdot entry.
But the program appears to fail if dotdot can't be opened for reading, and it appears to scan through it doing reads first. Then it closes and does a creat, which is expected to succeed also. I'm suspecting that at this stage of development, open may have been used for reading and creat for writing (including to an existing file).
From TFS¹, creat() on an existing file truncates it (i.e. to length zero, as O_TRUNC), and only the superuser can do this to a directory. open() has read/write flag bits as now.
I am leaning toward Pete_D's ‘pack directory’ idea.
Nobody!
There were no copyright notices, and this is a work made before the US adopted the BERN convention. Prior copyright law required copyright notices.
We learned from the ATT vs Regents case that a judge ruled there was a large likelihood that AT&T couldn't establish it had a valid copyright on V32 because they never marked it properly.
> Nobody! There were no copyright notices, and this is a work made before the US adopted the BERN convention. Prior copyright law required copyright notices.
...in the U.S.
This does not apply to other countries, especially continental European ones. They'll happily retroactively apply copyright for software, even when it wasn't explicitly protected by means of their jurisdiction claiming that software has is categorized as a work even before the convention. No license, no luck over there.
If it's not deemed a work for hire (and given UNIX was a rogue operation at the time, that's not entirely unreasonable to question), then the copyright probably remains with Thompson and Ritchie themselves, or rather Thompson and Ritchie's family (or whichever way the inheritance process went). If it is, then probably Micro Focus via Attachmate via Novell via USL. Special considerations may also apply because it wasn't "published" in any sense of the world until past Ritchie's death.
Either Micro Focus (through its acquisition of Attachmate/Novell) or Nokia (through its ownership of Bell Labs), depending on whether or not the Research Unix copyrights were part of the Unix business that AT&T sold to Novell. Bell Labs and USL were distinct business units within AT&T.
That's assuming it's deemed a work-for-hire (which if it isn't makes it the property of Ken Thompson and the Ritchie estate) and that it wasn't somehow "published" before 1989 (which if it was without a copyright notice, makes it public domain - but works written before 1989 but not published until after are under copyright regardless). It's confusing stuff.
45 pages in before I saw the first comment. whoever wrote section 8, "what may be a simulation or game for billiards or pool", put actual one-line descriptions before each section. Later on he gets real chatty, e.g.
fsin: 0203 " sine of the fine rotation angle
mfsin: -0203 " negative of fsin
Seriously, I bet there is a lot to be learned figuring out how he was approximating trigonometry using 12-bit int constants.
It looks like there are symbols named "sin" and "cos" not declared in the program. These are possibly system provided tables.
Accesses to "sin" and "cos" are preceded by what looks like a self modifying store ("dac .+3", interpreted to mean "deposit accumulator at current program counter + 3"), possibly to modify the "lac sin" and "lac cos" instructions to index the tables, but I'm not familiar with the instruction encoding.
If so, this is a relatively straight forward, non-magic way of implementing trigonometric functions.
Is this written in assembly? I thought Unix was written in C and you had to only write a C compiler to port it to other platforms. I remember that was the story I was told.
UNIX was written in assembly at first. By the Fourth Edition, the kernel was rewritten to be in a version of C (that won't go through a modern C compiler anymore), both 1973. The C compiler itself was introduced in the Third Edition. Parts of userland would still be in assembly even until the 7th Edition (1979), such as roff(1) (nroff/troff were in C), parts of as(1) or chess(6).
Not at all, UNIX had a couple of releases in Assembly before the C rewrite came to be.
Likewise there were other high level systems programming languages being used since 1961.
The lost notable one being ESPOL, replaced a couple of years later by NEWP, both Algol derivatives for systems programming. The OS was called Burroughs B5000, used compiler intrisics with zero Assembly and is still being sold nowadays by Unisys, as ClearPath MCP.
There are other notable examples available to discover there was a decade old of other OSes and systems languages.
Lots of the ideas in Unix came from those earlier systems. What Thompson and Ritchie contributed was synthesizing these ideas into a more coherent whole, and demonstrating that they could be made a whole lot smaller. (Both in terms of the PDP-7 and 11/20 being smaller computers than the mainframes previous systems were written for, and in that descriptive command names were reduced to cryptic abbreviations. CTSS's LISTF was shortened to ls, ARCHIV to ar, RUNOFF to roff and then nroff/troff...) And of course Unix ran on the most popular computer of the 1970s and got rewritten in a portable language to let it run on every popular computer since then, while just about every other OS was tied to its specific hardware platform and died off as the hardware did.
All really fascinating stuff.