Hacker News new | past | comments | ask | show | jobs | submit login
Restoration of First Edition Unix Kernel Sources (github.com/c3x04)
99 points by nxnfufunezn on Sept 13, 2015 | hide | past | favorite | 12 comments



Boy, that brings back real memories!

Harvard got the first Unix v7 outside of Bell Labs in, I think, '75, and my office mate in the old Harvard graduate computing center (a medieval beast compared to the new Gates/Ballmer cheese wedge building), Geoff Steckel, had some kernel listings on his desk which I of course devoured without really understanding their provenance.

Being teethed on (post-IBM mainframes systems hacking) PDP-10 assembler, TOPS-10, TENEX and Lisp (Harvard's ECL was a Dylan-like algebraic syntax on top of a Lisp internally), reading the C kernel listings was like a revelation: here was an operating system written in a high-level language!

The line printer listings were printed on an upper-case-only printer (not many lower-case-capable printers existed in the DEC world at the time) with strike-throughs (overprinted) used for upper-case letters.

In any case, it was immediately clear that this was something special. I didn't actually get involved with Unix until a few years later in grad school at Columbia, and then ending up as a Unix (BSD) kernel hacker at Multiflow, a Yale spin-off, some years later.

(Sorry for waxing "old duffer...")


Pretty sure this is an export of a Google code project[1], at least the date and message of the latest commit matches the github version.

Previously on HN for the Google code project: https://news.ycombinator.com/item?id=1132682 and https://news.ycombinator.com/item?id=2698685

Two days ago on HN: The init system from the same source code tree: https://news.ycombinator.com/item?id=10206309

[1] https://code.google.com/p/unix-jun72/


Can anyone explain this from src/c/c10.c?

    ospace() {}	/* fake */

    waste()		/* waste space */
    {
        waste(waste(waste),waste(waste),waste(waste));
        waste(waste(waste),waste(waste),waste(waste));
        waste(waste(waste),waste(waste),waste(waste));
        waste(waste(waste),waste(waste),waste(waste));
        waste(waste(waste),waste(waste),waste(waste));
        waste(waste(waste),waste(waste),waste(waste));
        waste(waste(waste),waste(waste),waste(waste));
        waste(waste(waste),waste(waste),waste(waste));
    }


I think so...

It's creating space in the code section right after ospace. If you look at ospace, it's being used as a buffer. So we have a buffer in the code section instead of the data section. So the question is why? Remember that this is a PDP-11/20 and the maximum memory is only 56 KB (maybe the one at AT&T had less). The data section must be nearly out of space.

There is more: notice that the variables are always allocated at the end of each file, and that "extern" is being used within each function as a forward reference. I think the purpose of this is to keep the symbol table usage within the compiler low- at the end of each function you get the symbol table space used by the externs back. The only symbols which remain are the function names themselves until the end of the file.

I never used UNIX on a PDP-11 (except simh), but I did use DEC's RSX-11 (on an 11/34). In that operating system a lot of effort was dedicated toward making a good overlay linker. Overlays were in the form of a tree: You had to carefully structure your code to maximize the efficiency of this, so that the most commonly referenced things were closer to root. UNIX didn't have any of this..


  > The data section must be nearly out of space.
It is for space, but the 11/20 didn't have split I/D. The specific answer¹ is ‘worse’:

  A second, less noticeable, but astonishing peculiarity is
  the space allocation: temporary storage is allocated that
  deliberately overwrites the beginning of the program,
  smashing its initialization code to save space.  
That is, use of the ospace() ‘array’ overlaps the beginning of main().

¹ https://www.bell-labs.com/usr/dmr/www/primevalC.html


So it is a kind of "overlay" where the data buffer overlaps the piece of the code. I find the explanation of jhallenworld is correct.


In a sense, but ‘overlay’ generally means more than just re-using memory — it's reading in different sections of (usually) code in place of others; in modern terms, it's sort of like paging, but structured explicitly and (usually) manually at build/link time.

PDP-11 Unix did eventually support overlays, but I don't think they were widely used. The ‘Unix way’ would be to have (as this C compiler does) separate programs run consecutively with intermediate state in temporary files.


> it's sort of like paging, but structured explicitly and (usually) manually at build/link time.

And this is "a poor man's" overlay of the data buffer over the piece of the code, specified explicitly and manually at the time of writing the program (the "bigger" overlay mechanisms I know of are also specified at the time of writing the program, not at the time of build or link). As the "scratch" area was overlaid, nothing has to be loaded, that's the only difference to the typical overlay where more often some code would also be overwritten over some other code. The other difference to the "classical," "bigger" overlay is that this was done exactly once over the life of the program.



Have a look at v7 sh(1) and you'll feel better.

e.g. http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/...


For those who don't know C, what Bourne did here was to define some macros that allowed him to get rid of C's curly brackets. E.g.,

    IF ...
    THEN
       ...
    FI
translates to

    if ( ...
    ) {
       ...
    }
I remember last seeing this code around 1980-82 when I was working with 7th Edition Unix and wondering why someone would want to do that, since it would have made the program hard for another C programmer to read and maintain. (If I remember correctly, this programming style is unique to the shell code.)


A thing of beauty is a joy forever.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: