
Restoration of First Edition Unix Kernel Sources - nxnfufunezn
https://github.com/c3x04/Unix-1st-Edition-jun72
======
cpr
Boy, that brings back real memories!

Harvard got the first Unix v7 outside of Bell Labs in, I think, '75, and my
office mate in the old Harvard graduate computing center (a medieval beast
compared to the new Gates/Ballmer cheese wedge building), Geoff Steckel, had
some kernel listings on his desk which I of course devoured without really
understanding their provenance.

Being teethed on (post-IBM mainframes systems hacking) PDP-10 assembler,
TOPS-10, TENEX and Lisp (Harvard's ECL was a Dylan-like algebraic syntax on
top of a Lisp internally), reading the C kernel listings was like a
revelation: here was an operating system written in a high-level language!

The line printer listings were printed on an upper-case-only printer (not many
lower-case-capable printers existed in the DEC world at the time) with strike-
throughs (overprinted) used for upper-case letters.

In any case, it was immediately clear that this was something special. I
didn't actually get involved with Unix until a few years later in grad school
at Columbia, and then ending up as a Unix (BSD) kernel hacker at Multiflow, a
Yale spin-off, some years later.

(Sorry for waxing "old duffer...")

------
beefhash
Pretty sure this is an export of a Google code project[1], at least the date
and message of the latest commit matches the github version.

Previously on HN for the Google code project:
[https://news.ycombinator.com/item?id=1132682](https://news.ycombinator.com/item?id=1132682)
and
[https://news.ycombinator.com/item?id=2698685](https://news.ycombinator.com/item?id=2698685)

Two days ago on HN: The init system from the same source code tree:
[https://news.ycombinator.com/item?id=10206309](https://news.ycombinator.com/item?id=10206309)

[1] [https://code.google.com/p/unix-jun72/](https://code.google.com/p/unix-
jun72/)

------
nightcracker
Can anyone explain this from src/c/c10.c?

    
    
        ospace() {}	/* fake */
    
        waste()		/* waste space */
        {
            waste(waste(waste),waste(waste),waste(waste));
            waste(waste(waste),waste(waste),waste(waste));
            waste(waste(waste),waste(waste),waste(waste));
            waste(waste(waste),waste(waste),waste(waste));
            waste(waste(waste),waste(waste),waste(waste));
            waste(waste(waste),waste(waste),waste(waste));
            waste(waste(waste),waste(waste),waste(waste));
            waste(waste(waste),waste(waste),waste(waste));
        }

~~~
jhallenworld
I think so...

It's creating space in the code section right after ospace. If you look at
ospace, it's being used as a buffer. So we have a buffer in the code section
instead of the data section. So the question is why? Remember that this is a
PDP-11/20 and the maximum memory is only 56 KB (maybe the one at AT&T had
less). The data section must be nearly out of space.

There is more: notice that the variables are always allocated at the end of
each file, and that "extern" is being used within each function as a forward
reference. I think the purpose of this is to keep the symbol table usage
within the compiler low- at the end of each function you get the symbol table
space used by the externs back. The only symbols which remain are the function
names themselves until the end of the file.

I never used UNIX on a PDP-11 (except simh), but I did use DEC's RSX-11 (on an
11/34). In that operating system a lot of effort was dedicated toward making a
good overlay linker. Overlays were in the form of a tree: You had to carefully
structure your code to maximize the efficiency of this, so that the most
commonly referenced things were closer to root. UNIX didn't have any of this..

~~~
kps

      > The data section must be nearly out of space.
    

It is for space, but the 11/20 didn't have split I/D. The specific answer¹ is
‘worse’:

    
    
      A second, less noticeable, but astonishing peculiarity is
      the space allocation: temporary storage is allocated that
      deliberately overwrites the beginning of the program,
      smashing its initialization code to save space.  
    

That is, use of the ospace() ‘array’ overlaps the beginning of main().

¹ [https://www.bell-labs.com/usr/dmr/www/primevalC.html](https://www.bell-
labs.com/usr/dmr/www/primevalC.html)

~~~
acqq
So it _is_ a kind of "overlay" where the data buffer overlaps the piece of the
code. I find the explanation of jhallenworld is correct.

~~~
kps
In a sense, but ‘overlay’ generally means more than just re-using memory —
it's reading in different sections of (usually) code in place of others; in
modern terms, it's sort of like paging, but structured explicitly and
(usually) manually at build/link time.

PDP-11 Unix did eventually support overlays, but I don't think they were
widely used. The ‘Unix way’ would be to have (as this C compiler does)
separate programs run consecutively with intermediate state in temporary
files.

~~~
acqq
> it's sort of like paging, but structured explicitly and (usually) manually
> at build/link time.

And this is "a poor man's" overlay of the data buffer over the piece of the
code, specified explicitly and manually at the time of writing the program
(the "bigger" overlay mechanisms I know of are also specified at the time of
writing the program, not at the time of build or link). As the "scratch" area
was overlaid, nothing has to be loaded, that's the only difference to the
typical overlay where more often some code would also be overwritten over some
other code. The other difference to the "classical," "bigger" overlay is that
this was done exactly once over the life of the program.

------
9fb29947
My eyes!

[https://github.com/c3x04/Unix-1st-Edition-
jun72/blob/master/...](https://github.com/c3x04/Unix-1st-Edition-
jun72/blob/master/src/cmd/unknown.c)

~~~
kps
Have a look at v7 sh(1) and you'll feel better.

e.g. [http://minnie.tuhs.org/cgi-
bin/utree.pl?file=V7/usr/src/cmd/...](http://minnie.tuhs.org/cgi-
bin/utree.pl?file=V7/usr/src/cmd/sh/blok.c)

~~~
greenyoda
For those who don't know C, what Bourne did here was to define some macros
that allowed him to get rid of C's curly brackets. E.g.,

    
    
        IF ...
        THEN
           ...
        FI
    

translates to

    
    
        if ( ...
        ) {
           ...
        }
    

I remember last seeing this code around 1980-82 when I was working with 7th
Edition Unix and wondering why someone would want to do that, since it would
have made the program hard for another C programmer to read and maintain. (If
I remember correctly, this programming style is unique to the shell code.)

------
petegrif
A thing of beauty is a joy forever.

