Of possible (mostly historical) interest at this point, a similar commentary on a fairly early version of the original Unix kernel, by an Australian cs prof named John Lions -- which was very widely circulated among CS students in the '80s, despite this being technically in violation of AT&T's copyright on the book.
Note that the code is written in a very archaic dialect of C, and for hardware that didn't support paging in any form (just swapping). Nevertheless, it was an important introduction for a lot of people at the time, not just to the basics of OS implementation details, but also, how to find your way around a nontrivial sized codebase.
"Lions' Commentary on UNIX 6th Edition, with Source Code by John Lions (1976) ...Despite its age, it is still considered an excellent commentary on simple but high quality code.
For many years, the Lions book was the only Unix kernel documentation available outside Bell Labs. Although the license of 6th Edition allowed classroom use of the source code, the license of 7th Edition specifically excluded such use, so the book spread through illegal copy machine reproductions (a kind of samizdat). It was commonly held to be the most copied book in computer science."[0]
I had a look a couple of years ago, it's a joy to read. It's about UNIX on the PDP11. A 1977 version PDF is available here[1] - that page has a cover with endorsement from Ken Thompson and Foreword by Dennis Ritchie, but that seems from a later edition. The end of Lions' preface is funny:
"The co-operation of the "nroff" program must also be mentioned. Without it, these notes could never have been produced in this form. However it has yielded some of its more enigmatic secrets so reluctantly that the author's gratitude is indeed mixed. Certainly "nroff" itself must provide a fertile field for future practitioners of the program documenter's art."
The Use of these notes section:
"These notes, which are intended to supplement the comments already present in the source code, are not essential for understanding the UNIX operating system. It is perfectly possible to proceed without them, and you should attempt to do so as long as you can.
The notes are a crutch, to aid you when the going becomes difficult. If you attempt to read each file or procedure on your own first, your initial progress is likely to be slower, but your ultimate progress much faster. Reading other people's programs is an art which should be learnt and practised because it is useful!"
The end of the Introduction:
"...on the whole you will find that the authors of UNIX, Ken Thompson and Dennis Ritchie, have created a program of great strength, integrity and effectiveness, which you should admire and seek to emulate."
Which is probably less relevant today in terms of directly understanding the implementation. But an interesting and enlightening read. Things were much simpler and fundamental back in the 1980s. It's easier to understand that way. Then layer on top.
Slightly more recent: The Design and Implementation of the 4.4 BSD Operating System, which includes some of the Berkeley additions to the kernel, such as TCP/IP and sockets.
And then there's xv6 [1], a small Unix running on vx32 from MIT for teaching purposes, full of comments, and available as a booklet that is directly inspired by Lions' commentary on the 6th edition of Unix.
I actually agree, though: to really appreciate the classics you should start with (Maurice) Bach. :-P
Hah, this is amazing! This reminds me of how I used to (and still do, sometimes) read third-party code.
For an OS class in college, we had to modify fork (and re-build the kernel) to track how many times a particular process had been forked (and probably some other statistics I'm forgetting at the moment).
I remember going through a very similar process for the first time - injecting white space below chunks of code, writing out my own comments, and then using that to figure out how to modify fork. Looking at the author's fork.c comments gave me a feeling of nostalgia.
The useful part of course is going through yourself and writing your own comments, but it can be really helpful to start with something like this (and then write your own version of the comments).
To briefly understand just how thorough this book is with providing all of the necessary background information and context, the chapter that actually matches the book title (Kernel Code), is chapter 8 and starts on page 319.
> The main goal of this book is to use a minimal amount of space or within a limited space to dissect the complete Linux kernel source code in order to obtain a full understanding of the basic functions and actual implementation of the operating system. To achieve a complete and profound understanding of the Linux kernel, a true understanding and introduction of the basic operating principles of the Linux operating system. This book's readership is positioned to know the general use of Linux systems or has a certain programming basis, but it lacks the basic knowledge to read the current new kernel code and is eager to understand the working principle and actual code of the UNIX operating system kernel as soon as possible. Realize the lovers.
> The current Linux kernel source code amount is in the number of millions of lines, the 2.6.0 version of the
kernel code line is about 5.92 million lines, and the 4.18.X version of the kernel code is extremely large, and it
has exceeded 25 million lines! So it is almost impossible to fully annotate and elaborate on these kernels. The
0.12 version of the kernel does not exceed 20,000 lines of code, so it can be explained and commented clearly
in a book
Chinese reader here. When I was in college about 11 or 12 years ago, a previous version of it is considered as one of our textbooks for the Operating System course. Most assignment and homework is about to add or modify some modules into kernel 0.11.
> At present, people in China are already
organizing human annotations to publish books similar to this article.
Maybe Chinese programmers will herald an increase in literate programming? Seems like a lot of effort could be saved in back-annotating by just starting the program as a literate one in the first place...
I think the prevalence of book of commented sources in China partly comes from the way that Chinese big companies interview people - asking a lot of implementation details (especially for DBs), even though in most cases that is useless in daily work (similar to Leetcode questions in US interviews). IMO, literate programming sounds more like software development in Japan, where big companies engineers write high-level specification, then the 1st outsource company models class hierarchy, followed by the 2nd outsource company writing function declarations and comments, eventually implemented by the 3rd outsource company.
Here's my favorite example of a literate program: http://www.pbr-book.org/ I don't see how such a thing could be constructed using that Japanese way. Though I'm not sure how any software could be constructed in that Japanese way. ;)
Just went to Vancouver a few months ago and I immediately recognized the aquabus and the background. I was just where the picture was eating the market's beefjerky! Crazy small world.
Many of my friend read the Chinese/original version of it more than a decade ago. It's a dictionary-style book. Unfortunately I never had the patient to read it.
As other commenters pointed out, this seems like an excellent piece! Is not too often than a one thousands pages book catches my attention, and then after a while I notice I been reading intensely the first few pages wanting for more, and so far only has been some paragraphs about the people involved at the very beginning of Linux!.
Can someone redirect me to the section of the book where they explain how exactly kernel manages sockets, binds the port and keep tracks of bound/allocated ports.
An easy way to do this is figure out related syscalls. in this case it will be open (vfs of sockets) accept etc. Grep for those and then you'll see the data structures and the code they use.
It's availble online here: http://www.lemis.com/grog/Documentation/Lions/index.php
Note that the code is written in a very archaic dialect of C, and for hardware that didn't support paging in any form (just swapping). Nevertheless, it was an important introduction for a lot of people at the time, not just to the basics of OS implementation details, but also, how to find your way around a nontrivial sized codebase.