I mean, I guess I can just dive in, and maybe that's the best approach, but is there a strategy anyone would recommend for reading the Linux source in terms of it making sense as a combined unit of code (as opposed to a collection of algorithms, if that makes sense)?
(1) Learn basic OS concepts through xv6.
xv6 is a reimplementation of an early version of Unix, designed to be as simple as possible and accompanied by a whole book of commentary. Get the book and the source cost listing printed and bound: http://pdos.csail.mit.edu/6.828/2012/xv6.html. Work through the book and exercises. Use the lecture videos from 6.828 from 2011 if you need extra material in order to understand: http://pdos.csail.mit.edu/6.828/2011/schedule.html.
(2) Pick a part of OSes that you are interested in. Contribute to that part in Linux.
Figure out where that part is in the Linux kernel. Find a bug in the bug tracker and submit the patch. I found filesystems interesting, so I fixed a small bug in one of the filesystems. Use a cross reference, it will save you a lot of time: http://lxr.free-electrons.com/source/include/linux/cpu.h. Also feel free to subscribe to the Linux kernel subreddit: http://www.reddit.com/r/kernel. I've set up the sidebar with a lot of useful links.
The Linux kernel is large and complex. You need to equipe yourself with a mental model of an OS through xv6 and then pick one small, specific part to attack in Linux. Be tactical! Otherwise you will be overwhelmed.
As an aside, I'm actually currently working on a tool that parses the Linux source code to find symbol definitions and then works its way back through the Git history to find the commit message for when the symbol was first defined. These commit messages usually contain really useful information about the original intent of the symbol and implementation details. Currently fighting with a few bugs in my C grammar, but should be able to work through those soon. Please feel free to email me at firstname.lastname@example.org if you want to be pinged when the tool is released.
Are you just doing something similar to cscope to find the definition of a symbol, then running git blame on that line? Or are you actually checking earlier revisions as well, to see if the symbol was moved or changed types?
A good "gentle introduction" book is the Love book (440 pages): http://www.amazon.com/Linux-Kernel-Development-Robert-Love/d...
Also, it's not kernel-specific, but this book covers a lot of system programming concepts (expensive though):
If you just want to learn more about how the kernel fits together, reading http://lwn.net/Kernel/LDD3/ (Linux Device Drivers, freely downloadable) is a fine start.
The instructor was excellent.
Well thats kinda expensive....
One thing I found a little different was that the OS has its own libraries for everything (string.h etc..) which makes sense if you think about it.
If you want to browse the source code on-line with this software website thing called lxr (Linux Cross Reference). Its got a good search tool and linked headers. Clicking on a function name shows you where that function is used.
You can install it yourself and I think its much faster..
There is linux weekly news too, which is a decent site when I was still in the Linux porting world. Like many Linux sites, seems to lack in style, but makes up in content.
have - understanding the linux kernel as your reference manual.
by now, you should be comfortable to read/understand the kernel source; download linux kernel source and start browsing through the code.
simply reading books wont get you anywhere - you need to play around with kernel source inorder to understand the linux kernel behavior and different problems you may come across. write simple kernel modules to get a hang of how you can interact/modify with the kernel.
join some opensource project and start fixing bugs you're comfortable with or just play around with your local linux kernel source - make changes; build and deploy and observe what happens.
if you have no prior knowledge of OS Theory and Fundamentals; then you should start here first - read either of the following books
1) Operating System Concepts by Galvin, Silberschatz OR 2)Modern Operating Systems by Tanenbaum
For programming related - system calls and stuff
read 1) Advanced Programming in the UNIX Environment - by Richard Stevens
From that point you can look at yearly diffs to find subsystems that have changed of interest. (Are 4 rewrites of USB interesting?)
One thing I just did was build the first version of git (just sync backwards in the git/git repo and type "make"). I ran the commands manually, and looked at the data file formats, and at the source code, and it greatly improved my understanding of git. It's all still relevant.
The first version of git was shockingly small, like 500 LOC of plain C code or so, but it does a surprising amount of the core work. I also gained a lot of respect for Linus' coding style.
People have said that Linux itself has too many hands in it -- e.g. the system call interface is a huge mess. So I wasn't sure if I would like Linus' code, but I definitely do after reading it.
I think he doesn't care about consistency when merging, because git's interface is a huge mess, just like the Linux syscall interface. But his code is consistent and good for sure.