
Xv6, a simple Unix-like teaching operating system - cturner
http://pdos.csail.mit.edu/6.828/2012/xv6.html
======
bane
Really cool project. Older, simpler, unixes are surprisingly relevant for
undergrad-level learning.

It reminds me of my OS class in my undergrad days. We skipped a few bits by
using the JVM to get a few things for free, but we still had to write (as
groups of 3-5 people) an entire unix-like, complete with user-space programs,
messaging systems, etc.

Definitely one of the most valuable classes I took for my degree.

------
btian
Also xv6 book that's meant to accompany the source code.
[http://pdos.csail.mit.edu/6.828/2012/xv6/book-
rev7.pdf](http://pdos.csail.mit.edu/6.828/2012/xv6/book-rev7.pdf)

~~~
csense
The line numbers in the book reference a 92-page printout of the source code
here:
[http://pdos.csail.mit.edu/6.828/2012/xv6/xv6-rev7.pdf](http://pdos.csail.mit.edu/6.828/2012/xv6/xv6-rev7.pdf)

------
kirbyk
Purdue has a similar OS, Xinu, that was created for teaching. It is actually
used in industry around the world. -
[http://www.xinu.cs.purdue.edu/](http://www.xinu.cs.purdue.edu/)

~~~
luckydude
+1 I learned a boatload by reading the xinu books.

Also met Doug, he's a great guy, old school C hacker.

------
adamnemecek
That reminds me, I stumbled upon the website of the Unix Heritage Society
([http://www.tuhs.org/](http://www.tuhs.org/)) which has the source code for
the early versions of Unix.

Here is the first version

[http://minnie.tuhs.org/cgi-bin/utree.pl?file=V1](http://minnie.tuhs.org/cgi-
bin/utree.pl?file=V1)

I did not really spend enough time reading the source to understand what's
going on but it's interesting regardless.

~~~
derekp7
There's also a port of V7 Unix to x86, by
[http://www.nordier.com/v7x86/](http://www.nordier.com/v7x86/)

I only wish that there was an add-on tcp/ip stack for that version, then one
could really have fun with it. Yes, I know, the ip stack add on is called BSD.

~~~
rst
You might want to look for 2.9BSD, a PDP-11 Unix variant which was more or
less v7 with the TCP stack from 4.xBSD grafted in. Patching this into an x86
port would still be a bit of a project, but it's probably easier than starting
with 4.xBSD directly, or starting from scratch.

~~~
ChuckMcM
This is great advice. 2.9BSD was the first UNIX I got to play with in depth it
has all the pieces you need and it can do networking.

------
swetland
Because splatting object files and intermediates all over the same directories
as the source drives me nuts, I shuffled things so that it's a bit easier to
find things (splitting source into kernel/, user/, ulib/, tools/, and include/
directories) and generated files are collected into kobj/, uobj/, out/, and
fs/ directories.

[https://github.com/swetland/xv6](https://github.com/swetland/xv6)

I also tweaked the scheduler to halt if there's nothing to schedule, per
tankenmate's suggestion.

A side-effect of this change is the generate-source-code-printout targets were
rather severely broken, so they're simply removed for the moment.

------
tankenmate
If you are running this on your laptop and want to save your battery then use
this patch...

[http://pastebin.com/BxXDDtTh](http://pastebin.com/BxXDDtTh)

~~~
akkartik
Interesting. How does the processor restart after _hlt_?

~~~
tankenmate
Any interrupt will take the processor out of hlt state; i.e. hit a key on the
keyboard, it will fire an interrupt and the processor will start processing
again. The clock tick will also fire an interrupt. Once it fires an interrupt
it will start searching for runnable processes again; e.g. a process can now
return from a read() from it's STDIN tty now that you have typed a key.

~~~
swetland
There's some inefficiency here in that wake() doesn't cause the other cpu to
wake immediately from hlt so it'll remain halted until the next timer tick or
whatnot.

It'd be fun to add some scheduler stats and poke around with things a bit.
Some really trivial cprintf() tracing in scheduler() implies that single
processes can ping-pong between CPUs, but the printing itself is slow and
invasive and could possibly be the cause.

------
oscargrouch
>Russ Cox (rsc@swtch.com)

>Frans Kaashoek (kaashoek@mit.edu)

>Robert Morris (rtm@mit.edu)

sh$%t, this got to be good!

Cloning the repo and surfing at the source code, just for fun, right now :)

------
Jonanin
xv6 is a pleasure to hack on and learn from! It's got everything you'd expect
from a unix-like OS (multi-processor support, simple scheduler, unix-like FS,
many of the standard syscalls, fork/exec based shell, etc).

Linux and *BSD may be the modern standard for unix-like OSes, but they're
often too complex and opaque for beginners to learn from. It's super easy to
read and toy with xv6's code if you want to learn more about OS internals,
though. For example, I was able to implement a simple log structured file
system:
[https://github.com/Jonanin/xv6-lfs](https://github.com/Jonanin/xv6-lfs)

Note: if you're looking for companion reading material, check out this OS
book:
[http://pages.cs.wisc.edu/~remzi/OSTEP/](http://pages.cs.wisc.edu/~remzi/OSTEP/)
(IMO it's super approachable and understandable for undergrad-level learning)

------
kps
I just did a quick grep through the code, and apparently students _are_
expected to understand this.

~~~
bodyfour
Shame this is getting downvoted, probably by people who don't understand the
reference. See the second section ("You are not expected to understand this")
in: [http://cm.bell-labs.com/cm/cs/who/dmr/odd.html](http://cm.bell-
labs.com/cm/cs/who/dmr/odd.html)

------
capkutay
I wonder how this compares to pintos
([http://www.stanford.edu/class/cs140/projects/pintos/pintos_1...](http://www.stanford.edu/class/cs140/projects/pintos/pintos_1.html))
which is an i386 unix like OS. In my under grad class we had to build thread
scheduling, sys calls, virtual memory, and some file system functionality on
top of pintos. It was the most challenging and most interesting set of
projects I've ever completed.

------
merkury7
Adding to the list of alternatives:

MINIX ([http://www.minix3.org/](http://www.minix3.org/))

~~~
mwcampbell
At the risk of reigniting an old flame war, it seems to me that a microkernel
in which core OS functions such as process management and the virtual file
system are in separate servers would be _more_ complicated than a monolithic
kernel. Having device drivers and specific file systems in user space may be
useful, though.

~~~
mcintyre1994
The impression I get (university class on OSes) is that the
microkernel/modular approach is more about making it simpler to port than
making a specific instance simple? Similarly it'd be less complicated to
extend since that's just a case of adding a new loadable module. In terms of a
learning OS I'm not sure exactly which should be prioritised -
extendability/portability or a simple structure? If this is a topic with a lot
of depth I'm obviously not getting I'd definitely be interested in a
discussion - my exam is soon :)

~~~
mwcampbell
Have you read the debate between Andy Tanenbaum and Linus Torvalds from about
1992? It's a discussion of the tradeoffs between microkernels (Tanenbaum) and
monolithic kernels (Torvalds). You can find it here:

[http://oreilly.com/catalog/opensources/book/appa.html](http://oreilly.com/catalog/opensources/book/appa.html)

The chapter by Torvalds in the same book is also worth reading:

[http://oreilly.com/openbook/opensources/book/linus.html](http://oreilly.com/openbook/opensources/book/linus.html)

The claim that microkernels improve portability makes no sense to me, since
Linux itself is very portable, as are the BSDs (especially NetBSD and
OpenBSD).

~~~
stplsd
This is flame between two man with HUGE egos, not a debate.

------
blt
Awesome! I really hope MIT releases video lectures for 6.828 so I can do the
full course on my own. I'm listening to Berkeley's OS lectures right now but I
would love a course built around xv6.

~~~
ash
This page has links to 6.828 lecture videos (2011):

[http://pdos.csail.mit.edu/6.828/2011/schedule.html](http://pdos.csail.mit.edu/6.828/2011/schedule.html)

------
__mp
Alternatively there's also Barrelfish:
[http://www.barrelfish.org/](http://www.barrelfish.org/) but there the focus
lies on multiple cores (48ish+++).

------
singular
For anyone else who is playing with this in OS X Mavericks, I've written a
blog post describing how to get things working on it (there are a couple steps
that need to be tweaked.) -
[http://blog.ljs.io/post/71424794630/xv6](http://blog.ljs.io/post/71424794630/xv6)

Hopefully this is useful to somebody and isn't too blatantly link whorey :)

------
mjg59
Someone's going to have the fun job of porting this to UEFI at some point in
the not-too distant future.

~~~
geofft
It's multiboot, so you can just boot it from grub.efi. See sheet 10,
xv6/entry.S.

The more annoying thing is likely to be that it's 32-bit only. Are there any
common machines that do x86-32 UEFI boot? Alternatively, does a 64-bit
grub.efi know how to exit long mode and go into protected mode, to multiboot a
32-bit kernel?

(I was actually talking the other day with some folks about whether a 6.828
project involving a UEFI port of JOS, the other teaching OS, would make sense,
but the 64-bit thing seemed to make it less interesting unless you also plan
to do a 64-bit port of JOS. Which has been done before in a final project,
admittedly...)

~~~
mjg59
Ah, true - there's no support for any other BIOS-based services, so that's not
going to be a problem. Being 32-bit only isn't really an issue, there's no
theoretical problem with a UEFI-based bootloader jumping into 32-bit before
executing a multiboot image. You won't be able to make any UEFI runtime calls,
but that's probably not a high priority anyway.

------
chris_wot
The link to the schedule points to:

[http://pdos.csail.mit.edu/6.828/schedule.html](http://pdos.csail.mit.edu/6.828/schedule.html)

It should be:

[http://pdos.csail.mit.edu/6.828/2012/schedule.html](http://pdos.csail.mit.edu/6.828/2012/schedule.html)

------
Moral_
Were using the Xv6 kernel for the OS class here at the university of Utah, for
the first time. In my opinion an OS class that makes you hack on low level
code is significantly more beneficial tha just a theroy class.

------
contingencies
Alternatively, historical codebases of various unices at
[http://www.tuhs.org/archive_sites.html](http://www.tuhs.org/archive_sites.html)

------
csense
If "Xv6's use of the x86 [is intended to make] it more relevant to students'
experience," then why use AT&T assembly language syntax rather than Intel?

~~~
rsc
Because the GNU toolchain uses AT&T syntax and most students have the GNU
toolchain readily available?

~~~
csense
> GNU toolchain uses AT&T syntax

But also accepts Intel syntax with a directive, and even admits "almost all
80386 documents use Intel syntax" [1].

As for user acceptance, a poll on the xkcd forums shows that the vast majority
of users prefer Intel syntax [2].

That takes care of the political/network effects arguments. As far as the
actual technical arguments, the position against AT&T syntax is obvious and
overwhelming -- the unnaturalness of src,dest operand order; the lazyness of
the parser's authors at the expense of its users by requiring percents or
dollar signs to indicate token types; the usually-redundant size postfix; and
the utter incompatibility with basically all non-GNU x86 documentation.

[1]
[https://sourceware.org/binutils/docs/as/i386_002dVariations....](https://sourceware.org/binutils/docs/as/i386_002dVariations.html#i386_002dVariations)

[2]
[http://forums.xkcd.com/viewtopic.php?f=40&t=19558](http://forums.xkcd.com/viewtopic.php?f=40&t=19558)

~~~
volkadav
Strictly imho, the translation overhead of reading one or the other if you're
used to the opposite syntax is not that high. We're not talking haskell vs.
c++ here, it's operand ordering and maybe a few characters' worth of
difference in the commands (e.g. add vs addl) or typing register names and so
on. The basis of my opinion, fwiw, is that all of my classes involving asm
used GNU tools/AT&T, and I never felt enormous difficulties reading Intel-
inflected docs. It's quite possible that I was just lucky enough to have never
been bitten badly, I don't know.

So while I don't disagree with your technical reasoning, I don't think that
it's a significant pedagogical barrier. The puckish part of me wants to
suggest that having to figure something out when the documents are a) insane
b) backwards c) written in Martian d) all of the above is great preparation
for commercial work. ;)

~~~
swetland
The reality is you just don't need to write a whole lot of assembly day-to-day
so using what the tools support best tends to be the path of least resistance.

Xv6 itself has about 5000 lines of .c and 364 lines of .S in the kernel.
Another 1500 lines in vectors.S but that's machine generated. There's a ton of
stuff one could add to Xv6 (drivers, syscalls, services, etc), almost all of
which one would likely just write in C.

In practice, I've found assembly mostly used for some hand-tuned inner loops
(codecs, memcpy, etc) and little snippets of glue code like entry.S, swtch.S,
etc.

------
samograd
I thought this was pretty awesome when I found it a couple of years ago,
especially since it's multicore.

------
sillysaurus2
Yes! I'm so happy to see this featured on HN. You'll be fascinated by the
topics presented in xv6.

Have you ever:

\- ... Wondered how a filesystem can survive a power outage?

\- ... Wondered how to organize C code?

\- ... Wondered how memory allocation works?

\- ... Wondered how memory paging works?

\- ... Wondered about the difference between a kernel function and a userspace
function?

\- ... Wondered how a shell works? (How it parses your commands, or how to
write your own, etc)

\- ... Wondered how a mutex can be implemented? Or how to have multiple
threads executing safely?

\- How multiple processes are scheduled by the OS? Priority, etc?

\- How permissions are enforced by the OS? Security model? Why Unix won while
Multics didn't (simplicity)?

\- How piping works? Stdin/stdout and how to compose them together to build
complicated systems without drowning in complexity?

\- So much more!

I credit studying xv6 as being one of the most important decisions I've made;
up there with learning vim or emacs, or touch typing. This is foundational
knowledge which will serve you the rest of your life in a thousand ways, both
subtle and overt. Spend a weekend or two dissecting xv6 and you'll love
yourself for it later. (Be sure to study the book, not just the source code.
It's freely available online. The source code is also distributed as a PDF,
which seems strange till you start reading the book. Both PDFs are meant to be
read simultaneously, rather than each alone.)

~~~
paul_milligram
I'm sold. Based on your experience, would this course be approachable for
self-study by individuals without 3 years of CS education? The MIT page[1]
suggests the intended audience for the class are seniors and graduate students
(M.Eng and PhD).

[1]
[http://pdos.csail.mit.edu/6.828/2012/general.html](http://pdos.csail.mit.edu/6.828/2012/general.html)

~~~
DigitalJack
In my opinion, the 3 years of CS has more to do with age of the student than
actual learning. With a requisite level of maturity, this should not be a
problem.

------
Dewie
I'm not knowledgeable about different philosophies when it comes to OS's, but
it seems that pretty much everyone thinks that being Unix-like is a good
thing. Is that true? Or is there some people that think that some different
philosophy that is perhaps more or less mutually exclusive with regards to
being Unix-like is better? I don't even know if that last question makes
sense.

~~~
cturner
Inspiring comment from Bane early this year,
[https://news.ycombinator.com/item?id=5624912](https://news.ycombinator.com/item?id=5624912)

If you need a yacht that can carry many people on the high seas through bad
weather at speed, unix is solid. But there hasn't been a great racing dinghy
since the 80s.

Maybe there is an demoscene/hacking tradition out there, waiting for to be
discovered.

Consider the debates that we don't have:

* _Hardware abstraction_. Casual audio programming is harder now than it was twenty years ago. Also - why doesn't hardware self-describe?

* _High-level languages_. Chuck Moore's ideas of code bloat are on a totally different level to the mainstream. He's onto something.

* _Protected memory_. Does it matter on a machine designed for fun? How much more hackable could your OS be if you got rid of stuff like this?

Stability doesn't matter so much in fun systems. If you're in a dinghy and not
capsizing, you're not sailing hard enough.

~~~
tokenrove
The VPRI (vpri.org) has some interesting ideas with respect to your second
bullet point.

Also, there was something about Arthur Whitney looking to build a stand-alone
OS in/under K recently.

~~~
cturner
Found the reference. [http://kparc.com/os.htm](http://kparc.com/os.htm)

Thanks - looks awesome. I particularly like his approach to HTML.

------
clown_penis
MIT has essentially recreated XINU. BFD.
[http://en.wikipedia.org/wiki/Xinu](http://en.wikipedia.org/wiki/Xinu)

