Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Xv6 (wikipedia.org)
350 points by jdmoreira on Nov 16, 2015 | hide | past | favorite | 47 comments



A few months ago I went through xv6 on my own as an experiment in getting better at systems programming. I wrote a bunch of posts about it, in case people are interested:

Grokking xv6: http://experiments.oskarth.com/unix00/

What is a shell and how does it work?: http://experiments.oskarth.com/unix01/

What's on the stack?: http://experiments.oskarth.com/unix02/ (video of tracing a system call from user space to kernel space and back: https://www.youtube.com/watch?v=TWksEdn5eoA)

Page tables and virtual memory: http://experiments.oskarth.com/unix03/

Locks and concurrency: http://experiments.oskarth.com/unix04/

A short overview of the file system: http://experiments.oskarth.com/unix05/

Grok LOC? http://experiments.oskarth.com/unix06/

It was a very educational experience for me, and I highly recommend the journey for other people.


Thank you for this. I know what I'm reading for the next two months!

In your original post, you mention one of the goals being able to contribute a patch to a modern OS such as BSD or Linux. After your experiment, do you feel like this is the case?


Glad it was useful and motivating! This one I feel a bit bad about, because I wanted to write a patch and test that hypothesis, but I didn't (yet) make the time for it.

I feel like I could after maybe ~30h (really I don't have enough information to determine beyond it probably being between 10 and 100 hours) of focused effort. I did read a bunch of The Design and Implementation of FreeBSD and I felt like I could understand it reasonably well, using the mental models I picked up studying xv6.

The reasons why I didn't do it (yet) are interesting in themselves. There was nothing in particular that I felt was missing from FreeBSD, and I hadn't used the "advanced" parts of the OS enough to run into any bugs. Because of this I think I lacked a sense of ownership, and the artificiality of a bug hunting activity caught up to me, motivation-wise. To do this I think you need to be motivated to dive into the nitty-gritty of a particular part of the OS, in addition to having a good understanding of the OS as a whole. Instead I ended up exploring some of the more advanced parts of FreeBSD (Jails and ZFS) with http://experiments.oskarth.com/netpowder/ - hopefully I will find something useful that I want to either extend or a bug I want to fix, or perhaps I get to test the hypothesis with something completely different (such as Mirage OS).


I'm a student at Johns Hopkins, and we're using xv6 in our OS course this semester: http://gaming.jhu.edu/~phf/2015/fall/cs318/

xv6 is pretty awesome for learning, and lecture is generally spent reading through the source code, some of which is...cute.

Most of us have the source code printed out. If you're interested in reading the source, this document is well-formated: https://pdos.csail.mit.edu/6.828/2014/xv6/xv6-rev8.pdf (and can actually be generated by the Makefile!)


I went through my school's version of this class on xv6, and wow, what a trip. After thinking hard and working hard on things I never had a solid grasp on - filesystems, virtual memory, interrupts, multitasking - I now appreciate and understand things going on after a call to fork() or exec().

For my final project I implemented a simple threading library based on the interface that pthread() uses. It's amazing how beautifully simple kernels can be.


So the ls and make and all those tools; where do they come from? Are they straight compiles using GNU gcc tools etc?

I only ask because I have no idea how much is the kernel compared to everything else; and if it has its own custom compiler or whatever.


No, they are minimal versions written for the OS. Here is ls[0] and cat[1]. You can compile with GCC.

[0] http://www.ccs.neu.edu/course/cs3650/unix-xv6/HTML/S/64.html [1] http://www.ccs.neu.edu/course/cs3650/unix-xv6/HTML/S/42.html


I wonder if a suitably minimalist C compiler (along the lines of C4[1] or C4x86[2], although perhaps a bit more featured) and/or assembler to go along with it would be a neat idea too - now you can have a complete self-bootstrapping OS which one person can easily understand.

[1] https://news.ycombinator.com/item?id=8558822

[2] https://news.ycombinator.com/item?id=8746054


By reading `cat`, how come that error messages are printend on standard output and the exit code is always 0? stderr and exit codes are not implemented/used in xv6?


Northeastern (my school, currently...) uses it as well. A quick search leads me to believe many schools do. Wonderful learning tool. Also fun to try to port to non-x86 :)


OP article lists a lot of schools using it as well.


Source available on GitHub: https://github.com/mit-pdos/xv6-public

Unless I'm missing something, the entire codebase is just 102 files (including README and such), no subdirectories, and 12,000 lines.

What a fantastic tool.


Any project with a TRICKS file is bound to be good.


xv6 is an amazing teaching OS - it is very simple to dive into and play around with. A while ago I wanted to explore some filesystem/permissions stuff and use xv6 to play with some concepts (see http://sarahjamielewis.com/posts/file-system-permissions-and... - still lots I want to play around with there when I find some time) - If you are at all interested in OS dev it is a great gateway.


I'm interested, but where would you suggest to start?


I just skimmed the PDF and it's really a good undergraduate text. I learned off the classic Tanenbaum stuff which is probably still good as a supplementary text, though it might be suffering the Dragon Book Syndrome at this point.

The Linux kernel itself is really well documented. Between the docs you can pull, KernelNewbies.org, the IRC channel, and the O'Reilly text "Understanding the Linux Kernel, 3rd ed", you can get up to speed re: conventions and develop a pretty good top-down view.

Then use strace/ltrace excessively on everything to get a bottom-up view. Strace is a great skill to have in general to help identify bugs in live code by attaching to the PID. Way more granular than a standard GDB attach. It's got a learning curve about on par with reading pre-C++11 template error msgs, but after a few months you'll start to recognize patterns, just as in tmpl error msgs, to the point where you can just skim the last few hundred lines strace piped to stdout and you'll know what sort of bug to deal with.

I haven't played with FreeBSD in nearly a decade but I do remember their documentation was second to none, including the IBM Red Books. So if you want to see another approach to a POSIX implementation (obviously Tanenbaum talks about MINIX which is semi-POSIX, so yeah, read that supplementary text), going through the Handbook from an end-user perspective then the mailing lists + source will give you a real interesting view as to why certain engineering decisions were made. I.e., why ipfw was replaced, the gradual progression of standard file systems from UFS all the way up to modern day ZFS, etc. The list offers a rare view behind the curtain exploring engineering decisions that end-users aren't often privy to, and the caliber of conversation is ridiculously high


Grab the source, compile it and get it running...after that...whatever you find interesting. Some ideas:

* Trace the execution of one of the user land programs (e.g. ls) to see how it uses the syscalls.

* Add a new syscall - there are a few open-courseware labs online which have this as in introductory exercise (e.g http://moss.cs.iit.edu/cs450/assign01-xv6-syscall.html)

* In a similar vein - find some other labs for other courses using this - and follow them.

* Read through the companion book (can be compiled from the source or downloaded pre-compiled) and follow it through the source code.


Some time ago, when i discovered xv6, I started playing with it and I ended up: - Refactoring the build system - Lots of clean of the code - Adding a real distinction between distribution and kernel - A clean libc - Tons of other stuff

You can find it here: https://github.com/NewbiZ/xv6

Considering the original build system, I think having a look at this just for the sake of the cleaner makefiles is worth it.

There is also a mailing list for xv6, to share its understanding: http://www.freelists.org/list/xv6


What did upstream think of it?


[deleted]


Well, they added a feature for this as it's quite useful. Click the "Past" link underneath the title on page.


There's also OS161, which is Harvard's version of this. It comes with a very simple MIPS emulator, on top of which the actual operating system is built. It's a lot of fun!


The University of Toronto used to use OS161 as well. The OS161 code contains lots comments, but xv6's textbook [1] covers things in more detail which is sometimes useful.

[1] https://pdos.csail.mit.edu/6.828/2014/xv6/book-rev8.pdf


I took 828 and really enjoyed it. It was one of the best classes I took at MIT. The other OS for that class is JOS, which is the one you build through the labs. It's similarly simple to understand and not that long to build.


Looks like http://ocw.mit.edu/courses/electrical-engineering-and-comput... is accessible, but I'm not sure how much content is free.


The course material is all on the course website, you can find it here (look under "Labs"): https://pdos.csail.mit.edu/6.828/2014/


> I took 828 and really enjoyed it. It was one of the best classes I took at MIT. The other OS for that class is JOS, which is the one you build through the labs. It's similarly simple to understand and not that long to build.

I'm not sure what I'm supposed to take away from your comment aside from "I went to MIT".


that the course was really enjoyable. why is that not nice to know?

edit: here's an analogy - say the article was about a famous restaurant, and someone chimed in to say "i ate there when i was visiting hong kong, and it was one of the best meals i've ever had". now we already know that it's a good restaurant; that's what the article was all about after all. nevertheless, i like that sort of comment; it adds a human touch to the story because (rightly or wrongly) i perceive a fellow commenter as less remote than a newspaper food writer, and therefore their opinion has a certain anecdotal quality to it that the article lacks.


I spotted a few things.

1. Class 828 was a really good class, best he took. Others might want to try it too!

2. There's another operating system that's used in the class called JOS you might like to check out also if you are into learning about operating systems.

2. b) JOS is similarly quite simple to understand and doesn't take that long to build. Try it out also.

Hope this is helpful!


I didn't get that from his comment at all--silly.


So, is this a minix competitor?


Best thing about this is that it runs in Bochs.

As useful as it is, especially outside of i386, QEMU has grown to be a massive program with many dependencies that requires GB of RAM/swap during compilation.


Is there a coursea or other MOOC that uses Xv6?


https://pdos.csail.mit.edu/6.828/2014/schedule.html The mit website pretty much has all the information open, which you can use to go through this course at your own pace. The lectures(most if not all) are on youtube, search for 6.824


Where are the unit tests ? where are the valgrind tests ?


If it's only for education purposes, why C? Wouldn't you want to choose a more "clear" language to get the ideas across?


If my school had decided to teach operating systems using some non-C language because it's "more clear," I can only imagine that my development as a programmer would have been delayed by years.

Sometimes education is about pure theory, and sometimes it's about how people do things out in the world. I think this is a case where the second approach is much more valuable.


Because C is the de-facto standard for writing operating systems.

If you want to write an oh-cool blog piece for general programmer audiences about how OSes work, sure, go ahead and write it in Python (seen that done, and pretty well, in fact).

But if you want to prepare people to be able to work on real operating systems, they need to know how to do it in C.


I think an OS class isn't meant to prepare you for working on the operating system. The vast majority of people that take an OS class aren't going to work on OSs, but they should appreciate how everything works. Using a language optimized for communicating "how everything works" clearly should be a priority!


Yes, and that language is C. C is a very simple language, and it's very clear. If you're writing anything that interfaces directly with the OS, you'll need to know C.


Perhaps because operating systems are still largely written in C and if a student in the class later goes on to work on OS internals not knowing C would be a major handicap.


> Wouldn't you want to choose a more "clear" language

Are you implying that another language would be more clear?

Quite the opposite, C doesn't hide anything from you. When you want to understand what the computer is doing you can tell directly from the C code. Unlike other languages where you need to understand what the language is doing first, and only then can you understand the computer.

There are time where "hiding" the computer is useful, but not here.


> C doesn't hide anything from you.

Yes, it does.

C hides cache, SIMD, registers, the stack (no multiple return values for you!), the details of the heap (malloc() either succeeds or fails, and you can't know what it's going to do until you call it), SMP, instruction-level parallelism, and the details of atomicity, all of which are relevant to OS programming.

C is a nice language. Don't pretend it's how the hardware really works.


...and, particularly crucially if you want to work with numbers, it also hides overflow and floating point exception behaviour. Plenty of processors support trapping arithmetic which signals on overflow and underflow, but this is practically unusable in C.


There's a big difference to hiding and simply not knowing about. If you write a kernel you will need to write architecture-specific primitives (and some non-quite-primitive code, since you bring up SMP) to deal with all the categories you mentioned and some more. Regardless of the language. There is no magic language for writing kernels with a fat runtime to hide the gory details.

C is a nice wrapper around assembly. In a few cases, you wish it were a bit better specified to control the actual assembly/ABI binding a bit better. The problem I see is a lack of contract enforcement which makes introducing hard to diagnose bugs really easy. Maybe Rust will fill the gap.


C is about as "clear" as one can get, other than assembly.

Most other languages hide away implementation details that are critical to writing an operating system--how would you handle, say, interrupts?


In Ada you just annotate a procedure as being an interrupt handler and it Just Works.

In addition, you can annotate a protected object (essentially, a group of shared procedures protected by an implicit mutex) as being interrupt-safe, and then any access to that object will be automatically protected by the appropriate instructions.

Plus, if you're in a Posix environment, you can use the exact same mechanism for interrupt handling. It's all remarkably elegant.

Alas, like everything Ada, the documentation is opaque in the extreme, but:

https://www2.adacore.com/gap-static/GNAT_Book/html/aarm/AA-C...

Note that at the bottom they're defining a parameterised interrupt handler structure and then instantiating it multiple times on multiple IRQs, each of which is in its own isolation domain...


What language did you have in mind? It has to be suitable for writing an OS, so Java, python and what not are out. The project has been around for more than a decade so the language has to have had a stable v1 release more than 10 years ago. Also the original code you are working with is in C so you will need to translate it. C seems like a perfectly reasonable choice.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: