
MINIX 3: a Modular, Self-Healing POSIX-compatible Operating System  - axman6
http://www.youtube.com/watch?v=bx3KuE7UjGA
======
pedrocr
This 6k vs 6M LoC comparison is pretty dubious. If your disk driver has a bug
and overwrites some data you're just as screwed up, be it running in userspace
or kernel space. The argument these bugs are less powerful because they are
now in userspace "can't do very much" is at best limited. Drivers can screw up
hardware just as easily, filesystems can screw up your data just as easily,
etc. The memory grant stuff seems interesting though and I see how that can
protect you from some forms of memory corruption.

Running all these processes in userspace seems to gain you some capabilities
to more easily respawn/reset drivers. Live upgrade seems exciting. I wonder if
you are paying for that in added complexity that is hard to debug. The tests
they did are simulating basically hardware errors, and it does seems to be
quite resilient to those. In practice I fear the software bugs more. The bugs
I actually see in Linux tend to be oopses generated by bad code that simply
disable the driver. The bad ones screw up the hardware they are controlling
and require a reboot. It is unclear in these situations that the driver can be
correctly reset without resetting the hardware too.

I wonder how much of this respawn/update functionality you can apply easily to
a Linux kernel right now. Most drivers and filesystems can already be built as
modules and loaded/unloaded in a live kernel. I wonder how much that can be
extended.

~~~
0bfusct3
This 6k vs 6M LoC comparison is pretty dubious. If your disk driver has a bug
and overwrites some data you're just as screwed up, be it running in userspace
or kernel space. The argument these bugs are less powerful because they are
now in userspace "can't do very much" is at best limited

They are less powerful in user-space he states this about 25 minutes into the
talk "moving bugs to user-space will do less damage"...roughly.This is true
instead of getting full ring0 access to anything I can only do what the driver
is allowed to do if I exploit a bug in the driver.

Running all these processes in userspace seems to gain you some capabilities
to more easily respawn/reset drivers. Live upgrade seems exciting. I wonder if
you are paying for that in added complexity that is hard to debug

Why would there be added complexity? Linux has an api just as well but less
defined than simple ipc - even more complex. Hard to debug? you do understand
that having parts of the kernel in userspace makes it _easier_ to debug

The bugs I actually see in Linux tend to be oopses generated by bad code that
simply disable the driver.

Read LWN theres roughly a root exploit every 2 weeks

An ideal operating system would be a sort of exokernel but proven and a
'relaxed' api that would allow distributed computing. Unverified applications
would run under a vm with proven parts being compiled as well as compiling
heuristically verified parts and jitting the other needed parts.

~~~
pedrocr
>They are less powerful in user-space he states this about 25 minutes into the
talk "moving bugs to user-space will do less damage"...roughly.This is true
instead of getting full ring0 access to anything I can only do what the driver
is allowed to do if I exploit a bug in the driver.

I did listen to the talk and that justification. That's why I said it was
pretty limited. That may be true for security bugs in drivers. For a network
driver that may even be very important. In practice what I see is that the
actual bugs I care about in Linux drivers are code bugs that disable the
device, or in a filesystem cause disk corruption. None of those are solved by
a microkernel. Microkernels give you a bunch of provable advantages in areas
that monolithic kernels don't seem to do too badly at.

>Why would there be added complexity? Linux has an api just as well but less
defined than simple ipc - even more complex.

This is anything but simple IPC. You're sending async messages around and
wanting to handle restart of whole pieces and reissuing of commands. It is
much more complex and with many more edge cases that the equivalent Linux call
stack.

>Hard to debug? you do understand that having parts of the kernel in userspace
makes it easier to debug

Because now you're trying to restart a driver for a device that is in an
unknown state and then restarting the operation of the filesystem accessing
the driver that now has to make sure its operations are idempotent otherwise
it will screw up. The number of new edge cases is immense. It could get hairy
really fast. That is even touched upon in the presentation with the async
messaging and deadlock avoidance. That's why it's harder to debug. Because
you're adding a bunch of complex code in error handling paths that get
executed once in a blue moon.

>Read LWN theres roughly a root exploit every 2 weeks

I read LWN every week, there are some local root exploits once in a while. The
memory protection stuff could be good for that and you could implement it in
Linux if you wanted. I specifically stated that part is interesting for this.
My point is that the non-security bugs I care about wouldn't be prevented by
this technique.

------
CitizenKane
I find it fascinating that the actor model seems to be _the_ model for
creating reliable software. You can see it in Scala or Erlang at the user
level, get it in MINIX3 at the kernel level. I think if you combined the two
you could potentially create systems which stay up for years.

~~~
rbanffy
I have met a couple of QNX-based, also microkernel-based, systems that had
uptime measured in years.

Lovely OS. I learned C on it.

MINIX reminds me of GNU/HURD. Since they are all Unix-ish from the
application's point-of-view, I have hope they someday can more or less compete
on equal footing with more traditional monolithic (it would be fair to call
OSX "duolithic") OSs like Linux, _BSD,_ Solaris.

Very usable desktops like the Gnome I am using to write this don't care much
about what is under the libraries they link against. They would be fine on top
of any OS as long as libraries provided the environment they expect. Oddly
enough, the first time I ran Gnome was on top of IRIX.

------
yan
MINIX is a fantastic kernel to practice kernel hacking on and see how
operating systems function without the real world cruft of performance and
portability hacks.

~~~
ori_b
Personally, I found the code relatively painful to follow through. I'm not
even talking about how the parts communicated - the code was difficult for me
to trace through in the small. Even when compared to Linux or the BSDs, let
alone Plan 9.

K&R prototypes, overly decorative comments that just echo the name of the
function, defining macros like 'PUBLIC' and FUNCTION_PROTOTYPE(name, args)
which are effectively no-ops, and a host of other little annoyances that make
the code - at least in my opinion - rather unidiomatic and painful to read C.

------
phaedrus
The self-healing thing in Minix 3 really works.

Some years ago I installed Minix 3 on a very old Thinkpad that had an esoteric
video hardware quirk that would crash most OS's. It was "designed for Windows
95" and by God they meant it because you couldn't run anything else on it.
Neither Linux nor Windows 98 would run for very long before the hardware would
cause a driver crash and a kernel panic/bluescreen (even in text mode). But
with Minix 3 the video driver would just transparently restart after each
crash, so smoothly that I wouldn't notice it unless I was watching the logs.

~~~
axman6
That's awesome, and I'm sure its the sort of story the developers would love
to hear about.

------
subwindow
Not knowing much about OS design, is there any inherent reason why operating
systems built on microkernels seem to be so unsuccessful? They always seem
really elegant, but are never widely adopted (with the possible exception of
Mach -> OS X).

~~~
rwmj
I tend to think that Linus was right and microkernels are in reality a bad
design.

Case in point: Minix 1.5 (which I hacked on before Linux came along).

Minix 1.5 has two "daemons" called mm and fs which run the memory management
and filesystem respectively. Now consider process creation and loading (fork
and exec). Creating and loading a process intimately involves both mm and fs,
so in Minix 1.5 the program sends a message to mm [IIRC] which sends a message
to fs and both daemons have to coordinate with each other. This makes it a lot
more complex than if there was just one daemon (ie. a monolithic kernel).

Another example is that if mm or fs die, your OS dies. You can't restart one
or the other because there's so much process state spread across the two
daemons. So the claim that microkernels are more resilient because you can
restart daemons seems to be nonsense (but I should say that QNX can apparently
restart some(?) components transparently).

Nevertheless it's not all roses for monolithic kernels either. There's no
process protection and they're usually written in deeply unsafe languages like
C. Exokernels might be the answer to this because they have monolithic
qualities (fast calls and shared state) but keep virtually everything running
in userspace so you can use sane, safe programming techniques.

~~~
0bfusct3
This shows nothing more than a badly implemented api. POSIX is a bad api for
anything modern (distributed).

------
vital101
In my undergraduate operating systems class we hacked away at an early version
of Minix. The level of detailed code documentation always just blew me away.

------
rbanffy
Absolutely fascinating lecture. Very much worth watching.

------
switch007
His book, "Modern Operating Systems", is a fascinating read for any geek. He
discusses MINIX a fair bit (unsurprisingly!)

~~~
michael_dorfman
Or, just get his "Operating Systems: Design and Implementation", which is a
great read, and contains the full MINIX3 source code (printed and on CD).

~~~
rbanffy
Wow... I remember having read that in 88 or 89... It's a bit sad MINIX 1 would
still be considered pretty modern.

------
strlen
After watching the lecture, just download Minix, setup it up in
VMWare/Virtualbox/Bochs and hack something in it. Really, do it. It's
amazingly cool OS and fun to play with.

