
IN/MSX: Running 4 Copies of an Operating System at Once (2008) - MindTwister
http://jeff-barr.com/2008/01/08/inmsx-running-4-copies-of-an-operating-system-at-once/
======
cpr
Yep, sounds familiar.

At Imagen (a Stanford TeX project spin-off started by Knuth's sidekick Luis
Trabb Pardo, building the first typesetting-capable commercial laser printers
using, at first, wet-process Canon imaging engines (LBP-10)) in the early
80's, we used the same Sun board (Andy Bechtolsheim, the designer, was a
consultant for us while he started up Sun).

I wrote our own "real-time OS" on the bare 68K Sun hardware (first time I'd
ever written a full (if simple) OS from scratch), and remember fairly vividly
the hard-knocks learning experience about race conditions just like the one he
describes here. Running for hours or days without error and then crashing
randomly--nightmare time.

Luckily, we also had an ace hardware guy, Kok Chen, from the Stanford SETI
project, and he and I and the logic analyzer would run test setup and lie in
wait for the condition to show up, then look back in time at all the
(Multi)bus transactions to see what actually happened. (Kok later moved to
Apple and became a distinguished engineer, one of very few folks who could
work on whatever they wanted.)

------
jeffbarr
I had a lot of fun doing this project as a young developer who had no idea
what could and could not be done with access to a raw machine.

~~~
Zenst
Indeed as we get older we add more realistic limits into our lifes, having
already defined our boundary markers. Also got more layers between you and the
metal. But many have thinks they did in there youths that today would be
things they would avoid. That said we often chucked in the level of commitment
work wise that you would only see today in a founder of a company. That was
the norm back then and today, it is the exception.

------
Morgawr
This reminds me of a (much much simpler) assignment for my Operating System
class during my Bachelor. We had to develop our own microkernel with
multitasking on a MIPS virtual machine. Setting up the scheduler was easy but
handling interrupts and message passing in a way that each task would not
incur into some context switching while a message was not yet fully
acknowledged/replied was hell.

I was 19 back then and I didn't know much (I learned a lot during that
assignment) about OS design and concurrency, I remember spending a few nights
just wondering why my code wasn't working at times. I mean, after all it's
just two instructions one next to the other, right? It's too unlikely that a
context switch were to happen right between these two lines of code, right?
How wrong I was.

Awesome read, by the way.

------
userbinator
For a second I thought this would be about running 4 OSs on the
[http://en.wikipedia.org/wiki/MSX](http://en.wikipedia.org/wiki/MSX) . Those
interrupt-related race conditions can definitely be really subtle - reminds me
that one of the earliest 8088 had an errata where an interrupt could occur
during a stack switch, corrupting memory in a similar way (see
[http://www.malinov.com/Home/sergeys-projects/sergey-s-
xt/his...](http://www.malinov.com/Home/sergeys-projects/sergey-s-
xt/historical-notes) ).

------
chillingeffect
"When you are young and naive, anything seems possible with technology."

Oh boy, I hope to use this quote at every possible opportunity from now on.

~~~
jeffbarr
Enjoy!

------
mwcampbell
So if it was possible to implement a hypervisor, why wasn't it possible, or
feasible, to run Unix and ROS on the same machine?

~~~
jeffbarr
Well, for one thing, I didn't have access to the Unix source code. The CEO
handed me a drive and said "Here's Unix."

Also, ROS worked within a flat address space and had no awareness of the
memory management features of the SUN board. Unix, on the other hand, took
full advantage of the MMU and I had no way to control what it did.

~~~
vidarh
I'm curious about the hardware. The 68000 is impossible to fully virtualize
directly - e.g. if you tried to write to memory the current process shouldn't
be able to access, there'd be no way to signal it without triggering an
exception/trap that didn't leave you with enough information to restart or
continue the instruction. I think the 68010 fixed this.

I also seem to remember that there were 68000 boards that "solved" this by
actually including two 68000's on the board and running them in lock-step 2
cycles apart, or something, and then triggering a trap on both of them if the
leading one triggered the MMU - that way the trailing CPU would be halted at a
point where sufficient state was available to reset state on both of them..

Do you remember if you had to deal with anything like this? Or did it have
simple enough requirements to get away without it (or actually run a 68010)?

I love the 68k family - after my C64, I got into Amiga's, and I'm still
disappointed Motorola didn't manage to keep up. It was so much cleaner to work
with than the x86...

~~~
jeffbarr
There was no virtual memory and hence no page faults and no need to recover
from them. Each OS ran in a fixed portion of the available RAM (I want to say
512K per OS).

The 68K ISA was a joy to program. With some prior experience on the 6502, I
was productive on the 68K within days.

