
Why was Tanenbaum wrong in the Tanenbaum-Torvalds debates? - nkurz
http://programmers.stackexchange.com/q/140925/25936
======
redthrowaway
In short, he wasn't wrong. His arguments were correct, but a combination of
happenstance, intransigence, and a vast infusion of research dollars kept
Wintel dominant on the desktop, while the world moved away from it.

The micro- v. macro-kernel debate is moot; all of the interesting development
is happening elsewhere in kernel-land. On that, he was wrong. But RISC will
dominate CISC unless Intel manages to pull a miracle out of their ass, <s>and
GNU will dominate, at least in terms of #installs, through Android.</s> nope,
I stand corrected.

So Linus was right about what mattered in the 90's, and Tanenbaum had his
finger on the pulse of history. Both knew what they were talking about, and
both were right in their own way.

Although reading the Usenet postings, Linus does come across as more of an
arrogant upstart, and less of a dickish master than usual. That alone is worth
the read.

~~~
InclinedPlane
He was "right" in that he made a prediction that did not come anywhere near
true?

Sorry, he was wrong.

As to the 2nd point about x86 vs "RISC" processors, it turns out he was so
massively wrong the very basis of his understanding was incorrect. CISC
processors are dead today, they stopped being a substantial part of the market
in the late 90s. I'm sure they still exist somewhere in new devices (RAD
hardened Pentiums, perhaps) but for the most part the CISC vs RISC battle is
over, and RISC won overwhelmingly. But Tannenbaum thought that a necessary
consequence of that would be that the x86 architecture would die due to the
weaknesses of CISC processor designs. But what actually happened is that Intel
(and later AMD and others of course) started making processors with a RISC
core that support the x86 (IA32) instruction set through transparent op-code
translation. Every Intel cpu since the Pentium-Pro has worked that way (and
every AMD CPU since the Athlon).

You can't just wave your hands and say "yeah, but he was _fundamentally_ right
in some ways though there were some things he couldn't have foreseen". That's
part of the deal, there's always something that you can't foreseen. Imagining
that CISC's weaknesses are identical to the weaknesses of the x86 architecture
are just the sort of naivety and shallow reasoning that can lead you to make
woefully wrong predictions.

~~~
zurn
"CISC processors are dead today [...] Intel (and later AMD and others of
course) started making processors with a RISC core that support the x86"

That's a pretty far fetched argument. RISC/CISC is about instruction set, and
the x86 instruction set is CISC.

Of course since the big RISC/CISC battle the implementations have converged a
lot on the microarchitecture level, primarily because the transistor budget
sweet spot targeted by RISC melted away and the amount of chip area saved in
instruction decode and ISA simplicity was later dwarfed by out-of-order
machinery, caches etc.

So an equally valid argument (as "CISC is dead") is "RISC is dead" since RISC
chips today have brainiac instructions and pipelines like divide/multiply,
unaligned access, variable length instructions (Thumb on ARM), out-of-order
execution etc.

~~~
Arelius
That's sort of like saying that any RISC machine becomes a CISC machine at
soon as you install the JVM on it (With the caveats that I don't know if the
JVM has a CISC instruction set)

~~~
burgerbrain
If all the end users use the CISC layer and no person uses the RISC layer,
then I would feel comfortable calling it a CISC machine.

------
jacquesm
Tanenbaum wasn't wrong and neither was Torvalds. The fact is that these are
complicated matters that you can't make black-and-white.

    
    
        Microkernels are the future
    

And they are, the future just hasn't arrived yet. But microkernels see more
and more adoption every day. They offer a degree of reliability that is
unprecedented. But they also come with a performance penalty that is for a lot
of people enough of a drawback that they would rather have 'good enough' than
'perfect'.

For software that needs to be 'perfect' microkernels are the way to go and in
fact in the embedded world there are more microkernel varieties that you can
choose from now than ever before. Once performance penalties are no longer
important and people will start to demand software that does not crash with
every change of the weather I believe microkernels will see another wave of
increased adoption. As far as I'm concerned this can't come soon enough.
Userland drivers are _so_ much better than a monolithic kernel.

    
    
        x86 will die out and RISC architectures will dominate the market
    

And in fact, in the mobile arena this has already come true. And the way Apple
is moving I would not be surprised to see an Arm powering an Apple laptop one
day.

    
    
        (5 years from then) everyone will be running a free GNU OS
    

I think both parties underestimated the strength of the windows lock-in here.
And many people _still_ underestimate the strength of this lock-in, even here
on HN the demise of Microsoft is announced with some regularity.

~~~
cperciva
_the future just hasn't arrived yet_

As far as microkernels go, I'd say that the future has arrived. We don't call
them microkernels, of course -- we call them hypervisors. But they're
fundamentally the same thing.

~~~
jacquesm
True, but those things run inside the hypervisors are usually still the same
old monolithic operating systems. I think that once those are also micro
kernel based that there will be a real shift in perception.

~~~
ajross
You say potato, I say potato. What's the meaningful difference? Inside even
the purest microkernel you're still running "processes" with unified address
space subject to awful crashes and memory corruption. The process is itself an
abstracted machine, after all. How far down the abstraction hole do we have to
go before we reach purity?

~~~
jacquesm
The differences between the linux kernel and a microkernel (such as for
instance QnX but there are plenty of others) is that everything is a process,
and everything but that tiny kernel runs in userland.

It's the difference between 'potatoes' and 'mashed potatoes' ;)

~~~
ajross
No, you miss the point. I understand very well what a microkernel is. I'm
asking you what the conceptual difference is between running a bunch of
"macrokernel" systems inside a hypervisor and running a single microkernel
with a bunch of processes. There is none: they are the same technology. The
difference is in the label you stick on it. Which is a very poor thing to
start an argument about.

( _edit: I should clarify "same technology" to mean "same use of address space
separation". Microkernels don't need to use virtualization technology like
VT-d instructions because their separated modules don't need to think they're
running on unadulterated hardware._ )

~~~
Arelius
The difference is the purpose of the system. I the prior, the purpose is to
simply multiplex the hardware into multiple logical systems performing
different tasks. And the later the purpose is to build a single unified
system. It has more co, Mmunication between the systems, and duplication of
work is minimized. Only one process has any FS drivers in it. Another only
worries about display. And more importantaly, it's more fault tolerant, if the
display process goes down, all the other processes are generally built to wait
on it coming back up. Whereas you cannot have a microkernel go down and not
take an application process, file system process, and a network process with
it..

~~~
ajross
I don't buy that at all, it's just semantics. Why can't multiple OS images be
a "unified system"? That's what a web app is, after all.

And the fault tolerance argument applies both ways. That's generally the
reason behind VM sharing too. One simply separates processes along lines
visible to the application (i.e. memcached vs. nginx) or to the hardware (FS
process vs. display process).

Potato, potato. This simply isn't something worth arguing over. And it's silly
anyway, because there are _no_ microkernels in common use that meet that kind
of definition anyway. Find my a consumer device anywhere with a separate
"display server", or one in which the filesystem is separated from the block
device drivers. They don't exist.

( _edit rather than continue the thread: X stopped being a userspace display
server when DRM got merged years ago. The kernel is intimately involved in
video hardware management on modern systems. I can't speak to RIM products
though._ )

~~~
jacquesm
> Find my a consumer device anywhere with a separate "display server", or one
> in which the filesystem is separated from the block device drivers.

Blackberry, every computer running 'X'.

------
spiralpolitik
Tanenbaum wasn't that far off.

1\. In terms of pure Micro-Kernels he was off. It was tried, the benefits
didn't outweigh the drawbacks so people moved to Hybrid Micro-Kernels (Windows
NT, OS X, iOS et al) so from that perspective he was about 50% right.

2\. Given the way ARM is trouncing everybody in the mobile space, unless Intel
manages the biggest comeback since Lazarus the future is almost certainly
RISC. Whether this feeds back to the desktop space remains to be seen.

3\. Unlikely to happen, although the future is most likely Open Source in some
form or other. GPL v3 has largely ruled out GNU dominating as vendors that
previously shipped GNU components replace the GPL components with other Open
Source Licenses because they find the new terms a bit much.

~~~
huggyface
_the future is almost certainly RISC_

What does RISC even mean anymore? Seriously, I remember the debates during the
early 90s (back when MIPS and friends were going to destroy Intel) and the
RISC of then is _very_ different from the RISC of today. Then the merit of
RISC was that you literally reduced the instruction set to the minimum
possible, putting the demand on the compiler to gang them to do even
rudimentary work. The idea was that the simpler silicon would be easier to
scale up (frequency scaling was a major problem), and the compiler would have
more insight into the operations of a product giving such a product a
performance advantage.

The MIPS of the 90s had about 45 instructions, total, and a corresponding
simplicity of implementation. The 8086 had 114, providing higher level, much
more complex silicon, and has grown since then.

How many instructions does ARMv7a provide (this is actually a hard question to
answer)? It has floating point operations, SIMD / NEON, virtualization
support, and on and on and on. I do know that while it once feature just
25,000 transistors (ARM2), a modern ARM9 design like the Tegra2 hosts 26
_million_ transistors for just the cores (not the GPU).

I realize that I'm stepping into a linguistic landmine, and various contrived
"this is the differentiator" definitions will appear, but the original intent
of RISC versus CISC was exactly what I described above. Today the meanings are
absolutely nothing like that.

~~~
spiralpolitik
While one of the characteristic of a RISC processor was a simpler instruction
set its not the only one. There is also uniform instruction length to make
instruction decoding logic simpler and quicker and that a single instruction
doesn't takes longer than a single clock cycle.

But you are correct that the water today is somewhat muddy especially as CISC
processors borrowed stuff from the RISC processors and vice versa. I think
someone in the late 90s coined the term CRISP (Complex Reduced Instruction Set
Processor) to describe these beasts, although I haven't seen the term
mentioned in recent years.

------
Symmetry
Just a few quick things to say on RISC vs. CISC.

Back when this debate was happening CPU design teams were a lot smaller,
meaning that any given feature hadn't had enough effort put into it to get as
far into the realm of diminishing returns, so there was a much bigger payoff
to be had in reducing the number of features you implemented.

You also weren't devoting most of your die to huge arrays of cache, so adding
- say - more addressing modes would tend to mean you couldn't have as many
pipeline stages. Any given feature will still make the overall design more
complicated and so will make it more difficult to add any other feature you
want, but the issue isn't as pressing as it used to be.

One area where RISC does still has a big advantage is instruction decode. When
you run into an x86 instruction you have to read a lot of bits to figure out
how long it is, and its not self synchronizing so you could read an
instruction stream one way if you start at byte FOO, but if you start at byte
FOO+1 you can find an entirely different but equally valid sequence of
instructions.[1] So decoding N bytes of x86 instructions grows in complexity
faster than linearly. In fact, I suspect that modern processors have to use
some sort of "Guess the three most likely solutions throw out the results if
we're wrong" solution for current processors to get the performance needed.

If I were to design an ISA I'd probably want some sort of UTF-8 style variable
length scheme, where you can always tell where an instruction boundary is
without reading from the beginning but with the space savings from having the
most common instructions be shorter than the least common ones.

[1] This apparently also annoys my security researcher friend.

EDIT: Found the link to that really good explanation Mashey had on RISC vs.
CISC: <http://userpages.umbc.edu/~vijay/mashey.on.risc.html>

~~~
anamax
> In fact, I suspect that modern processors have to use some sort of "Guess
> the three most likely solutions throw out the results if we're wrong"
> solution for current processors to get the performance needed.

IIRC, it's way more sophisticated than that.

As I understand Intel's trace caches, their guess is basically the result of
decoding the next N instructions, accounting for branch prediction.

And yes, it includes detection/recovery for writes into the instruction memory
that would invalidate that guess.

------
huherto
I thought this comment was cute. This person is in a conversation with two of
the biggest authorities in O.S design and he asks for a reference.

    
    
      *Can you recommend any (unbiased) literature that points out the strengths and weaknesses of the two approaches?*

------
zurn
Both RISC vs x86 and monolithic vs microkernel were just dichotomies of their
times, after that the disadvantages and advantages melted away through
changing constraints and crosspollination. Other market factors since dwarfed
these technical arguments.

------
oneweekwonder
I find it very strange to see no reference to Minix3 <http://www.minix3.org/>

Tanenbaum is putting his money where is mouth is with the project(Or at least
other peoples money he acquired) and yes the traction is slow and it it might
fail but I would really enjoy the day running a operating system based on a
micro-kernel, that does everything Tanenbaum promises with Minix3.

------
pjmlp
Well currently there are quite a few successful mikrokernel OSs, like QNX and
VxWorks.

And MacOS X and Windows are actually hybrid-kernels.
<http://en.wikipedia.org/wiki/Hybrid_kernel>

------
caycep
aren't micro or nano kernels living on as hypervisors? If so, they are the
backbone of all this "cloud" hullabaloo

------
tbsdy
Interesting to see Torvalds' attitude when he wasn't a rock star developer...

Linus "my first, and hopefully last flamefest" Torvalds

:-)

~~~
zalew
"PS. I apologise for sometimes sounding too harsh"

:D

------
exim
Who said Tanenbaum was wrong?

The fact that he doesn't swear, doesn't make his arguments weak.

