
Linus Torvalds: Debugging hell - mqt
http://torvalds-family.blogspot.com/2008/12/debugging-hell.html
======
danielh
Despite the risk of being voted down by the fanboys, I have to ask: how could
this receive 12 votes within 40 minutes?

What's so newsworthy about Linus being frustrated about debugging?

~~~
abstractbill
I found it somewhat interesting to learn that on a modern architecture you can
isolate the cpu from its peripherals to the extent that debugging is
impossible.

~~~
danielh
There are some aspects that might be worth voting this article up. Like the
one you noted, or the mere fact that even a genius like Linus gets frustrated
about debugging.

Still, I can't help suspecting that many voters wouldn't care if some John Doe
had written this story.

~~~
JesseAldridge
True. But the name Linus Torvalds helps the article pass through our "crap-
filter". We invest more effort in looking for value because we have a
reasonable assurance that value is there.

~~~
Herring
sounds.. bayesian.

------
jodrellblank
Does it unnerve anybody else that there are "Linus or nobody" sections of
code, when GNU/Linux is often linked with the "open source / many eyes"
security defense?

~~~
JesseAldridge
Good point. When code is too hard to understand, open source is an illusion.

~~~
eru
Open Source is an 'illusion' most of the time. What open source gives you, is
the possibility that someone could understand and take over the source if it's
needed badly enough.

~~~
elai
Open source is like open laws, so bloody vast it takes people who you have to
pay vast sums of money to spend the time to interpret and advise you on it. So
you don't have to spend the vast amount of time to use it.

~~~
eru
Interesting analogy. And still better than closed laws.

------
Hoff
Hardware (driver- or OS-level) debug sucks.

Commodity hardware debug sucks more.

When working at this level, access to hardware probes and external monitoring
and manufacturing taps can be invaluable.

Boundary and edge conditions and timing races and part steppings and errata
rule. For those cases where the errata was written down, or where the vendor
deigned to describe what changed between the steppings.

There are a number of device-level drivers and operating systems in use where
only a very few folks really know the code sufficient to debug this level,
too.

------
elai
I've always wondered why linux has a hard time with suspend and hibernate,
while on windows (and osx) there usually isn't any problems at all.

~~~
kaens
Because hardware vendors work directly with windows to make suspend /
hibernate work correctly.

Because for a long time, apple only had one architecture to worry about.

Because linux has only gotten "big enough to pay attention to" in the last few
years.

------
Haskell
Linus once said that debuggers are for sissies. I knew he would regret saying
that sometime.

Here is another of his rants against debuggers. Just substitute 'kernel
debugger' by 'simple chipset debugging facilities' (whatever that means) in
that email and basically he has his response for why Intel isn't adding it.
[http://linuxmafia.com/faq/Kernel/linus-im-a-bastard-
speech.h...](http://linuxmafia.com/faq/Kernel/linus-im-a-bastard-speech.html)

Intel engineers are saying, 'simple chipset debugging facilities' are for
sissies!

~~~
Agathos
Well I know he said real men use printf, but it sounds like he's putting the
system in a state where it can't print to anything.

~~~
Haskell
Therefore, he is putting himself in a state where he is not a real man.

This is his logic, not mine.

~~~
Agathos
He put himself in a state where the only options were more difficult than
printf. Therefore he exceeded his own standard for real manliness, and remains
a real man.

It seems like a simple transitive relation to me; are you saying his logic
neglects transitivity?

------
ars
See update: <http://news.ycombinator.com/item?id=388251>

------
jacquesm
this is what you get for going macro kernel. in a micro kernel you'd just hook
the debugger to the driver process.

~~~
blasdel
That's not true at all, have you ever actually done kernel development,
especially dealing with bad hardware? A kernel panic is what it is no matter
how many boxes there are in your flowchart.

~~~
jacquesm
Yes, and yes, and you're wrong about that. Anything else :) ?

To elaborate: Yes, I've actually written an os, yes, bad hardware was my
'standard' in those days (a simple lack of money), imagine a pc built out of
parts bolted to an old print-file trolley, more flakey than I care to
remember.

A kernel panic in a microkernel presumes a memory or a cpu error, anything
else the kernel simply does not deal with so that would lead to driver process
issues, not kernel panics.

A faulty memory controller or cpu would lead to a kernel panic because of
(apparent) datastructure inconsistencies.

One of the hardest drivers to write (even in a microkernel environment) was an
X.25 board that a friend of mine had designed around some comm chip, for one
the X.25 spec is pretty convoluted and there were a lot of layers of the
protocol to be implemented in a single driver. That thing was an absolute
nightmare to debug, other than that most drivers (harddisk controller, network
cards, graphic boards) were a walk in the park compared to doing the same
under a macro kernel.

Simply telnet in to the machine, start up the vga driver process and run it
(under the debugger) until you manage to crash it. Most of the times a simple
'where' and close inspection of the source would be enough to solve the
problem, recompile and run for the next iteration. No kernel panics.

------
river_styx
Wow, there's an enormous ego on that guy. Well deserved, but still...

~~~
kirubakaran
Ego? Sounds like a simple statement of fact. If someone else _can_ debug the
issue, can he say what he did and not get called out?

Also, it is funny that people who succeed in the first place due to what you
call as ego, are expected to magically lose it and become "nice" once they are
above a certain level of success.

[edit: +1. Downvote is not from me.]

~~~
river_styx
So it's seriously a statement of fact that he's the only person in the entire
community who could possibly debug that problem? I find that incredibly hard
to believe.

~~~
duhprey
If there were someone else, it sure sounds like he would be happy to have
sloughed off the work. It doesn't sound glamourous. It sounds like it sucks.
So everyone else would rather work on something they think is cool, and Linus
is stuck with the crap work that in the end nobody is willing to do. That is
why he's the guy in charge. That is why he deserves the credit for Linux.

------
gaius
It's interesting that OSX is basically FreeBSD with bells on, and it can
suspend/resume without too much difficulty. _That_ is what needs to be fixed
at the source (haha), not one-off hacks.

~~~
davidw
How many hardware platforms does OSX do suspend/resume on without too much
difficulty?

~~~
davidw
(We used to sort of have a tradition of not voting people down that much for
comments that were just "wrong", rather than actively offensive or something)

