Hacker News new | past | comments | ask | show | jobs | submit login
Linus Torvalds: Debugging hell (torvalds-family.blogspot.com)
78 points by mqt on Dec 5, 2008 | hide | past | favorite | 44 comments



Despite the risk of being voted down by the fanboys, I have to ask: how could this receive 12 votes within 40 minutes?

What's so newsworthy about Linus being frustrated about debugging?


I found it somewhat interesting to learn that on a modern architecture you can isolate the cpu from its peripherals to the extent that debugging is impossible.


There are some aspects that might be worth voting this article up. Like the one you noted, or the mere fact that even a genius like Linus gets frustrated about debugging.

Still, I can't help suspecting that many voters wouldn't care if some John Doe had written this story.


True. But the name Linus Torvalds helps the article pass through our "crap-filter". We invest more effort in looking for value because we have a reasonable assurance that value is there.


sounds.. bayesian.


Personally I found it refreshing. Most of the front page is VC, food, money & politics.


for me it was nice to see that even the world's most famous hackers deal with the same aggravating nonsense i do.


I think some of us are simply scanning through the new section, tagging things we'd like to read at a later time. :)


www.instapaper.com


Reassurance that even Linus is human.


Does it unnerve anybody else that there are "Linus or nobody" sections of code, when GNU/Linux is often linked with the "open source / many eyes" security defense?


It's also unnerving that there are huge projects (not just Linux) and huge operating systems that have sections that are "Subsystem maintainer/author or nobody." Open source projects get volunteers because volunteers find the work interesting and some esoteric parts can have very, very few people working on them. I don't really have a solution to this, but it's to be expected.


I don't think that it means that if Torvalds gets shot tomorrow that the Linux kernel will die. I think that what it means is that there are parts that only he is knowledgeable or invested in enough work on in order to fix certain bugs. That is, if he were to die tomorrow, other would be able to take his place, technically, but they would have to spend a while reverse engineering the code and might even make significant changes to suit there tastes.

This happens all the time with commercial products.


Good point. When code is too hard to understand, open source is an illusion.


Open Source is an 'illusion' most of the time. What open source gives you, is the possibility that someone could understand and take over the source if it's needed badly enough.


Open source is like open laws, so bloody vast it takes people who you have to pay vast sums of money to spend the time to interpret and advise you on it. So you don't have to spend the vast amount of time to use it.


Interesting analogy. And still better than closed laws.


Hardware (driver- or OS-level) debug sucks.

Commodity hardware debug sucks more.

When working at this level, access to hardware probes and external monitoring and manufacturing taps can be invaluable.

Boundary and edge conditions and timing races and part steppings and errata rule. For those cases where the errata was written down, or where the vendor deigned to describe what changed between the steppings.

There are a number of device-level drivers and operating systems in use where only a very few folks really know the code sufficient to debug this level, too.


I've always wondered why linux has a hard time with suspend and hibernate, while on windows (and osx) there usually isn't any problems at all.


Because hardware vendors work directly with windows to make suspend / hibernate work correctly.

Because for a long time, apple only had one architecture to worry about.

Because linux has only gotten "big enough to pay attention to" in the last few years.


Linus once said that debuggers are for sissies. I knew he would regret saying that sometime.

Here is another of his rants against debuggers. Just substitute 'kernel debugger' by 'simple chipset debugging facilities' (whatever that means) in that email and basically he has his response for why Intel isn't adding it. http://linuxmafia.com/faq/Kernel/linus-im-a-bastard-speech.h...

Intel engineers are saying, 'simple chipset debugging facilities' are for sissies!


Well I know he said real men use printf, but it sounds like he's putting the system in a state where it can't print to anything.


Therefore, he is putting himself in a state where he is not a real man.

This is his logic, not mine.


He put himself in a state where the only options were more difficult than printf. Therefore he exceeded his own standard for real manliness, and remains a real man.

It seems like a simple transitive relation to me; are you saying his logic neglects transitivity?



this is what you get for going macro kernel. in a micro kernel you'd just hook the debugger to the driver process.


That's not true at all, have you ever actually done kernel development, especially dealing with bad hardware? A kernel panic is what it is no matter how many boxes there are in your flowchart.


Yes, and yes, and you're wrong about that. Anything else :) ?

To elaborate: Yes, I've actually written an os, yes, bad hardware was my 'standard' in those days (a simple lack of money), imagine a pc built out of parts bolted to an old print-file trolley, more flakey than I care to remember.

A kernel panic in a microkernel presumes a memory or a cpu error, anything else the kernel simply does not deal with so that would lead to driver process issues, not kernel panics.

A faulty memory controller or cpu would lead to a kernel panic because of (apparent) datastructure inconsistencies.

One of the hardest drivers to write (even in a microkernel environment) was an X.25 board that a friend of mine had designed around some comm chip, for one the X.25 spec is pretty convoluted and there were a lot of layers of the protocol to be implemented in a single driver. That thing was an absolute nightmare to debug, other than that most drivers (harddisk controller, network cards, graphic boards) were a walk in the park compared to doing the same under a macro kernel.

Simply telnet in to the machine, start up the vga driver process and run it (under the debugger) until you manage to crash it. Most of the times a simple 'where' and close inspection of the source would be enough to solve the problem, recompile and run for the next iteration. No kernel panics.


Wow, there's an enormous ego on that guy. Well deserved, but still...


Ego? Sounds like a simple statement of fact. If someone else can debug the issue, can he say what he did and not get called out?

Also, it is funny that people who succeed in the first place due to what you call as ego, are expected to magically lose it and become "nice" once they are above a certain level of success.

[edit: +1. Downvote is not from me.]


So it's seriously a statement of fact that he's the only person in the entire community who could possibly debug that problem? I find that incredibly hard to believe.


If there were someone else, it sure sounds like he would be happy to have sloughed off the work. It doesn't sound glamourous. It sounds like it sucks. So everyone else would rather work on something they think is cool, and Linus is stuck with the crap work that in the end nobody is willing to do. That is why he's the guy in charge. That is why he deserves the credit for Linux.


Not "could" - "will". It's not the same thing.


He even calls it "rare"... So I didn't get the ego angle. He's probably understating it if anything.


"I have an ego the size of a small planet..."

http://en.wikiquote.org/wiki/Linus_Torvalds


Pluto?


He said planet ... that probably means Mercury now.


No, he wouldn't want to be close to Sun :-)


I took small planet to mean dwarf planet.


It's interesting that OSX is basically FreeBSD with bells on, and it can suspend/resume without too much difficulty. That is what needs to be fixed at the source (haha), not one-off hacks.


Saying that Mac OS X is basically FreeBSD is inaccurate and I wish that people would stop repeating it. Darwin is built around XNU, a hybrid kernel that's based on Mach, FreeBSD, and 4.3/4.4BSD code. It's true that a lot of code came from BSD, but Mac OS X is not FreeBSD.

http://www.kernelthread.com/mac/osx/arch.html


How many hardware platforms does OSX do suspend/resume on without too much difficulty?


(We used to sort of have a tradition of not voting people down that much for comments that were just "wrong", rather than actively offensive or something)


booya!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: