Hacker News new | comments | show | ask | jobs | submit login

I've done the oscilloscope thing, though it was only a vanilla couple-of-hundred Mhz scope on some pins that were bugging me, and not the full deal: "Gang way, we're cracking the lid of that thing and going in." That sounds exciting, I'd love to see it.

Once, I used a chunk of ice to cool down a chip, and that made it work. The hardware guys were unimpressed. But hey, they've got cans of Chill and they use them a lot, and this software guy took a while to realize the reason the board worked in the morning and was dead by lunch, and worked for a little while again after lunch, was temperature related.

There were some devs who tracked down a nasty bug in a processor's TLB. I only heard about that one, wish I had been there. I only had to deal with the fallout in the hypervisor. Note: If you have to spend 20ms hunting down and killing lies with all interrupts turned off and everything basically stopped in its tracks, you are no longer a real-time operating system.

Heh. It could be telling that I had to look up the expansion for TLB. CPU cache implementation... holy crap.

My ex-coworker has done the vanilla scope thing too and has a 400MHz scope at home. For some reason people like this are not too uncommon in Finnish oldskool[tm] IT scene. I remember how he isolated a latency and concurrency bug to an expensive interrupt handler. Rewriting isolated parts of core kernel code to make a really tricky problem go away was one of his more hardcore skills.

I'm not even near his level. My own experience is limited to slightly nibbling the edges of file system and block cache behaviour. It's a brave person who dares dive into that code. Not me.

But I do know one person who regularly works with decapped chips. He works for a company who do extremely low-level hardware investigations. Now that's hardcore.

Cache bugs are one of the fun ones. You think you're losing your mind and the people around you would probably agree. A couple of weeks go by, your spouse is ready to fire you, your boss wants to divorce you, and every waking moment is full of race conditions. Four-way stops on the drive to work are a source of stress and you punch buttons in the elevator and worry about firmware bugs. Then you get to your desk and there's the setup, a laughably small board for all the trouble it's made, and it's time for single combat, Sherlock Holmes style.

When you find the problem it's usually a blinding flash of realization that illuminates a tiny, eensy bit of code that you tweak and make right in a couple of minutes. Invariably the mistake was pretty stupid. The glory moment is over quickly because you know all the test cases will pass and that you've just nailed another one.

You've got bragging rights during one lunch, but that's it. It's off to more mundane bugs in the mortal world, and you feel a little sad.

I need to do hardware again.

I remember a 3G network signalling simulation I worked on back in about 2002. We ran it on a rack of custom servers. The CPU load was pretty hellish, and the only way we could get it to run reliably without segfaults was to install gaming cooling systems and underclock the CPUs ... ran like a charm then!

At the late 90's and earlier 2000's it was commonplace to fix OS panics by opening the computer and pointing a fan at it.

I would try it even before going for some harder software problems, because it's so easy.

Ever squeaked the chips on an Amiga?

People couldn't import computers into my country at the Amiga's time.

I had a local-made spectrum clone, it didn't overheat, but I lost a multimeter on its power supply.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact