Hacker News new | past | comments | ask | show | jobs | submit login

As someone who found four compiler bugs in three weeks - in a five-nines fault-tolerant OS, yet! - and who found a PostgreSQL optimizer bug within weeks of learning SQL, I think the key to being "that guy" is playing five-whys with every single bug you encounter.

I work with some very talented developers who, when they try something and it doesn't work, try something else. I am fundamentally incapable of that. If it doesn't work, I MUST KNOW WHY. Even if that requires building a debug version of my entire stack, adding all sorts of traces, and wolf-fence debugging until I have a minimal fail case.

It's a real limitation; if I hit an undebuggable brick wall, I have no ability to attack the problem from a different angle. Luckily, there are few things that are fundamentally undebuggable.

I had to google "wolf-fence debugging".

I found I knew of it under a different name: binary search debugging. Git includes built-in support under the bisect command.

I also looked this up, and found the original paper that coined the term, which starts:

> The "Wolf Fence" method of debugging time-sharing programs in higher languages evolved from the "Lions in South Africa" method that I have taught since the vacuum-tube machine language days. It is a quickly converging iteration that serves to catch run-time errors.

http://dl.acm.org/citation.cfm?id=358695 (if you have access)

Anyone know what the "Lions in South Africa" method is? I couldn't find it via Google, it just kept turning up references to the same paper.

After a quick googling I found this explanation of wolf-fence debugging:


It stipulates that the state of Alaska has got exactly one wolf, so you build a fence across the middle of the state to find on which side wolf would howl, then subdivide the problem, etc...

I'm assuming in your case the wolf got replace with a lion and Alaska with South Africa.

Personally I always preferred the mathematician's method for catching a lion in the Sahara...

First, place a lion in Egypt so you know your algorithm will terminate...

Typical global-warming propaganda.. :)

My experience with "fault-tolerant" OSes and their toolchains is that they are buggier than their off the shelf counterpart, probably due to limited use ...

Well... "fault tolerant" doesn't mean "correct". It only means it can tolerate faults and, presumably, recover from them. And since they are used in very limited contexts, the odds of having bugs that nobody noticed are much higher.


You ask why five times. The internets are full of commentary.

Luckily, you can use mostly open source software these days...

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact