

Programmer Gore - jgrahamc
http://blog.jgc.org/2010/07/programmer-gore.html

======
jacquesm
Sometimes, usually when in a hurry I skip the 'getting to the root cause' step
and this has bitten me badly on more than one occasion.

So now, if I can afford it I really want to know where things went wrong.
Usually that means a longer time-to-fix but what's fixed that way usually
stays fixed. The 'band-aid' type fixes tend to lead to subtler problems that
are harder to fix later on.

If it breaks, let it please break now and in as spectacular a fashion, without
any band-aids, that way we can stay away from the kind of bugs that only
happen during full moon and Eastern wind.

Which reminds me, I really should have a look at the guts of some software
that currently runs in a wrapper script because it crashes once every 3 months
or so without any apparent cause.

~~~
Robin_Message
There's some good advice in chapter 8 of Code Complete on Defensive
programming. One example given there is the C function _realloc_ , which
resizes a block of memory, which can sometimes mean moving the whole block to
a new, larger block. Since intermittent bugs are indeed the worst kind, Steve
suggests making the debug compile memory allocator _always_ move the block, so
as to exercise that code path everywhere in testing.

Edit: Wrong Steve and wrong book -- it's a running example in "Writing Solid
Code" by Steve Maguire.

~~~
jacquesm
That's a good trick, regardless of which book it came from.

In the software I wrote about above I suspect a very subtle resource leak.

A nice example of such a leak is for instance forgetting to close an opened
file descriptor if some other rare error condition occurs elsewhere in the
code (not that that's it, but that's how you can get to the point where
something will run for months on end without crashing and then suddenly it
does).

File descriptor leaks can be relatively easily traced using lsof by the way
(one of the step-child utilities that really should be in every coders
toolbox, right next to gprof and make).

------
fierarul
Finding the root cause of failure is essential, especially when you are
working with large codebases.

For my biggest running project, I customize about 1GB of source code not
written by me. Every bug needs to be chased until one actually understands why
it happened otherwise it's too risky to just make a patch that "seems" to fix
it.

Plus, in the process you usually learn a new and interesting thing about a
previously unknown part of the codebase.

Of course, very few customers actually understand the importance of this and
have the budget to allow you this "luxury".

Fixing the bug for most of the servers and leaving a small fraction with the
old codebase to investigate the bug some more sound interesting but I couldn't
be doing that.

~~~
gaius
A _gigabyte_ of source code!? That's the entire depot not just one tag, right?

~~~
pmjordan
The Linux kernel currently is around 450MB of source code. I've worked on
individual projects with 1-2MLOC, I suspect that's about 100MB of code. A
couple of those don't seem all that implausible.

~~~
gaius
Hmm

    
    
        yulia:/usr/src# bzcat linux-source-2.6.32.tar.bz2 |wc -c
        382382080
    
    

382M. Tho' is it really one "thing"? A lot of that is device drivers, loadable
modules etc that no single install will ever have.

~~~
pmjordan
I guess my 450MB are from 2.6.34 and are the disk usage of the actual untarred
sources that I'm working from, so my figure includes cluster off-cuts (the
bytes needed to round up to the nearest 4096 for each file).

It's certainly one thing in that it all uses the same build script; and any of
the pieces are pretty worthless on their own.

------
Mongoose
Is there a bash.org-style site for programmer horror stories? They pop up once
in a while on HN, but it would be nice to have a single-service site for these
kinds of stories.

~~~
gaius
<http://thedailywtf.com/Default.aspx>

------
sh1mmer
Great article, but for someone reason when I saw the headline I thought it was
going to be able Al Gore.

------
gcheong
I thought this was going to be about how Al Gore programs his own climate
change models.

