
My 8 hour journey to a single character - lbrandy
http://lbrandy.com/blog/2009/11/the-8-hour-journey-to-a-single-character/
======
staunch
The real lesson here is the power of asking the experts. Over the years there
are countless times when I've replaced a huge or messy work-around with a tiny
fix someone on IRC or a mailing list suggested.

Googling is great, but it only helps when you know what you're looking for.
Telling expert humans _what you're trying to achieve_ is far more useful.

~~~
DarkShikari
_Telling expert humans what you're trying to achieve_

This is critical. All sorts of "dumb questions" turn out to have been created
by faulty assumptions or bad solutions to a larger problem. Often it's like
traveling to the top of a mountain to boil water instead of fixing the broken
stove. When asking for help, users should always ask the big picture question
("I need help setting up a server to do X, I'm getting error message Y when
doing Z" rather than just the specific problem "I'm getting error message Y
when doing Z").

~~~
trafficlight
An excuse to share my favorite bash.org quote:

<glyph> For example - if you came in here asking "how do I use a jackhammer"
we might ask "why do you need to use a jackhammer" <glyph> If the answer to
the latter question is "to knock my grandmother's head off to let out the evil
spirits that gave her cancer", then maybe the problem is actually unrelated to
jackhammers

<http://bash.org/?866112>

------
mquander
I had the pleasure once of debugging an issue in C# that boiled down to the
difference between a property "Percent" which was represented as e.g. 45.6,
and a field "percent" which was measured as a fraction, e.g. 0.456. After
correcting a reference to the field to be a reference to the property, I was
proud to claim that my fix was a fix of a single bit - the difference between
ASCII 'P' and ASCII 'p'.

Can't get smaller than that!

~~~
DougBTX
I was at a pompously titled Software Craftsmanship conference, and one of the
questions going around was: "what is the minimum commit size?"

My first guess was one character, since you can fix a bug by changing one
character. But in the 'P' vs 'p' case, minimum size could be larger - fixing
the calling code is fine, but is there a change that can be made to stop this
bug appearing in other calling code? Eg, simplify the interface so that one of
the 'p's is protected, so that the wrong Percent can't be set by accident.

------
jrockway
_One character. One friggin’ “J”. 8 hours. I hope no one is keeping track of
“lines of code per hour”._

Sounds like he / you acquired some essential domain knowledge in those eight
hours. Your code is more efficient, and you understand the problem space. That
is what programming is about, not how many lines of code you write.

(When working on a large existing code-base, my goal is to _remove_ lines of
code from the project. Writing code means you are writing bugs. Removing code
means you are removing bugs :)

~~~
req2
Whoosh.

~~~
spatulon
Can you explain what apparently obvious point you think jrockway has missed?

~~~
req2
jrockway bromidically responded to a throwaway joke that in itself already
acknowledged the content of the response.

~~~
jrockway
Judging from various codebases I've had to maintain, most programmers don't
think this is a joke. They really think this guy wasted 8 hours learning how
his code is supposed to work. They would have just reverted to last night's
version and moved on to adding some more technical debt. (It's OK because it's
test-driven development. If the tests pass, the code is fine!)

~~~
req2
I'm sure most of the people here feel the same way. It just seemed like an
audience mismatch, and if you acknowledged it (either the joke nature or the
audience mismatch), I wouldn't have felt the need to "Whoosh" you.

------
DarkShikari
I had a similar case; a regression that lasted 3 weeks and took half a day of
grovelling through code to fix... and turned out to be two one-character typos
in a #define name:

[http://git.videolan.org/?p=x264.git;a=commitdiff;h=2389de25f...](http://git.videolan.org/?p=x264.git;a=commitdiff;h=2389de25ff7fe6f84c9c885578c0fbaa6b656f4a)

------
chipsy
I just fixed a "should be || was &&" logic error that manifested itself fairly
mysteriously in a one-dimensional partitioning data structure I had made(meant
for indexing things like collision rectangles that have a "canonical value"
and a "represented value" which need to be synchronized).

When it removed elements, it was matching both the represented value(an
integer) and the node(an object reference). Using && meant that it removed too
greedily. Because updates always went "remove then add," this error went
undetected until I hit the case where two or more objects shared an identical
reprval. Then it removed all of the shared objects, leaving only one behind
when I went to query the partition. After going through "Does my query
function work right, does my list implementation work right, " I finally
narrowed it down to the add/delete functions.

~~~
mleonhard
That sounds like an error that would have been detected by a unit test.

~~~
holygoat
Being a little snide... that sounds like a unit test that occurs to you after
encountering that bug.

~~~
chipsy
It's a conceptual error with the problem, which means that it indeed WAS
missed by unit tests. I wrote some of those and checked the "overlapping
numbers" case; the difference was that I did so in a naive situation where the
values were inserted once and never updated.

Now, it would have gone from a "mysterious" error to an obvious one had I done
an academic-style step-by-step visualization applet of the entire structure's
processes before integration, but I had confidence when the code was first
written that because this structure was being used with extremely high
frequency, the only thing I had to write an explicit test for off the bat was
off-by-one ranges causing edge-case "near misses"; anything else would make
itself known at runtime after integration.

Time it would have taken for a visualization applet: ~3-6 hours? Time it took
to solve the bug: ~3 hours

It's placing a bet - and the bet is that the bugs are limited to a specific
segment of code that can be narrowed down easily; if the problem were
architectural in nature I'd be in far deeper shit because that would make the
same class of bug appear all over, with any number of different symptoms.

------
stuff4ben
I love stories from the trenches like this! I'm currently diving into the
shallow end of Lucene trying to accommodate ridiculous customer requirements.
I'm sure in a couple hours there will be a "hidden flag" story somewhere in
here.

~~~
delano
The part of my code soul that dealt with search requirements goes out to you.

------
mhartl
Something like this happened to me today. I noticed a floating menu that
worked fine in Firefox and Safari didn't work in (wait for it) IE. After about
three hours, I discovered that the JavaScript for the menu needed a CSS top
attribute to work with; FF/Safari set it to 0 by default, but IE apparently
needs an explicit value. Three hours, one line:

    
    
      #navtool {
        top: 0;
      }
    

But it works, and that's what really matters.

------
apgwoz
I now realize why people don't always share their failures in the way Feynman
said we should in "Cargo Cult Science."

Comments like this:

> wow, you are an idiot. i feel bad for you. you work on image processing for
> a living?

------
hxa7241
I would think there would be another cause of different results: chrominance
being encoded at lower resolution, dynamically and spatially. So converting to
RGB then to gray would make it more quantised and blurrier . . . well, I
haven't thought this through in detail, so I am not certain, but it seems a
possible consideration . . .

(since each channel of RGB is a weighted combination of all three YUV, each of
RGB must be somewhat lower resolution, and that can't be recovered, so when
converted to gray, that lower resolution must remain . . . so it seems it
_would_ be a deficiency . . . )

~~~
DarkShikari
That's not lower resolution, that's lower precision. It's the difference
between the number of samples and the accuracy of those samples.

This has nothing to do with luminance range though. Even if the output samples
only had 4-bit precision, it wouldn't make black into gray.

~~~
hxa7241
I understand the difference, which is why I said 'lower resolution, spatially
and dynamically'.

Are you saying definitively that chrominance is not encoded at lower spatial
resolution?

I also said it might be _another_ cause of difference, not _the_ cause.

Anyone might want to look at <http://en.wikipedia.org/wiki/4:2:2> before
simply assuming I am talking nonsense.

------
arad
A little off-topic, but I was under the impression that using the ffmpeg
commercially was problematic, did you find you had to address that issue at
all?

~~~
DarkShikari
ffmpeg is LGPL; it is used by hundreds of commercial applications around the
world. For that matter, it has _no commercial competitors_ for many of its
use-cases.

Even On2, a _competing encoder company_ , includes instructions with its
encoder on how to set up mplayer (which uses ffmpeg) for decoding input
videos.

~~~
NikkiA
Besides which, it sounds like they're only using it in-house for processing,
not distributing anything based on it. So even if they're not compiling it in
LGPL mode, they're probably perfectly OK.

