
Perfect timing: a bug that only shows up in the first 250ms of a day - gregschlom
http://labs.qt.nokia.com/2011/04/09/perfect-timing/
======
Stormbringer
The architecture/design part of this article sounded a bit odd. For the sake
of 'purity of design' or something like that he decided to eschew
constructors, but then it turns out that an un-initialized timer is not
useful... which to me is the classic example of when you _do_ provide a no-
args constructor with a smart default...?

These sorts of designers annoy me. They've disappeared too far up their own
backsides for the sake of some arbitrary aesthetic. Moreover by leaving traps
like this in their own code they always get out of their depth, and it's the
pragmatists like me who have to go in after them and rescue them.

My rule of thumb when choosing between great art and great pragmatism is to
make life easier for 'the next guy'. Because 9 times out of 10 _you're_ the
next guy.

~~~
cheez
When I read that part, I thought: there is your problem.

------
arethuza
Reminds me of the time we had a data encoding bug that only happened when the
length of the data was a multiple of 57.

At first it appeared to be random then a colleague noticed the commonality in
the lengths of the data that had the problem and after that replicating it and
fixing it were relatively straightforward.

~~~
saurik
One of the many interesting bugs that Apache 2 has accumulated workarounds for
is one that involves HTTP headers that end with a newline on the 256th or
257th characters.

[http://hi.baidu.com/blog/blog/item/bd01213fd850dfe954e72300....](http://hi.baidu.com/blog/blog/item/bd01213fd850dfe954e72300.html)

------
asymptotic
Time-based bugs are a nightmare to debug but in my opinion the far more
insidious side to them is that, even once you've found the bug, there's an
easy workaround - just reboot the box! Papering over problems is asking for
disaster, as the US armed forces found out:

Patriot Missile Software Problem
<http://sydney.edu.au/engineering/it/~alum/patriot_bug.html>

"Ironically, Israeli forces had noticed the anomaly in the Patriot’s range
gate’s predictions in early February 1991, and informed the U.S. Army of the
problem. They told the Army that the Patriots suffered a 20% targeting
inaccuracy after continuous operation for 8 hours.

Army officials presumed that Patriot users were not running the systems for
longer than 8 hours at a time. They suggested if they would be running for
continuous periods, they were rebooted regularly (which took around 1 minute
and would reset the system clock to zero)."

This is where I part company with the author, when he says:

"If a particular failure happens twice a year in thousands or even millions of
runs in that period, you’d be excused if you attributed that to 'the alignment
of the planets' and simply went on your merry way."

No. No a thousand times. Whenever you find a problem, any problem, you owe it
to yourself to dig until you find the actual cause. If you don't find the
cause, I guarantee the cause isn't magically going to become "cosmic rays",
and rest assured the problem will come back to haunt you when it combines with
other "random" issues in a cataclysmic failure; fate is funny that way.

------
pdaviesa
We had a strange bug that only surfaced on the last day of every month. One of
our directors asked what we were doing to troubleshoot this bug. I responded
that we were a bit more focused on troubleshooting bugs which occurred every
day.

------
qntm
Bugs in time-based code are great. The most amusing example I've seen was an
intermittently-failing automated test case which, I discovered after careful
scrutiny of the results database, failed only on Sundays.

There's also the classic time zone unit test "convert this Europe/London
timestamp to America/New_York time, verify that it is now 5 hours behind",
which, due to Bush's pointless Daylight Saving shift, now fails for two weeks
twice a year. Of course, two weeks is _just_ short enough and the bug is
_just_ inconsequential enough (the time zone code itself is working correctly,
after all, it's the test case that's broken) that the regression goes away
before anybody fixes it.

I can only imagine how much otherwise robust code flips its lid when presented
with, for example, a leap second. It's the number one argument for the use of
mock objects in testing.

~~~
yahelc
"Bush's pointless Daylight Saving shift"

?

Staunch Democrat though I may be, how is the Daylight Savings shift Bush's
fault?

~~~
xinsight
One of the last things he did in office was move it 2 weeks earlier. Canada
followed along and now North America is out of sync with most other countries.

~~~
burgerbrain
So just use UTC and get on with your life?

~~~
xinsight
I'd prefer to just get rid of DST.

------
mrspeaker
Fantastic work! But bug finds like this make me cringe. You can guarantee that
your client, or your client's boss, only tests your app out when they get home
- at midnight. And you're reduced to a mumbling, sobbing mess as you tear down
and rebuild your code yet again.

