
We had a unit test which only failed on Sundays - ColinWright
http://qntm.org/unit
======
11thEarlOfMar
Once upon a time in a Fab far away...

We had received three reports over as many months of a system that failed
mysteriously in a manufacturing setting. We sent a field tech out each time.
He examined the system, examined logs, found that an error had happened, but
found no explanation for it.

On the fourth report, while examining the system, the field tech noticed that
single thread of sunlight was passing through a high window and into the
viewport of the system. He immediately realized that the optical sensor on the
automated arm would be spoiled by that beam had the system been operating.

We had the manufacturer cover the window, and filed it as a bug that only
happens at a specific time of day in a specific season when the sun is
shining.

------
amelius
This reminds me of the case of the "500 mile email" [1]

[1]
[http://www.ibiblio.org/harris/500milemail.html](http://www.ibiblio.org/harris/500milemail.html)

------
lordnacho
Corner cases involving time are endless (you can use this joke for free):

\- Can your code handle a datetime if the logic expects midnight times?

\- What happens if you have a duration that starts on November 1 and ends on
November 2, and the user decides to change the end time by an hour? Daylight
savings?

\- If I fly from Denver to Arizona in December, how does it deal with the time
zones? Oh great, you need a table.

\- What's the next business day? Another table, this time with holidays.

\- When does an all day event start? Midnight? Midnight where?

\- Pre-gregorian time? Warhammer 40K time?

Then there's platform based problems:

\- Is there an easy way to do time arithmetic?

\- Is there an easy way to extract time parts? (iOS I'm looking at you!)

\- Are time zone aware times different to UTC times?

~~~
greggyb
Because of questions like these, the widest table in every dimensional model
I've ever built has been the date dimension.

~~~
rjbwork
Yeeeep. Can be tens to hundreds of columns depending on the requirements.
Fiscal years, UTC+14, holidays, etc, etc, etc.

~~~
greggyb
Work days, index field for every granularity of time (for fiscal calendars
that don't align to calendar granularities), day number of every time span,
display fields stored in text ("I don't care about the intricacies of date
formatting in locales, just make it always look like this "), abbreviations of
display fields.

How about the retail requirement of capturing store open/close status for year
on year comparisons? Date dimension becomes Store_Date dimension, with
indicators for what to include/exclude in comparison totals?

I give trainings on dimensional modeling and query design/optimization. The
vast majority of my examples are for dates, for two reasons.

1\. Everyone needs some form of date dimension.

2\. If you can solve date problems, you can apply any other logic you want
trivially.

2 is not 100% true, but sometimes it seems that way.

------
vinceguidry
I've come to simply expect that any code that deals with time is going to be
hard to build abstractions on top of, whether it be test cases or workflows.
The number of ways in which we use time is combinatorially large.

When introducing awareness of time into a system, I slow development down by
an order of magnitude until I have workflows built up to deal with all the
edge cases. The workflows have to be iterated on manually and slowly so the
domain model can emerge.

------
Corvus
I remember a similar bug in the early 1980s when a line-of-business system
went haywire every Wednesday. Every other day it worked fine.

We eventually found a maintenance programmer had changed "Wensday" in a
database to "Wednesday". Unfortunately this was passed to a C char day[9], so
the string had no trailing '\0'. Hilarity ensued.

------
0x0
Related: OpenOffice won't print on Tuesdays:
[https://bugs.launchpad.net/ubuntu/+source/file/+bug/248619](https://bugs.launchpad.net/ubuntu/+source/file/+bug/248619)

------
lacker
Is this really a bad thing? If there's a bug on only one day of the week, then
writing your unit test to work on the current day will actually catch that
sort of bug. Whereas if you hardcode a day then it will never catch that bug.
So it seems there is at least one advantage to this sort of testing in
practice.

~~~
mason55
A better way to test it would be to deterministically generate the days that
you want to test with. That might be one date for each day of the week, or one
date for each day of the year, or some set of dates that include some normal
dates and edge cases like leap days.

The problem with having non-deterministic tests is that when they break you
don't know if it's because the test broke or the code broke and even if it was
the code you don't know when it broke.

What if someone checked in a change that broke how things were handled on
Sundays but the tests were only run on Sundays once/year? You'd have to run
each build through the test suite, one by one, and you'd have to do it on that
day, just to figure out which check in broke.

------
lmm
Much less interesting than I was expecting. The program under test behaved
consistently; the test itself contained logic to perform a different assertion
on Sundays.

~~~
protomyth
but the discussion with all the other, more interesting, bugs makes up for it

------
stygiansonic
Date/Time is a fine example of something that initially seems simple, but
turns out to be exceedingly complex at times. For this reason, I usually use a
proper library (like Joda Time for Java, Moment.js for JavaScript, though Java
8 has improved things a lot with the built-in APIs) when dealing with any
date/time data, especially when manipulating it, i.e. finding out a duration
between two dates, adding a duration to a date, etc.

Also, it's helpful if you have some sort of DateService that you can mock
rather than using something like "new Date()..." in your code. The article
touches on this; in general you should not have tests that rely on or deal
with the current datetime as that's just asking for non-deterministic
behaviour. By having your code call out to a DateService to get the current
date-time, rather than creating it on its own, unit testing becomes easier.

My favourite date/time story: [http://stackoverflow.com/questions/6841333/why-
is-subtractin...](http://stackoverflow.com/questions/6841333/why-is-
subtracting-these-two-times-in-1927-giving-a-strange-result)

~~~
zwerdlds
Using a library is definitely the way to go when you can. For serialization,
ALWAYS use 8601.

------
swanson
Date formatting is the source of so many of these headscratchers :(

I'm reminded of the difference between 'yyyy' and 'YYYY'
[https://news.ycombinator.com/item?id=8810157](https://news.ycombinator.com/item?id=8810157)

------
ergothus
If I'm parsing the article correctly (And I'm not sure I am), the test was
written to care about the current day, because the code under test defaulted
to the current day when no day was specified.

Someone please tell me what "should" have been done. While I recognize that
unrepeatable tests can be bad, I don't see how one can test that defaulting
code works correctly without messing with whatever that defaulting code relies
on (in this case, the system date). I'd think I'd prefer to have some
conditional logic in the test than screw with the system clock.

Then again, enough doesn't make sense here (why was this test failing?) that
I'm probably misunderstanding the basis.

~~~
edejong
Our systems generally rely on a (dependency injected)
'currentDateTimeService'. In unit-tests, the currentDateTimeService is
dependency-injected with a mock service, always returning the same date.

But here, I guess the better solution would be to fix the tested code: don't
gracefully fall-back, but fail fast. I've seen too many places where graceful
defaulting code ended up corrupting important data.

------
protomyth
The funner bug is when your testers find a bug that happens when they work
late (after 6pm CST) that you cannot duplicate the next morning.

~~~
manicdee
I've fixed a bug that broke a web app any time the user did something after
10pm. Turns out someone had written their own date/time parser and validated
the time using a regex along the lines of "^[01]?\d:[0-5]\d:[0-5]\d".

------
tzakrajs
Wouldn't this be an integration test if it was testing a dependency? I believe
unit tests should have all dependencies mocked out to tightly control the
scope of that which you are actually testing.

------
kbart
Two things I hate dealing most while developing software: graphics and
date/time management. Every time a new project starts, you have to deal with
the same problems over and over again..

------
SEMW
A fun one that I came across was a test (in an old rails app I was
maintaining) that failed only on the 29th January in years preceding leap
year.

The test was:

    
    
      Subscription.new (starts_at: 1.month.from_now).ends_at.should == 13.months.from_now
    

On 29th Jan 2015, 13.months.from_now is 29th Feb 2016; but 1.month.from_now =
28th Feb, plus a year gives 28th Feb 2016.

(Possible conclusion: when using activesupport-style magic date helpers, look
up their exact semantics and be sure that what they're doing is what you mean
them to do...)

~~~
markburns
Another possible conclusion: fix this several years from now, the first time
it causes a test failure and still probably before it causes any production
problems.

~~~
lmm
If you have time bugs that only manifest at the DST changeover then everyone
knows why your bugs have happened and you look incompetent. I suspect this
applies if subscriptions handle leap years poorly too.

~~~
markburns
My point is more that if it costs more to fix it now, than it does to fix it
later (including the opportunity cost), then there is little point in fixing
an extreme edge case now. Especially if it means having to understand every
intricacy of every library you work with and spend a long time thinking of
permutations that have little effect on the business.

It depends on the situation you are in as to whether that cost calculation
works out in your favour.

It is not always that is has to always work in 100% of edge cases, all thought
about up-front and with no possible bugs arising due to not 100% understanding
every line of code in every library. Sometimes it makes more sense to fix a
bug that happens every four years when it causes a problem. In this case the
bug may manifest itself in a user potentially seeing a day off by one error on
a subscription details page. In which case it may never make sense to fix that
page. As the user may never look at that page, and even if they did, they may
never care.

~~~
lmm
> Sometimes it makes more sense to fix a bug that happens every four years
> when it causes a problem.

Sometimes yes, but you need to actually perform the risk assessment/cost-
benefit analysis. Many bugs are costly enough that it's worth fixing them pre-
emptively rather than always waiting for a problem to happen before you do
anything about it.

------
brudgers
Related: [http://stackoverflow.com/questions/4608470/why-
dec-31-2010-r...](http://stackoverflow.com/questions/4608470/why-
dec-31-2010-returns-1-as-week-of-year)

------
VLM
The article was surprisingly complicated, I was expecting something like
yesterday is DAYOFWEEK(NOW())-1 which works every day except Sunday.

------
lectrick
tl;dr

Badly-written test, nothing to see here.

------
klodolph
ISO weeks run Monday-Sunday, not Sunday-Saturday.

------
gregmac
We had a test that was calculating a total value of a counter (so based on
differences between counter values) for "yesterday" (as well as "today", "this
week", etc). I committed code and within a few minutes get notified that my
code caused a failure in this test -- unexpected as my change didn't even
touch anything remotely related to this functionality.

As I start asking if anyone has any idea, someone else mentions they've seen
that before too, many weeks ago, but then it "fixed itself".. Today was the
1st, and when I looked at the previous failure for this test, it was also on
the 1st, but it was a couple months ago. Last month the tests were passing on
the 1st. So I start to dig into it.

To put this in perspective, we had dozens of other tests that were checking
the same calculation for various time spans across months (including
explicitly for months with both 30 and 31 days, and February-to-March for both
leap and non-leap years), and all sorts of combinations of different values,
missing data, etc, that had mostly been there for well over a year. We had
actually spent a fairly significant amount of time thinking about how to test
this for all the different combinations of dates and data.

As it was the 1st, one of the other tests was actually running with the exact
same start and end date/times as this "yesterday" test, but was passing. So I
started looking at the mock data each was using.

Turns out it was in fact a legitimate bug, but only happened if it was
currently the 1st, AND yesterday was not the 31st, AND there were no values at
all for the current month (or anytime later).

The mock data for the "yesterday" test didn't have _any_ values after whatever
yesterday was. The explicit date test for 31st-to-1st happened to have a 0
value on the 1st, which meant this bug didn't happen.

Mostly out of curiosity, I looked further back in test history. There were in
fact 3 or 4 separate times this test failed, all of which were on the 1st of
either March, May, July, October or December. But not _all_ of those dates --
because sometimes the 1st was on a weekend, or just no code was pushed, and no
build was run.

This was also in production, but probably was never seen by any customers
(none had reported it) because the "yesterday" value was only ever displayed
in the UI, and most of the time data is added hourly, so by the time a user
logged in on the 1st (say, 8 am), it was almost certain there was some piece
of data added (even if it was 0).

We added an explicit test with the data for this situation, and of course
fixed the bug.

However, this will live on as by far the most obscure time-based test failure
I've ever had to deal with.

~~~
pc86
> _I committed code and within a few minutes get notified that my code caused
> a failure in this test_

Forgive me as I'm coming from a .NET background where everything is tightly
integration into Visual Studio, but are you not able to run your tests before
committing? We are strongly encouraged to run the full test suite prior to
committing any code for exactly this reason.

~~~
gregmac
Yeah, and this is also .NET. I can't remember if I ran the full test suite or
not (I have to admit, I don't always -- even though it's not best practice --
if it's a fairly isolated/minor change).. but in this case, due to the
problem, if I had done the code change on the 30th, it would have passed at
that time anyway. The nightly build (in the morning hours of the 1st) would
have failed.

