
Your test suite is trying to tell you something - jgrahamc
http://blog.jgc.org/2013/07/your-test-suite-is-trying-to-tell-you.html
======
jcampbell1
The real problem is `duration * time.Millisecond` makes no sense. The time api
seems a bit obtuse. Why not `duration.milliseconds`?

~~~
jgrahamc
Agree. Part of the problem here is that the Go time.Duration is just an int64
so you can do arithmetic on them. It would be better if there was something
like NewDurationMilliseconds(). Thus the type here allowed me to shoot myself
in the foot.

Having said that it doesn't change the fact that looking into the 'flaky' test
was the right thing to do. Even if the type system were different I'm sure I
would have shot myself in the foot in some other way.

~~~
barrkel
Seems it would rather be better if there was a .net TimeSpan / JodaTime
Duration type, or better yet, a type system that understood units.

------
smoyer
"By me." \- I've been there many times!

I hate that SCM systems commonly describe who made changes as "blame"
(couldn't it be "praise" for when you want to find out who wrote exceptional
code?) and yet it's wonderful to be able to look back into history and see
"who" was responsible for a block of code. It can be a real learning
experience!

~~~
seren
'svn praise' is an alias for 'svn blame' (or the other way round). The more
neutral wording is usually 'annotate'. To be honest, I have rarely used these
commands to check who had written such a beautiful code...

~~~
sneak
Someone once said that the only real measure of code quality was the number of
WTFs/minute emitted while reading it:

[http://www.osnews.com/story/19266/WTFs_m](http://www.osnews.com/story/19266/WTFs_m)

~~~
nathas
Might be from Joel Spolsky/Bill Gates:

"...a person who came along from my team whose whole job during the meeting
was to keep an accurate count of how many times Bill said the F word. The
lower the f __*-count, the better. "

[http://www.joelonsoftware.com/items/2006/06/16.html](http://www.joelonsoftware.com/items/2006/06/16.html)

------
rachelbythebay
Without knowing more about the source, it's hard to say for sure, but would an
explicit test for a failure to connect to memcached have uncovered this? That
is, if there is common code to connect to memcached, and a test to purposely
generate a timeout, it seems like it would have jumped out as taking far too
long.

If there's no common code to connect, and every user of it has its own
implementation, there might be more bugs like this lurking. Every user of
memcache in this particular code base would need to have its own "what if
memcached is slow or down?" type test. It seems such a test did not already
exist.

Following the events in this post, is there now a test to explicitly make sure
the delays are reasonable, or was the extra multiplication just removed?

~~~
jgrahamc
The big change was to write a mock memcached server where I could control
everything. The extra multiplication was removed and a bunch of new tests
written using the mock server.

------
thwest
Ah, this reminds me to go use the new std::chrono classes to strongly type my
time unit variables.

------
programminggeek
If you listen to the pain or failings of your test suite, you will likely end
up with better code. The problem is most developers when testing gets hard
don't look at their code and go "why does my code make this so difficult?"
they wil say "testing sucks" and delete their tests.

It's amazing what you will learn if you pay attention to the pain.

~~~
avelis
I agree. This example is a perfect candidate to further abstract the source to
remove ambiguities and better test the expectations of the source.

------
wldlyinaccurate
Great anecdote. I've never really agreed with having massive system-level
automated tests, because more often than not the random failures like this
_do_ just get ignored.

I guess it depends on the culture within your team, how much you all care
about your test suites.

------
colanderman
It amazes how often I see otherwise competent engineers ignore spurious
failures – they don't even open a bug report!

Software doesn't fail spuriously – there's _always_ a cause.

------
elsurudo
Interesting analysis, but yup, that's what tests are for! If you've got a
sporadically failing test somewhere, that's probably sign of a bug.

~~~
lmm
The question is what's the cost/benefit on fixing such things? Not everyone is
in a position to stop working on features for two days to track down something
like this.

This particular bug is also a strong argument for using more advanced type
systems, even if you have good test coverage.

~~~
jgrahamc
_This particular bug is also a strong argument for using more advanced type
systems, even if you have good test coverage._

Yes, it's a great pity that the compiler didn't say "You're multiplying a
time.Duration by a time.Duration and expecting it to be a time.Duration".

~~~
dllthomas
I'm not sure whether you're serious or not (and I see a few different jokes
you could be getting at otherwise), but multiplying time by time should give
you time squared, not time again. Unfortunately, even Haskell doesn't catch
errors of _that_ type if you want to use the ordinary mathematical operators,
because (*) :: Num a => a -> a -> a.

In this case, that's not what's going on, though. It's multiplying time by
unitless scaling factor, which doesn't present the same problem. However,
there's a few ways you can get at the issue with the help of the language, you
just have to distinguish between the different units you might be using to
represent time - this doesn't really take an advanced type system: the example
of having a .milliseconds method is along these lines.

~~~
njs12345
Have you seen the dimensional library? It's a slightly abusive use of the type
system, but will check units at compile time. See e.g.
[http://code.google.com/p/dimensional/wiki/GMExample](http://code.google.com/p/dimensional/wiki/GMExample)

~~~
dllthomas
I think so; I've seen similar things, for sure. Obviously Haskell does have
the _ability_ to make this distinction; just (unfortunately) it's somewhat
incompatible with the standard prelude, and thus common practice.

------
willvarfar
Makes me want to go zap a long standing ocassional test fail I got.

It seems that sometimes a netty hashwheeltimer is executing a timeout you add
immediately, instead of after 2 seconds. And its the simplest code, and the 2
seconds we are trying to set are literals, so it ready surprises me. Find it
hard to imagine a bug in netty and it always affects the same test, despite
all our requests setting an aggressive 2 second timeout...

------
sopooneo
Why do they call it a "memcached" server rather than a "memcache" server? I
realize that the daemon process for a server application is usually the normal
name followed by a 'd', but that is not usually how we refer to the server or
application. For instance, we say "mysql" server, not the "mysqld" server,
even though that is the name of the long running process.

~~~
emmett
Because that's how memcached brands itself.

[http://memcached.org/](http://memcached.org/) \-- the URL has a d (unlike
mysql)

"What is Memcached?" \-- it's how they talk about themselves.

It's even in the logo...

------
samwillis
I think this is a really good example of where languages should have built in
support for units. I was just having a look around and found the units module
for python
([https://pypi.python.org/pypi/units/](https://pypi.python.org/pypi/units/)).
I really like the example at the end where they have modified pypy for a
native unit syntax.

~~~
gizmo686
I think built is support for units is a bit to specific. Support for cheap
type alias'es would work, and be far more general. What I mean is something
like:

type Inch = int type Centimeter = int

And then have the compiler enforce these as different types (as opposed to
alias'es), so if you want to use an Inch, where the function decalares it
expects a Centimeter, you would be force to cast it (or have the compiler auto
cast using your conversion function).

This doesn't get you support for multidimensional units (IE, define a function
that goes from arbitary unit u to u^2), which would require a much more
complicated addition to the type system.

~~~
dragonwriter
> I think built is support for units is a bit to specific. Support for cheap
> type alias'es would work, and be far more general

Go has support for cheap type aliases for which names typed with the same
underlying type are distinct types which are not mutually assignable, and, in
fact, Duration is such a type with int64 as the underlying type.

