
Testing like the TSA - hawke
http://37signals.com/svn/posts/3159-testing-like-the-tsa
======
ejames
This is why I left Microsoft. Automated testing was a separate discipline -
meaning there were "dev devs" and "test devs". Automated tests were written
based on what the "test devs" had time for, not on the need or usefulness of
such tests for the actual code. I was hired as a "test dev" - I had no
industry experience at the time and figured I would give it an unprejudiced
try to see if I liked it.

I quickly realized that my job was futile - many of the "good" tests had
already been written, while in other places, "bad" tests were entrenched and
the "test dev" had the job of manning the scanner to check for nail clippers,
or upgrading the scanner to find as many nail clippers as possible.

Here's a useful rule on the subject that I picked up from an Artificial
Intelligence course back in the day: The value of a piece of information is
proportional to the chance that you will act on it times the benefit of acting
on it. We all realize there is no benefit in testing if you ignore failures
rather than acting to fix the bugs, but in much the same way that doing
nothing when tests fail has no benefit, doing nothing when tests pass also has
no benefit - so tests which always pass are just as useless as failing tests
you ignore, as are tests which only turn up corner-case bugs that you would
have been comfortable with shipping.

If you're doing the right amount of testing, there should be a good chance,
whenever you kick off a test run, that your actions for the next hour will
change depending on the results of the run. If you typically don't change your
actions based on the information from the tests, then the effort spent to
write tests gathering that information was wasted.

~~~
SoftwareMaven
I disagree that there is no benefit in passing tests that don't change your
behavior. Those tests are markers to prevent you from unknowingly doing
something that should have changed your behavior. That is where the nuance
enters: is this a marker I want to lay down or not? Some markers should be
there; others absolutely should not and just introduce noise.

~~~
ejames
I don't understand what you mean by "prevent you from unknowingly doing
something that should have changed your behavior". If you do something without
knowing it, how could it change your behavior? If it's a case where you should
have changed your behavior, why would you prevent it?

~~~
xianshou
I believe the above comment refers to regression testing. For instance, if I
write a test for an invariant that is fairly unlikely to change, then the
chance that my behavior will change in the next hour based on the test run is
small. However, if and when the invariant is mistakenly changed, even though
negative side effects might not be immediately visible, it could be immensely
valuable to me to see the flaw and restore that invariant.

~~~
ejames
Yes - but the test would fail when the invariant is mistakenly changed. On the
test run after the invariant was changed, you would get new information (the
test does not always pass) and change your behavior (revert the commit which
altered the invariant).

That is the point of the "changing behavior" rule - you do not gather the
benefit of running a test until it has failed at least once, and the benefit
gathered is proportionate to the benefit of the action you take upon seeing
the failure. The tricky part of the rule is that you must predict your actions
in the future, since a test that might have a very important failure later
could pass all the time right now. Knowing your own weaknesses and strengths
is important, as is knowing the risks of your project.

There are possible design benefits to writing tests, since you must write code
that is testable, and testable code tends to also be modular. However, once
you have written testable code, you still gain those design benefits even if
you never run your test suite, or even delete your tests entirely!

~~~
SoftwareMaven
Your comment reads like you can know when a test will fail in the future (how
else can you know the difference between a test that "always passes" and a
test that will fail in the future to identify a regression?). You may have a
test that passes for ten years. When do you know it's OK to nuke the test?

Based on your follow-up, it is clear that my reading was not what you
intended.

~~~
ejames
You can't know, but you can guess, based on past experience or logic. The
simplest way to estimate the future is to guess that it will be similar to the
past.

For example, if you personally tend to write off-by-one errors a lot, it's a
good idea to write tests which check that. On the other hand, if you almost
never write off-by-one errors, you can skip those tests. If test is cheap to
write, easy to investigate, and covers a piece of code that would cause
catastrophic problems if it failed, it's worthwhile to write the test even if
you can barely imagine a possible situation where it would fail - the degree
of the cost matters as much as the degree of the benefit.

You don't "know" when it's OK to nuke a test just as you don't really "know"
when it's safe to launch a product - you decide what you're doing based on
experience, knowledge, and logic. The important step many don't take is
developing the ability to distinguish between good tests and bad tests, rather
than simply having an opinion on testing in general.

~~~
xianshou
Re: "The simplest way to estimate the future is to guess that it will be
similar to the past."

When we say that the future will be similar to the past, for code, we really
mean that the probability of certain events occurring in the future will be
similar to their prior probability of occurring in the past.

In my hypothetical example of testing an invariant that is unlikely to fail
but damaging if it does, it might be valuable to keep that test around for
five years even if it never fails. Imagine that the expected frequency of
failure was initially <once per ten years>, and that the test hasn't failed
after five years. If the expected frequency of failure, cost of failure, and
gain from fixing a failure remain the same, we should keep the test even if
it's never failed: the expected benefit is constant.

Not to say that we should test for every possible bug, but if something is
important enough in the first place to test for it, and that doesn't change
(as calculated by expected benefit minus expected cost of maintenance), we
should keep the test whether or not it changes our behavior.

Thus, if we could estimate probabilities correctly, we really would know when
it's OK to nuke a test.

------
jashkenas
A thought that folks reading this post might have an opinion on:

"Libraries should be mostly unit tested. Applications should be mostly (and
lightly) integration tested. Naturally, some parts of a complex app will
behave like a library..."

Agree or disagree?

~~~
raganwald
Strongly agree, but also I think that the interesting thing is why this might
be so and what it tells us about application architecture.

The basic thing about tests is that once you have them passing, they represent
statements about constraints on the program. In other words, they express your
opinion of things that should not change. Unit tests are a bet that certain
aspects of the implementation will not change. Integration tests are a bet
that certain aspects of the externally visible behaviour will not change.

Libraries tend to be smaller and with well-defined responsibility.
Applications tend to be bigger and have many responsibilities. In general, I
think it’s true that the requirements for libraries change less often than the
requirements for applications. I think this leads us to expect that
applications may need to be rewired “under the hood” and have their
implementations changed as responsibilities are added, removed, or changed.

This, I believe, leads us to want to unit test applications less, because a
unit test expresses implementation semantics, and we expect application
implementations to change. No what about integration tests? Well, if we’re
unit testing less in the application, we need to make up for it by integration
testing more, otherwise where do we get our confidence?

Now if we throw the words “library” and “application” away, this suggests to
me that those parts of the code that are small and tight and with a single,
clear responsibility should be unit tested, while those parts that involve a
lot of what the AOP people call ‘scattering and tangling,’ should be
integration tested.

Thoughts?

~~~
tel
Unit testing didn't fully make sense to me until I played around with
quickcheck (and eventually theorem proving in Coq). Unit tests vanish nicely
to theorems and (empirical) proofs if your code expresses a succinct API. This
is one end of the testing continuum.

I use this sort of stuff extensively when doing mathematical computing and
statistics because there's usually a clear mathematical boundary. Once you're
inside it, it's relatively easy to write down global properties (theorems) of
your code's API.

The moment you cross that boundary your testing apparatuses have to get more
complex and your tested properties less well-defined. Unit tests are hazier
than quickcheck properties and integration tests hazier still.

This continuum seems to be precisely the same as the code reuse continuum.
Highly abstracted, testable code with a shapely API is a highly reusable
library whether you like it or not. Maybe it's being called by other code,
maybe it's being called by your UI, maybe it's being called by the user
themselves.

~~~
Dn_Ab
I view unit tests as a kind of proof by counter-example. You have a logical
structure your program embodies. This structure is very hard to specify
mathematically and prove deductively so you come up with key statements that
must at least evaluate to true for this structure/theory. The tests are a
bunch of counter-examples that should be false (test passed).

If a random testing framework is available in your language they really should
be integrated as they are able to come up with some pathological examples.

~~~
eru
For Haskell you can do even better than random testing: There's Lazy
SmallCheck for exhaustive testing. (For some values of `better' and
`exhaustive'.)

------
damoncali
_Don’t use Cucumber..._

Thank God someone with a bullhorn finally said this. I was beginning to think
I was alone in my hatred of Cucumber. (And my love of Test:Unit/Minitest.)

~~~
typicalrunt
I'm in the same boat as you, but I don't tend to voice my opinions on
Test::Unit because there is such a strong opinion _for_ RSpec and Cucumber.
IMHO, I tend to like my tools to be tried and true, and not do any fancy
magic.

~~~
equalarrow
Learned my lesson the hard way(s) with Rspec and Cucumber a while ago - total
waste of my time.

There's a group of devs that have fooled themselves into thinking anyone
outside their group understands how they are testing. I've seen this
firsthand. DHH has always been right, imo, in this regard; test what you think
is important, use simple tools.

TestUnit still serves me well and it perfectly fits my needs of do more with
less.

------
pbiggar
I largely agree - there is a certain testing dogma that goes into testing that
this article dispels nicely. Of course, it comes with its own dogma, though I
guess that's a bit tongue in cheek considering the author says: "let me
firebomb the debate with the following list of nuance-less opinions".

So let me add some nuances:

1) DO aim high though, just recognize that the work in getting there is
probably better spent elsewhere in your app.

3) BUT ignore this advice if you don't write tests yet. When you learn to
test, or start working on a new feature that you may not know how to test, it
will take you as long to test it as to code it. From there on though, test
cost of testing is pretty cheap, so the 1/2 or 1/3 ratios start to make sense.

4) Do test that you are correctly using features and libraries (yes, standard
activerecord stuff is probably going overboard).

5) But dont forget that many bugs occur at the boundaries of functional units.

6) Do what works for you, and what makes sense for you code base and business
priorities. I don't love cucumber myself, but when others swear by it I can
see why they like it.

Kent Beck's quote at the end is lovely. The first and only book on TDD I read
was Beck's, and it's good to know that he's not actually as dogmatic as the
book makes you think.

------
astral303
Brilliant article. Testing for testings sake is wrong. Testing for 100%
coverage sake is wrong. Write just enough tests at the level where it catches
most of your regressions. Drill down into unit tests for complex logic,
because you can test that more extensively and much faster than an integration
test. Then leave a case or two for an integration test to make sure things are
hooked up right.

Don't be afraid to unit test little complex things here and there. Are you
writing a function to parse a string in a certain way? Pick that function,
elevate its visibility if need be, write a simple unit test to make sure you
didn't make a stupid off-by-one mistake. Does the rest of the class otherwise
not loan itself to unit testing? That's OK, move on.

We've learned that each line of code is a liability, even if it's a
configuration file, which is why we have come to appreciate things like DRY,
convention over configuration, less verbose languages, less verbose APIs.
Likewise, each line of test code is a liability, so each line better justify
itself.

~~~
stcredzero
_Write just enough tests at the level where it catches most of your
regressions...make sure you didn't make a stupid off-by-one mistake._

One thing that I've seen in inexperienced coders (including myself in the
past) is that they tend to think of every bug as a fluke one-off mistake in an
otherwise mostly flawless and awesome record. New coders tend to want to just
fix a bug, then pretend it didn't happen.

This is exactly the wrong attitude to take. As a discipline, we programmers
should be studying our mistakes and taking steps to prevent them in the
future. As a craftsperson striving to improve, each of us should be studying
our own mistakes and taking steps to prevent them in the future.

~~~
astral303
I think you might be misreading what I wrote. While your points are correct, I
was specifically referring to "off-by-one mistake", which is a common "silly"
error (since many indices are zero-based, it's often easy to request one too
many elements, or chop off the first item).

Also the way you quoted me above, "make sure you didn't make a stupid off-by-
one mistake" looks like it's talking about writing just enough tests. However,
in context, I'm actually referring to writing unit tests for small items where
you might make a stupid off-by-one mistake.

So I was never referring to "one-off mistakes", as in mistakes that are
flukes.

Your points are all good otherwise! Never rest on your laurels and always
think about what you can do to catch your mistakes.

~~~
stcredzero
I did misread what you wrote, but I thought the point was a good one to make,
so posted it anyhow. Not disagreeing, just adding.

------
skrebbel
Nice read.

I'm no Rails dev, so I'm curious about this one point from DHH:

> _6\. Don't use Cucumber unless you live in the magic kingdom of non-
> programmers-writing-tests (and send me a bottle of fairy dust if you're
> there!)_

I mostly do C#, and teams I've recently been on have found SpecFlow tests to
be an _excellent_ time saver in communicating requirements and acceptance test
criteria with customers. Has Cucumber not been designed for the same purpose?

I might guess that David included the point because a product business such as
37signals has no non-programming stakeholders to communicate about
requirements and acceptance criteria with.

Using BDD for having non-programmers _write_ tests sounds far-fetched to me
indeed. It's excellent to have them able to _read and understand_ the tests,
though. Any opinions? Is BDD as dead horse, or is DHH a little narrow minded
here?

~~~
damoncali
My own observation is that some ruby/rails developers get so enthralled with
testing that it becomes an obsession that eclipses the product that they are
building. The result (strong opinion coming...) is libraries like rspec and
cucumber. They're complex and burdensome, and tend to attempt to mimic
english, but often do so poorly and are totally unsuitable for non-coders. You
spend a lot of time learning "the way" to do things and wind up with code that
is unintelligible to less experienced developers.

I use test::unit and minitest and it gets the job done without having to keep
up with the latest trends. It's simple and can be written in a way that
correlates well to actual English requirements. It takes all of an hour to
digest minitest from zero.

That said, Cucumber and rspec are very popular, so I may be the weird one.

~~~
100k
For a long time, I had a deep hatred for RSpec because we used it on a project
before the API stabilized (and before it was easy to maintain an environment
where all developers had the same gems).

We got stuck on some particular revision in the RSpec Subversion repository.
The choice was re-write all the specs, or stick with that ancient version. We
re-wrote all the specs -- to test/unit.

Several years later, and I have never picked up RSpec for my own use. However,
I am working on another project that chose RSpec and it is working out pretty
well. I have turned on render_views so I don't have to test those separately
and am only using mocking for external services.

Cucumber, on the other hand, I do not understand at all. Why write tests in
English when you have Ruby?

~~~
homosaur
There's a few solid reasons I've heard for using Cucumber, although in my life
I've not found a need for it yet.

1) Although it's a Ruby tool, it works with a ton of languages. Someone can
write code based off Cuke tests in Ruby or .NET or whatever without much
trouble.

2) it makes web workflow testing cake

3) it keeps people strongly out of the "implementation" zone when they are
thinking about how a program should be properly executed

4) it works with many spoken languages so if you're collaborating with an
international team it could be useful there.

5) it has a whole bunch of report formats built in

If you don't find any of those features incredibly useful, I'm not sure you're
going to ever see a need for it. I've played with it but for me it seems more
hassle than anything. I do LIKE it but that's not enough to justify the time
spent messing with it.

------
adrianhoward
_You’re probably doing it wrong if testing is taking more than 1/3 of your
time. You’re definitely doing it wrong if it’s taking up more than half._

"No generalization is wholly true - not even this one" said Oliver Wendell
Holmes.

What proportion of your time do you spend writing tests for a one-off bash
script to fix some filenames? What proportion of your time do you spend if
your writing the fly-by-wire code in a 777? You should spend "enough" time
writing tests - where "enough" is extremely context dependent.

I also worry about this sort of advice because the reason many spend too long
testing is that they're _really bad a testing_. I fear this group will use the
quote as an excuse to do less testing, rather than get better at testing.

Also, to be honest, I'd be hard put to tell anybody how much time I spend
writing tests vs. writing code. I've been practicing TDD for about ten years
now - and I just don't think about "testing" vs "coding". I'm developing -
which involves writing tests and writing code. Trying to separate them out and
time them makes about as much sense to me as worrying about whether I type
more with my left or right hand.

~~~
matwood
_Also, to be honest, I'd be hard put to tell anybody how much time I spend
writing tests vs. writing code._

That's a good point. Test writing versus code writing is not black and white.
Spending time writing tests is also time spent thinking about what's going to
be coded. Writing tests should make the coding process go quicker such that
there exists a lot of overlap in test writing and coding.

------
typicalrunt
I completely agree with David's assertion that there is too little focus paid
to how to test properly or what over-testing looks like.

Here's a question to the HN community... For a library you write yourself to,
say, access a Web service, how do you go about testing it? (For example, if
you want to write tests against an Akismet gem)

I tend to write both unit tests and integration tests for it. My unit tests
mock out the HTTP calls made by the library and only test that the library is
able to handle both good and bad inputs.

My integration tests allow the library to speak directly to the web service in
question, to test that the correct connection is being made and the service is
providing the correct data back to the library.

Is this overkill?

~~~
edwinnathaniel
I'd have to lean toward yes and here's why:

1) Config, connections, or whatnot are one-time only. You're not going to
screw up many times.

2) Not all WebServices have their own test-server so count yourself lucky if
the ones you use have them :)

3) You're testing their code instead of yours if you have to make sure correct
data etc comes back.

I usually only write unit-tests if they came back with bad data and we did not
anticipate it (edge-case situation).

------
viraptor
I don't get the "don't aim for 100%" point. What are you aiming for then? 50%?
What happens when you reach 50% and remove some code - do you remove enough
tests to match the ratio?

We may just phrase the same idea in 2 different ways, but I'd go with "don't
force yourself to do 100% coverage if it's not that relevant" / "don't add a
test for a simple getter if you have more important things to do". If you can
do 100% and have time for it - you should definitely do that. For example in
case someone rewrites your getter, but it's not that simple anymore.

~~~
adrianhoward
_I don't get the "don't aim for 100%" point_

For me the point of that one is that test coverage isn't the goal - good code
is.

I've seen folk disappearing down a rabbit hole focusing on getting that last
2% of branch coverage using some baroque mock object monkey patched into the
system. Their focus was on test coverage. What people who focus on test
coverage get is an evil complex test suite that's very brittle in the face of
change.

Other, smarter, folk go "damn - I can't test that easily - this code sucks",
factor out unrelated functionality into appropriate classes, add some code
that makes some duplication between branches obvious, factor out the
duplication and end up with something that's better code, with simpler tests -
and better test coverage too. Their focus is on the code - not the test
coverage.

[Edited for... erm... English]

------
aslakhellesoy
This reminds me of the old joke about the man who went to his doctor and said:
"Sir, those deposiwhatever pills you gave me were no good. I might just as
well have stuck them up my ass".

Using a tool (or a medicine) to do something else than what it was designed
for rarely has a positive effect. Cucumber was not designed to be a testing
tool. Most people don't realise this, and end up with a verbose,
unmaintainable mess. They blame the mess on the tool without realising that
they used the tool wrong.

I don't use Cucumber to test my code. I use it to discover what code I need to
write. This is a subtle, but important difference. This approach can work
whether there are non-technical people involved or not.

I typically start out with a single Cucumber scenario that describes from
10.000 feet how I want a certain feature to work - without getting bogged down
in details.

This allows me to reason and think about the domain in a way that helps me
write the simplest possible code. What usually happens (to me at least) is
that I end up with a simple design that reflects the domain.

Last week I wrote a small application like this. It is 1000 lines of Java
code. I have 6 Cucumber scenarios - a total of 100 lines of Cucumber "code"
(Gherkin) and about 200 lines of Step Definitions code (the glue code that
sits between the Cucumber scenarios and the app).

I could have used JUnit instead - or I could have just written the code
without any tests at all. For me, this would have made it harder to start
coding. I would have started off with a much more muddy picture about how the
app needs to behave and how it should be designed internally. I would have
spent more time experimenting.

If you use Cucumber as a starting point to discover the code you need to write
- and keep your scenarios few and high level - then you're more likely to reap
benefits instead of pain.

And never fall for the temptation to use Cucumber to test details. That's what
unit testing tools are for.

------
gfodor
Hey, while we're slaughtering sacred cows, lets kill mocking, endotesting,
expectation based testing, and the whole nine yards. It's a horrible practice
that causes you to write too many tests, too many assertions, and results in
tests that stay green even if you _delete entire files from your codebase_.

~~~
aneth
How would you test, say, payment processing, in any sane way, without mocking?

How do you test failure handling code when the failure can not be reproduced
in a known way by the current code base?

I agree these tools are often overused, however they do have their roles.

~~~
gfodor
Yeah I'm not referring to mocking external services, which is both smart and
useful. I'm referring to the practice of endotesting whereby you mock your own
objects and interfaces as a means to test interactions between them.

------
zmoazeni
@dhh was talking about this on twitter before he made this post. And I think
@dchelimsky has a very fair point that @dhh ignores: testing is all about
effort to eliminate some amount of risk.

 _Test x but don't test y can't be universal. Risks are (often radically)
different in every app/team._

(from <https://twitter.com/#!/dchelimsky/status/190096365794230274> )

@dhh and 37Signals have a different perspective on acceptable risk for their
own product (which they maintain every day) vs developers writing software for
someone else/handing it off to someone else. 1 dev to 1 project vs 1 dev to x
projects. As a consultant, my acceptable risk level is very different than an
entrepreneur trying to push out a MVP and I think testing will reflect that.

~~~
bryanlarsen
I think dhh tried to address that with his comment _yes, yes, if you were
working on an airport control system for launching rockets to Mars and the
rockets would hit the White House if they weren’t scheduled with a name, you
can test it—but you aren’t, so forget it_

~~~
zmoazeni
That's a hyperbolic comparison. Yes, focusing on outliers makes all the other
data points look the same. My point is that in the context of each team/app,
those differences in acceptable risk matter.

------
adrianhoward
_Code-to-test ratios above 1:2 is a smell, above 1:3 is a stink_

I think this depends on the domain. Some code ends up with a metric shed load
of important edge cases because it's modelling something with a metric shed
load of important edge cases - not just because your code sucks.

For example one project I worked on involved a lot of code to manage building
estimates. It involved a _stack_ of special cases related to building codes,
different environments, the heuristics that the human estimators used, etc.

There wasn't any sane way to remove the special cases - the domain caused
them. There wasn't a way to sensibly avoid writing tests - since we didn't
really want estimates for the amount of drywall in a multi-million pound
skyscraper to be wrong :-)

------
samd
Testing isn't just about preventing bugs, it also encourages better design,
and good design makes extending and enhancing your application much easier and
faster.

...in general, of course, there's always exceptions, don't be dogmatic.

~~~
romaniv
I think this aspect of unit testing is largely overrated. True, it does stop
people from writing something ridiculous, like a 1000-line method that handles
a dozen different things. But it also encourages developers to go overboard in
the other direction: creating endless, needless layers of abstraction,
chopping logic into bits that are to small to represent anything in reality
and doing dependency injection when it's not needed. It's important to keep in
mind that ease of unit testing doesn't automatically ensure ease of use of
your library in real code.

In short, it's better to think of unit testing as bug-prevention mechanism.
Thinking of it as an ultimate design mechanism doesn't end up well.

------
hkarthik
Discussions like this arise mostly because developers today think less about
whether a feature should be built and more about how it should be architected
and tested.

For decades, we've been conditioned to think that by the time a feature is
written down on paper, someone else has crossed all the Ts and dotted all the
Is to determine that the feature has value and should be built.

It's far easier to focus on low value testing issues than attempt the harder
work of convincing the customer/product owner that certain features should not
be built.

------
sudonim
If you're looking to manage code smell, check out Code Climate -
<http://codeclimate.com>. We use it as an awesome gut check for your code
coverage as you write more code. It helps you see at a macro level: "Are you
writing better or worse code over time?". Pretty cool stuff.

------
tnorthcutt
_Code-to-test ratios above 1:2 is a smell, above 1:3 is a stink._

I think this is stated backwards. A code-to-test ratio of 1:3 is _lower_ than
1:2, and based on the wording (smell, stink), it sounds like David is saying
it's _higher_.

~~~
awj
It is. Most of the time this is expressed as test-to-code, and I suspect it
was originally written that way before being reworded.

------
100k
I am glad DHH has put up this piece. This is one of my favorite topics to be
contrarian about. I am a strong believer in the importance of developer
testing to ensure code quality, but some people take it way too far. I gave a
talk about this at RubyFringe: <http://railspikes.com/2008/7/11/testing-is-
overrated>

I also am utterly opposed to the Cucumber-style programming-in-English-via-
regexp testing approach. Unless someone who doesn't know how to code is
writing those step files, why subject yourself to that?!

------
orblivion
My first unit testing experience (Django), it wasn't 1:2 or 1:3, it was more
like 2:1. As far as time spent, I don't even know if I could separate the two,
I spent so much time in tests and production at the same time. I actually want
to scrap that code and restart it, it's not worth changing all those damn
tests any time something else changes. I tend to go to extremes, I'm actually
thinking of switching to Yesod next to see if I can mitigate a lot of the
reasons for testing.

------
lucian1900
I disagree about coverage.

First, I've found tests to be an excellent way to debug, much better than
trying to manually get the app into the state I want and then stepping into
the debugger.

Second, having a good coverage lets you refactor with a lot more confidence. I
often open some file to fix a bug/add a feature and notice some piece of code
that is horrible. I can comfortably fix that code too and if the tests pass
(including the one that failed before fixing the initial bug/adding the
feature) I'm done.

------
rondon1
There are too many managers that just want Metrics. for example: Bill created
100 tests this month, Sam has 95% automated test coverage.

These metrics don't mean anything.

------
codereview
Great points. What not to test is just as important as what to test. If you're
interested in learning how to start TDD, Typemock is hosting a free webinar
next week introducing TDD: <http://j.mp/IVmQNi>

------
joelhooks
I've always enjoyed Testivus on test coverage metrics[1]

[1] [http://googletesting.blogspot.com/2010/07/code-coverage-
goal...](http://googletesting.blogspot.com/2010/07/code-coverage-goal-80-and-
no-less.html)

------
wtracy
This seems to be almost the opposite attitude of the people behind Jester:
<http://jester.sourceforge.net/>

------
tptacek
STOP. BREATHE. THINK.

Here's the Stack Overflow post that I think triggered this post:

[http://stackoverflow.com/questions/153234/how-deep-are-
your-...](http://stackoverflow.com/questions/153234/how-deep-are-your-unit-
tests/153565#153565)

Yes, he said "fuck". Let's get past that, because it's a super boring
discussion to have on HN.

Also: pointing out specific 37signals or Rails bugs as evidence that this view
is flawed? Also a super super boring way to argue. Argue with the idea, not
the proponent of the idea.

Ok, feel better? _Now_ comment on this story.

------
namidark
What a bunch of hogwash; 1000 lines of code to to test validates_presence_of?
What testing framework is he using? x86 assembler?

The only part I agree about is the view testing and partially cucumber.

Then again I dont think 37s has a large amount of business logic to test
(compared to some of the other commenters in here) and their testing is mostly
front end, and testing front end ajax-y stuff with cucumber, does indeed suck.

------
maeon3
It is not fun to pull out the power tools on a huge piece of software with 50
thousand unit tests and see that 600 of them fail because you changed an
important part if the code.

It means the unit tests will have to be deleted, and or updated. If each takes
10 Minutes that is 6000 Minutes.

On the other hand, seeing one unit test fail in another module you didn't
expect to fail can save you a week of work, so it's all in how brittle and
organized the tests are.

~~~
astral303
In my experience, it means that you have to treat your tests with nearly as
much software engineering respect and prowess as your real code. You have to
factor things out and try to avoid DRY (particularly tough sometimes in
testing) as much as possible.

If there is one "concept" that you're asserting in your test suite, you want
that concept to only be repeated once or as close to once as you can get.
Often times, people copy paste sets of assertions that have mixed concepts.
Then, when a requirement changes, hundreds of tests end up having to be
updated.

This is why many large companies, that contract out test automation
development or train manual QA people with enough programming skills to write
tests, end up with brittle test suites. One look at those test suites by an
experienced developer and it's no wonder: there's conceptual/semantic
duplication everywhere.

------
aneth
The phrase “Don’t test standard Active Record association” jumped out at me,
since I’ve discovered a major bug in the most basic function of Active Record
associations in 3.0.x.

<https://github.com/rails/rails/issues/5744>

That said, I generally agree with DHH that many developers write too many of
the wrong kinds of tests and this can make products brittle.

I share his disdain for Cucumber and rspec. I prefer clarity over syntactic
sugar and non-programmer readability.

~~~
edwinnathaniel
The point is not to test the library you're using. Test your own code and
assume the library has been taken care (even if there are bugs there).

------
nirvana
One line in this article may have given me an epiphany: >" If I don’t
typically make a kind of mistake (like setting the wrong variables in a
constructor), I don’t test for it."

My primary objection to TDD is that it doesn't seem to work, because when I've
tried it, the tests caught no bugs at all. I believe that tests can be
important for APIs to prevent regressions when you have to work with external
components, but its frustrating to put time into tests and then never have the
tests fail.

Its not that I'm a perfect programmer, its the kind of bugs I make. The kind
of bugs I make are caught by the compiler. This may be because, over the years
of my career (many of which occurred long before the idea of "test first" was
widely heard) I've trained myself in a style of programming where I can trust
the compiler to catch my mistakes (most of which are typos, frankly.)

I don't know if others can do this, but for me, it came about by doing things
like:

Old way: if (variable == 1) then whatever New way: if (1 == variable) then
whatever

Every time I mistype that as "(1 = variable) the compiler catches it because I
can't redefine 1.

------
excuse-me
Testing, or at least coding, like the TSA thinks is remarkably common in
industry.

TSA logic - somebody tried to get on a plane with a bomb in their show so
everyone everywhere has to remove their shoes

PHB logic - somebody somewhere had a bug because of a misuse of inheritance so
you aren't allowed to use any inheritance in your C++ code anywhere.

