
An External Replication on the Effects of Test-driven Development [pdf] - joatmon-snoo
http://people.brunel.ac.uk/~csstmms/FucciEtAl_ESEM2016.pdf
======
jdlshore
This study, like most software development studies I've seen, is seriously
flawed. It doesn't justify the sensational title here on HN.

* The sample size was tiny. (20 students)

* The participants were selected by convenience. (They were students in the researcher's class.)

* The majority of participants had no professional experience. (Six students had prior professional experience. Only three had more than two years' experience.)

* The programming problems were trivial. (The Bowling Kata and an 'equivalent complexity' Mars Rover API problem.)

Maybe, _maybe_ you could use this to draw conclusions about how TDD affects
novices working on simple algorithmic problems. Given the tiny sample size and
sampling by convenience, I'm not sure you can even draw that much of a
conclusion.

But it won't tell you anything about whether or not TDD impacts development
time or code quality in real-world development.

~~~
hackuser
It may be unfair to say this in response to the parent comment, but the great
majority of HN discussions start with a comment like this one: It's seriously
flawed, etc. Occasionally it's true, but the noise drowns out the signal.

In a graduate-level engineering class, the students were making similar
statements about all the studies we read. One day the professor said: It's
easy to find flaws in someone else's work; humans are flawed. The real
challenge and benefit is to find the value in their work - find what has
lasting value, learn from it, and carry it forward.

~~~
ericdykstra
The conversation around TDD is tired, and the conclusion is always the same:
"it depends." It depends on the person writing the code, the type of problem
they're tackling, the language they're using, the needs of the business, etc.

This study doesn't bring anything new to the table except: "in this
manufactured environment we found a single point of data that equates to
noise."

I'm guessing the only reason the story was upvoted at all in the first place
is because some people who agree with the title clicked the up arrow without
looking at the article.

~~~
tbrownaw
_the conclusion is always the same: "it depends."_

Er, no. The studies I've read all end up showing that principled testing
helps, but test-first and TDD (strict red/green cycle, code only enough to
pass the new test, etc) provide no additional benefit over anything else that
gets the tests written.

The "it depends" always comes from the echo chamber trying to justify their
desire to believe that TDD isn't completely useless. It actually feels quite
similar to the claims I've seen from practitioners that reikei, faith healing,
etc aren't complete bunk.

~~~
ericdykstra
I really don't believe any study or meta-study could come close to being able
to suss out the nuance of when TDD may provide an advantage and when it
doesn't.

I'd rather just trust programmers to consider what approach works best for
their problem and mindset and go from there.

I personally don't TDD most things, but it's a tool I have available and I
bring it out when a situation arises.

~~~
raverbashing
Well, with TDD you waste time writing red tests first, then with tests that
"just pass" and finally make the thing as it is supposed to be

I find it surprising that's not slower than Test-Last (maybe if you really
leave it for last then you'll need some time to fit your functions to your
test)

------
geerlingguy
Could it be that TDD vs. tests-after-code is a highly personal thing? I
personally find it easier to write good tests after I've coded something
functional. Before hand, I know one or two fuzzy ideas of what I want to
accomplish, but I can't list out the concrete, real-world test scenarios until
after I've coded something, poked and prodded it, etc.

But I know some people are wired differently; they'll think a lot more about
scenarios first, then code after they have everything accounted for. For them,
TDD as a philosophy seems more fitting.

I think the chasm exists between _untested_ code and code that has tests. I've
never understood the seemingly-religious zealotry behind TDD as an XP
practice. Just like pair programming... if it works for you and your coding
style, awesome. But don't force it down my throat or act like it's the One
True Path to clean code.

~~~
paulddraper
TDD vs. test-after-code is a small distinction.

80% of software development is designing correct abstractions/interfaces/APIs.
If you have the correct abstractions, everything else is easy by comparison.
And both tests and code are fundamentally founded on these early design
decisions.

So whether I do TDD or tests-after-code, I'm confronted with the 80% first:
designing the interfaces (either in writing or mentally).

Naturally, I never get this part right at first, and so I wind up refactoring
a lot as I go along. ("Build one to throw away", I believe this has been
called.)

Now in my experience, TDD requires me to refactor more during this process
than tests-after-code. But suit yourself. Changing the order within the last
20% won't be earth-shattering one way or the other.

~~~
embwbam
This is why designing the interface in a language with a powerful type system
provides the same quoted benefit of TDD. It helps you think about the
interface before you move to implementation.

In my experience, using a type system to do this requires much less effort and
refactoring.

~~~
paulddraper
I am a big fan of "testing via types".

Unit tests can demonstrate that my code is correct; it can say nothing about
how my code is used by others.

In contrast, a type system can extend those protections downstream.

(Yes, yes, unless you go way off the deep end of dependent types, you still
need tests. But a powerful type system can systemically prevent a huge number
of very common bugs.)

~~~
mercurial
My take it on it is that it can remove most of the boring bugs you would get
in a less strongly-typed language and leaves you open to the interesting bugs
(eg, logic bugs).

As for dependent types, I'm not sure that even then you would be able to
ensure that, say, your cache is correctly invalidated.

------
tspike
My experience has been that TDD is worthwhile when working with notoriously
slippery whack-a-mole functions like handling time or money. The time saved by
catching regressions vastly outweighs the time taken to implement the tests.

In contrast, TDD has been a waste of time for me for UI-based work, as the
effort needed to properly expose the functionality under test is too great and
the requirements and design change too quickly to be worth it.

In the latter case, writing some deterministic UI tests against mock data
after the requirements and implementation have settled has proven much more
effective in preventing regressions.

~~~
err4nt
Can you elaborate more? Im a frontend guy and sometimes I write JS plugins for
the browser (that run after page load, and apply based on how things render on
the page, or bawed on user interaction with the page) and non-frontend folks
tell me I need tests for my code, or dont want to look at it until I have
test. How would I build a tewt for this, other than a functional test that
runs in-browser and the test is whether it works or not?

~~~
zachrose
Here are two strategies. First, you can go "outside" what you've built,
simulating a user's interaction and examining the resulting DOM. In general
this is a tool-centric approach.

Another approach is to partition your code into what Gary Bernhardt calls an
"imperative shell" of code that touches the outside world and a "functional
core" that does not. Then unit test the functional core, which shouldn't
require special testing libraries, and validate that the imperative shell
works by the occasional manual or "outside" test.

------
EdSharkey
The

    
    
      * fire your QA team
    
      * dev team is the level 2 production support, and
    
      * get to continuous integration nirvana
    

management fads have been sweeping through my Scrum enterprise for the last 18
months.

Teams that aren't testing constantly, well, they've got tons of escape defects
on every release. And those devs are constantly in fire-fighting mode, it's
miserable for them. And I see that leading to compressed schedules for them
and more reckless behavior like asking to push their releases during the
holidays where there could be severe financial consequences to bugs.

As far as I'm concerned, in an environment like mine, where developers can no
longer hide their incompetence behind bureaucracy like a QA team, it is
official _insanity_ to not spend inordinate amounts of development time
writing automated tests. You should be spending 70% of your dev time writing
tests and doing devops and 30% writing features.

I read in these comments a lot of bellyaching about how much time it takes to
write tests. First, TDD is a skill that you can get good at, and it won't take
as much time as you think once you get good. Second, I just don't think you
have a choice to not test comprehensively when escape defects become a mark of
shame in the organization.

~~~
hirsin
Potentially contentious opinion, but one that's been echoed through our halls
-

Developing software and developing tests and test infrastructure are two
different skills and mindsets that are often inversely coupled.

It's not that "incompetence" is revealed by firing the test team (although it
sometimes is) - it's that "being bad at writing tests" is revealed. A team
with 3 dedicated testers and 7 devs will probably outperform (in code output
and reliability) 10 devs spending 70% of their time on testing.

~~~
kodfodrasz
I have experienced the opposite. After the dedicated QA team was disbanded and
repurposed to development, where every developer had to write tests for
someone else's code, the code and designs started to become more testable, the
tests eventually became simpler to maintain and understand. (the project was
and embedded system)

The "We" and "Them" distinctions made people in different teams and different
roles ignore the needs of the others. The change brought a culture change and
also gave a view on the needs of the other side. This could make it possible
that for the first time the development of tests and features could really be
done in parallel (officially that was the methodology before, but never really
worked out well, because the testability considerations were usually ignored
at design time, and the docs were lagging, because of the bad bandwagoning
culture. With the mixed team where everyone was treated as an equal these
problems dissolved surprisingly quickly (in less than half a year)).

So my point is having dedicated testers can give base for bad culture which
hurts the product and the company. Having everyone do the same job with
regards to development and testing is better.

~~~
hvidgaard
There is no silver bullet here. For some teams it makes sense to integrate
development and testing. For some teams it makes sense to have dedicated QA
people. For some teams a different constellation is optimal. I know developers
that are brilliant at the big picture, finishing the implementation however
was lacking. I also know developers that cannot get the major architecture
right, but they can finish tasks and get it shipped. pick one of each and a
devops guy that can write tests, and you have a team of 3 that produce top
quality software.

~~~
kodfodrasz
You are probably right with your example for a small project. The one I
referred was a medium sized safety critical piece with dev/test staff of
hundreds. My example was for such larger organization.

Small teams almost always worked out well for me if a single leader person was
present to sort out initial problems.

------
donw
Am I correct in reading that they performed this experiment only for two days,
and entirely with graduate students?

If so, they have missed the point of TDD.

In the short term, TDD probably doesn't make a difference, one way or another.

But software as a business is not a short-term game.

I would love to see a study where the participants are, over a period of six
months, given the same series of features (including both incremental
improvements, as well as major changes in direction).

In my experience, teams that don't test at all quickly get buried in technical
debt.

Untested code is nigh impossible to refactor, so nobody ever does, and the end
result is usually piles of hacks upon piles of hacks.

As far as testing after development goes, there are three problems that I see
regularly:

One, tests just don't get written. I have never seen a TLD (Test Later
Development) team that had comprehensive code coverage. If a push to
production on Friday at 6pm sounds scary, then your tests (and/or
infrastructure) aren't good enough.

Two, tests written after code tend reflect what was implemented, not
necessarily what was requested. This might work for open-source projects,
where the developers are also the users, but not so much when building, say,
software to automate small-scale farm management.

Three, you lose the benefit of tests as a design tool. Code that is hard to
test is probably not well-factored, and it is _much_ easer to fix that when
writing tests, then it is to change the code.

~~~
jcoffland
> Untested code is nigh impossible to refactor, so nobody ever does, and the
> end result is usually piles of hacks upon piles of hacks.

The mistake is in creating unrefactorable code. TDD may be a possible solution
if done correctly but there are other ways to skin that cat.

~~~
EtienneK
Sure, but I have yet to find a better solution than TDD for big teams where
team members come and go constantly.

~~~
Deestan
"Branches aren't merged without peer approval" has sufficed plenty in my
experience.

Whether people code tests before thinking interfaces, before writing a
prototype, before implementing, during the inevitable interface rewriting,
after coding, after manual verification that it seems to work, or right before
submitting the branch to review, doesn't matter as long as someone on the team
looks over and sees that "yup, here are tests and they seem to cover the
important parts, the interface _makes sense_ , and the documentation is
useful".

Then people can do TDD, TLD, TWD or whatever they personally feel most
productive with. Developers being happy and feeling in control of their own
work does more for quality than enforcing a shared philosophy.

------
haalcion3
This is a misleading title and conclusion. The study showed a huge benefit of
TDD over Waterfall, and it is only when compared to ITL that it was found to
not be better.

But moreover, I think it's important to understand why Beck pushed for TDD.

TDD is like saying "I'm going to floss before I brush every time, no matter
what."

But, when people don't do TDD they typically aren't all saying "I'm going to
brush and floss afterwards every time, no matter what."

Instead, most say "I'll floss regularly at some point, but I don't have time
now, and it takes too much effort. I'll floss here and there periodically,
maybe before my monthly meeting or big date night."

Another reason Beck pushed for TDD was method and solution complexity
reduction which results in lower time and cost required for maintenance
because code is simpler to read and understand. Again, with ITL, you're still
writing tests for everything, so you'll see those benefits. However, if you
fail to write some or most tests, some developers will write overengineered
solutions to things and have overly long difficult to follow methods that will
make maintenance suck more resources.

If you want to go beyond this study, though, Beck, Fowler, and DHH had a
critical discussion about TDD in 2014 that's worth checking out:

[http://martinfowler.com/articles/is-tdd-
dead/](http://martinfowler.com/articles/is-tdd-dead/)

~~~
jblow
Waterfall is a straw man.

~~~
lomnakkus
Not only that... when you test the efficacy of medical interventions the gold
standard to strive for[1] is _not_ whether the new intervention is better than
placebo, it's whether it's better than $CURRENT_BEST_KNOWN_INTERVENTION. I
suggest we should be aiming for a similar standard in testing software
engineering methodology.

I think it would be very hard to argue that Waterfall ~=
$CURRENT_BEST_KNOWN_METHODOLOGY.

[1] Of course, this isn't usually what happens in practice when pharmaceutical
companies are doing their own testing, but it's what _should_ happen if you
actually care about efficacy and not just PR/sales.

------
defenestration
The title suggests that TDD has little or no impact on dev time or code
quality at all.

The research shows no significant difference between TDD and iterative test-
last (ITL) development.

Could the title be updated? To show that it is a comparison of TDD vs ITL/TLD.

~~~
Jyaif
This thread feels like a study showing that HNers don't read the articles.

------
inglor
There is a problem with all these studies - they all use a very small amount
of programmers (21 in this case) with no experience (all graduate students in
this case) and presumably no significant experience with TDD or TLD.

I'm not making a stand about TDD here - I just think we need to have much
better computer engineering science studies if we want to have significant
results.

~~~
wpietri
Agreed. I also would question the short time scale of the test. It took me a
year or so to really get good at test-driven development.

I also think of TDD as a sustainability practice. If I'm writing a small thing
that I do not intend to maintain, I won't bother with TDD (or with tests at
all). But I'll definitely TDD something where I expect to come back to it
frequently, especially when I initially don't know the requirements, and I
expect requirements to change over time.

In practice, I suspect a lot of the interesting questions about software are
effectively unanswerable with the budgets available to CS profs. I can't
imagine really answering this question without doing something of the scope of
a substantial medical study.

------
lowbloodsugar
"TDD has little or no impact on development time or code quality _when
compared to the equivalent number of tests implemented afterwards using TLD_."

FTA: In this paper we reported a replication of an experiment in which TDD was
compared to a test-last approach.

Very different title.

------
PaulKeeble
I never really viewed TDD as better at reducing bugs for a short term project,
its going to have marginal better chances of getting additional test cases.

I view it more as important for breaking the growth of testing effort in an
iterative project. With each release the scope of what should be tested to
fully test a project climbs and unless a team wishes to linearly increase the
size of its test team its all but certain tests will be skipped.

TDD gives us the ability to always full regression test as its just machine
time. Its a safety factor in knowing nothing is broken which in turn gives us
confidence we can refactor.

------
supersan
Up until now it has mostly been opinions and biases and even though many
popular programmers[1] have been saying this for a very long time, it's great
to see a controlled study done about it.

This makes it a fact and a great counter argument for helping a lot of
programmers who are being forced to practice TDD because of the generally
accepted claims in productivity and code quality associated with doing it.

[1] [http://david.heinemeierhansson.com/2014/tdd-is-dead-long-
liv...](http://david.heinemeierhansson.com/2014/tdd-is-dead-long-live-
testing.html)

~~~
girvo
Keep in mind though, that the study shows that it's no better than testing
after iterative development. Testing is still required, and I'd wager that the
study participants following the ITL process didn't have external business
pressure to skip the "test-later" bit...

------
namuol
\- Population: A classroom of students, most without professional experience

\- Sample size: 21 students

\- Study duration: 2 days

\- Team size: Individual

Tests are most useful when _refactoring_ someone else's long-forgotten code;
the sort of thing that happens frequently in long-running projects consisting
of large teams. In other words, the "real world".

Show me _that_ study.

------
sayrer
In "Realizing quality improvement through test driven development: results and
experiences of four industrial teams", an MSR researcher found that TDD did
reduce defects in his study, but also came at a large cost in time-to-ship.

[https://www.microsoft.com/en-us/research/exploding-
software-...](https://www.microsoft.com/en-us/research/exploding-software-
engineering-myths/)

This finding contradicts the headline. TDD impacted both development time and
code quality in that study.

------
shade23
I have spent almost 3 years now writing code(with very few or no tests) and my
current organization stresses on agile practices a lot.I encountered TDD from
here.So I would like to chip in here too.

TDD solved a major problem for me which I have seen a lot of people suffer
with. _Where do I start ?_ . The thing is TDD and refactoring go hand in hand.
I cannot imagine doing TDD if I was not using an IDE like Intellij or
something. When you normally start writing code first(typical TLD) then you
need to have a plan before hand. This plan cannot change much because you
really do not get feedback till you complete major segments of the code. TDD
ensures you keep getting nibble sized feedbacks which assure you that what you
are writing works. This according to me is the single most beneficial point of
the system. TDD or TLD would allow maintainable code too.And often while doing
TDD too,you can strictly follow TDD. It might not have an impact on code
quality for seasoned developers(coding for years on the same codebase) but it
does help for the others .It also reduces my inertia considerably too. So
while it might not have impacts on development time or code quality. I tend to
sleep well without large UML Diagrams floating in my head and knowing that
each unit of my code works independently.

~~~
jcoffland
> When you normally start writing code first(typical TLD) then you need to
> have a plan before hand. This plan cannot change much because you really do
> not get feedback till you complete major segments of the code.

A good plan is modular and therefore flexible. I get lots of feedback as I'm
writing the code. When I find myself writing very similar code over and over
or an API feels unwieldy I take that feedback and refactor without delay.

I think what really happens in big groups of developers is that TDD forces
everyone to delay agreement on the interfaces between code modules until the
process of writing tests has uncovered most of the problems. With out TDD the
devs who just want to get their part done and go home plow ahead with the
first draft of the API locking it in stone before it's been vetted and then
dig in their heels creating technical debt. TDD appears to be the saviour.

~~~
shade23
>I think what really happens in big groups [...] dig in their heels creating
technical debt.

This is a major factor. the Pareto principle applies in workplaces too. TDD
tries to even the balance a bit.

Regarding feedback.I should have clarified that I am a mobile dev. I need to
build everything from the backend services to the UI to be able to get viable
feedback without tests. And if any other mobile dev has any other approach to
this problem . Please let me know,I've been trying different approaches for a
while now. None seem viable to me apart from TDD.

------
johan_larson
The questions worth asking about techniques like TDD are "What problems does
it fix?" and "What problems does it introduce?"

I would expect a determined attempt at TDD to solve the "no tests" problem,
because it is so utterly insistent on tests. It should also solve the "don't
know how to start" problem, because it de-emphasizes planning and design in
favor of just jumping in; you write the tests, and then you do the bare
minimum to make them pass.

That said, I would expect a TDD-based project to have the "bad architecture"
problem: messy interfaces and sort of ad-hoc separation of concerns, because
it makes no time for up-front analysis and design. It's always focused on the
current feature and doing whatever it takes to make it work now.

In fairness, it does include a refactoring step, which is supposed to clean up
the mess after the fact. Color me skeptical. Refactoring is hard, and people
tend to do it on a large scale only when they have to.

~~~
pkolaczk
" It should also solve the "don't know how to start" problem, because it de-
emphasizes planning and design in favor of just jumping in; you write the
tests, and then you do the bare minimum to make them pass"

It doesn't SOLVE this problem. It only pushes it in time, making it actually
worse, because you're wasting time on stupid tests instead of actively
researching a solution. No amount of tests is going to help you find the right
solution if you don't know what you are doing. See
[http://ravimohan.blogspot.com/2007/04/learning-from-
sudoku-s...](http://ravimohan.blogspot.com/2007/04/learning-from-sudoku-
solvers.html)

~~~
johan_larson
I think TDD deserves a bit more credit than that. Building a simple solution
for part of the problem can be credited as exploring the problem space. The
same can be said for extending it to address more of the problem; that's
exploration too.

But I fear this incremental approach is going to produce a very baroque
solution that is going to have to be rewritten completely once the bell goes
"bing" and the programmer actually understand the underlying problem well
enough to produce a clean solution.

I think the larger problem with TDD is that there are at least four parts to
software design, and TDD bets it all on two of them. There's requirements
analysis, architectural design, construction, and finally testing. TDD is
really all about the construction and testing bits. It doesn't address
requirements analysis at all, and it doesn't seem to want to do architectural
design, it just constructs and tests with great passion. It's imbalanced.

------
NumberSix
Software development varies enormously. Flight avionics software differs from
video game software differs from a spreadsheet differs from an order-entry
system differs from laboratory analysis software differs from a web browser
and so on. Flight avionics differs from a commercial jet liner to a fighter
plane to a model airplane. Some projects have huge budgets and others have
shoestring budgets. Some projects require extremely high reliability and
quality; cost is not an issue. Other projects can be quite buggy, low quality
but still useful -- cost effective.

Developers vary as well. Some temperamentally find something like TDD useful.
Others do not.

There is no one software development methodology to rule them all.

------
sebringj
I'm not commenting on the TDD studies in terms of its effectiveness but I do
know that a project that takes longer brings more programming hours which
results in larger budgets. If you were a company selling your services, you
would be a bit more motivated to include things that take longer especially if
this tugged at the emotional sense of assurance in your clients. You would
also preach it to your programmers as a core practice and they would happily
be converts. This goes for all the structure surrounding your project as well.
I tend to see more structure in outsourcers these days and a smugness along
with it. I wonder how much of it is bloatware though.

------
varjag
No TDD discussion is complete without a reference to Sudoku debacle.

[https://news.ycombinator.com/item?id=3033446](https://news.ycombinator.com/item?id=3033446)

~~~
hkon
should be top comment

------
rainforest
The use of students in SE research is a hot topic, see, for example
Fietelson's review:
[https://arxiv.org/abs/1512.08409](https://arxiv.org/abs/1512.08409).

Practitioners have a problem recruiting subjects. There is often a tradeoff
between applying more rigorous experiment design and using convenience
sampling (students) versus sacrificing controlled environments (so that
professionals would actually join the study).

It's easy to condemn work like this but there's no other option. In this case
the researchers chose to replicate a study (which often risks similar ire for
telling us nothing new) with a commendable level of rigour and have provided
more evidence that, for the scope of experiments we can construct that TDD is
probably no different to TLD when using a population of relatively unqualified
developers (students).

As to the problem being trivial, what else can be done? There's a finite time
you can ethically expect participants to give to you, even if you pay them. If
anything the criticism of this work is better directed at the limitations
academics are forced to bear.

~~~
jdlshore
Your argument reminds me of the joke about the man searching for his keys
under the streetlamp.

"A policeman sees a drunk man searching for something under a streetlight and
asks what the drunk has lost. He says he lost his keys and they both look
under the streetlight together. After a few minutes the policeman asks if he
is sure he lost them here, and the drunk replies, no, and that he lost them in
the park. The policeman asks why he is searching here, and the drunk replies,
'this is where the light is.'"
[[https://en.wikipedia.org/wiki/Streetlight_effect](https://en.wikipedia.org/wiki/Streetlight_effect)]

I agree that this is a well-designed study given its constraints. And it's
admirable that it's a replication study.

That doesn't change the fact that it's largely irrelevant to professionals. It
doesn't test the claims made by TDD proponents (TDD leads to better design,
reduces long-term maintenance, allows for team coordination, etc.), nor does
it address any of the interesting questions about TDD:

* Is TDD more effective in a professional setting than commonly-used alternatives?

* Is a mock-heavy approach to TDD more effective than a mock-light approach?

* Do people using TDD refactor their code more or less than people using a different but equally rigorous approach?

* Is the code done with TDD more maintainable than code done rigorously in another way?

* Is TDD easier or harder to sustain than equivalently-effective alternatives?

As a study, it's fine, if only of interest to academics. The problem isn't the
study. It's the credulous response on the part of industry developers who then
turn the false authority of the study into statements like "TDD doesn't lead
to higher quality or productivity."

------
jmadsen
The problem that I have with this article is how people will interpret the
results. The test is comparing (presumably) Comp. Sci. graduate students who
already know good design patterns, best practices, etc at a relatively high
level to see if they are faster and more accurate by testing before vs. after
writing the main code. (TDD vs. TLD)

That's all well and fine, and possibly completely accurate. However, many
people's takeaway is going to be the out-of-context & incorrect title of this
post. (It does not say TDD is worthless - it says it is essentially the same
as TLD)

I've always looked at TDD as a tool to help push less experienced, less
"educated" developers into 1) even using tests at any point of the development
cycle, 2) creating tighter, cleaner and MORE TESTABLE code by the time they've
reached the end of the cycle.

So, if your team is and always will be well-educated, experienced programmers
who already understand how to always do everything correctly from the
beginning, feel free to use either method.

Otherwise, I'd urge you to consider TDD.

~~~
sporkenfang
You clearly think very highly of CS grad students :)

I've met a few that couldn't code let alone write proper tests, even though
their foci were _not_ policy or say the lighter side of user experience.

~~~
jmadsen
Really? Almost all of the post-graduate CS students I've known are experienced
people who went back to school because they were at the point where they
needed to either move into management or become a very high-level expert.

That's only personal experience, however, and mostly out-of-date

That said, I would hope any of them would be a step up on the new programmers
from the "churn 'em out" short courses many are forced to take as the only
affordable way they have to get into the industry, but then are left with
little knowledge beyond how to get the requirements fulfilled.

~~~
DannyBee
Yes. I can say 100% that the major difference I see in hiring interviews
between grads and laterals is coding ability. Even at PhD+1 it's vastly
improved

------
refulgentis
My anecdata matches the author's - I feel more productive doing TDD.

Perhaps because it's less stressful. You think about system design as you
code, instead of only when you hit a wall and have to rewrite everything, or
when you have to clean up for code review.

Either way, if it has little to no compact on dev team or code quality, I bet
the positive impact TDD has on team morale would make it worthwhile.

~~~
pvg
Extrapolating from that, perhaps it's worth considering the benefits of
thinking about system design _before_ you code.

------
jjp
This is an editorialised title. The blog posting is a boring "Test Driven
Development". The blog posting and the paper that it fronts has a conclusion
that _no significant difference between TDD and iterative test-last (ITL)
development_ , which is quite a bit different from _TDD has little or no
impact on development time or code quality_

------
Rapzid
My default is to not write many tests at all during the experimental, build-
out phase. I'm not looking for exact or bug-free software, I'm trying out
different API's, aggregates, and architecture in general. Needing to refactor
tests every time I want to make a drastic change is... Well, you know. AS
somebody else pointed out, this architectural stuff is probably actually much
harder to nail down than just writing code that works. This is not limited to
very initial build out but could apply to big refactors as well.

After and during the experimental phase, it depends. Both before and after I
may write tests before or "test with" for gnarly logic or algorithm-y stuff.
Otherwise and in addition I do copious amounts of manual testing. Manual
testing is a must for much of what I do, so I augment or substitute automated
testing as appropriate. Automated testing is great, but sometimes the overhead
is too expensive.

------
rbanffy
After a quick read on the metrics section, it seems the quality is measured in
terms of adherence to user stories implemented as a set of behavior tests.
There seems to be no assessment on code maintainability and looks like a flaw
in the study, as it would model a short lived codebase and not one that
undergoes several maintenance cycles.

------
jaunkst
TDD is king when refactoring, or proving an algorithm. You have a tests to
confirm the output, and near realtime feedback that you assumptions are
correct. The rest is obvious. Mission critical component TDD, complicated
refactor TDD, algorithm you need to validated TDD. Anything else write the
code and get a peer review.

~~~
candiodari
No amount of tests can prove that an algorithm is correct. At best, they prove
that an algorithm works in a particular case.

And generally, I would say that the more interesting sorts of tests (fuzz
testing, large-scale system testing) are extremely unpopular with software
engineers because "they suddenly fail without reason". Not quite as unpopular
as actual proof that an algorithm works, like implementing it in coq for
instance, but very unpopular.

------
Steeeve
I'm a fan of TDD, but I'm a bigger fan of having reliable, repeatable, and
complete tests period.

I don't think it's productive to argue the merits of the study itself - better
to look at the positive. What the study tells us is that it's not too late to
improve your existing software with tests.

------
Annatar
Long story short: if you don't have coders who take their product as a matter
of personal pride, or are inexperienced, or are mediocre, _no methodology in
the world will save you_. None. I realize my statement is anecdotal, but I'm
writing from decades of experience working with people who did not take any
pride in their work, and still view programming as a trade rather than art, or
view programming as an art where "spaghetti code is beautiful". No
methodology, no technology, no management technique, and no programming
language saved them or the company. The builds are still a mess. The code is
still a mess. The bodies of code require endless babysitting and endless
hacking.

------
arcticbull
I've tried to TDD numerous times in my professional career; I'm confident it
works for many. I prefer to use white-box as my second pass through at my
algorithm. It allows me to identify potential weaknesses, write test cases
around them and correct them in one step. I never feel quite as secure with
TDD as I do with post-hoc testing. I'm also not going to tell other people
that's the one-true-path. Unit tests? Critical. Before vs. after? Personal.

With respect to this study, I think at best we can say that equal quality
tests yield equal results. I don't think -- based on reviewing the methodology
-- that the headline can clearly be drawn from the study.

------
k__
TDD is basically "writing software for a test". Programming language design
has a similar problem. First BIG software many people write in their new
language is a compiler, so many languages are optimized for that.

------
rhizome31
As a TDD advocate, and assuming this study has any scientific validity, this
is actually good news! There's a very common claim that TDD makes you less
productive. It's good to have some study to oppose this claim.

~~~
kriro
I think methodically you'd need equivalence testing for that. Your hypothesis
would be that productivity is equal (enough) and you could then discuss the
additional benefits of TDD.

I can't remember if I read a study like this for TDD but equivalence testing
is fairly underused outside of pharma/medicine (it's often even called
bioequivalence) where the test usually shows similar enough effects and the
extra benefits are cost savings (for generics).

------
dang
We changed the URL from [http://neverworkintheory.org/2016/10/05/test-driven-
developm...](http://neverworkintheory.org/2016/10/05/test-driven-
development.html), which points to this.

When the topic is controversial and the paper is not so specialized that only
a few people here can understand it, changing the URL to that of the paper
tends to help make a discussion more substantial. Especially when the blog
post is more of a gloss on the paper than an in-depth commentary on it.

------
stevehiehn
If you were to always write tests immediately after you write a few classes I
don't think it would make a difference. However from my own experience I never
write nearly as many tests after the fact.

------
greesil
This is the kind of stuff that in the aggregate you can't show a relationship,
but I bet if you controlled for type of project one would see some interesting
results. Anecdotally, I know some firmware engineers that shit out the
buggiest code I have ever seen, and test driven development would have
definitely improved the customer experience. Because when the engineers have
literally no tests other than trying stuff out with a printf on the target
embedded device, any amount unit-testing will wind up helping.

~~~
Tyr42
I just came out of an embedded system-sy project, and I did have some tests
for my ringbuffs, and sprintf.

But tests for any of the interactions between subsystems is quite problematic.
And then testing on the device might be problematic due to space constraints,
but testing on a simulation is also problematic... I don't know how you'd
realistically test it.

------
godmodus
"YOLO" based Dev work on the otherhand is where it's at, right?

On the other hand I can see where students and new learners might falter. TDD
requires u know a bit about what ur doing,and if you're new to programing, ut
just costs more time to compensate for not having a healthy intuition.

Still tho, if you want to run maintainable code, that's somewhat future proof
and not disposable - test it and keep it clean.

I mean it's like arguing sharpening ur katana while u fight is detrimental to
duel survival. Which is true.. But...

------
goalieca
Actual studies were never needed to convince managers to switch processes.
Bonus points for blaming old problems on old process while blaming new
problems on "not doing agile right".

------
gaius
It always both amuses and saddens me how people will eagerly write more tests
than actual code, but refuse to use a strongly typed language. The compiler is
my test harness.

~~~
rbanffy
The compiler can test the validity of your code, not it's behavior. Only the
most trivial bugs can be caught this way.

------
iUsedToCode
The research seems low quality. Whenever i try creating something more complex
than just a CRUD webapp, i'm always relieved after getting a significant code
coverage.

It may be because i'm a medicore programmer (i mostly do hobby projects), but
getting assurance that my 'small change here' didn't mess up anything major in
a distant part of the system is quite relaxing.

Obviously i only test logic and usually write the tests after coding. It still
helps with my flow.

~~~
avarun
Then you're not talking about TDD.

------
Shorel
In a certain way, you always use tests when you are developing something.

Write code, run it, see what happens, repeat.

The 'see what happens' part is what is different in TDD.

It can be very similar to what you do without automated testing (while also
repeating all previous tests), or it can be a scaffold on endless tests, or
two few tests, or anything in between.

I've seen too many mocking tests for my taste. In fact, my tests tend to be in
the 'integration tests, not unit tests' category.

------
gostonethecrows
There are many problems with this study, but for me the most glaring is the
definition of quality that they measured. It was purely whether the program
performed as expected. This is obviously an important part of code quality,
but not the only one. Most proponents of TDD say that its greatest benefit is
creating clean, easily maintained code. So this study didn't even attempt to
test the benefit that TDD claims to provide.

------
mobiuscog
TDD means 'management' can't drop the tests being written due to 'timescales'.
If they're done up-front, they will be there.

It's also one reason that TDD isn't done, because given a few weeks to
complete an impossible deadline means that tests are the first ideal to be
dropped.

It's not the correct way to do things, but all of these studied tend to ignore
the 'real world'.

------
Confusion
The comments to that story are pretty good.

An interesting question is: why does TDD fail in such experiments (it does so
unexpectedly consistently), even when many developers feel it has benefits
when they practice it?

There is no silver bullet, so there must be circumstances in which TDD does
not work. And conversely, the central question is: under what circumstances
does TDD work? What are the preconditions?

~~~
ivan_gammel
The answer is likely in the studies, which have declared TDD efficient
practice, and in the works that specified TDD approach. These works and
studies tried to solve specific problems and I'm not sure they followed really
scientific process before declaring that TDD is a solution, not a by-product
of some solution that passed unnoticed during the research (e.g. education of
developers on software architecture).

------
johnlbevan2
TL;DR

Conclusion: "TDD does not affect testing effort, software external quality,
and developers’ productivity"

However, per jdlshore's comment
([https://news.ycombinator.com/item?id=12740978](https://news.ycombinator.com/item?id=12740978)),
test parameters weren't suitable for any meaningful conclusions to be drawn.

------
xrd
Does this study assess the long term cost of software? It may be true that
this has little benefit in writing code from scratch, and my experiences are
that TDD definitely takes longer when writing code than not doing it. But, how
does it evaluate claims that 90% of the cost of code comes in the maintenance,
not the initial creation of it.

------
BurningFrog
TDD, like most of the agile practices, is a learned skill.

Doing it at an expert level is very different from an untrained novice winging
it.

------
mathattack
Replicated with 21 grad students? And then they quote statistics?

Painful to watch people generalize from such small sample sizes.

------
avodonosov
It's great such studies exist, but there might be many reasons why they are
incorrect (they are testing on students, probably the students don't
understand how to apply TDD, or other way around, they are so good that their
coding approach provides all the benefits without TDD; the numeric metrics
used in study might not adequately reflect the interesting characteristics of
the code base, the payback of TDD might show up in later stages of the product
life when we refactor or extend it, etc).

Probably TDD can speedup people who otherwise aren't used to iterative bottom-
up approach - TDD will encourage short cycle of "change - run and see how it
works" loop. Especially in non-interactive languages like C or Java.

Also, if we write tests after functionality is implemented, how do we know why
our test passes: is it because the functionality is correctly implemented or
it's because the test doesn't catch errors? To ensure test catches errors we
need to run it on a buggy version of code. Implement functionality, write
test, introduce errors in functionality to ensure the test catches them -
that's 3 steps. Run test in the absence of correct code and then implement the
code - 2 steps. That's where "test first" might be efficient.

But often that might be achieved other way. Suppose I'm writing a function to
merge two lists. I will just do in REPL (merge '(a b c) '(1 2 3)) and see by
eyes that it returns (a 1 b 2 c 3). I will then just wrap it into assert:
(assert (equal `(a 1 b 2 c 3) (merge '(a b c) '(1 2 3))). Run this and see
it's passes - that all, I'm sure it's an OK test.

In short, I think there is a certain truth in TDD, but it shouldn't be taken
with fanaticism. And it can even be applied with negative effect (as any
idea).

Suppose I want to develop a class (defclass user () (name password)).

I personally will never write tests for make-instance, (slot-value ... 'name),
(slot-value ... 'password) before creating the class, the see how tests fail,
then creating the class and see how tests pass.

Tests take time and efforts for writing them, and then for maintenance and
rewriting when you refactor code. If a test captures an error then the test
provides some "return of investments". Otherwise writing this test was a
waste.

The tests in the above example will never capture anything.

I tend to create automated tests for fragile logic which is relatively easy to
test, so that the efforts spend are justified by the expected payback.

But all my code is verified. Write several lines, run and see what doesn't
work, fix that.

------
andy_ppp
I've always made data to test if my functions work but now I write that data
down in other programs for the future to keep checking my functions. What's
the big deal... sure TDD is about the future not the time or development
quality today.

------
gedy
Tests before/at/near development time really helps your code design - I've
seen how ensure code is unit testable simplifies and enforces layering, etc.
Really disagree that this does not help code quality.

------
EugeneOZ
New programmers will read "studies" like this and will decide to write tests
"someday later". I really hate impact this "study" brings. And I agree with
every point of @jdlshore comment.

------
SeriousM
The title of this post is very misleading. TDD in opposition to ITL has little
to no impact, the title suggests that the testing itself does not have any
impact. This is just click baiting...

------
jrockway
When writing new code I don't usually write the tests first, but when fixing a
bug, I do. There is nothing worse than a test that would have passed without
your supposed bug fix!

------
eva1984
Not surprised. Religiously follow certain principle to believe that it could
help you bypass the complexity of the problem itself, almost always won't
stand the test of time.

------
known
Experience is the name everyone gives to their mistakes --Oscar

------
z3t4
I think TDD is most effective in "state machines" like cook([ingredients]) =>
dish, witch should be avoided if possible as they are very bug prone.

------
joelthelion
As many things in Software Engineering, TDD is just another tool. It's useful
from time to time, but it's no silver bullet.

------
tarkaTheRotter
"TDD disproven by people who have no idea how to practice it, or have the
ability to grok the longterm benefits."

------
copperx
For a split second I thought they were measuring TDD against no tests at all
and I felt a panic-induced adrenaline rush.

------
jyriand
Based on my own experience working in teams using TDD and teams not using TDD,
I cannot agree.

------
_pmf_
The onus of proof is on the TDD pundits to prove anything substantial.

------
roadman
I was eager to read this paper but found little substance in it.

------
micahbright
Somehow, as a Software Engineer, I'm not really surprised.

~~~
djrtwo
Can you elaborate?

~~~
problems
If you write testable software and actually write the tests, the end result is
the same whether you test first or test later. You're designing with testing
in mind and creating useful tests either way.

It's really a matter of code that lacks tests that's an issue and especially
of code which isn't designed with testing in mind. I think they
overinterpreted the value of TDD as simply test-first. Test first can be good
but it's more a motivational tool and a productivity simulator (yay look at
those green checkmarks!) than a real benefit.

The end result is identical, this sort of craps on it as a motivational tool
too if you consider that only as a factor in development time at least. It'd
be interesting to see developer satisfaction included in some way too.

~~~
wpietri
> If you write testable software and actually write the tests, the end result
> is the same whether you test first or test later.

This is definitely not my experience. As Kent Beck says, TDD is a design
technique. It forces you to always start thinking of the code from the outside
of the unit. If I build the unit first and add tests later, it's more likely
I'll end up with something where the API reflects the implementation. With
test last, I'm also less likely to test everything well; after the
implementation is done, I believe it works.

~~~
zzalpha
_If I build the unit first and add tests later, it 's more likely I'll end up
with something where the API reflects the implementation. With test last, I'm
also less likely to test everything well; after the implementation is done, I
believe it works._

Will you?

Are you sure?

Do you have data to back your supposition?

My contention would be that at the end of the day, the requirements of the
interface to support unit testing will result in a very similar set of design
choices, whether you write those tests up-front, as-you-go, or after-the-fact.

The only difference is that if you write them after-the-fact, you may push
some amount of refactoring to the end of the process instead of doing it along
the way.

But I'll bet you have about as much data as I do to back your beliefs. ;)

~~~
wpietri
> Do you have data to back your supposition?

That's my experience. I started doing TDD more than 10 years ago, and it took
me about a year to fully make the switch from test-after to test-first
programming. I regularly try experiments with different personal code bases.

If you're arguing from personal experience, that your code ends up just the
same either way, good for you. But I suspect you're arguing from theory here.

------
ericls
If the claim in the paper is true:

TDD = Same time + same quality + feel better.

------
bigodines
Clickbait. TDD !== tests, article compares TDD with TLD.

------
blakecallens
It may not boost productivity upfront, but it saves a lot of time down the
line by alerting you when something is out of place.

~~~
zzalpha
No, that's the value of automated regression test. TDD is just one way to skin
that particular cat.

~~~
blakecallens
And that test winds up in v0.1.1 of your software magically, or is it there
because you put it in the work to add it up front?

~~~
zzalpha
Nice work on the false dichotomy. :)

TDD has a _very specific_ meaning. It means you write tests, then write code
that passes those tests. That specific order. If you're not doing that, you're
not actually adhering to the definition of TDD.

TLD could mean, for example, writing a module, class, or function with a
defined interface that implements the contracts for that interface, then
writing the suite of tests to validate that module, after which you move onto
the next module.

Strangely enough, you can do that while you're developing the product without
adhering to the process dictated by TDD.

------
zkhalique
TDD no, regression testing yes.

------
isuckatcoding
As someone who uses unit tests to find bugs in my code, that I would never
otherwise find, this is surprising.

~~~
WalterSear
That not TDD, though. It's more similar to test-after.

~~~
mattschmulen
I think test after is a reasonable approach. I am constantly dealing with new
code bases and in the interview process I am often asked about my philosophy
about testing. My response is thatbtrsting occurs on two fronts from the top
down ( functional UI tests) and from the bottom up (unit tests). When it comes
to unit testing my approach is to focus on the hotspots. If something gives
you trouble, or if you find a bug/ issue then wrap it in a unit test. That way
you don't have to worry about it. Bugs tell you where the weak spots in your
code base are. When they speak to you, listen and take some action. Otherwise,
I feel like chasing blanket coverage is not worth the effort in most products.

------
ben_jones
...when implemented poorly

~~~
skookum
Therein lies the rub with all the faith-based modern development practices.
Getting results? That's the power of The Practice. Not getting results or
seeing negative impact? You're just not doing The Practice right and/or hard
enough.

~~~
bbcbasic
Cough scrum

