Hacker News new | past | comments | ask | show | jobs | submit login

While it is useful to tease out the contributory causes for why adopting even a half-baked TDD had such a powerful effect, in the meantime, the fact is that even a half-baked TDD had a powerful effect.

You've invented this term "half-baked TDD", but that seems a little unfair. My point was that the groups in the Nagappan study were doing significantly more than just TDD. For example, they also had varying levels of dedicated design activities in addition to anything test-driven they were doing. It also wasn't clear to what extent some of them were doing similar kinds of unit testing in the alternative scenarios from the study. So you have neither a clear baseline nor a clear change.

For example, one alternative possibility that still seems reasonably consistent with the evidence in that study is that unit testing is good at improving quality, that TDD promotes writing more tests, and that any design weaknesses that a TDD style might encourage in isolation were mitigated by the separate design activities those groups were doing beyond the basic fail-pass-refactor cycle required for TDD.

The point is "could they have panned out better?"

Erm... Yes. I just made exactly that point, in my very last post, the one you were replying to.

I present Pivotal as a falsification of the argument that "true" TDD hasn't been tried for a long time in commercial settings, with real clients, with real projects, with real consequences.

Who was making that argument? I certainly wasn't.

Your argument is "if it's so good, why hasn't everyone switched?"

No, it isn't. Let's be clear about this.

Prominent TDD advocates, Bob Martin among them, claim quite unambiguously that TDD is essential to writing good software, even using patronising and insulting language like "unprofessional" to describe anyone who doesn't do it.

If that were actually true, if TDD is inherently superior to any other development process and anyone not following it is actively doing programming wrong, then the results achieved by organisations using TDD should be clearly and consistently superior to the results achieved by those not using TDD, other factors being reasonably equivalent.

I claim that this is not the case, and I cite as evidence the simple fact that after nearly two decades TDD still represents a tiny part of the industry practice. This is an industry that over the same time period has brought numerous minor programming languages to prominence and mainstream acceptance; brought many ideas from fields like functional programming and distributed systems from niche applications or academic studies into the mainstream along the way; shifted a large proportion of mainstream development from the previous desktop/server model to web apps, cloud hosting, mobile development, and much more sophisticated embedded systems; made dramatic shifts in development processes such as the migration to a DevOps style of integration and deployment in many cases; and seen the rise of Open Source and more generally of collaborative development from a fun pastime for geeks to a strong influence behind much of the software we run every day.

Given that, I'm sorry but I find it patently absurd to argue that the only reasons hardly anyone is doing TDD, even though it is so inherently superior in both quality of results and cost effectiveness, are that it is hard or unfamiliar. Many of those other changes I mentioned above have been adopted across large parts of the industry within much less time, even though the ideas were completely new and/or required completely different mindsets and understanding.

If TDD really were essential to get good results and so clearly superior to other development processes as the evangelists frequently claim, then as I said before, by now we should see a vast pile of evidence, not just the same study with a sample size of four being cited nearly a decade later, and not just the occasional business -- particularly one that by your own admission is "probably the most doctrinaire TDD company in the world" -- that has used TDD and not failed. Many of us worked on software projects that have not failed. Not failing is table stakes for this debate.




> You've invented this term "half-baked TDD", but that seems a little unfair.

I was rolling with your characterisation that the study wasn't about "real" TDD.

> Prominent TDD advocates, Bob Martin among them, claim quite unambiguously that TDD is essential to writing good software, even using patronising and insulting language like "unprofessional" to describe anyone who doesn't do it.

I personally find Bob Martin quite infuriating.

Doubly so, because I am being apparently grouped with him.

> Given that, I'm sorry but I find it patently absurd to argue that the only reasons hardly anyone is doing TDD, even though it is so inherently superior in both quality of results and cost effectiveness, are that it is hard or unfamiliar.

My actual argument is that TDD is a practice that is hard to learn alone. Every anecdote I read about someone trying and rejecting TDD is an individual trying it by themselves.

> Many of us worked on software projects that have not failed. Not failing is table stakes for this debate.

Reducing defects found in production by 40-90% on a first encounter with TDD is more than table stakes. Especially considering how many projects utterly fail.

Consider for contrast Fagan-style code inspections. These too boast studies with ~90% bug yields. I don't see many people doing them.

Or formal methods. Again, claims of remarkable bug prevention outcomes on very challenging projects, for long spans of time. Yet it hasn't swept the industry.

Some practices are, frankly, harder to learn than others. That the industry is quicker adopt more easily-adopted practices says nothing else about the practices.

We clearly aren't going agree.

Edit: one more thing. I was struck by your point that people only ever cite the one paper. So I began looking for reviews.

Here are two recent ones of interest:

The effects of test driven development on internal quality, external quality and productivity: A systematic review

http://www.sciencedirect.com/science/article/pii/S0950584916...

and

"Considering rigor and relevance when evaluating test driven development: A systematic review"

http://www.sciencedirect.com/science/article/pii/S0950584914...

This second one in particular is of interest, the authors include Munir, who was an author of early research showing equivocal results for TDD.

Unfortunately, both behind paywalls, so a closer reading may weaken the fairly strong statements in the abstracts.


My actual argument is that TDD is a practice that is hard to learn alone. Every anecdote I read about someone trying and rejecting TDD is an individual trying it by themselves.

In itself this is a fair point, but I think this kind of argument only stands up for so long. The same could be said of previously relatively obscure programming styles like functional programming, but they have slowly worked their way into the mainstream as more people have learned them. The same could be said of the modern emphasis on DevOps, but again knowledge and tooling for that have evolved rapidly and gained widespread acceptance in an industry where they were mostly alien just a few years ago.

Consider for contrast Fagan-style code inspections. These too boast studies with ~90% bug yields. I don't see many people doing them.

Fagan-style is too heavyweight to be practical in most software development organisations, and rightly meets resistance as such. However, this is an area where I have considerable personal experience, and I can tell you there are a lot of places that have successfully implemented lighter weight code reviews and/or broader technical reviews of project assets, with very favourable results. Even major Open Source projects typically have some level of mandatory review and often super-review today before new code is allowed into the master branch. Almost every project that is serious about software quality has at least some form of code review process today.

Or formal methods. Again, claims of remarkable bug prevention outcomes on very challenging projects, for long spans of time. Yet it hasn't swept the industry.

Formal methods are too expensive for most projects with today's techniques. They have their place, and they can achieve excellent results in the right context. I'm bullish about the future of this field, not because I expect it to take over completely any time soon, but because I expect that some of its ideas will drift into the mainstream and become common practice as they become incorporated into our languages and tools, just as today strong, static type systems can eliminate entire classes of programmer error that are possible in more dynamic environments. However, for now the cost of heavyweight formal methods is so high that you really are into the territory where alternative engineering solutions involving completely redundant systems and the like can actually be more cost-effective.

I was struck by your point that people only ever cite the one paper. So I began looking for reviews.

I've only read one of those (the Munir one) but I'm afraid you might be disappointed. For example, of the 41 primary (mostly) sources they considered, just 9 were in their high rigour and high relevance quadrant. Of those, they report that 7 did conclude that the external quality of the TDD-based development was significantly better (one of the 7 being the Nagappan paper).

However, when you look at the primary sources, you find that like Nagappan, often what they were looking at wasn't really TDD either. For example, one was actually about moving away from TDD at a class/method level and more towards testing at a higher level with components, and it was the latter that gave the better results.

I might also challenge the classification of some of those papers as being rigorous and relevant. For example, one of the key metrics used in the Slyngstad case study is defects per SLOC, which in itself is questionable. The case study compared several releases of the same project, between which the number of SLOC varied widely (notably changing quite dramatically at the same release the TDD was introduced) but in all cases was quite small by professional development standards (only a few thousand lines). And then the paper does some extremely dubious arithmetic to reach its headline statistic of TDD reducing the mean defect density by around 35%, glossing over things like a sharp rise in the defect density in the release when TDD was introduced and the fact that the average for the test-last releases was completely dominated by a much worse score for the very first release.

In at least one case, the Siniaalto paper, the survey appears to have almost completely reversed the position of the original paper, perhaps as a result of scanning for key words and phrases a little too loosely and failing to notice that the paper was actually challenging disputing some of those claims rather than supporting them.

Overall, it's still much the same story here: some of the generalisations being presented in the summaries aren't necessarily supported by the primary data when you look at the details. There are lots of examples of the understandable but still real distortions that these kinds of surveys always seem to show up.

So while I appreciate the interesting discussion, I'm afraid we might still have to agree to disagree on this one. I'm not saying TDD doesn't or can't work for the right team in the right context, but the idea that it is innately superior to other development methods in general and the evidence typically cited to support such a claim just don't stand up to scrutiny.


Are you a researcher or a practitioner? The last half of your answer was much more interesting than the slogans at twenty paces we exchanged in the early part of the discussion.


I'd say I'm a practitioner, but one who has been around the block a few times and perhaps done more research than most along the way.

Once upon a time I did spend several years doing fairly serious investigations into ways to improve software development processes and what evidence was out there. The majority of that work wasn't primary research, but it was fascinating and sometimes enlightening to separate advocacy from evidence, and I suppose I've maintained the habit ever since.

I find some of the ideas popularised by the Agile movement particularly interesting. Often there is decent evidence of effectiveness to some degree or in some context, the kernel of a good idea, if you like. Unfortunately, there is also the whole dogmatic advocacy thing, where evangelists extrapolate beyond the evidence and the benefits get overstated.

Just to be clear, I'm not suggesting that you've been doing this in our discussions here. I'm happy that TDD seems to work for your organisation, I've no reason to doubt that you find it effective, and if you have any write-ups of what you've found does or doesn't work well then I'd be happy to read about it.

However, I've mentored more than one junior developer who really has told me point blank that we were doing software development wrong just because we weren't following the gospel according to Bob Martin, Joel Spolsky, or whoever it is this week. That gets old, so I tend to comment when discussions get into evidence-based debate, in the hope that it will point others towards information that took me a long time to find and reconcile.


I hear you. I often feel the same way.

Most of what coalesced into agile in the late 90s was already "in the air", just taken further and tied together.

For example, it's normal now to have CI/CD.

In 1996, McConnell listed "daily build and smoke" as a best practice, describing it as the "heartbeat" of a project. Without it, you're dead. It didn't have a sexy name it was slow and fragile, but the concept was there.

Or, variously, sprints or iterations. Various spiral models existed before Scrum and XP became the talk of the town; just nobody tried it.

This was a good discussion. I continue to disagree with you about the parallels you drew and what I see as a line of argument that adoption of a practice is commensurate solely with its value and effectiveness.

However I now appreciate why you were forceful. I fit a pattern you recognise.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: