Giving Up on TDD

jblow · on March 19, 2016

For many paragraphs, I thought this was a parody of TDD defense, but then it turns out it wasn't.

His 'defense' of the point is basically: Look, when you do TDD you have to put a lot more work into the tests than you thought! It is not just a simple thing!

Okay, fine, but ... Before embarking on TDD, the programmer had a picture in his head of what the costs+benefits of this change would be. Now you are telling him the costs are WAY higher. So a successful defense would have to then make the case that the benefits are also WAY higher.

But he doesn't. Because the benefits aren't higher, in fact they are lower (as is the case with every well-intended scheme in the history of anything.)

As usual my advice on this is: look at the people who build things you find highly impressive, and study how they did it. This is much more fruitful than reading the output of people who want to spend all day telling you how to program (which leaves very little time for them to build software that is impressive, i.e. they never even test their own ideas!)

jacques_chester · on March 19, 2016

> Because the benefits aren't higher, in fact they are lower

It depends on what you think the benefits are.

> As usual my advice on this is: look at the people who build things you find highly impressive, and study how they did it.

Good idea. Researchers at Microsoft and IBM conducted studies of applying TDD to real, commercial projects of varying sizes, languages, team structure and complexity.

They found that defects found in production were 40 to 90% lower. Development was slower by 15-35%, assuming that you don't count re-work after release towards the overall time spent on development.

http://research.microsoft.com/en-us/groups/ese/nagappan_tdd....

These studies were not conducted with experienced TDDers, nor with TDD advocates on staff. Some of the teams abandoned TDD afterwards.

simula67 · on March 19, 2016

Each chapter in the book 'Making Software' by Greg Wilson and Andy Oram, takes one claim usually made by software developers and examines the body of research surrounding that claim. Chapter 12 is on TDD. The conclusion was that the evidence is not conclusive.

jacques_chester · on March 19, 2016

It's an excellent book.

Software engineering research has the same problem as sports science. Small samples, lots of difficult-to-control variables. No way to perform real RCTs. Not enough money or interest to really dig.

That doesn't make it useless, but it does mean exercising judgement.

ArkyBeagle · on March 19, 2016

Can we then conclude that there's not a way to balance the 15 to 35 % rise in (one) cost against the 40 to 90% drop in defects ( if those are the measurements )?

That's really interesting if it is the case.

jacques_chester · on March 19, 2016

I think it can be rationally traded off, so long as everyone is clear about how to measure and what the decision criteria are.

I think that for products with long release latency (ie the kind of shrink-wrapped software Microsoft specialises in), the accounting very much works against TDD on these findings, because the incentives are to front-load features and catch up on bugs with service patches. If you miss your release window, your feature will be bumped by years.

For products with continuous deployment, the economics shift and invert. Blasting out code willy-nilly will eventually cause engineering to steadily be diverted to firefighting. Eventually forward progress stalls.

ArkyBeagle · on March 19, 2016

This just confirms my understanding that nobody really understands the cost of defects.

jacques_chester · on March 19, 2016

In the way we work at Pivotal, feature stories "earn" velocity, but bugs do not. So if attention is not paid to code quality, velocity will eventually trend downwards.

This is intentional, as it gives you a feedback loop on quality and helps to improve predictions of actual feature progress.

From there you can invert to points/cost per engineer/week.

Of interest, our average points per engineer per week on Cloud Foundry has remained largely steady over time and scale. I'm wary of that metric because it breaks a fundamental concept of measuring velocity (that points are only meaningful inside a single project).

Silhouette · on March 19, 2016

The Nagappan study is one of the few that at least tried to do a decent job, but there ought to be some sort of magic auto-responder every time someone cites it that points out that none of the teams they studied was actually doing "pure TDD" as originally advocated by those who coined the term, so the baseline for all those comparative figures isn't entirely clear.

It's also pertinent that it always seems to be that same one report from nearly a decade ago, studying just a few development groups, that gets cited whenever anyone asks for hard data about the effectiveness of TDD. If TDD is as necessary as its more fervent advocates claim, then surely after all this time we should have a vast body of evidence where real world development teams have tried it and found it clearly superior to what they were doing before, not just a few isolated examples.

jacques_chester · on March 20, 2016

> none of the teams they studied was actually doing "pure TDD"

And still showed dramatic improvements in defect prevention.

> surely after all this time we should have a vast body of evidence where real world development teams have tried it and found it clearly superior to what they were doing before, not just a few isolated examples.

The problem is funding. Nobody wants to stump up their project for it.

Oh, except at Pivotal, I guess. We've only done a few hundred projects with TDD over several decades. I guess it'll never pan out.

Silhouette · on March 20, 2016

And still showed dramatic improvements in defect prevention.

Right, but because neither what was being tested as close to TDD nor the alternative were fully controlled, we can't tell from the cases studied whether it was the full TDD process that would make a difference, or just writing more unit tests, writing those tests first, writing them at all, or various other changes in process that accompanied the TDD-like shifts.

Oh, except at Pivotal, I guess. We've only done a few hundred projects with TDD over several decades. I guess it'll never pan out.

Good for you. Thousands of other organisations with millions of developers working for them have done thousands of times as many projects as you have without TDD over those same decades, and plenty of them panned out too. Unfortunately, none of those anecdotes in isolation tell us anything very useful about whether things would have panned out better following some other development process.

jacques_chester · on March 20, 2016

> whether it was the full TDD process that would make a difference, or just writing more unit tests, writing those tests first, writing them at all, or various other changes in process that accompanied the TDD-like shifts.

While it is useful to tease out the contributory causes for why adopting even a half-baked TDD had such a powerful effect, in the meantime, the fact is that even a half-baked TDD had a powerful effect.

Here's what we know: under varying conditions, in multiple teams, with multiple levels of experience, in various problem domains, in multiple languages, in multiple companies in multiple locations, even a half-baked TDD had a powerful effect.

Small sample sizes suck. But when you have a small sample and a very large effect, then the null hypothesis is under serious pressure.

> Thousands of other organisations with millions of developers working for them have done thousands of times as many projects as you have without TDD over those same decades, and plenty of them panned out too.

The point is not "did they pan out?" (modulo the usually utterly abysmal "success" rate in our industry). The point is "could they have panned out better?"

> Unfortunately, none of those anecdotes in isolation tell us anything very useful

I present Pivotal as a falsification of the argument that "true" TDD hasn't been tried for a long time in commercial settings, with real clients, with real projects, with real consequences.

I did so because you said "we should have a vast body of evidence where real world development teams have tried it and found it clearly superior to what they were doing before, not just a few isolated examples."

Then you shifted the goalposts.

Pivotal is probably the most doctrinaire TDD company in the world. If you can find another company that has done TDD for as long, and as consistently, on as many projects, with as many different engineers, but showed a consistent failure rate, I'd love to hear about it.

Your argument is "if it's so good, why hasn't everyone switched?"

Because people are people. TDD is hard and an unfamiliar practice to almost everyone. It requires someone to teach, which makes it frustrating to the autodidacts who fill the industry.

You might as well ask why we haven't all switched to Dvorak keyboards, why VHS beat Betacam, why we were stuck at 1080p screens for a decade, why we use enriched-uranium high pressure water reactors, why x86 beat every other ISA for decades and on and on and on. That a practice or technology is superior on the metrics we are "supposed" to care about is no guarantor of success. Contrariwise, an argument from popularity proves or disproves nothing about the actual empirical properties of a practice.

Silhouette · on March 20, 2016

While it is useful to tease out the contributory causes for why adopting even a half-baked TDD had such a powerful effect, in the meantime, the fact is that even a half-baked TDD had a powerful effect.

You've invented this term "half-baked TDD", but that seems a little unfair. My point was that the groups in the Nagappan study were doing significantly more than just TDD. For example, they also had varying levels of dedicated design activities in addition to anything test-driven they were doing. It also wasn't clear to what extent some of them were doing similar kinds of unit testing in the alternative scenarios from the study. So you have neither a clear baseline nor a clear change.

For example, one alternative possibility that still seems reasonably consistent with the evidence in that study is that unit testing is good at improving quality, that TDD promotes writing more tests, and that any design weaknesses that a TDD style might encourage in isolation were mitigated by the separate design activities those groups were doing beyond the basic fail-pass-refactor cycle required for TDD.

The point is "could they have panned out better?"

Erm... Yes. I just made exactly that point, in my very last post, the one you were replying to.

I present Pivotal as a falsification of the argument that "true" TDD hasn't been tried for a long time in commercial settings, with real clients, with real projects, with real consequences.

Who was making that argument? I certainly wasn't.

Your argument is "if it's so good, why hasn't everyone switched?"

No, it isn't. Let's be clear about this.

Prominent TDD advocates, Bob Martin among them, claim quite unambiguously that TDD is essential to writing good software, even using patronising and insulting language like "unprofessional" to describe anyone who doesn't do it.

If that were actually true, if TDD is inherently superior to any other development process and anyone not following it is actively doing programming wrong, then the results achieved by organisations using TDD should be clearly and consistently superior to the results achieved by those not using TDD, other factors being reasonably equivalent.

I claim that this is not the case, and I cite as evidence the simple fact that after nearly two decades TDD still represents a tiny part of the industry practice. This is an industry that over the same time period has brought numerous minor programming languages to prominence and mainstream acceptance; brought many ideas from fields like functional programming and distributed systems from niche applications or academic studies into the mainstream along the way; shifted a large proportion of mainstream development from the previous desktop/server model to web apps, cloud hosting, mobile development, and much more sophisticated embedded systems; made dramatic shifts in development processes such as the migration to a DevOps style of integration and deployment in many cases; and seen the rise of Open Source and more generally of collaborative development from a fun pastime for geeks to a strong influence behind much of the software we run every day.

Given that, I'm sorry but I find it patently absurd to argue that the only reasons hardly anyone is doing TDD, even though it is so inherently superior in both quality of results and cost effectiveness, are that it is hard or unfamiliar. Many of those other changes I mentioned above have been adopted across large parts of the industry within much less time, even though the ideas were completely new and/or required completely different mindsets and understanding.

If TDD really were essential to get good results and so clearly superior to other development processes as the evangelists frequently claim, then as I said before, by now we should see a vast pile of evidence, not just the same study with a sample size of four being cited nearly a decade later, and not just the occasional business -- particularly one that by your own admission is "probably the most doctrinaire TDD company in the world" -- that has used TDD and not failed. Many of us worked on software projects that have not failed. Not failing is table stakes for this debate.

jacques_chester · on March 20, 2016

> You've invented this term "half-baked TDD", but that seems a little unfair.

I was rolling with your characterisation that the study wasn't about "real" TDD.

> Prominent TDD advocates, Bob Martin among them, claim quite unambiguously that TDD is essential to writing good software, even using patronising and insulting language like "unprofessional" to describe anyone who doesn't do it.

I personally find Bob Martin quite infuriating.

Doubly so, because I am being apparently grouped with him.

> Given that, I'm sorry but I find it patently absurd to argue that the only reasons hardly anyone is doing TDD, even though it is so inherently superior in both quality of results and cost effectiveness, are that it is hard or unfamiliar.

My actual argument is that TDD is a practice that is hard to learn alone. Every anecdote I read about someone trying and rejecting TDD is an individual trying it by themselves.

> Many of us worked on software projects that have not failed. Not failing is table stakes for this debate.

Reducing defects found in production by 40-90% on a first encounter with TDD is more than table stakes. Especially considering how many projects utterly fail.

Consider for contrast Fagan-style code inspections. These too boast studies with ~90% bug yields. I don't see many people doing them.

Or formal methods. Again, claims of remarkable bug prevention outcomes on very challenging projects, for long spans of time. Yet it hasn't swept the industry.

Some practices are, frankly, harder to learn than others. That the industry is quicker adopt more easily-adopted practices says nothing else about the practices.

We clearly aren't going agree.

Edit: one more thing. I was struck by your point that people only ever cite the one paper. So I began looking for reviews.

Here are two recent ones of interest:

The effects of test driven development on internal quality, external quality and productivity: A systematic review

http://www.sciencedirect.com/science/article/pii/S0950584916...

and

"Considering rigor and relevance when evaluating test driven development: A systematic review"

http://www.sciencedirect.com/science/article/pii/S0950584914...

This second one in particular is of interest, the authors include Munir, who was an author of early research showing equivocal results for TDD.

Unfortunately, both behind paywalls, so a closer reading may weaken the fairly strong statements in the abstracts.

Silhouette · on March 21, 2016

My actual argument is that TDD is a practice that is hard to learn alone. Every anecdote I read about someone trying and rejecting TDD is an individual trying it by themselves.

In itself this is a fair point, but I think this kind of argument only stands up for so long. The same could be said of previously relatively obscure programming styles like functional programming, but they have slowly worked their way into the mainstream as more people have learned them. The same could be said of the modern emphasis on DevOps, but again knowledge and tooling for that have evolved rapidly and gained widespread acceptance in an industry where they were mostly alien just a few years ago.

Consider for contrast Fagan-style code inspections. These too boast studies with ~90% bug yields. I don't see many people doing them.

Fagan-style is too heavyweight to be practical in most software development organisations, and rightly meets resistance as such. However, this is an area where I have considerable personal experience, and I can tell you there are a lot of places that have successfully implemented lighter weight code reviews and/or broader technical reviews of project assets, with very favourable results. Even major Open Source projects typically have some level of mandatory review and often super-review today before new code is allowed into the master branch. Almost every project that is serious about software quality has at least some form of code review process today.

Or formal methods. Again, claims of remarkable bug prevention outcomes on very challenging projects, for long spans of time. Yet it hasn't swept the industry.

Formal methods are too expensive for most projects with today's techniques. They have their place, and they can achieve excellent results in the right context. I'm bullish about the future of this field, not because I expect it to take over completely any time soon, but because I expect that some of its ideas will drift into the mainstream and become common practice as they become incorporated into our languages and tools, just as today strong, static type systems can eliminate entire classes of programmer error that are possible in more dynamic environments. However, for now the cost of heavyweight formal methods is so high that you really are into the territory where alternative engineering solutions involving completely redundant systems and the like can actually be more cost-effective.

I was struck by your point that people only ever cite the one paper. So I began looking for reviews.

I've only read one of those (the Munir one) but I'm afraid you might be disappointed. For example, of the 41 primary (mostly) sources they considered, just 9 were in their high rigour and high relevance quadrant. Of those, they report that 7 did conclude that the external quality of the TDD-based development was significantly better (one of the 7 being the Nagappan paper).

However, when you look at the primary sources, you find that like Nagappan, often what they were looking at wasn't really TDD either. For example, one was actually about moving away from TDD at a class/method level and more towards testing at a higher level with components, and it was the latter that gave the better results.

I might also challenge the classification of some of those papers as being rigorous and relevant. For example, one of the key metrics used in the Slyngstad case study is defects per SLOC, which in itself is questionable. The case study compared several releases of the same project, between which the number of SLOC varied widely (notably changing quite dramatically at the same release the TDD was introduced) but in all cases was quite small by professional development standards (only a few thousand lines). And then the paper does some extremely dubious arithmetic to reach its headline statistic of TDD reducing the mean defect density by around 35%, glossing over things like a sharp rise in the defect density in the release when TDD was introduced and the fact that the average for the test-last releases was completely dominated by a much worse score for the very first release.

In at least one case, the Siniaalto paper, the survey appears to have almost completely reversed the position of the original paper, perhaps as a result of scanning for key words and phrases a little too loosely and failing to notice that the paper was actually challenging disputing some of those claims rather than supporting them.

Overall, it's still much the same story here: some of the generalisations being presented in the summaries aren't necessarily supported by the primary data when you look at the details. There are lots of examples of the understandable but still real distortions that these kinds of surveys always seem to show up.

So while I appreciate the interesting discussion, I'm afraid we might still have to agree to disagree on this one. I'm not saying TDD doesn't or can't work for the right team in the right context, but the idea that it is innately superior to other development methods in general and the evidence typically cited to support such a claim just don't stand up to scrutiny.

jacques_chester · on March 21, 2016

Are you a researcher or a practitioner? The last half of your answer was much more interesting than the slogans at twenty paces we exchanged in the early part of the discussion.

Silhouette · on March 22, 2016

I'd say I'm a practitioner, but one who has been around the block a few times and perhaps done more research than most along the way.

Once upon a time I did spend several years doing fairly serious investigations into ways to improve software development processes and what evidence was out there. The majority of that work wasn't primary research, but it was fascinating and sometimes enlightening to separate advocacy from evidence, and I suppose I've maintained the habit ever since.

I find some of the ideas popularised by the Agile movement particularly interesting. Often there is decent evidence of effectiveness to some degree or in some context, the kernel of a good idea, if you like. Unfortunately, there is also the whole dogmatic advocacy thing, where evangelists extrapolate beyond the evidence and the benefits get overstated.

Just to be clear, I'm not suggesting that you've been doing this in our discussions here. I'm happy that TDD seems to work for your organisation, I've no reason to doubt that you find it effective, and if you have any write-ups of what you've found does or doesn't work well then I'd be happy to read about it.

However, I've mentored more than one junior developer who really has told me point blank that we were doing software development wrong just because we weren't following the gospel according to Bob Martin, Joel Spolsky, or whoever it is this week. That gets old, so I tend to comment when discussions get into evidence-based debate, in the hope that it will point others towards information that took me a long time to find and reconcile.

jacques_chester · on March 24, 2016

I hear you. I often feel the same way.

Most of what coalesced into agile in the late 90s was already "in the air", just taken further and tied together.

For example, it's normal now to have CI/CD.

In 1996, McConnell listed "daily build and smoke" as a best practice, describing it as the "heartbeat" of a project. Without it, you're dead. It didn't have a sexy name it was slow and fragile, but the concept was there.

Or, variously, sprints or iterations. Various spiral models existed before Scrum and XP became the talk of the town; just nobody tried it.

This was a good discussion. I continue to disagree with you about the parallels you drew and what I see as a line of argument that adoption of a practice is commensurate solely with its value and effectiveness.

However I now appreciate why you were forceful. I fit a pattern you recognise.

c0achmcguirk · on March 19, 2016

> As usual my advice on this is: look at the people who build things you find highly impressive, and study how they did it. This is much more fruitful than reading the output of people who want to spend all day telling you how to program (which leaves very little time for them to build software that is impressive, i.e. they never even test their own ideas!)

"Uncle Bob" Martin has built a lot of impressive things. Most notably FitNesse [1] the service level test framework. He tests his ideas out and is a great teacher.

I've been doing TDD for years thanks to Uncle Bob, Martin Fowler, and Roy Osherove (who have all built impressive things) and I've leveled up as a result. In fact my entire team has leveled up with this simple discipline.

I've yet to meet an anti-TDD zealot who has actually spent a month developing the art of TDD. The detractors tend to be people who write really brittle tests that are a pain to maintain.

[1] - http://FitNesse.org

tragic · on March 19, 2016

> I've yet to meet an anti-TDD zealot who has actually spent a month developing the art of TDD. The detractors tend to be people who write really brittle tests that are a pain to maintain.

OK, sure. I'm basically pro-TDD, seeing as how everyone's taking sides. Yet there's a problem here, which is that (as Martin says here) bad testing is basically a design problem. To write better tests, learn better software design - which is to say, you do not do it (or at least most efficiently) by doing more TDD.

This stuff is not in the brochure. And it should be; because without it, the claims made by TDD evangelists are somewhat misleading. No, it will not make you a better engineer on its own, at all. No, it is not a substitute for getting your whiteboard marker out and drawing a few boxes, or thorough code reviews, or whatever else.

Sandi Metz makes this point in Practical Object Oriented Design in Ruby[0]; and the chapter on testing is right at the end of that book, presumably for the same reasons (get the design sense first, then you'll get a sane red-green-refactor thing going).

Note that I'm not talking about Ian Sommerville here - I'd never heard of him until this whole contretemps, but I'm given to understand he knows a thing or two about design. Perhaps if he really stuck at TDD, Bob Martin's favourite bit would flip[1]. For less experienced programmers, who have not yet felt the pain of maintaining a ball-of-mud and learned a thing or two about separating concerns, TDD is not going to help without ongoing education about design.

[0] http://www.poodr.com/ . The design principles are different outside of the OO world, of course; but not the principle of design!

[1] http://blog.8thlight.com/uncle-bob/2012/01/11/Flipping-the-B...

wpietri · on March 19, 2016

Huh. I've seen TDD help people improve their design skills. It certainly has helped me.

It forces a very short feedback loop between acts of design and experiencing the design as a consumer. That gives people immediate feedback on bad design choices, giving them opportunities to see the problems they're creating.

It also forces people to think immediately about the consumer perspective of a design rather than the implementation perspective. Instead of mentally being inside the objects and methods they're building, they have to start thinking about them from the outside.

I agree it's not a substitute for getting up and going to a whiteboard. But, then, going to a whiteboard is not a substitute for TDD. They're mainly working at different levels of design; when I'm at a whiteboard we're generally talking about the relationship of relatively large pieces of a system. Whereas during TDD you're mainly confronting the fine details of design.

0xcde4c3db · on March 19, 2016

> I've yet to meet an anti-TDD zealot who has actually spent a month developing the art of TDD. The detractors tend to be people who write really brittle tests that are a pain to maintain.

I guess I loosely fall into the category of "anti-TDD". I don't generally go around badmouthing it, but I'm not really sold on the idea that it would save me significant time or effort. I certainly encounter things that TDD would likely catch earlier, but those are typically easy fixes and caught in QA or beta testing. The majority of my bugfixing hours are eaten up by things for which I don't even know what it would mean to write tests, such as:

- reproducing rare hardware-involved errors (often race conditions, often based on the variability of random-ish real-world events) in a way that allows getting relevant information logged/dumped

- clarifying product definition/spec, usually to resolve some inconsistency where the behavior isn't invalid per se but implicitly conflicts with some other requirement (i.e. we basically decide which behavior to call a bug and change that one)

- reverse-engineering undocumented design decisions of some third-party product that our product is expected to work with, because we foolishly believed that there was a standard (overlaps somewhat with #1)

So while I don't claim that TDD has no value, I admit to rolling my eyes a bit at the more fervent evangelists.

khushia · on March 19, 2016

I've been using FitNesse heavily at work for the past few years, and I've been surprised by how buggy we've found it.

We have over 30,000 tests and we've had to rewrite part of it because it memory-leaked so badly. Unfortunately the company has strict rules against contributing back to Open Source projects.

kolanos · on March 19, 2016

Relevant: https://www.youtube.com/watch?v=IRTfhkiAqPw

Specifically the Uncle Bob example of OOP run amok.

radicalbyte · on March 19, 2016

I just came to post something similar. Typical "Enterprise Java", much like Jenkins.

Free but very buggy.

erichmond · on March 19, 2016

"I've yet to meet an anti-TDD zealot who has actually spent a month developing the art of TDD. The detractors tend to be people who write really brittle tests that are a pain to maintain."

I'm not an anti-TDD zealot, but I don't personally do it, nor require any of my developers to do it (but they can, if it helps them).

As with everything in software, some things work well for some people, and not so much for others.

Instead of TDD, HDD works best for me. https://www.youtube.com/watch?v=f84n5oFoZBc

mikerichards · on March 19, 2016

Rich also said, I think we're in this world I'd like to call Guard Rail Programming... 'I can make change because I have tests!' Who does that? Who drives their car around, banging against the guard rails? Do the guard rails help you get to where you want to go?"

http://patrick.lioi.net/2011/11/23/guard-rail-programming/

TDD has always seemed to me to be another sketchy, cult-like following in the Agile ecosystem. I've never seen anybody be productive at it, even if they did find some kind of solution to a problem long after they would have doing normal development.

But what do I know, I'm not even a fan of unit tests. I'd rather go to the real-deal and run some acceptance/integration test at 2 in the morning.

agarden · on March 19, 2016

There are people who go around driving their car into guardrails. They are the people who build cars. Because if you don't test failure modes, you don't know that your product performs to spec under them.

mahyarm · on March 20, 2016

I think what rich hikley was saying that changing something and if the tests still pass after your change you think it is still all good. It's like a programmer who thinks if it compiles, it's shippable!

Another thing I don't like about tests are categories of tests that should be automated by the compiler, tracer or similar.

Like in python you type check a lot in unit tests, but in a statically typed language, the compiler does your type tests for you already so you don't have to write that kind of test any more.

There are also tests I call 'breakpoint equivalent testing' where you put expectations of certain methods getting called in a method. I wish I could just set some breakpoints in an IDE and have some recorder automatically write the method call expectation code for me.

agarden · on March 20, 2016

He seemed to me to be criticizing people who want to make changes to the code without reasoning about the system, assuming that if they mess something up the existing tests will catch it and tell them. I think he is criticizing a real problem, but the metaphor is all messed up and hence misleading.

Writing software is not like driving a car. It is like engineering a car. Making a change on the assumption that tests will tell you if there is a problem is like deciding to move the gas tank and saying, "If that turns out to be an issue, QA will tell us when they do their collision tests." (Sort of like that. Obviously less expensive.) When you are making a change, you need to think about whether that change will require any new tests and you need to be thinking about how it integrates into the system as a whole. Maybe moving the gas tank means a new kind of collision test needs to be added to the repertoire.

Because the metaphor was bad, the GP seemed to be taking it as a criticism of testing, which isn't reasonable. But I may have been mistaken about his intent.

On the other hand, if the change you are making is just altering the shape of the bumper for cosmetic reasons, it seems pretty reasonable to assume that QA will tell you if it adversely impacts safety in ways that were not obvious.

I agree about static types. Static types are basically a way of having the compiler automatically write and run whole classes of unit tests.

zimpenfish · on March 21, 2016

> He seemed to me to be criticizing people who want to make changes to the code without reasoning about the system, assuming that if they mess something up the existing tests will catch it and tell them.

I think where tests really help is when it's not possible to sensibly reason about the system because [looks at current codebase] e.g. it's horrifying interwoven ball of mud that's never heard the word "no" or met a Perl module or methodology it didn't want to halfheartedly adopt in the last 10+ years.

tracker1 · on March 19, 2016

I don't necessarily think it's even TDD vs not... I'm not big on TDD, but always try to design my code so that it is very module, and as a side effect easier to test. At least beyond some cross cutting edge cases, which are usually easy enough to isolate.

The act of TDD, or test compliance is only that you've looked at everything twice. It doesn't guarantee quality. Writing modular code means less friction over time, and TDD encourages this, but it's not necessarily a requirement imho.

thesz · on March 19, 2016

>I've yet to meet an anti-TDD zealot who has actually spent a month developing the art of TDD.

Do you actually apply that reasoning in everything you do? This is highly unproductive approach.

At least one paragraph from [1] this essay by PG is very relevant:

Most people have learned to do a similar sort of filtering on new things they hear about. They don't even start paying attention until they've heard about something ten times. They're perfectly justified: the majority of hot new whatevers do turn out to be a waste of time, and eventually go away. By delaying learning VRML, I avoided having to learn it at all.

[1] http://paulgraham.com/popular.html

c0achmcguirk · on March 20, 2016

> Most people have learned to do a similar sort of filtering on new things they hear about. They don't even start paying attention until they've heard about something ten times. They're perfectly justified: the majority of hot new whatevers do turn out to be a waste of time, and eventually go away. By delaying learning VRML, I avoided having to learn it at all.

Great quote, but VRML impacts the actual work of the programmer in his career about 1%. TDD on the other hand can not only impact your work but take you to another level in nearly every task.

I haven't spent time learning Rust or Go yet. My clojure and haskell skills are subpar. I can't know everything. But TDD is a practice that applies regardless of language or framework. Languages come and go, but practices are what help you really hone your craft.

ssmoot · on March 19, 2016

Back when TDD started, it was often referred to as Test Driven DESIGN by the same people you cite.

I do it, and I find it valuable. But I think if you're not using it as a design tool, you're missing most of the benefit.

You don't do TDD (IMO) to prove correctness. You do it to achieve a level of composition and design that makes later requirements changes and refactoring less painful. Proving that your stuff actually works is a convenient side effect. But not the primary goal because tests have costs.

I haven't followed Osherove for years, so maybe his thinking has changed, but it used to be common to think of chasing high test coverage numbers as an anti-pattern. 70 to 80% is the sweet spot. You're designing. You're being productive. That last 10, 20 or 30% of coverage gets exponentially more expensive both to develop and maintain, and it provides no additional design benefit. It's only testing for it's own sake.

BDD and the culture that sprang up around it (at least in the Ruby community) was such a disappointment after having done TDD for a few years previous. It's like every known anti-pattern was adopted as a core deliverable and the entire point of the exercise (IMO, solving the "blank page" problem in building systems; Design) was forgotten entirely.

It's my experience that Developers who've adopted this style of "TDD" are cargo culting, and incredibly difficult to work with. Reason didn't get them into the belief. They can't be reasoned out of it. It's religion by that point.

Now that I'm in Scala, and TDD my own code, I'm a very happy developer. I never suffer blank-page issues. I never bother writing tests after the fact. I don't often encounter bugs in such code, and they're almost never fundamental design issues when I do. And my coverage proudly hovers around 70%.

That is how you TDD right IMO. Follow the lessons learned a decade or more ago and avoid the anti-patterns.

Chasing test coverage is not only a good way to light piles of somebody else's money on fire by wasting time for little benefit, but it's actively harmful to the quality of your code base over the long term. You're disincentivized to correct design mistakes, and you're encouraged to over design to enable a level of testing granularity that never paid any concrete benefits for itself in the first place.

I'm a solid proponent of TDD then. But it's like saying I think water is good and you can prove it when most people are trying to sell you "mineral rich" Iron tainted industrial runoff. TDD absolutely can increase costs and complexity in the wrong hands.

The moral of my rambling story is to be Agile I guess. In the original sense, not the consultation services one. To the inexperienced developer I'd say: Try to solve problems, not implement solutions. And above all, never cargo cult. When someone tells you you need 100% coverage, ask them what benefit it provides. When someone tells you to "measure everything", ask them if they've measured the business value of measuring everything and if it outweighed the opportunity cost and dollars sunk into the effort. Be a constant skeptic, because snake oil is everywhere.

Sorry for the diatribe. You took me down memory lane and I guess I feel pretty strongly about TDD.

jacques_chester · on March 19, 2016

> BDD and the culture that sprang up around it (at least in the Ruby community) was such a disappointment after having done TDD for a few years previous. It's like every known anti-pattern was adopted as a core deliverable and the entire point of the exercise (IMO, solving the "blank page" problem in building systems; Design) was forgotten entirely.

Could you elaborate?

Speaking for myself, I thought that the main useful thing to come out of BDD was working outside-in and the given/when/then convention.

But I certainly didn't get it on my first pass. Looked at Cucumber, soldiered mightily, gave up in disgust.

howareroark · on March 19, 2016

I like your usual advice! I imagine there are a lot of very ingenious things hidden within the code of people who have never written an article. Or even just contradictions to articles that they may have written.

I find a lot of articles tend to just build upon common assumptions with a unique twist. Makes sense though, people want to read things that validate their assumptions. People want to write things that people want to read.

agarden · on March 19, 2016

I think he did state that the benefit is also much higher, but it was a bit subtle. He argues that the cost is higher than expected because you have to design the tests and design the system to be testable. He then states that this isn't wasted effort because, "Something that is hard to test is badly designed." And then hammers the point home with the pacemaker story and a bit of ranting.

I agree with him. It took me a few years to realize that if code is hard to test, it is hard to debug. And if it is hard to debug, it is hard to reason about. And if it is hard to reason about, then it is likely to have hidden bugs.

wpietri · on March 19, 2016

> But he doesn't. Because the benefits aren't higher, in fact they are lower (as is the case with every well-intended scheme in the history of anything.)

The benefits of X are always lower than any naive adopter thinks? That's quite a claim. Would you care to justify that? Heck, don't even worry about the general case, just explain it for this one.

I've been doing TDD for many years at this point. Right now I'm working on a hobby project where I was intentionally sloppy about testing, just to see how little I could get away with. The answer is: in the long term, I can get away with surprisingly little. But there's a lot of subtlety in exactly how and when the benefits come, and when exactly sloppiness is ok.

Given that I'm still learning about the benefits of TDD despite many years of practice and thought, I'd say there's no reason to think a novice would have a particularly good idea. Maybe their expectations are unrealistically low, maybe high. But more likely it's both and neither. Maybe one's very notion of "benefit" changes over time. TDD has certainly done that for me.

msie · on March 19, 2016

Most software out there has been done without TDD.

wpietri · on March 19, 2016

Sure, but most software is also pretty bad. Most software projects aren't considered successful. [1] And 99% of people advocating TDD have written software without TDD, so they've tried it both ways.

Heck, when anesthesia was introduced, people thought it was a needless luxury and an interference with the pain God intended us to have. [2] And they were right in the way you were right: billions had just lived with the pain, so anesthesia wasn't really necessary. But something can be unnecessary and still be a good idea.

[1] https://www.google.com/search?site=&tbm=isch&q=software+proj...

[2] http://www.newyorker.com/magazine/2013/07/29/slow-ideas

Silhouette · on March 20, 2016

Sure, but most software is also pretty bad.

That seems rather uncharitable.

Billions of people travel in software-controlled vehicles every day, and most of them will get to their destination safely and reasonably efficiently.

When I check my bank account or credit card statement, it is extremely unlikely that anything on it is incorrect, even though I may have been transacting with other parties all around the world, with several completely different organisations involved along the way.

When I make my dinner, it is a safe bet that all my kitchen equipment will function properly, even though everything from my microwave to my refrigerator is software controlled.

If I pick up my wireless landline handset and place a call, chances are nothing short of severe interference is going to prevent me from connecting to the phone of the person I want to speak to, again even though this might involve intricate negotiations between numerous devices and even different organisations in different countries.

The world is full of software that is actually pretty good considering it was made by fallible humans. We don't notice a lot of the time, because the things depending on that software just work.

wpietri · on March 20, 2016

I guess if your criterion for "good software" is "mostly works" then that's fine. Mine are better.

But I have seen bank code. It's terrible. I've seen credit card handling code. It's terrible too. Phones used to be reliable devices, but the average uptime of my phones has fallen to something like 3 days. Perhaps 3x/day my phone tells me that some app has stopped and that it would like to send a bug report. The code that runs cars is terrible too: http://www.safetyresearch.net/blog/articles/toyota-unintende...

Most companies I visit have bug databases with hundreds or thousands of open problems. And that's the just ones they know about, the ones where people take the time to report the bug. Things are terrible and people are just used to it. In the aerospace industry, it's called "normalization of deviance", and it's what destroy the Challenger.

None of this is necessary. It's been at least 15 years that Kent Beck has been talking about the various quality practices that drop bug rates dramatically, including test-driven development, pair programming, and continuous integration. It has been at least 10 years that Martin Fowler has been reporting on teams that have bug rates below 1 per team member per month.

I'm certainly willing to be charitable about the people. A lot of organizations making terrible software are staffed with perfectly nice people who mean well. But I decline to be charitable about the software. These are commercial products, not ash-trays made by second-graders at day camp. The software should stand and fail on its own, with no charity needed to soften the blows.

Silhouette · on March 20, 2016

As a slight digression, I think "mostly works" is a reasonable benchmark for a lot of software, because as much as you or I may dislike it, evidently the market won't pay for something qualitatively better. Your phone crashes every few days because most people are willing to accept junkware that crashes every few days even though it's part of one of the most expensive things they'll buy this year. I can't imagine why so many people would think that's acceptable, other than the industry successfully convincing them that it's the best they can reasonably expect, and personally I use a feature phone that doesn't suffer this sort of madness anyway, but sadly it's clear that I'm in a minority here.

If we're talking about software where reliability actually matters, then I agree that too many projects fall far short of ideal standards. I've heard all the same horror stories that probably you have. The auto industry, in particular, is moderately terrifying. But on critical projects, the kind of heavy reliance on unit tests and ad-hoc specification that is common practice and even considered desirable on a lot of Agile projects is also inappropriate, or at least far from sufficient on its own. So I'm assuming we're not really talking about the software that controls a pacemaker, the emergency shutdown systems for a nuclear reactor, or the safeguards to prevent configuring points so trains moving in opposite directions enter the same section of the track.

For most software, though, the programs that help us to do things day-to-day but if they fail once in a while under some awkward conditions it's not the end of the world, I think it's pretty clear that the industry produces a lot of value and the users would miss it if it were gone. Is it perfect? Of course not. There's plenty of room for improvement. But I think the argument that most software is bad is hyperbole.

wpietri · on March 21, 2016

> evidently the market won't pay for something qualitatively better

Your assumption here is that quality is more expensive. In my experience, it's substantially cheaper. I've seen "enterprise" shops take reasonably simple apps and blow them up into things requiring large teams and enormous amounts of hardware. And then spend 70% of their time debugging, because they're going too fast to do anything right. This is endemic; some friends of mine do ops consulting, and even at the heart of the tech boom they see clusterfuck after clusterfuck. The apps all mostly work, or the companies would be out of business. But we can do better than just failing to fail.

As an analogy, look at the US car industry in the 70s and 80s. They were producing terrible stuff. Toyota came along and demonstrated you could make better cars for less money. The same opportunity is available here in software. Consider, E.g., WhatsApp, which was serving nearly a billion people on 8 platforms with a team of 50 engineers.

It's an especially appropriate analogy in that a lot of the most effective process improvements come from applying TPS-derived principles to software. See, e.g., Mary Poppendieck's work.

> But I think the argument that most software is bad is hyperbole.

Only if you define bad to mean "worse than average". But I mean it quite literally.

Bug rates, development cost, development cadence, WIP, and project failure rates are all absurdly high compared to well-run projects. This has been true for decades. By "bad" I mean "well below what teams could achieve if they applied best practices".

Silhouette · on March 21, 2016

Your assumption here is that quality is more expensive.

To some extent, I think it is. More specifically, I think there is a balance between spending more on preventing defects up-front and not needing to spend as much on dealing with those defects later, which dominates the issue up to a certain point, and then beyond that point you have to start considering external costs as the dominant factor.

If you have a project that is made of poorly designed spaghetti, doesn't have any sort of serious test or review processes, and is kept afloat by little more than a few hero developers, then of course you're likely to have a relatively high level of defects. Even modest improvements in the development process will likely have a very good ROI in this case. In this sort of scenario I would agree with you that improving quality may be substantially cheaper than neglecting it, because relatively easy changes in development process would probably pay for themselves in reduced maintenance costs even before considering external factors.

However, the kinds of changes that bring really dramatic improvements in quality -- the kind of thing we might hope you would use for a medical device or safety-critical transport control system -- really can significantly increase development costs. Assuming you could fix the easy problems in other ways, you're probably chasing a relatively small number of extra defects already by the time you get to using these methods. To get a big jump in quality at this point, you might need to employ very different development and/or engineering techniques, such as formal verification stages, redundant systems, or much more structured and demanding review processes, and you might need to do this all the way down your tool chain in both software and hardware terms. These measures tend to require more skills, time and/or resources, and all of those are expensive.

Now, we have to be clear on what we mean by "more expensive". So far, I've mainly been talking about the development costs here, what it takes to write and maintain the software. The point of the extreme quality approaches is usually that failures of the system may have some other cost -- in human life, perhaps, or in delaying something important by a very long time -- that is not acceptable, and so extra investment in avoiding that external cost may be justified even though it makes the development itself much more expensive.

In my experience, the development costs associated with those more extreme approaches ("extreme" is a somewhat loaded term, but I can't immediately think of a better word and I hope you understand what I mean) will be prohibitive today for non-critical software, the kind of system that doesn't have a catastrophic failure case with disproportionate external costs to consider. This is what I mean when I say the market won't pay for something qualitatively better: most people won't prefer to pay $20,000 for a word processor that essentially never crashes or corrupts data or has minor incompatibilities when loading files created using its previous version, instead of $200 for a word processor that basically does its job but might crash out every couple of months and lose the five minutes of work done since the last auto-save.

By "bad" I mean "well below what teams could achieve if they applied best practices".

OK, so if we also restrict "best practices" to "things that improve quality at any cost" then I would agree with you that most software is bad by your definition.

However, if best practices also include things like being commercially viable, then I would no longer agree with the claim that most software is bad by your definition. There certainly is plenty of bad software around, but there's also plenty of software developed in ways that already do avoid silly defects reasonably successfully. In the latter case, I come back to my argument above: because of both diminishing returns in the number of failures you might prevent and the need for more fundamental changes in the development strategy that are relatively expensive to implement if you want to significantly reduce the number of remaining defects, most projects won't be able to do these things with the tools and techniques we have available today and still remain commercially viable. I don't think it's really fair to say those projects aren't well-run just because they went with a strategy that the market would accept.

wpietri · on March 23, 2016

> In my experience [...] will be prohibitive today for non-critical software

And how much time have you spent practicing TDD? Have you worked on a project with 95%+ unit test coverage? Have you worked on with a comprehensive test suite that runs in under 30 seconds? Have you worked in a team that practices pair programming and collective code ownership? Have you worked on a team that does continuous deployment with at least one deployment per developer per day? Have you worked on any team that has bug rates below one per developer per month?

Other people have had different experiences than you. I am one of them. I'm telling you that it's perfectly possible to do an order of magnitude better on bug rates than most teams and get a cost decrease. Plenty of other people will tell you the same. People having been writing about their experiences like this for 15 years.

At this point, I have given up expecting J Random Commenter to believe me; normalization of deviance means that most people cannot (or will not, I can't tell which) even conceive that things could be better. It's the same way that American car companies literally could not understand how Japanese manufacturers were producing radically better products at substantially lower costs. They still generally can't, because to do so would mean admitting that they've been screwing up for decades.

So if you'd like your current limitations, carry on arguing for them. But if you would like to see if something can be different, try out something like Extreme Programming.

Silhouette · on March 24, 2016

It's regrettable how often people assume opinions different to their own must have been formed out of ignorance.

Some of the code I've written has to run in places where the cost of failure can be very high (not normally human life high, but certainly economically prohibitive) and the processes to deploy an update anything if a bug does need fixing can be measured in months with significant costs of their own.

As an example, I wrote a program a while back that implemented somewhat complicated data processing algorithms, took a few months to develop, has been in service for several years now, and to my knowledge has never had a single bug reported against it in production other than a small number where the project met the spec but the spec turned out to have been wrong.

That project was developed and tested using a variety of techniques. A sensible automated test suite was one of them. It was also built on rigorously proven mathematical foundations, among other things.

So yes, I do have experience with building very high quality software. I've made a significant part of my living doing it over the years, and in some cases I have single-handedly outperformed entire teams working for my clients' competitors at the same time. I do know the value of a good test suite, and a lot of other effective development techniques.

It would still be commercially unreasonable to spend the kind of time and money it took to develop a project at that level of robustness if the potential costs of failure were not so high.

rafaelferreira · on March 19, 2016

One nuance you might have failed to capture in your characterization of the article is that a lot of the cost is high when the subject is a beginner on the technique. The point is that as you gain mastery it becomes easier, thus encouraging people who are interested in the technique not to give up so fast.

In particular one side-effect of being a beginner is that one spends a lot of mental energy on the technique, leaving less room to focus on other important aspects. That gets better with time.

aalvarado · on March 19, 2016

TDD is better than no tests most of the time I think. So at least that's a gain altogether.

aalvarado · on March 19, 2016

I do TDD rarely, I like it when I'm on pristine projects, I feel it works when working alone on code you're outside your paying job. Or a library that you don't need to do much exploration programming.

I add tests after the fact for most of my programming, usually because that's how I feel more comfortable. Sometimes it is a net win to figure out how something works by experimenting with different ideas and start having a feel of the software you're creating. Without slowing down for trying to think about correctness every time you write something down.

TDD should not be, "An always on" kind of thing, I feel like most of what software has taught me is "Use where better suited" which this I think is the main reason why I feel I differ with his view which I think it is, use always. I kind of think that people with those bracelets about "always doing TDD" are thinking.

wpietri · on March 19, 2016

I've used TDD both at my paying job and as part of some kinds of exploration. I agree it's more challenging in both places.

The place where I won't use it is when I'm writing throwaway code. If I'm really being exploratory, then I just go write garbage. And when I've done enough exploration to start writing real production code, I'll start fresh with TDD.

I've tried it the other way, where I retrofit tests to existing experimental code. But because the experimental code was to help me learn something, it generally ends up being poorly designed. It's only once I understand the big picture that I know the write way to express my understanding in code. At first I didn't like throwing out the experimental code, but now I prefer it in that it frees me to be entirely experimental, rather than writing something that's half experiment, half production grade.

howareroark · on March 19, 2016

Yeah. To be honest the only unit tests I ever wrote... I wrote after I built something that I understood through and through and is likely to change very little.

I really only did it cause people tend to say... "don't use open source unless it has tests". I even did all the code coverage stuff... got it to 90%+

It was fun and amusing... certainly not a validation of TDD though.

erichmond · on March 19, 2016

It should be noted that TDD specifically means writing tests before you write application code.

For example, at our shop, we don't require devs to do TDD, however we require test coverage.

I don't think anyone would argue that writing tests at all is a bad idea, I think the debate is whether or not your tests should drive your application code.

Silhouette · on March 20, 2016

I don't think anyone would argue that writing tests at all is a bad idea

If we're talking specifically about the kind of low-level unit tests required for TDD, that is still an assumption, though not an implausible one.

Even if unit testing is effective by some measure, it takes a significant investment of time to create and run those tests. TDD also constrains the software design and the development process.

Maybe we would do better to remove those constraints and instead spend that time on some other activity? Maybe some form of code review or walk-through exercise would be effective. Maybe we should be writing higher-level tests. Maybe we could formally prove some key parts of our code are correct. Learning new programming skills might stop us making some mistakes in the first place. Maybe we should even be adopting a new language or tool that would prevent some errors from being possible at all by design.

There are many ways we can try to make our code more reliable, and they all have costs, and sometimes they conflict. Even if one of them is better than nothing in isolation, that doesn't necessarily make it the best possible strategy when you consider the alternatives and opportunity cost.

seanwilson · on March 19, 2016

> As usual my advice on this is: look at the people who build things you find highly impressive, and study how they did it.

Are there any big popular open source projects that follow TDD?

jacques_chester · on March 19, 2016

Yes: Cloud Foundry. Small teams, pair programming, TDD. I believe it is now north of a million lines, mostly Go, spread across a lot of complex, interacting subsystems. Teams are located on multiple continents, working for multiple companies.

At Pivotal we typically roll out the latest release versions of Cloud Foundry into production (PWS) in less than 48 hours after a release is cut.

Anchor · on March 19, 2016

> But he doesn't. Because the benefits aren't higher, in fact they are lower

I am genuinely curious, what do you mean with this claim? I mean, yes the costs are high [1], but when properly paid, do the benefits still end up smaller than without practicing TDD? Which concrete benefits we miss when doing TDD?

Uncle Bob has stated many of benefits of TDD elsewhere and did not mention them in this post. Many of his public talks on the subject are available online.

[1] https://news.ycombinator.com/item?id=11311646

Edit: link to my previous comment from yesterday elaborating on the costs

jblow · on March 19, 2016

It is important to read the rest of the sentence in order for this comment to really make sense.

I am talking about any scheme of how to do things that is intended to provide benefit. These all start with "wouldn't it be better if X, because Y" and then a plan is made of how to bring this about.

Well, this plan is inevitably imperfect, so it is either that you don't get all of X, or the reasons Y were not correctly understood or accounted for.

Then, there are always some extra drawbacks that creep in that negate some of the benefits. Usually these drawbacks are very subtle, and they can be hard to notice because they are not things that the plan was trying to address.

In the end, usually the net result is negative: the scheme causes more damage than it provides in benefit. But usually it takes a long time to understand this clearly, because the drawbacks can be subtle (but sometimes they aren't, for example, in TDD, how much extra code you are writing all the time).

Anchor · on March 19, 2016

If I read this sentiment correctly, this means that any attempt to improve anything ("provide benefit") usually results in net negative. But I cannot imagine that this is the case. Can you elaborate?

For example, let's say I want to improve my car factory efficiency. I introduce a way to keep things running without delays on the production line. Would you say that this plan is futile too? Toyota might disagree.

Would you also say that accountants waste half of their time doing double-entry bookkeeping? Half of the transactions are extra in the same sense.

I mean, I understand your argument on the imperfection of any improvement plan, but this does not say much on the actual topic on hand. Should we discuss concrete trade-offs instead of general sentiments?

jblow · on March 19, 2016

There is no contradiction.

Yes, I am saying that most plans on how to do things better are not right. Doing things better is often pretty hard.

But there always is some way to do better. The way you find that is you keep trying a lot of things until you build up an experience-based picture of what things are really like. As you get better at this, plans you formulate become more likely to be net-positive.

What I am saying is that TDD strikes me as a pretty terrible plan in the first place, the product of this kind of ideas-untempered-by-serious-experience.

Speaking for myself, I am pretty sure my own productivity would plummet were I to adopt TDD, and in fact I would completely lose the ability to build software as complex as I do; I would drop at least a level or two there. This does not necessarily speak to TDD's suitability for anyone else, which is why I am recommending to judge by output.

Anchor · on March 19, 2016

In general it seems that people who keep practicing TDD have been programming for more years than people who have not tried it, or have abandoned the practice. I have yet to meet a TDD practitioner who started programming that way and has not considered any alternatives. The ideas have born out of bad - serious - experiences with existing approaches.

As I said in the previous thread yesterday, TDD'ing requires months or years of practice to get really productive with, and has a fairly large set of prerequisites that one has to know in order to remain sane. It took me several years of experimenting (especially with different techniques of writing unit tests) before I found a way to be productive with TDD. I also drew the connection between testability and program architecture (decoupling etc.) fairly recently (some four years ago), and that was one of the last pieces of the puzzle that made everything work.

Yes, my productivity plummeted too, but the benefits were too good, and I slowly found the techniques needed to keep up with my old self in terms of produced features. I dread to think the pieces of code that send me deep down into debugging sessions due to non-existent test coverage.

c0achmcguirk · on March 20, 2016

Neal Ford (Thoughtworks) said in a recent talk that he can tell which projects were written with TDD vs. non-TDD by looking at one metric. Cyclomatic Complexity.

He said that TDD code typically results in a code base having classes with an average cyclomatic complexity of 1.5 - 2. Non-TDD code typically results in cyclomatic complexity of 15 or so.

I do think you would drop in productivity for a while, but you'd level up in a month or so because it'd force you to write more maintainable code.

> the product of this kind of ideas-untempered-by-serious-experience.

(chuckle) That's a pretty myopic view. I've written some pretty freaking complex software, and I've done TDD and non-TDD. My TDD code is waaaay better and much less brittle than my complex, not-TDD code. I hate reopening the projects where the team didn't use TDD...it's scary to change. And I forget what I was thinking when I wrote that 250 line method with all these cute little shortcuts that I thought were so cool at the time. Now I have to spend time reading the code like a man stares at ancient hieroglyphics trying to decipher the mind of an Egyptian pharaoh.

ryanbrunner · on March 19, 2016

There's definitely some good points here, but I have a huge issue with this point:

> Something that is hard to test is badly designed.

In my experience, TDD has a natural tendency to favour decoupling at all costs, and TDD zealots will push de-coupling units as an unqualified positive. And if your metric is "good design = easily testable and low coupling" it certainly looks that way.

This mindset fails to take into consideration the fact that tightly coupled things are often simpler to understand and reason about. They're more closed to extension, sure, and harder to test in isolation, but a straightforward process that acts like a "black box" and performs a job simply in a few lines of code is often better than the class explosion that rigid TDD often encourages. And what's the value of keeping things open to extension if the only thing that is going to reasonably extend them are tests?

A good illustration made by DHH as part of one of his anti-TDD tirades is here: https://gist.github.com/dhh/4849a20d2ba89b34b201

Anyone who can honestly say the latter example has a better design has a very different opinion on "good design" than I do.

ktRolster · on March 19, 2016

His point, as phrased, is correct though, something that is hard to test is badly designed. You need to be able to test your software in some way or another. Even if you've proven its correctness, you still want to put it through some basic sanity tests (as Donald Knuth pointed out).

sago · on March 19, 2016

Yes, but it misses the point of the criticism.

The issue isn't whether the final product is difficult to test. It should be roughly as easy to test the final product no matter the implementation strategy.

The problem is that TDD requires a royal-road of testing. No code can be written without tests for that specific piece of code. So you can only build the whole out of components that are testable in isolation. The claim is that this excludes architectures that don't provide such a step-by-step path of testing.

This interacts badly with the 'make everything an API' idea. You end up with these functionally small units with over-engineered APIs rather than something more complex, more efficient, and easier to refactor as a whole.

jskulski · on March 19, 2016

> This mindset fails to take into consideration the fact that tightly coupled things are often simpler to understand and reason about

You're right on a small scale. With teams or feature changes, tightly coupled things quickly breed incidental complexity.

I like what you said about decoupling not being unqualified positive. It's not, it's very good, but it's costly (with a high return). It's all about trade offs. Testing, and specifically testing-first, is expensive compared.

If I'm working on a critical piece of my system? Yep, worth it. TDD. Do I need to refactor something so I can reduce complexity with this change? TDD. Am I writing a quick workaround for better UX that will be removed after two days? TDD isn't appropriate.

It's a essential skill to have, but not the only way to practice.

Silhouette · on March 20, 2016

With teams or feature changes, tightly coupled things quickly breed incidental complexity.

That's a big assumption. However, even if it's correct and tighter coupling does introduce some additional complexity in any given case, if the code is also significantly easier to understand as a result then it may still be quicker and more reliable to maintain it. After all, very loose coupling also breeds a kind of incidental complexity, because something still has to connect all the components.

davidjnelson · on March 20, 2016

Interesting point. How about dependency injection for oo code and composition for functional code?

m_fayer · on March 19, 2016

Just the frequency of certain memes... "interfaces", "design this Design that DESIGN", "composition", "too coupled", "too decoupled", all points me in the direction of excessive use of DI, indirection, and as you say - explosion of tiny "single purpose" types.

I've worked on many such code bases. I don't deny the benefits in testability and extensibility, but I think the benefits have always been outweighed by the difficulty of onboarding new developers, the fact that few people manage to develop a big-picture understanding of how the whole thing fits together, and the proliferation of YAGNI-violating code that ends up being written.

davidjnelson · on March 20, 2016

It's a great point, and a hard problem. Too early and over abstraction leads to inefficient maintenance and no abstraction leads to inefficient and error prone maintenance. I guess the trick is finding the right balance.

aikah · on March 19, 2016

> In my experience, TDD has a natural tendency to favour decoupling at all costs, and TDD zealots will push de-coupling units as an unqualified positive. And if your metric is "good design = easily testable and low coupling" it certainly looks that way.

Depends on the kind of TDD. I feel like TDD as Unit Testing driven design IS a waste of time (I'm talking about unit tests first here). However integration testing first is useful. Let's say I'm writing a library(a http router). The library will obviously have an API that will be consumed by the user, well for me testing will be like writing first how I think the API should look like. For a http router it would be like :

    var router = new Router();
    router.get("/{id}",someController);
    response = httpClientMock.get("/34943");
    expect(response.body).toBe("Hello 34943");

It costs me very little to write that , provided the language has all the facilities to test http servers[1], and check that the result of the operation is what I expect. I don't need to test the code that will parse the path for route variables directly, yet I did write the test first.

1 : that's often the problem , languages care about syntax but don't care about developer experience and the quality of tools around the language. It's not just "a compiler and the rest is your problem",no testing should be entirely part of the language or its standard library. Testing SHOULD be made as easy as possible.

ryanbrunner · on March 20, 2016

I agree insofar as writing that sort of test prior to writing code that makes your tests pass is a good thing. I don't think that matches the popular definition of TDD though. Most TDD advocates will insist that a key facet of TDD is that NO code exists in your application without first writing a test first. Given that, you're forced either into doing the highly detailed unit tests, or having an integration suite that needs to accomodate for every potential edge case of the entire system underneath it, which for anything non-trivial will be extremely time-intensive to run.

Ace17 · on March 19, 2016

Would you have an issue with "Something that is hard to prove correct is badly designed"?

EdSharkey · on March 19, 2016

Forcing yourself to not write a line of production code without having a test in place to test it first produces a very high level of coverage over all your control paths. You wind up with the tests being an ad hoc specification for the production code. Not really a design as much as documenting the code.

This is a really irksome way for me to work, and I find that level of diligence difficult to muster. Often I will find myself writing the production code first and writing tests to cover after the fact. My coverage is probably not as high as it is when I TDD the code.

The cadence of TDD'd projects I've worked on professionally is painfully slow and predictable. If you're at a workplace that is giving you the time and resources to do it, then you're likely also pairing, which helps impatient souls like me stick to the program. My team has been blessed to have a devops guy who has a scientific mind and acts as informal agile coach while pairing - and he's probably the reason we do a good job on the TDD because he makes us all better at it.

If you're at a workplace that is encouraging TDD and pairing, you should be thankful and honestly try to do a great job at it. Your work weeks will be relaxed and you probably won't need to work much overtime. I wind up with lots of left over energy to burn on my home projects, which I appreciate.

davidjnelson · on March 20, 2016

Jobs like that are indeed amazing. Anyone have an effective strategy for evaluating if a job is like that while interviewing? Or perhaps a list of companies that operate this way? I'll start the list with Autodesk, although all projects and teams are different of course.

barrkel · on March 19, 2016

This is an argument against a straw man, IMO; people who have a problem with TDD don't (or shouldn't) have a problem with testing, or creating a design that is testable.

What TDD is specifically poor at is design. Test Driven Design literally means your tests drive the design of the system, rather than any other consideration, like reducing API scope, reducing complexity, reducing configurability (yes, excess configurability is a bad thing, and TDD tends to encourage it). TDD in particular won't drive insightful designs, because insight is a product of a fertile and well stocked mind meeting a problem domain - it does not emerge organically out of tests.

robert_tweed · on March 19, 2016

Title is slightly misleading, so for anyone reading the comments first, this is Uncle Bob's rebuttal of yesterday's "Giving up on test-first development", in the form of a Q&A.

Original discussion here: https://news.ycombinator.com/item?id=11310711

legulere · on March 19, 2016

> but he said he was just using it for some home projects

I think here lies the biggest problem. For small/hackish projects you barely get any benefits from test driven development but still have the higher costs. Test-driven development means that you will spend more effort while developing, but will have to spend less on maintaining in the long run.

The fun part about home projects is that you don't really have to think about maintenance at all and you can just hack things away until they work for you.

PS: you should write out acronyms once in the beginning of an article

EdSharkey · on March 19, 2016

My home projects become tragic when they get past the trival size and I haven't been TDD'ing and designing all the way through.

I think part of my problem is that I cannot devote a consistent level of effort every week to my current home project. Sometimes I'll leave and when I come back to it, I'm lost without a test suite and wind up rewriting huge swaths of it.

Really good tests also serve as up-to-date documentation of your design. If you code in Java or Groovy, check out the Spock Framework. Spock tests read like a story. I wish there was a Spock-like DSL for JavaScript development.

vorg · on March 19, 2016

Spock hacks into Groovy's AST to make labelled expressions have special meanings:

  def "adder-test"() {
    given: "a new Adder class is created"
    def adder = new Adder();

    expect: "Adding two numbers to return the sum"
    adder.add(3, 4) == 7
  }

It could all easily break if the Groovy parser or AST is updated. And very clunky, which says something about Groovy's design post v 1.5.

LordDragonfang · on March 19, 2016

I spent the majority of the article thinking it stood for the "top-down design" that my intro to programming professor was so fond of.

overgard · on March 19, 2016

Regardless of if TDD works, we see this argument pattern all the time, some magical claim for effectiveness is asserted via a methodology (always anecdotally), someone says "didn't work for me" and the response is ALWAYS "well you didn't do it long enough to get it!"

It's all just noise. Theres nothing wrong with having your pet theory, I just wish entire industries wouldn't carelessly hop on board every time some consultant comes up with an acronym and some promises.

jdking · on March 19, 2016

For a new project I like to start by writing code that sketches out ideas of how the software should hang together.

TDD at this point is kind of wasteful because if you decide to backtrack or rework aspects of the design then you have to also rework all of the tests. I find that once the design begins to crystalize, adding unit tests at this point is valuable, but not necessarily in a 'test-driven' way.

TDD by itself, per se will not automatically produce a good design, and can often produce bad designs with huge proliferation of classes / interfaces that makes a codebase more complex and difficult to understand.

TDD is not the be all and end all that some commentators seem to believe.

sago · on March 19, 2016

To me, TDD means:

1. No functionality is coded without a failing test.

2. Only code the minimum functionality to make the test pass.

It does not mean 'test more' or 'write tests'.

So I'm not sure more than 10% of this 'defence' bolsters the method at all. Seems like the normal rhetorical bait and switch - you pretend that your opponent is criticising testing and respond by defending some more or less vague notion of tests, with no substantive defence of TDD beyond "you didn't try hard enough." The example of the pacemaker, for example, has no relevance to TDD whatsoever.

davidjnelson · on March 20, 2016

Don't forget 3. Easily and safely refactor both your production and test code.

100k · on March 19, 2016

I found it strange that Martin suggests that the gaps in TDD-style tests with regard to dealing with bad input should be filled by integration tests, which due to their complexity seem more likely to cover the happy path, rather than fuzz testing or quickcheck (https://en.wikipedia.org/wiki/QuickCheck) style tests.

davidjnelson · on March 20, 2016

Right, those two tools seem like a natural fit for bad data input testing, and do seem to make more sense at the unit level.

vmorgulis · on March 19, 2016

An interesting and simple way to overcome the testing interface design problem with "tracing tests":

http://akkartik.name/post/tracing-tests

facepalm · on March 19, 2016

And communism would work if only people would do it properly. It seems to me that kind of argument could be made about anything. "It only didn't work because you didn't do it properly".

Doesn't it work for Homeopathy, too? "It would have worked if you hadn't passed it through the x-ray scanner at the airport" or "it would have worked if you had taken the medicine while doing a handstand".

In the end doing it right becomes so complicated that you need to hire an expert to do it for you. Wonder if that was the point all along - basically marketing.

jacques_chester · on March 19, 2016

The communism analogy is better than the homeopathy analogy. The former makes problematic assumptions about human nature, the latter impossible assumptions about physical reality.

That said, I work in a company where TDD is the default. So far everything still works and there are no tanks rolling into any public squares to squash the bourgeois scourge.

Ace17 · on March 19, 2016

Some disciplines are hard to do properly, and easy to do wrong. It's a mistake to dismiss them just because the return on investment isn't immediate. We're mostly programmers here, we're not supposed to be afraid of steep learning curves!

facepalm · on March 19, 2016

Sure, although my heuristic for trying new Software methodology tends to be "does it make things simpler?". That article made my head hurt...

sargas · on March 20, 2016

Moral of the story. Choose simple design, and test your code. Am I missing something?

Btw, this website is great! I really enjoy the writing approach.

projectileboy · on March 19, 2016

TDD is just a tool, not a religion. If the tool works for you, use it. If not, don't. I would encourage people to give it an honest try and see if they find it valuable.

me_again · on March 19, 2016

Q: Are you sick of the Socratic method?

A: Oh God yes.

Q: What's wrong with it?

A: It's overly verbose, condescending, and allows you to give the appearance of resolving a debate when all you've actually done is talk down to a sock puppet.

greenyoda · on March 19, 2016

"It's overly verbose, condescending, and allows you to give the appearance of resolving a debate when all you've actually done is talk down to a sock puppet."

I agree with your criticism of the article's style. But to be fair, if you're arguing with yourself (as the author is), it's not really the Socratic Method, which is a dialog with another person. If you don't get to control what questions the other person asks, it makes for much a more interesting conversation.

me_again · on March 19, 2016

Agreed- if there are 2 real people debating, it's much more bearable. Ironically some of the canonical Socratic dialogs (http://www.gutenberg.org/files/1643/1643-h/1643-h.htm) have the same problem - Meno's slave is too obviously Plato's mouthpiece IMHO.

jacques_chester · on March 19, 2016

Sock-puppeting is rife in Plato's books. And it is just as infuriating as Bob Martin's writing is.

But I agree with Martin more than I do with Plato. Take that as you will.

ktRolster · on March 19, 2016

Q: Why should that be different than any other type of rhetoric?

A: If you want truth, read a scientific study, not a blog.

dave2000 · on March 19, 2016

Q. Are there actually proper studies done by and of proper developers writing proper (ie not trivial crap) code which demonstrate the pros and cons of TDD?

ktRolster · on March 19, 2016

A. This one is probably good, it compares actual software teams: http://www.infoq.com/news/2009/03/TDD-Improves-Quality It shows in increase in quality, but with an increase in development time.

It is my un-studied assessment that the improvements seen from TDD are primarily a result of the programmers thinking more. It pushes you to think of every edge case.

It also gives more experienced programmers a chance to mentor younger programmers (for example, you can easily look at the test coverage and say, "look, you need to test these things too." It's harder to do that (though still possible) without tests).

inaprovaline · on March 19, 2016

Every single time a developer releases a piece of code not covered by tests, he/she is basically taking a bet, hoping that things will work. And very often that is simply not the case.

Releasing not tested code is simply unprofessional and wrong in so many different ways.

I honestly don't see any valid argument to not write tests.

jqm · on March 19, 2016

Agreed, but writing tests isn't necessarily the same as TDD.

exabrial · on March 19, 2016

Tdd is actual engineering: requirements first, set expectations, plan, estimate, build. Silicon Valley (generalization) has extreme shortsghtedness when it comes to building software, there are a lot of prima donnas, hipsters using whatever is cool,tech decisions made on opinion rather than measurement, and NIH.

Have a spine and push for tdd. The prima donnas will attack you ad hominum and call you slow. But it feels pretty damned good to be right.

stuartaxelowen · on March 19, 2016

    actual engineering

    prima donnas, hipsters

    have a spine

And then...

    The prima donnas will attack you ad hominum

I'm not for or against TDD, but you've gotta see the irony here.

mikegioia · on March 19, 2016

    The prima donnas will attack you ad hominum and call
    you slow.

I don't know whether to laugh at or applaud the irony of that statement.

usrusr · on March 19, 2016

That's why conservative engineering traditions mandate that whenever one sets out to design a bridge, the very first thing to do is to build a really heavy car.

overgard · on March 19, 2016

You also want to make sure your heavy car falls off the leading edge of the bridge before you raise the support structure for that part. So you're sure you're not building unnecessary bridge parts.

marcosdumay · on March 19, 2016

> requirements first, set expectations, plan, estimate, build

So, all the process that is known for decades to mostly not work for software?