Hacker News new | comments | show | ask | jobs | submit login
It's OK Not to Write Unit Tests (msdn.com)
103 points by bensummers on Nov 4, 2009 | hide | past | web | favorite | 69 comments

IMVU did not consistently write unit tests on day one. So why are we all rabid unit testing / test-driven development advocates now?


Every time you ship a bug to customers, ask yourself "How will we prevent this bug from ever reaching customers again?" and "Why did we write this bug in the first place?"

Almost every time, unit or acceptance tests would have caught the bug initially or prevented the regression. That's not to say unit tests are the only solution.


* you need better knowledge sharing between engineers

* you should spend more time refactoring your code

* your internal APIs are confusing and inherently cause bugs

Often a follow-up to a bug will involve changes to all of the above.

When you start an experimental project, don't focus on automated tests if you don't need to... I'm all for lazily pulling in the processes you need as you need them. :)

Whether you write unit tests or not is entirely context-sensitive. The author's points might make sense in some context, but they certainly don't apply to every team or even every engineer.

Maybe this guideline is helpful: "I just wrote a new component. What kind of test coverage do I need to allow a new engineer to refactor arbitrarily and still be convinced my code works?"

As far as guidelines go, my personal rule is "test at the highest level of abstraction possible." That could also be phrased as "test those invariants that are the least likely to change." In practice, the two guidelines turn out to be pretty analogous.

What does that mean, exactly? Any code you write is obviously in the service of some goal. Some code, like a method to split a String that's part of a standard library, is its own end: it's a complete API, so it should be tested as such.

A whole lot of other code, though, is in service of some larger component. Maybe you've got 15 classes that all work together to form the e-mail component, and hopefully only one or two of those classes really encapsulate the external API (whether it's an explicit API or just implied) exposed to other components, and the rest are logically part of the implementation. So instead of unit testing every one of those, you get more mileage testing at the level of that API. That lets you change the implementation of that component without having to rewrite a bunch of tests, and the tests themselves then become a much clearer documentation of that API.

But plenty of times, the overall API is so complex that there's a combinatorial explosion if you try to test at that higher level; in that case you have to find smaller chunks that have few enough paths through them that they can be tested thoroughly. Sometimes those chunks are the level of a single class or method; often they're still a collection of things.

Rather than starting from the classes and say "let's test everything," you look for logical groupings that form logical or explicit APIs, say "let's test this API," and then push the tests down a level only insofar as you can't test that API thoroughly enough to execute all the code paths in the related code.

To put that yet another way: test APIs, not implementation details, and the more things you can treat as implementation details, the less friction your tests will exert on your development process and the more value you'll get from them as a result.

Well said.

"How will we prevent this bug from ever reaching customers again?"

I think the cost of this is far higher than it's worth.

Every so often you're right... you certainly should consider the costs of automating the test versus the risk of regressing.

But note that once you you have an organization good at writing tests, the incremental cost of writing a test is very, very small.

We like the term "10x" around here. Design systems for 10x the customers of today. Write code that will survive for a year. (Otherwise it may have been wasted.) Only fix bugs once so you can grow the engineering team without rehashing old issues.

Are you a programmer presently? That works on a codebase that's at least in the tens of thousands of lines of code?

Tell me what software you write so that I know to never, ever use it.

I'll play along. Have you ever owned your own business, and had to make the trade off between absolute quality and efficiency?

The one between correct and works-for-the-customer?

lines of code is very dependent on the language. I have one rails app that's in the 5k range, excluding tests, and it's been running for over a year with no restarts and no need for bug fixes. I think that says something about the effectiveness of my process.

I question the design decisions of any code base with tens of thousands lines of code.

If your app has been running for over a year, you might want to look into this (before someone else does)... http://groups.google.com/group/rubyonrails-security/msg/7f57...

thanks for the pointer. the site wasn't affected by that.

> I question the design decisions of any code base with tens of thousands lines of code.

What operating system are you running on?

my servers are either debian or ubuntu, and my laptop is a mac. You do realize that the OS isn't a single monolithic code base. It's a bunch of smaller code bases that work together, which is precisely the point I was making.

Want to take a guess at how many lines of code are in the Linux kernel?

that's the exception, not the rule right? It's also a example of something that I would absolutely write thorough tests for.

I'm also curious how modular it is. Does a kernel hacker have to keep the structure of that whole set of code in his/her head? I'm guessing there are a lot of clean seams in it that in practice reduce the actual lok related to a project down to a much smaller number.

This is from linus' chapter in "open sources"

"With the 2.0 kernel linux really grew up a lot. This was the point we added loadable kernel modules. This obviously improved modularity by making an explicit structure for writing modules. Programmers could work on different modules without risk of interference."

Meaningless. You made an explicit statement that you thought all code bases with more than 10 k lines have "questionable design", using a piddly little rails app as an example.

When someone pointed out that linux kernel had millions of lines now you are retreating to "but it is modular". Programming languages have had modules since the eighties. If someone points out a 25,000 line module what will you say? Each function has less than 10 k lines?

The point is that there are well designed code bases (the design includes modularization) with sizes well above 10 k lines , which directly contradicts your point. I've worked on well designed enterprise systems with a few million lines of code.

You've flipped my statement which changes its meaning. I said I'd question the design of any code base that large. Some would pass that questioning. I did not say that all large code bases have questionable design.

I won't debate a strawman anymore so I'll move on. Btw, are device drivers counted in that millions of lines of code? If so those are very independent things.

This thread was about test coverage, how many of those millions of lines of kernel code are covered by a unit test?

EDIT: so you agree with my original assertions that large test suites may not be worth the cost?

"how many of those millions of lines of kernel code are covered by a unit test? "

Not many. And yet the kernel is an awesome piece of code. That was the point. I don't have the time to debate test-driven/test-coverage fanatics either.

If I ask the client between shipping a bug now and then and developing twice as slow, I know what he'll tell me.

(edit: twice as slow is about the same as twice as expensive)

Shipping bugs is not a show stopper. It's not even serious. What is serious is developing a live, production app without any consideration for bugs: backups (at least a couple of very different kinds) and control keys. A control key can be anything which will be visibly not ok when there's a bug somewhere.

And you never have to debug and fix your bugs? That takes zero time?

I do management apps with a web interface: pretty flat architecture, no OOP to speak of, low complexity except the occasional long database query. So bug fixing is not really that time consuming, and definitely not the kind that would be caught by tests. Lately most time has been wasted on configuration issues of one kind or the other.

If you're a Java programmer and want to have a rude awakening, go download Jester. Jester is an automated mutation testing tool—it goes in and replaces “<=” with “<”, “&&” with “||”, “!” with whitespace. And then it re-runs the tests. And then you get to watch in horror as your tests still all pass, regardless of what the product code actually does.

I've always wanted to write a unit-test testing library which inverts all of the assertions to catch the false positives. This sounds like it's a lot more sadistic (and useful) though.

A tip for Rubyists.. there's a library that does stuff like this: http://ruby.sadi.st/Heckle.html

Would be interesting to automatically run something like this and find a way to assert that the tests should fail. Almost an automated test of your tests.

Unless I'm misunderstanding something, that is what these type of tools do.

That is indeed a devilishly clever idea.

I'm a relative newcomer to unit testing and man.. has it saved me a lot of time already.

Forgetting the gigantic advantages it gives you over mere bugs, I'm just as impressed at how it helps you resolve higher level issues.

For example, recently I moved a project I was working on months ago from one machine to another. Tests had lots of failures, despite no changes. Turns out I had an unpatched library installed, a few libraries of the wrong version, a slightly different database version, stuff like that. My tests picked all that up and I could vendorize stuff or tighten up my dependencies. Without the tests, I'd have been scratching my head for ages figuring out "what" wasn't working.

>Without the tests, I'd have been scratching my head for ages figuring out "what" wasn't working.

The real question though is how all that time you spent writing the tests compares to the time you'd have spent troubleshooting.

No the "real" question is how much more time it will save him when another incompatible library affects his codebase. Since he spent the time developing tests, he doesn't have to waste time tracking down obscure bugs when something like this happens.

Writing tests adds time up front but almost wipes out the debugging stage for me (which used to make up > 50% of the time).

The number of crazy edge cases that have been picked up by my tests that would have just "got through" before scares me when looking at code without tests.

Of course, I acknowledge it's not for everyone and it's not for every project. If you have a very strong up-front design and speccing process coupled with a final integration testing process, I'd argue unit testing might not be necessary at all. As a lone developer with a lot to keep in his head, though, unit testing has been very useful.

Have you done much unit testing? Done right, writing unit tests often doesn't take any extra time at all.

I disagree. We spend a considerable amount of time writing our behavioural tests (we have around 1,700 for our large web app), but they have saved us countless times from having to track down bugs after deploying code.

There was an article published in the last few weeks which showed that teams who write tests spend 10-15% longer in the development process, but they also save 25-30% of the development process time by not having to fix regression defects. Unfortunately, I don't remember which site it was linked from or what the article's title was. If I can find it, I'll link it here.

I have to agree with Jon. Really, once you get into the cycle, and you write them first, it doesn't tend to add much time. It does, however, really hurt your productivity when you start doing TDD. It's akin to the hit you take when you move to a new programming language. You'll feel faster if you don't do it. But as you get more comfortable and get into a rhythm, it can be fairly insignificant. Just don't get ahead of yourself, and don't let the tests get out of sync.

Rails is a good example where this sync stuff is hard. It's wicked easy to use things like associations and named scopes to query records while forgetting to write a test to ensure that those queries actually return what you specify.

But keep in mind that tests aren't just for you. They're for your team, and the next guy who follows you. You can learn a lot about intent from tests too.

I'm not sure what to take away from this post.

Unit tests just saved my bacon when I accidentally introduced a serious bug in refactoring some network-related code.

The bug wouldn't have shown up outside of production in some very specific cases, but it would have had a serious customer impact.

Unit tests catch these things for me all the time. Every 5 minutes I spend writing tests easily saves me 30+ minutes I'd spend debugging when some code 6 levels down from the apparent point of failure caused the issue.

I wonder why the apparent point of failure is so far removed from the actual point of failure. With my work that is rarely the case, and I wonder which of us is the anomaly.

Don't get me wrong, I write tests, but not nearly as many as a lot of people propose. I do a lot of social sites, so the context may be different, but it's not uncommon for a release to introduce a bug immediately comes to my attention when a user hits it. I can usually deploy a fix very fast, and I feel this is a more effective approach than maintaining a test set approaching comprehensive coverage. People usually fail to mention the time it takes to maintain such a test set over time, which is higher than the time it takes to write the initial test.

Note, I always thoroughly test high impact pieces of code. i.e. account creation, anything involving money or personal information, and data validation code. Having said that, the bulk of the code is controller and view code which isn't earth shattering if it breaks for 10 minutes.

I wonder why the apparent point of failure is so far removed from the actual point of failure. With my work that is rarely the case, and I wonder which of us is the anomaly.

Neither of us, it just sounds like we just work on different software with varying levels of complexity.

The tradeoffs largely depend on 1) your deployment model and 2) the cost of the bug itself. If you're running a web service, diagnosing and fixing bugs is a whole lot easier than if you're shipping pre-packaged, installed software. If the cost of failure is just "some user action failed, they'll try again when it works" it's far less critical to prevent it than if it's "your data is now corrupted." Something like a social media site is on one end of the scale, where you're running a service and where most bugs are non-fatal. Installed backend financial transactions systems for large companies, for example, are at the other end of the scale, where fixing bugs in deployed software is hard (you might not have the information to really diagnose the failure or the ability to reproduce it in-house, and the fix will require a patch that someone else has to apply and deploy), and the failure can be incredibly costly.

I've noticed that code connectedness has a huge impact on the kind of test coverage needed. For example, if your website back-end is fairly stable and the API rarely changes, then you don't typically need 100% code coverage on each page. As you mentioned, if you fix a bug, it won't randomly break later...

On the other hand, with a bunch of connected objects and events that fire, causing other events... it's very possible for a change in one system to affect others. In this situation, I recommend a ton of unit tests _and_ acceptance tests. In addition, decouple the components in the system and stabilize your base APIs. This will allow you to write a new feature without affecting stability elsewhere.

I also noticed that (at least in the code I write) bugs are much more prevalent in "boring code" than in "smart code"; that is, if I got some elegant but rather advanced piece of code I had put a lot of thought into (like some smart math stuff or something similar), it almost always works right away without any bugs whatsoever, however, boring and tedious-to-write code (such as wiring up GUI, integration glue code etc) tends to have a lot more bugs in it.

Unfortunately, this first type of code is a lot easier to test automatically than the second type. Smart code usually has clean and simple interfaces, while writing tests for "boring code" is also boring and tedious, like writing a test for each textbox on a form etc.

Maybe there's a way to automate the boring code? Write a system that lets you build UI without manually specifying the behavior of every control, etc.

Can you say for sure that a good suite of functional tests wouldn't have saved your bacon as well? And they might have been much easier to write and also catch a whole lot broader set of problems at the same time. The post isn't advocating having NO tests. It's not even saying that unit tests don't work. Just that they're not necessarily the best value in every situation.

Summary: another post saying that "unit tests aren't a panacea".

Which is true, of course. But that doesn't mean that unit testing isn't terribly useful.


He actually makes some subsidiary points that are worthwhile and don't just fit mold. * A lot of tests people write aren't really unit tests * Unit tests aren't the armor that protects you from error during refactoring. * Unit test don't necessarily improve design, indeed they can have a bad effect on design.

So, there's more to the article than the summary - read the article, I think it has merit.

Being a huge testing advocatem I agree with this post ...for the most part (and it's easy enough to disagree with if you didn't read it all the way). There are just certain parts of code that are just more trouble than they're worth to test.

However, I'm surprised at just how small an amount of code that really is. In fact, it's really difficult to say what those portions of code are unless you stick to TDD.

Another thing that I've noticed is that not only do functional languages lead to a more testable design, but testable code tends to be more functional in nature. So perhaps tests are design tools.

Plus, while unit tests really only catch a small number of bugs (I think studies have shown something like 25%), they're the fastest way of catching and preventing the bugs they prevent. Words can't describe how frustrated I get when I deploy to the test server and get an error because I mistyped a variable name in some obscure part of my code.

Some times I think programming in a dynamic language without unit tests is like screwing without a condom. Sure it feels good for a while, but eventually it will come back to bite you (no pun intended).

re: pains of unit tests when refactoring --

i.e. my experience with poorly written unit tests

i took over a project once from a crazy person that had a ton of unit tests in it. about 50% of the tests were dumb, so i deleted those (seriously, things like instantiating an object and then testing that it wasn't null).

I tried to maintain the rest of the tests as I worked with the code, but what I found was that there were so many tests, and most of them were testing the most pointless trivial things, that keeping them around was taking more time than it was worth, so I axed those too. the remaining tests that actually tested stuff, were testing parts of the design that were done so ridiculously that they needed unit tests. So I just got rid of all the tests and refactored the hell out of the code. I ran it through the debugger a few times and all was good. the end.

oh, then a few months later the company changed its business model and they didn't need that subsystem anymore. go figure.

The book that changed my life:


"Clean Code" by Robert Martin is one of the best programming books ever written, and it barely has any code in it. However, it is relevant for every programming language. He treats unit tests like mandatory rituals, though, and unfortunately, I have not progressed quite that far yet. Still, it is a must read and gives good insights into this topic.

"one of the best programming books ever written, and it barely has any code in it."

Sounds very fishy, but from Robert Martin, who thinks people who don't do TDD (not unit testing, TDD) are "unprofessional" and "stone age programmers", I wouldn't expect any less.

"He treats unit tests like mandatory rituals,"

cargo cult alert!

fwiw my conclusion based on my experience, unit tests are (by and large)a good idea. Test Driven Development is dubious, especially the "TDD is a design method" idea. Conflating the two notions isn't very useful but happens surprisingly often.

Even if you don't get into the TDD stuff he evangelizes, it's a pretty darn good read. It has some really thought-provoking things to think about in terms of how to write your code. I wouldn't dismiss it offhandedly.

My mantra: "if you haven't tried it, it doesn't work" Unit test, dry run, big bang, but in the end no significant piece of code works the first time. Ok, once or twice in my life it has, and I still talk about that experience. But that just shows how necessary testing is most of the time.

Unit tests have another advantage - they keep you coding and motivated without an upfront schedule. Sure, you should be motivated already. Why was it you were reading hacker news then?

I got tired of that argument before I even read it. If you're coding non-stop, without distractions, for the whole of your work day, teach us how to do it.

I find unit testing to be a drag sometimes. Creating a class is much easier than creating code to reproduce every situation it can be in. Then there's troubleshooting random failures that turn out to be bugs in the test rather than the code being tested. In what way does it keep you motivated?

Wow, people are still writing more and more words on unit testing with absolutely no new ideas.

Nothing to see here. Move along.

Quite the contrary. This attitude in a MSDN blog, from a Microsoft employee explain a lot of problems their users have.

Way to go, Microsoft. Testing is for sissies! Real Men write correct code.

Now I can use Exchange and know why it sucks so bad.

What one random developer writes on his MSDN blog doesn't mean anything more than what any other random developer writes on an MSDN blog. Microsoft is way too big and has way too many developers to draw conclusions like that.

I've done consulting work for Microsoft, and I've never conceptualized them as one company with one voice. Microsoft is more like lots and lots of little companies which run the entire spectrum from total dysfunction to jaw-dropping awesomeness.

This is the opinion of a single engineer on a single team. There are tens of thousands other engineers and thousands of other teams.

My team (at Microsoft) is religious about unit tests for some parts of the product (math libraries, content serialization logic, etc) and have a grand total of zero "unit tests" for other components (some UI, etc). We do, however, have a full test team who write numerous automated regression tests as well as conduct significant manual testing and routinely use the product.

Furthermore, have you read it? It's not saying don't write tests. It's saying that TDD is not a silver bullet.

I am sorry. I was being cruel.

It may not be a silver bullet, but it's the kind of "don't leave home without it" ammo any self-respecting programmer will keep close all the time.

Without them the risk of regressions is immense.

You're making far too many assumptions about the code to be tested.

The folks who wrote our math libraries (primitives such as 4x4 matrices, vectors, planes, spheres, axis-aligned boxes, etc), would agree "don't leave home without it" for TDD.

The folks who wrote our drawing libraries cheat on the TDD a bit and write the code first, then capture a screenshot and use that screenshot for automated testing to verify nothing breaks in the future.

The folks who wrote the sample code that implements a simple game (involving math and graphics) laugh at the concept of TDD. What would you possibly test? There is no `bool IsFun(Game)` function. How should Mario jump? I dunno, just pick some random numbers and math operations until it feels like it works. Then try not to break it.

The interesting bit here is that each of these groups of folks are on the same team, in fact many are the same people. Use the right tool for the job. Invest your testing efforts where you get the most bang for buck.

Using a screen capture is not exactly cheating. It's automating regression testing and that's a Good Thing. A human ensures the first implementation is correct and then the tests ensure future implementations are correct too. There is also stuff that could (and should) be tested - like a drawing primitive that should not exceed a given region, a pixel that should be of a given color or even an operation that should finish in a given number of cycles for a given pixel count. Erroneous arguments should also be tested as the proper exceptions should be thrown (or, at the very least, proper error codes should be emitted).

For those who write the demo code, there is a lot of tests to do on the gameplay itself - collisions detection, rendering tests, physics and a whole lot of stuff that will also serve as nice documentation on how to use the components that are being exercised.

TDD says "no line of product code is written until there is a unit test that fails for it". I'm just pointing out that there is a spectrum of practicality in TDD. Our automated regression tests for graphics routinely require humans to re-verify every single screen capture because a single pixel may render slightly differently with some other change and there is no algorithm to determine "is this artifact objectionable to humans". Tests are good, but they are not a panacea.

As for the "proper exceptions" or error codes or what not. That always drove me NUTS to see that. The first line of the function is `if (foo == null) throw new ArgumentNullException("foo")` -- do I really need to write a unit test for that? Furthermore, static analysis tools tell us when your public interface (including exception types we may throw) changes, that is code reviewed too. Writing `AssertThrows(typeof(ArgumentNullException)...` is just a waste of time.

Again: this is not an argument about whether tests have merit or not. All the author (and I) are saying, at the core of our argument, is that write tests when it makes sense to write tests. He's also going a step further to say "it makes sense to write tests less frequently than TDD nuts would have you believe". I may or may not agree with him, but I'm going to publicly denounce that belief simply because I'd rather developers write too many tests than too few.

> TDD says "no line of product code is written until there is a unit test that fails for it".

If you have tests that completely specify the behavior of the code, it seems a bit redundant to then write the code... why not automatically generate it?

Have I just described, in some vague sense, prolog?

"is this artifact objectionable to humans"

is the same as

"will this incorrect behavior be noticed" and then removing the test that failed.

And testing was always about testing everything that could conceivably break. Unless you are developing math libraries, you should not need (or be expected to) test if 2 + 2 is 4 or that if 1.1 * 3 is 3.3 and not 3.30000000000003.

"Testing is for sissies! Real Men write correct code."

I seriously laughed when I read this. I'm willing to sacrifice the extra couple of inches of programmer penis size by writing tests.

But the tests themselves _are_ code. Do you have tests that check the tests? And then tests that check the tests that check the tests?

At one point or another you just have to learn to live with bugs.

And even if all code is provably correct, there is always the chance of a random particle hitting a transistor in the chip and clearing a bit that should be 1.

Just like in math, even if the code is provably correct, there is always the possibility someone screwed up the proof.

In the end it always comes down to writing the least buggy software possible and then bug zapping.

Having debug tools is smart of course, but TDD goes a bit further than debug tools. When I write unit tests I don't think of them as unit tests I think of them as debugging probes.

Have you noticed that there is an uncanny number of downvotes in this whole topic?

You can also refrain from writing the bugs ;-)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact