Hacker Newsnew | comments | show | ask | jobs | submit login
Unit testing in Coders at Work (gigamonkeys.com)
109 points by mqt 2198 days ago | 85 comments

I sit somewhere in the middle. Unit testing GUI code is going to slow you down. Unit testing your core algorithms is essential. That is the stuff that must never break, and where breakages would not be immediately detectable. The unit tests ensure that you are always sure that that code works.

(Real world example; testing somewhat hairy parsing code: http://github.com/jrockway/moosex-runnable/blob/master/t/arg... I test every case I could come up with, so I could be sure that it works. Better to spend the time typing this file than to be hit with a weird misparse on the command-line, where you are probably not in the mood to fix the parser.)

On top of the unit tests, I do integration tests, to make sure sections of my app mostly work together. This is somewhere below "Request -> Response" and somewhere above the individual routines/classes. In the web app case, this is mostly "instantiate an object graph; call the methods that provide data to the rendering functions; make sure the results make sense". (I assume my template renderer and object database work; those have their own test suite, after all.) I also test stuff like typical sequences of actions that share some state (usually "the session"). I neglected that once and it caused me lots of problems.

But anyway, GUI testing == waste of time. As long as the buttons you click generate the right events, the GUI works. This is the sort of app Zawinski was working on, and this colors his thoughts on testing accordingly. Your mileage may vary.


Why exactly is unit testing of GUI components a waste of time?


Because it takes a lot of time and effort but does not provide much benefit.


Often, there is benefit, but it's quickly swamped by the maintenance effort.

In the group I'm working with now, we're transitioning to all interactions with the user being abstracted into an AppModel domain object, which will describe user interaction in an abstract, non GUI or other interface-specific way. We will be implementing meta-level and meta-syntax-driven automated coding standards to detect probably violations. In addition, our (now incomplete) Unit Test suite will be run against a Code Coverage tool against the change requests. So any new change requests will have to have coverage in a Unit Test, which will be detected by a nightly script. (The above tools are based on the Smalltalk Refactoring Browser parser & libraries.)

One benefit -- we will be able to test everything in Unit Tests and leave out GUI specifics!

Another benefit of the AppModel architecture, we can eventually publish our abstracted applications as Web Services, and get out of the business of maintaining nitty-gritty GUI code. This will make that group 10X more valuable to their corporation, and they will eventually be doing only half the work!


I've found that it's often worth unit-testing functionality that depends upon complex combinations of GUI state. For example, "Show the menu only when there is a selected bar and the baz list contains at least two items that can be foozled together." These interactions are very often wrong, and even if you code them right, it's nice to have a record of what the spec was so when someone comes along and says "I can make this code much cleaner if I eliminate the check for foozling" (or worse, they change the definition of foozling), the test breaks and you remember there was a reason you put it there in the first place.

I find that it's a waste to test things like appearance, position, labels - basically anything that should be in CSS or other declarative specifications. But UIs often have quite a bit of actual logic in them (ironically, because real humans are often strikingly illogical), and that should all get tested.


OK, but why?

Is it not useful to know that various operations and navigations through the GUI still work as expected?

And if it's a matter of a poor benefit/effort ratio, why are GUIs so hard to test?

Tools like Swinger are pretty easy to get rolling with for testing Swing UIs. Where do these things break down?

Are these things worth fixing?


They break down because UI testing tends to rely on Strings: either labels for controls or embedded ids of them. It's really, really easy to have those change on you, and when they do the tests become difficult to debug: if the button with the id "Search" isn't found, is it because something broke to cause the button not to show up, because the id has changed to "SearchUsers," or what? And if I need to remove the "Search" button and replace it with somethine else, how do I figure out which tests to change? The high-level nature of the tests inherently makes them much harder to debug, because the test could have broken for any of a hundred reasons.

In other words, with unit tests the linkage between the code being tested and the test tends to be pretty tight; with UI tests it tends to be very, very loose, and that makes the tests correspondingly more fragile and much harder to debug.

We essentially do typesafe metaprogramming on our web UI that generates compile-time-checked constants for all the labels, buttons, etc. so that our tests don't compile if the UI changes, which has gone a long way to keeping the tests stable; it's the best solution we've come up with, but it's a huge investment, and our attempts to do testing of our Swing client have met with less success so far.


In my experience, I have sunk a lot of time making sure that foo div has bar css class when the quux link is clicked... but it has never saved me much time. I have to click through the site regularly (content changes, "does this work in IE", etc.) anyway, and errors are usually noticeable immediately.

Other problems I've noticed are that they are either so specific that they fail for no reason (it was supposed to be red, not green!) or too general that they don't catch issues that would annoy users.

If I could have these tests for free, I'd take 'em. But since they're expensive and don't get me much, I don't bother.

(BTW, if you have complicated algorithms in your JavaScript; refactor so you can test them with Rhino on the command-line. Don't do this stuff in the browser!)

Anyway, I would be interested in hearing your UI testing success stories.


"Anyway, I would be interested in hearing your UI testing success stories."

For Web apps, I use Selenium. I use the Firefox plug-in to record scenarios,with some hand-tweaking to replace any odd xpath stuff with more robust references to IDs. I then periodically run suites of UI tests to see that things still behavior as they should.

Yes, there are times when page content changes and breaks a test, but it's trivial to see where that is happening, so I've not had a problem keeping them up-to-date.

And having automated integration tests is way faster and more reliable than manually clicking through a site. I've caught numerous bugs this way, mostly in pages that do not typically get much use in real-life (but tend to be the first thing a clients tries when showing off code. Go figure.)

So, with Web+Selenium there's not much overhead to creating and maintaining a set of tests. It's a big win to be able to kick off a full suite and automatically run through a site far faster and more reliably than I could by hand.

It's not so good for desktop apps. I've been looking at Swinger, which marries Cucumber with Jemmy, for integration testing of Swing apps. As far as I know there are no good tools for recording user actions, so tests need to be constructed by hand. This makes it harder to assemble tests that capture assorted complex interactions.

But, at some point, someone has to actually test the app itself, so while it's time consuming to assemble automated UI tests, it may pay for itself over time since it reduces the effort in manually walking through the app.

So far neither of these would replace unit and functional testing, and neither catch all bugs as experienced by the end user, but they do reduce the number of problems that make it into a release. For Web apps, the effort spent is well worth it. For desktop apps, I've yet to reach that balance. However, things keep improving.


Maybe a better way to put it is that testing GUI components is not a complete waste of time, but it gives you a lot less bang for the buck.


The most important lesson to learn here is that all code, including tests, is a means to an end. A lot of people geek out about new programming languages, object oriented design, automated testing, TDD, XP, and various other methodologies du jour. If those things help you achieve your goal, that's great; but it's important to remember that they're (usually) not your goal in and of themselves.

I don't think jwz would say that unit testing is a bad idea. What he was trying to say when he dismissed them was that they were focused on making a browser, not on making a nice piece of code.

Similarly, Norvig was focused on solving sudokus, not on exploring methodologies that might be used to write the code which solves sudokus.


"TDD, XP, and various other methodologies du jour. If those things help you achieve your goal, that's great; but it's important to remember that they're (usually) not your goal in and of themselves."

When you are an agile "coach", Scrum Trainer etc, TDD, XP , Scrum etc are your goals!


One day I hope Joel eventually realizes this. Programmers who say they don’t have time to write tests are living in the stone age.

This kind of attitude really pisses me off. It's so confident in its condescension and name-calling, I have a little wonder that they might be right. In fact, it's just dogmatic abuse, unencumbered by an objective factual appraisal of the issue at hand. Probably, I should just remember that dogmatism tends to be inversely related to wisdom.

For one thing, the purpose of the code makes a tremendous difference: e.g. unit-testing is great for maintainability, but terrible for evolving an API (as in prototyping or exploratory code).

And thank goodness for Knuth.


My position on unit testing is, "Have you tried it?", followed by, "No, seriously, have you actually tried to use it for a period of time, not just played with it for an hour?", which I will then follow by... wishing the programmer well regardless of how they now feel about it, because now they've at least got some experience.

Unit testing is a big deal. Everybody should try it. Ideally, you should try it with the next project you start from scratch (or subproject as the case may be) because trying to retrofit an existing code base does not give accurate impressions. I'd also point out that some things are easier to test than others, but that once you have a bit of experience sometimes even the hard ones turn out to be feasible. (For example, you shouldn't go straight to UI testing, but there are many other good testing starting points, like parsing or network communication.) You really should stick with it at least long enough for it to detect at least one bug that is a total surprise to you, because that will happen, it's only a matter of time. But once you've tried it, I respect your opinion after that.

I've also tried TDD, which I just could not get into, whereas unit tests I love.


It's easier to learn to use the testing tools themselves on an artificial project, but I find that trying to build testing into an existing project teaches you different (and perhaps more useful) things. It really brings in to focus how to write (or rewrite) code to be testable, and you learn much faster when it's actually a huge win vs. when you would just be going through the motions. If a technique only works on (say) blog-post-sized Ruby projects, then that should be a red flag. Testing is a means to an end.


I was amused by his comment on Knuth:

"So Knuth too disagrees with the notion that unit testing always makes you go faster. Maybe he too is living in the stone age."

This follows him describing writing a program, in pencil, in 1977.


I think the list of programmers throughout history who could reliably write something as complex as TeX using nothing but paper and pencil is incredibly short.


It's a lot easier than you think. When I was a kid I wrote longer programs in Basic using paper and pencil, because it was a lot easier for me than typing. I would type in the program after I was pretty sure it was right. The errors were almost always local. It seemed so easy to check the high-level structure of something when I could spread it out on the ground in front of me. Later, in college, I always printed out drafts of my papers and marked them up completely before I started editing.


What I hate most about this article is that they stress the "always". It is impossible that unittesting is faster and better in every possible software for every possible hardware in every possible situation in this universe and time, just plain impossible.

Certainly, it will make you faster in a lot of cases (and a lot of regular buiseness-programming is among these cases) but there are certainly cases where it is wrong. Think of a code emitter in a compiler, or handcrafted assembler for some micro controller, or especially a problem where you just know the solution, and it is simple. In these cases, it just is faster to just hack down the solution.


What I find most irritating is that the TDD or other methodology zealots usually cannot point to a success of exactly the sort people like jwz or Joel have been influential in shipping. There also seems to be a tendency among these people to confuse rejection of TDD with rejection of unit testing.


" "One day I hope Joel eventually realizes this. Programmers who say they don’t have time to write tests are living in the stone age."

This kind of attitude really pisses me off. "

Then you'll really love "Uncle" Bob's latest blog entry (http://blog.objectmentor.com/articles/2009/10/06/echoes-from...) where he re iterates his inane "Stone Age Programmer" name calling. The comments on the blog post are more interesting (well ok, funnier) than the content of the post.


If you enjoyed seeing the contrast in approaches between Norvig and Jeffries, you may also find the following interesting -- calculating bowling scores in

1. OCaml, no tests: http://alaska-kamtchatka.blogspot.com/2009/07/disfunctional-...

2. Clojure with unit tests: http://blog.objectmentor.com/articles/2009/07/19/uncle-bob-j...



One thing I've never really seen discussed is proving unit test coverage. Taking the clojure example above and its unit tests, nowhere does it make any attempt to show that the few test cases chosen provide a complete coverage of all possible bowling scores. Without that what have you really shown? Sure I can look at it and my gut feeling is, yea that should be enough, but I got that from looking at the OCaml code as well

In the blog post Brad Fitzpatrick is quoted as saying "Write a test. Prove it to me." Now maybe it's my maths background, but when someone say "prove it" a expect proof, and the bowling unit test example proves nothing more than that it works in a few common cases.

Yes I realize that proving anything non trivial in programming is Very Hard, but without any sort of reasonable attempt and even talking about this problem TDD seems more or less ad hoc to me.


I can't "prove" my code, but I can prove that my test executes every function, every line and takes every branch and every return. I run my test using valgrind (callgrind with a few options), then I wrote a valgrind parser and combined that with a c++ parser and output the results. Some fun tidbits:

Every time I discovered that a bit of code can never be executed.

The amount of functions that I never even call, even when I am using my test generator (built on top of the same c++ parser)

I enjoy coding. If I spent 1 hour coding a test that finds nearly all of the bugs in a class that means later I wont spent a week deciphering reports and fixing bugs that are annoying users. I get to hack more and my users get more stable code.

If you want to try out my code, http://arora-browser.org/

The c++ parser I used http://github.com/icefox/rpp/tree

My evil little test generator http://benjamin-meyer.blogspot.com/2007/11/auto-test-stub-ge...

My valgrind tools http://benjamin-meyer.blogspot.com/2007/12/valgrind-callgrin...


You should have a look at QuickCheck. It's a Haskell library where you (more or less) specify laws that your program should satisfy, and it generates test cases on its own. I found it very useful.


QuickCheck is cool. But don't make the mistake of thinking that it proves that the properties hold universally.


Yes. But it's closer to that goal than coding up test cases by hand. (If `closer' is a suitable expression for comparing infinite distances.)

SmallCheck and Lazy SmallCheck seem also to be worth a look. But I did not use them, yet.


Proving test coverage is just as (in)feasible as proving correctness. This is the elephant in the room for testing: it relies on programmer intuition to enumerate cases. It's easy to neglect a case both in the implementation and in the test. This is why tests are better at finding regressions than new bugs.


Unclear on how anybody can cite the author of the latter with any kind of reverence, but I see that happen all the time.


This seems unfair. (a) OCaml != Clojure; maybe the former is better for this problem. (b) Uncle Bob writes I’m trying to learn Clojure, i.e., he admits he's brand new to the language and doesn't (yet) know what he's doing. So, maybe cut the guy some slack?


I reread the Norvig vs. Jeffries posts, and you're right. That example was breathtaking, and it probably colored my judgement about this Uncle Bob example.


Enlighten us folks who aren't methodology scenesters. What is wrong this Uncle Bob guy?


Martin Fowler is an enterprise buzzword methodologist. He was into UML, refactoring and patterns and Java back in the days. Nowadays he is into agile, TDD and ruby.

He has a big respect in certain circles, and even I have the Refactoring book, which was a disappointment for me, because it was about practices that was trivial for me after two years of proferssional work, while going into the details too much.


The plural of anecdote is not data, but when I worked at ThoughtWorks, Martin sent me some code to look at. It was very clean OO code (very smalltalk-ey in design and form, thought it was written in Ruby). The man can code well (not to the level of say Peter Norvig, but then not many of us can) and competently and is very sharp and when he gets hands on, solves problems fast.It is true that he doesn't have any code online where we can look at it.

Another admirable trait I've witnessed is his willingness to say "I don't know" when asked questions outside his circle of competence. He often says it very bluntly (which has caused some heartburn to some ThoughtWorkers once in a while, but that is another story).

His "Enterprise Patterns" book is very useful for anyone working on enterprise software. I believe DHH was influenced by the book when he was hacking Rails.

His upcoming book on rich gui apps seems promising too. Also he doesn't make outragoeus claims or adopt a condescending atitude to programmers who don't work the way he does/advocates (unlike say Robert Martin). I wouldn't dismiss him as a "know nothing" talker/blowhard.

As I said, just anecdotes. Make of them what you will.


Enterprise Patterns is a classic, that alone redeems the man's non-hackish career.

P.S. I also enjoyed Enterprise Integration Patterns by Hohpe and Woolf.


His enterprise applications patterns book is superb. An excellent reference to have to hand when you're planning a system. And because it's all written down, you can point to it for substance underneath your decisions when you get skepticism from straight line programmers who just want to start coding without getting excited abuout structure.


Martin Fowler != Robert C Martin (Uncle Bob)


Oops, my mistake. But the above still holds for Martin Fowler btw. And all those enterprise methodologists look the same to me: they advocate something, but you never see the code they produce, as it is some proprietary crud system at the end of the day :)


I don't know him from Adam, but his "TDD-based" bowling attempt in Clojure seems like a microcosm of that other guy's Sudoku solver in Ruby.


I asked because I have seen a few other people whose opinions I respect dismiss his writings. So I had to ask.



Intelligent people often dismiss Robert Martin's writings because he has very extreme/condescending opinions about programmming without having the hacker cred to match them. (See 10ren's post above for an example of his condescending attitude and an extreme opinion).

Someone like Linus Torvalds can pull this off, Bob Martin, not so much. Linus nearly always has good reasons for his stated beliefs and agree or disagree you can see why he says what he says. Also he is an acknowledged uber hacker. Bob Martin does write code(Fitnesse) unlike most other agile "guru"/consultant types, but it isn't really anything extraordinary (which is all right, as long as he doesn't presume to then teach other people how to program "correctly").

It is interesting that a great hacker like Peter Norvig is modest and unassuming and chooses his words carefully, as does Knuth for example, while people like Ron Jeffries and "Uncle" Bob say outrageous things with nothing to back them up but faith in their ideology .. err.. methodology .


Well, except that the bowling scores thing had a clear spec from the beginning, and was an easier fit for a blog post. The Sudoku thing seemed more like thrashing around in the how-to-represent-data space to avoid thinking about the algorithm.


I know which I would like to maintain.


So now that I've voted up every plinkplonk comment, my comment is that most people obviously haven't read the book. The post is pretty good, but the book is better. You should read it.

If you do read the book, you will see that the discussion begins with Norvig saying that he thinks one of the most important things is being able to keep everything in your head at once. Extra tools come in when the problem gets too big to do that, but here's the key: he saw right from the very beginning that the Sudoku problem could be solved with two tools from the AI toolbox. In other words, he saw the entire solution immediately, and it was never too large to fit easily in his head.

Seibel's analysis fine as far as it goes, but he misses the really important thing here, which is that this is not just an example of someone recognizing a problem that they already knew how to solve. It's a case of someone with the mental tools that allow them to dramatically reduce the (apparent) complexity and size of a very large class of problems that happens to include Sudoku. Seibel takes a bottom up look at the data structures Norvig used, but this can be misleading because they weren't designed bottom up, they were designed all at once. You simply cannot do that unless you have the necessary training in abstract thinking (read: mathematics/formal logic/language development/ai techniques/etc).

No amount of code centric techniques or tools will ever make up for not having these tools. There will always be (relatively trivial) problems that you will never be able solve without them, because you will not be able to fit everything in your head and your ability to reason about the problem will be crippled by that. Debates about TDD are not even wrong, because they are at the wrong level of abstraction. Spending any significant amount of time discussing it is premature optimization.


I generally take a middle road - I write code that is unit-testable, but I rarely take the time to write exhaustive tests. When bugs arise, I start writing test cases in various components until I find them. Thus, the debugging effort is what grows the test coverage. Of course in a vacuum it's better to have the cases earlier rather than later, but I like this approach as a speed/quality compromise. The key to making this work is designing in a test friendly way - that is the true art.


I strongly agree with this. The sad reality is that some of us have deadlines to meet. I think this is an option that makes sense. Writing testable code to get the job done and meet the deadline and the unit tests come in later when needed. EG: new features, bugs, etc.

Not a good practice and ideal but it is practical. IMHO


It's not too bad. At least you avoid recessing bugs.


I do that a lot to, I try to call it Test Focused Development opposed to TDD


I'm personally of the opinion that this comprises a large portion (>60%) of the value of unit tests. Unit testable code tends to not have many of the side effects / global variables / nasty state that make nasty bugs, so this alone tends to drive down the number and severity of bugs. The testing itself is icing on the cake.


The part about TDD in the interview with Peter Norvig really jumped out at me, too. A passing test doesn't always mean anything important has changed - like most other programming tools, there are circumstances where unit tests just create the illusion of progress. I also remember Norvig saying* that sometimes the scope of pass/fail tests may be too narrow, because sometimes having 18 of the first 20 results be reasonable is good enough, but typical testing packages aren't really structured to accommodate probabilistic thinking. Certainly more relevant for some problems than others, but interesting nonetheless.

I've seen major benefits from automated testing, and I'm convinced they're a net win (especially for maintenance), but I also think that being dogmatic about any particular testing methodology (TDD, etc.) is going to be counterproductive sometimes.

* I don't remember if it was from the same book, PAIP, AIMA, or something else - I've been bouncing around in several AI and Prolog-related texts lately.


The Norvig points were particularly thought-provoking. How do you tell if google is returning the correct result.

The other is that starting a unit test without knowing what you are aiming for (e.g., the puzzle solver) you may not get there, and TDD won't help you.


As far as testing goes, I have a simple approach.

If I just wrote some code and had to poke at it in a REPL with some different inputs to check if it works, then I should take whatever I did and turn it into a test. With a good testing framework, like the one built into Clojure (formerly clojure.contrib.test-is and now clojure.test), it's really easy. I just copy from my REPL and paste it into the appropriate test file, along with the expected results.

If I find a bug, I fix it, write a test for it, and also copy whatever I did to check my work from the REPL into a test file.

That's it. I routinely change code around with all the confidence TDD advocates claim. I don't test trivialities. I don't test if my database connection was correctly established --- but I will test database connection loss error handling if I put any non-trivial logic in there.

I'll often write a little snippet of sample code for an API I'm working on, but I never write a full formal test in advance. It makes letting the actual API evolve as it grows too cumbersome.



Though writing a test case for an evolving API might be of benefit from time to time, because it forces you to use the API (and thus see if it is sensible).


You know what has improved the quality of my code? Having every error on my live servers emailed to me. I feel compelled to fix it because I know that an actual user has had a problem.

I tried to do unit testing for a project. I'd write a method, then write some tests for it. Some observations:

* I thought that writing the tests would give me refactoring ideas for the actual methods. It did not.

* Running my test suite prior to each deploy saved me from shipping a bug once. Once.

* Writing tests is fucking boring.

This was all code that I knew the purpose of in advance. I have no idea how one writes tests when doing exploratory coding.


"I have no idea how one writes tests when doing exploratory coding."

You're probably already writing tests, especially when you're doing exploratory coding. How much code do you really write before you check to see if something works? You're almost certainly doing something. Maybe it's just a small main program where you validate that some methods are giving you the output you expected, or maybe it's a simple web form that you populate with mock data to verify it's persisting to a database. Unless you wait until the entire functionality is ready before a trial run (and if you're doing exploratory coding, there's no such thing as "ready" anyway), you are writing something to test during iterations.

Instead of throwing these tests away, keep them somewhere and run them periodically. If they break, figure out why - and either fix the code, or change the test. When you're done, you'll have a bunch of unit tests.

(just a quick note - the process I described above works extremely well for some situations, but poorly for others. Some frameworks make it a real hassle to write unit tests. The process I described also is not TDD, it's more of a write a little, test a little approach, which feels more natural to me. I also don't really like to use TDD, especially for exploratory coding).


You're right: I write a little, do a test run of the code to see what comes back, fix it if it doesn't work, write a bit more and so forth. And I see how saving those little tests could be useful.


There are testing packages for a couple languages (Python and Lua, for sure) that let you just copy in the text directly from the interpreter and use them as regression tests.


I simply loved this article and the comparison made between the two Sudoku solvers.

Peter Norvig is one of my favorite programmers ... his solutions are always simple and elegant. Have a look at his spell-checker here (a solution I used in production): http://norvig.com/spell-correct.html

He just kicks ass, and it reinforces that people (at least in software development) are more important than processes.

I don't do TDD because our consultancy gigs only last for 2 months tops, during which we have to do what other firms are doing in 1 or 2 years. We are also most of the time in uncharted waters, and doing proper unit-testing of such modules requires even more thought than the actual functionality. And after two months tops, the project is over, and we won't get any money for extra work. We are doing unit-testing for really critical sections though.

But I do wish I had the time and the skills to do TDD.


fwiw, I wrote the original blog post (that Norvig was speaking about in his interview) which compared the Sudoku efforts of Norvig and Jeffries) - http://ravimohan.blogspot.com/2007/04/learning-from-sudoku-s....


When testing helps, test. When testing hinders, stop testing.

This business of "must" and "always" anything ?

Not so much.

Unless your business is a lucrative hourly rate telling those who believe in "must" and "always" what they should be doing instead of thinking for themselves.


In all fairness, how can anyone compare Peter Norvig and Ron Jeffries on the same level?

One is an AI genius and the other is a XP coach. They are on very different levels intellectually.


Solving Sudoku isn't exactly rocket science. I can understand if you don't have the cleanest, most elegant solution in the world, but if you can't even make a tiny bit of progress toward writing a Sudoku solver in a few hours, then I don't think you have any business holding a programming job, much less telling other programmers how to do their jobs.

I mean, those Ron Jeffries blog posts read like someone who has never solved a non-trivial programming problem in his life. He literally makes no headway on any difficult part of the problem, and he spends what appears to be the better part of several hours working hard to get nowhere on code that does very little. If someone is listening to him about how to approach programming projects, I've got a bridge to sell that man.


In fact, solving Sudoku puzzles was a programming assignment given in my first year of undergraduate study. We were given input/output specifications, and told to write in C++. That was it.

Seeing as how most of the class passes, I assume almost any joker can write a half-decent sudoku solver if sufficiently motivated.


It's not just that Norvig is smart; he's specifically skilled at writing really good code, apparently because that's something he cared about and worked on. At JPL I worked with a bunch of smart researchers with PhDs but some wrote better code than others, and it didn't mean they were on a different level intellectually.

Norvig's book _Paradigms of AI Programming_ has 900+ pages presenting code as instructive as that Sudoku solver; I've never seen a better collection. http://norvig.com/paip.html


Dude, a sudoku solver is really directly solved via brute force search. If a coder can't come up with the brute force algorithm for a 9x9 sudoku board, there's something wrong with that person.


Brute force? You can write a solver in 15 simple lines of Zimpl code (pun intended), that uses state-of-the-art constraint propagation and all that.



The point is that some people are too blinded by their ideology to even come up with a crude solution, let alone an elegant one.


"One thing I noticed, reading through Jeffries’s blog posts, was that he got fixated on the problem of how to represent a Sudoku board."

Reminds me of pg's writings on exploratory programming (I think that is what he called it).

For me, maybe I am a bad programmer, but I always change my mind on how to represent things once I sit down to code. I can read specs or think about how to do stuff over and over, it all only happens when I finally start to code. So it sounds as if too much test first would make me stuck. Plus, as the article points out, it would be boring to always just work on unit tests and never actually produce code that does stuff.

Not that I dislike unit tests, but I tend to write many of them after coding the main stuff (in the testing phase).


Nothing can excuse testing a non-solution, so I won't defend it. But correct code, code that does solve your problem, isn't always pretty: sometimes it's straw instead of gold. And testing is orthogonal: you can write untested gold, untested straw, tested straw, or tested gold. Sometimes you're lucky and untested gold bursts fully-formed like Athena from the head of Zeus. Otherwise, be glad for testing, which lets you turn untested straw into tested straw, and thence, via a refactoring Rumpelstiltskin, to tested gold.


I agree with Spolsky on this one; and I cant help feeling TDD is like agile. One of those things cool companies and corporates use to sound "cutting edge".

We used TDD for a few projects (and agile too) and found it does slow you down no end.

And it also ultimately doesn't catch many useful bugs and problems. We still had to go through the usual end-point testing cycles.

We write unit tests for any of our API's and also for some of our core code; but only after original versions are in place. They are supposed to catch any mistakes or errors in futures we enhance and evolve the programming so we don't break backwards compatibility.

I think it really does come down to what works best for your teams though. Im sure plenty of people find TDD a great addition to their arsenal.

But at the end of the day it's just a buzz word like everything else - and we've found that by avoiding those kinds of things we produce good, solid, working code at a fast (not necessarily faster pace) - but with less headaches :)


The thing with unit testing is, a lot of the time it's used to try and make a weakly-typed language with uncontrollable side-effects behave like a strongly typed language in which pure functions can be written.

A team lead who doesn't understand the above sentiment is at best only going through the motions with unit testing.


Im not sure I agree.

Unit testing is about code being to a specific standard (i.e. that API works like X or like Y).

Surely your referring to Fuzz testing?


That was a fabulous analysis. For my part, I can't quite figure out why people seem to get emotional about TDD one way or the other. If it helps you, great. If not, ditch it. Do people work in environments where they'd like to avoid TDD but are forced to work that way? Or vice versa?


Do people work in environments where they'd like to avoid TDD but are forced to work that way? Or vice versa?

Yes. Unfortunately, it's one of those things that only works well if you have total buy-in from everyone on the team. Maybe we should avoid methodology that requires too much of that. Reminds me of how communal living only works if everyone pitches in. How about a methodology that works entirely off of inherent human laziness and self interest? XP works off of some laziness. But it's more like Air Cavalry in Vietnam. Land the soldiers behind enemy lines, and they have to do something productive just to save their own butts. Likewise, write some failing tests, and you have to code something to get the tests to turn green. (pass) But the problem is getting the soldiers on the choppers to begin with, or getting developers to write the tests. In the Army, there's the threat of the stockade and the firing squad. In XP, it's just getting fired.

Another problem that startups are supposed to be able to avoid.


Can't the teams just decide this by consensus?

There're a bunch of these practices that require total buy-in from a team. Things like "Are we going to use bugtracking software to coordinate who's doing what?" "Are we going to have daily stand-ups?" "What'll the conventions be for X?" etc.

On the projects I've been on lately, the answer has usually been "Fine by me. Let's give it a try and see if it works out." And if it's not working out, we don't do it anymore. But oftentimes they really do help, and we end up introducing the same practices to our next project.


> * And if it's not working out, we don't do it anymore*

It's not as simple as that.

Once you start using a bug-tracker the bug database grows and grows, and that data is not easy to migrate to something else. And then you're stuck with a substandard solution that makes your life a hell of a lot harder than using plain old emails with naming conventions.

Daily meetings are good for some people but bad for others. Scrum meetings are advertised as a way to "get in the flow" and to let others know of your problems so they can help you. But in practice a scrum meeting turns out to be just your average status report you give your manager, and it's not going to stop him from interrupting you later anyway. This means that whenever you weren't on your best behavior, you're going to game the system and lie, making the whole meeting pointless and a time-waist.

"Conventions for X" in a team with lots of opinions quickly turn into decision meetings. I had to argue really hard to choose a sane naming-convention for our database, because our tech-lead was a Java programmer "with years of experience" and the project manager was incapable of making such decisions and was delegating these choices to the tech-lead. Two years later and the whole project is a mess, because some decisions where made as if we could change them later, and others where pulled out of somebody's ass as a conflict resolution.

For an epic example of such a train-wreck, go read "Dreaming in Code".

Teams aren't going to decide anything right by consensus. You need one or two programmers on your team that have strong-leadership skills and generally just kick ass, and let them decide.


> Once you start using a bug-tracker the bug database grows and grows, and that data is not easy to migrate to something else.

If it's a new project, why would you need the old bugs?

Rereading that, I realize it sounds flip, but realistically the way things work here is that we get the critical bugs fixed, we launch, we get the annoying bugs fixed, and then we promptly forget about everything that's left and go start a new project. There's nothing wrong with reverting to e-mail, or index cards, or permanent marker on the back of your hand for the next project. The bug database is huge - literally millions of bugs - but realistically they're all either fixed or won't be fixed. People only care at the margins, the projects that are actively being worked upon.

> But in practice a scrum meeting turns out to be just your average status report you give your manager

I don't give status reports to my manager. The important part is that my peers know what I'm doing, and it's much easier to tell them all at once than individually. That is, if they care - which is why we don't do them if people don't find them helpful.

We're also allowed to say "I did nothing today" at the stand-ups, with no questions asked (though I suppose if that became "I did nothing this week", some questions might be asked). It's actually the managers that say this most often, because they waste the most time in meetings.

> For an epic example of such a train-wreck, go read "Dreaming in Code".

Read it. IMHO their problem was that they never pinned down the goals for the project in a way that was concrete enough that you could say, "This is what Chandler is, and this is why it's important." If you can't do that, you've lost regardless of your process and how many superstars you have on the team.

The reason they kept running around in circles concerning coding conventions is that they had nothing else to concern themselves with. If you know why you're doing something, you're much more inclined to say "Let's move on and not argue over bikesheds, because we want to get some real work done."

> Teams aren't going to decide anything right by consensus.

Every team I've been on here has done just that. Consensus doesn't mean that there's no decider, or even that everybody agrees. It means that everybody's been asked, and a solution has been proposed, and everyone is content enough with that solution to move forward. Oftentimes, that's because they find it more work to argue than to just hold their tongue and do it, but the net effect is that they can do something and at least see whether they were right or wrong.


So the general conclusion, from various programming legends, seems to be:

• (semi-)automated testing for any complex code: essential

• unit tests for any long lived code: great

• TDD (writing the tests before the code): good in theory but alien to most folks

• tests speed/slow you up/down: open question (though it would seem logical that there will be certain projects that benefit more than others, e.g. the longer lived the code the longer you can earn "interest" on your initial "investment")

Yet there still seems to be a general reaction here of "I'm too smart and/or busy to write tests" and the corollary "Anyone who writes tests, or advocates testing, isn't a good programmer".

Seems a strange disconnect.


"(semi-)automated testing for any complex code: essential"

Where did you see this idea in Coders At Work? Most of them say testing is important . Most of them don't say "semi automated" testing is "essential". So where is your perceived "consensus" of programming legends coming from?

or this

"TDD (writing the tests before the code): good in theory but alien to most folks"

for that matter? Sibel mentions 5 developers in his blog post - Zawinski, Knuth, Norvig, Bloch, Armstrong. I didn't see anyone saying "good in theory but alien to most folks". Only Armstrong says he regularly does (soemthing like) "test driven", but if you read his interview he also says he spends a lot of time doodling and getting his abstractions right before coding, hardly the "YAGNI" style of coding and theory of "emergent design" propounded by most agile consultants.

Who in CaW do you think concludes either of the two statements you put forward as "general conclusions from programming legends".

Or are you saying Robert Martin and co are programming legends? Bit hard to justify I'd think ;-)

You say,

"There still seems to be a general reaction here" .. "I'm too smart and/or busy to write tests" and the corollary "Anyone who writes tests, or advocates testing, isn't a good programmer"."

The former " I am too busy to write (TDD style) tests" is a justifiable argument in certain circumstances (Zawinski says as much) and the latter "Anyone who advocates testing is a poor programmer" is something I don't see anyone here saying. Who said this? Links please? And if someone did, why do you think it is a "general reaction"?

fwiw I see the evolving consensus here ) as "writing unit tests is valuable in certain circumstances (and not so in others) as long as you aren't fanatical about them" and "what is important is not the methodology but the end result".

You seem to be putting forward your beliefs about the value/validity of TDD/testing etc as the "general conclusion, from various programming legends". I am curious as to how you came up with this apparent consensus.

Don't get me wrong, you certainly have a right to your conclusions. I am just saying the CaW interviews and the posts here don't seem to converge to a conclusion along the lines of you say they did.


I got all this directly from the linked article:

Zawkinski: created automated tests for a complicated corpus of quasi-standard email headers that he triggered manually. When he rewrote the mailer in Java he unit-tested because it was easier in a Java OO environment. He says there's specific areas where unit-testing would have sped them up, but that they were under so much time-pressure, with short-term ROI horizons that they didn't bother, they just went with the gamble of "getting it right first time". If they had more time then unit-testing was "certainly the way to go". He specifically states that they traded quality for a ship date.

Fitzpatrick: Does "a lot of testing". Tests anything half-way "clever" because others will break it. Forces people who work for him to write tests. "At a certain point, people realize, “Holy crap, it does pay off,” especially maintenance costs later."

Joshua Bloch: automated testing of transactional shared-memory with a "basher". When it failed at high concurrency he had to use unit tests to locate the bug, which wasn't in his code but the underlying Solaris assembly code. This was early 90s and he implies that finding a bug in such code today would probably result in the engineer being shouted at for inadequate testing, though perhaps not in concurrency areas as testing those is still an "art". Argues that writing an API before the code, which he does, isn't technically TDD though I'd say it's in the same ballpark.

Knuth: is "a fan of after-the-fact torture tests".

The blog author interprets the other comments as anti-TDD, though they seem unrelated to me.

Joe Armstrong: Actually does TDD.

Norvig: "I think test-driven design is great. I do that a lot more than I used to do."

I don't understand how you could have come up with a different reading. The originally post seems to be half-heartedly questioning TDD (without much support from the quoted sections), but that is on the far end of a continuum that has automated tests, then unit tests, on the way towards it. Which I thought I made clear with my list where the near end of that spectrum (automating tests for complicated areas) is better supported than the far end (TDD according to the book). Many responses I see here seem to reject testing entirely.

(edit: just to be clear when I say "automated" I mean they wrote a program to test their program. It doesn't necessarily mean that they run it on every change, or check-in or whatever. I just mean it runs and says: pass or fail without a human being having to examine the output, though that output may be helpful when it does fail.)


51/2 for 6?

(EDIT: you first claimed that of the 6 developers from CaW 5-1/2 of 6 supported your claims. Knuth being the "1/2" You've since edited your post)

51/2 for 6?

You must be kidding. You are cherry picking statements that support your viewpoint and ignoring those that go in an opposite direction.

these were what you said "programming legends" had a consensus on

"• (semi-)automated testing for any complex code: essential • unit tests for any long lived code: great • TDD (writing the tests before the code): good in theory but alien to most folks"

Let's see now.

"after the fact torture tests" (Knuth) is hardly support for "TDD' or "semi automated testing is essential". SO that's one down (or 1/2 down :-P) So that yo are down to 5 of 6 right there.

BTW Knuth also said (http://www.informit.com/articles/article.aspx?p=1193856)

"As to your real question, the idea of immediate compilation and "unit tests" appeals to me only rarely, when I’m feeling my way in a totally unknown environment and need feedback about what works and what doesn’t. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be "mocked up."

Zawinski, Knuth, Norvig etc don't say anything like "semi automated testing is essential" (one of your conclusions) or "TDD is good in theory".

It is one technique among many they sometimes use,certainly.

Norvig (for example) specifically says

"Seibel: What about the idea of using tests to drive design?

Norvig: I see tests more as a way of correcting errors rather than as a way of design. This extreme approach of saying, “Well, the first thing you do is write a test that says I get the right answer at the end,” and then you run it and see that it fails, and then you say, “What do I need next?”—that doesn’t seem like the right way to design something to me"

So that is 4 of 6.

Zawinski says

"Seibel: In retrospect, do you think you suffered at all because of that? Would development have been easier or faster if you guys had been more disciplined about testing?

Zawinski: I don’t think so. I think it would have just slowed us down. There’s a lot to be said for just getting it right the first time. In the early days we were so focused on speed. We had to ship the thing even if it wasn’t perfect. We can ship it later and it would be higher quality but someone else might have eaten our lunch by then."

In other words in retrospect, If speed is a concern, "being disciplined about testing" is NOT important which is hardly "consensus" about the centrality of testing.

So that is 3 of 6.

About Bloch, you say "Argues that writing an API before the code, which he does, isn't technically TDD " and you concluded "it is in the same ballpark" ?! You conclude something 180 degrees opposite to what the interviewer understood?

Josh said

" I do disagree with Martin (Fowler) on one point: I don’t think tests are even remotely an acceptable substitute for documentation. Once you’re trying to write something that other people can code to, you need precise specs, and the tests should test that the code conforms to those specs. "

So write precise specs, write code and then add tests that the code confirms to written specs is "in the same ballpark" as TDD? This is how sw is being developed for ages.

By this logic, any thinking about code before coding is "in the same ballpark" as TDD. A strange argument.

Besides Josh Bloch wrote a test when everything else failed. And when specifically asked if an automated test would have been useful, he says

"I think a good automated unit test of the mutex facility could have saved me from this particular agony, but keep in mind that this was in the early ‘90s. It never even occurred to me to blame the engineer involved for not writing good enough unit tests. Even today, writing unit tests for concurrency utilities is an art form."

More nuanced than your claim that Josh was all for automated testing don't you think?

So 2 of 6.

So how did you come to your conclusions "TDD is great in theory but too strange for most folks" as something they concluded?

You are concluding a few things and cherry picking statements to support this.

Every developer in the world would love an extensive (semi) automated test suite if they didn't have a cost to write and maintain.

You can't take a statement from someone that "tests are great" and ignore "tests are for correcting errors" or "when speed is critical, tests are not important".

My point was that you say "programming legends" conclude various things. As I see it they don't conclude anything beyond "tests are good in certain circumstances for specific purposes". They make some generic statements in support of "testing" but are very cautious of "test driven development" or "you must test every bit of code" por even "you must have an automated test suite".

There is no consensus for the stronger statements you put forward. Nobody says "TDD is good in theory but most people don't get it". Almost everyone says it is a bad idea.

The second part of your original post , that the consensus evolving here on HN is "anyone who advocates testing is abda programmer" is still unsupported. Please, show me links.

EDIT: I see you've edited your post and are now proposing a weaker claim "testing" vs "tdd". Sure if you are saying "programs that test other programs" are a useful technique to have in your arsenal, sure thing.

Programmers have written tests drivers (in other words programs that tests programs) from the beginning of programming. This is hardly an argument for "there is a consensus among programming legends different from teh consensus here on HN" .


I think this is going nowhere but, some points:

I assume that anyone who does TDD thinks unit tests are a useful tool, and that anyone who uses unit tests thinks that automated tests are a useful tool. That seems logical.

Regarding thinking about the API, that's not "thinking about the code", it's thinking about the code that's going to call the code. Unit tests are also code that calls the code. Orthodox TDD theory claims that by writing code that calls the code first you are in fact designing good APIs. Seems related to me.

JWZ's comments on speed are all over the map, probably because he's not talking about sustained speed of development, but a very short shipping deadline based on market forces. A sprint versus a marathon. I think his comments strongly support unit testing in the marathon case (which surprised me based on other things I'd read).

If you have to do something because nothing else works, I'd call that "essential".

Regarding TDD, which seems like the real sticking point, as opposed to unit-testing or automated tests. Two of the six actually claimed to do it personally and like it. Two of the others seemed so enthusiastic about unit-testing that TDD didn't seem like a big leap to me. One designs APIs by considering calling code first, which as I said above, reminds me of some elements of TDD. The last one was talking about writing code in pencil. It didn't seem relevant, though the blog poster thought it was.

I didn't say they did TDD, by the book, 100% of the time, or that they thought it was suitable for every task. But if someone actually says they do TDD then who am I to argue?

I intially commented mostly because of the dissonance caused by the article quotes being very pro-testing, and even had two people flat out claiming to do TDD and yet the overall tone of the original post commentary was anti-TDD, and many comments here (and in previous related posts) were anti-testing in general (not specifically TDD, which I believe the true faithful don't even like referring to as "testing" since it's actually about "design"). That anti-testing feeling seemed to be drawing inspiration from these same heroes, who when I read their own words, struck me as very upbeat about testing.


" that automated tests are a useful tool. "

"useful" is different from "essential". Conflating the two words hardly makes for a coherent argument.

" Two of the others seemed so enthusiastic about unit-testing that TDD didn't seem like a big leap to me."

The problem is in this "leaping" and essentially putting words in their mouths to conclude what they didn't.

When you quote others in support of your conclusions we have to look at what they actually said, not where you landed up after "leaping" from what they said. So you have two of six tdd-ing/ testing/whatever it is you are claiming now.

Unit testing existed long before TDD was put forward as a named practice.

unit testing AFTER writing the code is NOT tdd as its proponents take great pains to make clear. You can't "leap" to conclusions directly opposite to the meaning of terms.

If you read the actual quotes only one saying they do TDD (not automated tests after writing the code) in the accepted meaning of the word "TDD" (TDD == write test code BEFORE you write app code then refactor to get the "design" correct , aka "red green refactor") regularly.

And the one person who writes tests first, Joe Armstrong, does significant "design up front".

This hardly supports your original claims.

"One designs APIs by considering calling code first, which as I said above, reminds me of some elements of TDD. "

By this logic if I think about how a user will use my web app I am doing "TDD". You can't write any code, API or non API without thinking of how it is used. It is just that in API design the "user" is code. Every API designer in the world does this thinking.

Doing this thinking is hardly "TDD". Every programmer in the world is (in your words) "in the ballpark" of TDD practice then. Again ,hardly a strong argument.

"The last one was talking about writing code in pencil."

This must be Knuth. So now you are down to 1.

"I didn't say they did TDD, by the book, 100% of the time, or that they thought it was suitable for every task."

No. But you said there was some sort of consensus among programming legends that "TDD is good in theory but too alien for most folk". (I am quoting you exactly). No one said anything like this or anything that implied this.

All the actual interviews(vs your "leaps") show is that (a) these developers are aware of TDD. (b) Some of them use it occasionally (c) The one person who uses it regularly (Joe Armstrong) does many things different from the theory of "TDD" as propounded by its more mainstream practitioners.

At best you have 1 person of 6 who "tdd"s regularly. This is "consensus" that "TDD is good in theory but alien to most folks"?

Also in your initial post you explicitly juxtaposed these programming legends's "consensus" with a claim that here on HN there was an emerging counter consensus that "anyone who advocates tests is a bad programmer"? I asked for links to anyone saying that. You didn't provide any.

let me ask again. How do you support that?

Peter Seibel has pulled together a blog post on what his interviewers said about TDD and testing. You interpreted fragments of the interview to lead to unsupported conclusions opposite from Seibel's (which are very balanced btw). Then you said the "egenral consensus" on HN is that "anyone who advocates testing is a bad programmer".

When challenged you edit your posts and play with words.

Ahh forget it. This thread is too deep now. And since you edit your posts continuously so my responses don't make sense anymore (apologies to other readers) it is not worth my time trying to keep up.

If you always intended to say "developer testing (including writing unit tests, drivers whatever) is a good thing" in general,then I agree.

Your original post made very strong claims (since edited, very frustrating to responders).That was what I reacted to. and you still haven't supported your claims about the "HN consensus". ;-)

But yeah whatever! Have a nice day.


Great article. Lots of detail and the author doesn't get into editorializing too much.

I'm in the "want to believe" category with TDD, solely for the reason that I think there are a lot of smart people advocating it. Having said that, I haven't seen a lot of teams that use it and continue using it for a competitive advantage -- and that sets off alarm bells. The hype may very well be ahead of reality on this one. I honestly don't know.


Unit testing is the only way to scale lines of code without scaling your team.


Applications are open for YC Winter 2016

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact