
The best bug tracking system. Don’t raise bugs, write an automated test - fogus
http://www.makinggoodsoftware.com/2010/02/23/the-best-bug-tracking-system-dont-raise-bugs-write-an-automated-test/
======
jasonkester
Here are 4 real-world bug reports, in their entirety. Anybody care to rewrite
them as automated test cases for me?

    
    
      * "Try it Now" button sometimes drops below the "Learn More" button in FireFox
    
      * Capitalization inconsistent in top nav
    
      * Popup blockers occasionally stop the App window from opening properly
    
      * Wierd timeout errors clumped around 1:00 am every few nights
    
      * Accounting needs to be able to call up weekly income reports
    

In my experience, it's actually pretty rare to find a bug that can be wrapped
in a test case without actually discovering and fixing it in the process. Most
things that end up in the bug tracker are either cosmetic, wierd "solar
activity" things, or disguised feature requests.

~~~
jerf
Whenever the unit testing comes up, people always cite GUI issues as
untestable. But it should be pointed out that in Fred Brook's usage of the
terms, this is an _accidental_ problem, not an _essential_ one. GUIs are hard
to test not because GUIs are hard to test, but because _your GUI doesn't
contain provisions for being tested_.

Nothing stops a GUI from being very queryable. Nothing stops you from being
able to query the locations of two widgets and making assertions based on
those locations. Nothing stops you from getting text back out, or verifying
fonts, or verifying the lack of overlap, or lack of popups... except that GUIs
are written to be opaque monoliths, graveyards of data. This is correctable,
but not by the end programmer.

So it is true that GUIs are untestable, but this doesn't _have_ to be true.
But it's going to take a conscious effort by QT or GTK or someone to _make_
their GUIs testable before anything will change.

(I acknowledge your other issues; don't agree with all of them but the GUI
issue is what I'm passionate about.)

~~~
dasil003
As a Rails developer doing a fair amount of TDD and some BDD, I have a first-
hand view of the state of the art of view testing for the web. Tools like
webrat, selenium and cucumber are compelling because they hold the promise of
end to end testing.

I'm really glad that people are working on solving these hard problems, _but_
, I'm also a designer and usability guy, and I can tell you that these tools
are still very very crude. You know the sayings "when all you have is a hammer
everything looks like a nail", and "use the right tool for the right job"?
Well, to developers, code is the hammer. Developers (especially consultants)
would like to believe that if they write a cuke test and it passes, then their
job is done. But the fact of the matter is that test code is not free to write
or even to run, and the amount of things that a QA person is actually testing
when they click through an application is several orders of magnitude more
coverage than an state-of-the-art acceptance test gives you (this may change
eventually, but it's an AI-level problem).

I see a lot of value for the initial acceptance tests that verify basic flows,
but they are _just one tool_ in your belt. Usability testing and QA provide
orthogonal _human_ perspectives that provide more bang for your buck then
increasingly fine-grained automated tests. Think of it like this, automated
tests take your product from complete failure to mediocre, but design,
usability testing and QA can take it from mediocre to great.

I've seen brilliant programmers get obsessed with "all green" to the point
that they do all their work without firing up a browser, and miss dozens of
terrible usability flaws that would be immediately obvious the first time you
click through.

None of this is in any way disparaging of GUI testing per se, but just that it
has limitations which I've seen ignored for the sake of code fetishism.

~~~
gnubardt
Not sure if it's any different or better, but Project Sikuli
(<http://groups.csail.mit.edu/uid/sikuli/>), an api for scripting visual
actions in guis could be useful for testing layout and interaction.

Here's an example of using it for unit testing a gui:
[http://sikuli.org/documentation.shtml#examples/TestJEdit.sik...](http://sikuli.org/documentation.shtml#examples/TestJEdit.sikuli/TestJEdit.html)

------
colomon
And of course, the testing for any bug can be easily automated. Not!

Look, some things are very easy to test in unit tests: Does sin(180) return
the right value? If I do operation X on supposedly const object O, does O have
the same value before and after?

And some things are incredibly hard: Does this MP3 sound right after
compression? Does the typography for this combination of letters look right?
Is the user interface responsive enough? What about the crash that only occurs
in just the right hard-to-duplicate circumstances?

Why is it that these hard core test advocates always seem to assume that all
bugs are of the former sort?

~~~
DannoHung
> Does this MP3 sound right after compression?

A quick check for this would be to have a bunch of wav files that demonstrate
different properties that the encoder is supposed to compress on, run them
through the encoder, and then read them back out through a standard
decompressor, and then verify that the waveforms match within a tolerance.
With any release, you'll want to do some manual testing, but you can gain
higher confidence on small changes between milestones.

> Does the typography for this combination of letters look right?

Same sort of thing, generate a bitmap from the output, compare against a pre-
approved sample using a heatmap to determine differences.

> Is the user interface responsive enough?

Define what responsive means in quantitative terms. For most interactions,
you're going to want sub 200 ms. For stuff where you want the user to wait,
you'll have to define wait times ahead of time. Maybe you make changes that
cause the latter to break because of a significant change in what you're
doing, but it's good for people to know about that ahead of time, right?

> What about the crash that only occurs in just the right hard-to-duplicate
> circumstances?

Hey, nothing's perfect. But this objection is tautological. It's only hard to
duplicate because you don't understand it yet.

~~~
derobert
"verify that the waveforms match within a tolerance" The problem is, that is
not easy. Determining how close two pieces of audio sound is the majority of
the lossy audio compression problem. You can of course check if you're output
is bitwise identical, but when it isn't, you need ABX testing. With people.

~~~
vlisivka
Yes, it looks like manual test.

BUT, when QA team found problem, you can create automated test case, which
will look for that particular problem only. Right?

For example, QA noticed "clicks" every few seconds in resulting sound. Is it
hard to create automated test case for that?

~~~
derobert
I'm not sure how hard it would be to create a reliable test for clicks.
Depends on the type of click, I suppose. Some would be fairly easy to detect
(e.g., "for .1s, all samples output are 0, with loud samples on both sides").
Though I'm guessing that would actually result in plenty of false-positives,
and would be a fairly carefully tuned (and thus fragile) test case.

A better approach might be to detect it in the frequency domain, after
performing an FFT (that instant drop to 0 will generate a lot of energy on
both sides). I suspect you'll still need the careful tuning; after all, a
sudden burst of energy on your FFT could be a click, or it could be a cymbal.

Not sure how well this would work, I've never tried it, though it sounds like
some fun code to write.

------
alextgordon
This wouldn't work for:

* Public bug trackers

* GUI applications, where writing automated tests is difficult or infeasible. (how would you write a test case for "icon is upside down"?)

* Abstract bugs, such as performance-related bugs or feature requests

* Nondeterministic or otherwise unusual bugs

I'd love to live in a world where all bugs were easily reproducible and even a
non-programmer could write a test case. A quick glance over my issue tracker
reveals this is not the case.

------
Tichy
Raising the cost for reporting bugs might people want to avoid reporting bugs.

I found I hardly every used a bug tracker I had installed on a slow server,
because it was so slow. So I don't think making bug submission even slower
would be a good thing.

~~~
nathanh
It also raises the literal dollar cost for QA people. If you have a large QA
operation, you're probably paying semi-skilled non-programmers. If you have to
get QA people who can code enough to write a test, they'd cost a lot more.

------
moe
Yea, because all testers are capable of writing software tests and because all
bugs warrant the effort to write a test...

Not sure what this is doing on HN, really.

~~~
andrewcherry
Well, while I certainly agree that this is perhaps a way off in terms of
technical capability, that won't always be the case perhaps. I'm not talking
about everyone learning to code, that's dead in the water.

But systems which allow business people to define their own acceptance and
test criteria exist now - tools like Fitnesse, etc. I'm not saying this is
feasible today, but a future can be envisaged where writing a repeatable test
which determines whether a function works as desired is within the domain of
less technical users.

~~~
Retric
If you let people without a development background write tests it's easy for
them to say something to the effect of (X > 7) and (X < 4). More generally
insuring that there are no conflicts in your requirements is a very hard
problem.

------
whyme
_Communication problems: Sometimes the description of the bug is poor, or the
developer misunderstands the problem._

-> Maybe more effort should be placed on improving communication skills for your company? This will greatly improve many other aspects of your business - not just bug logging.

 _Regression testing. This process requires the handover of code back and
forward between QA and development, this causes versioning problems and
duplicated testing effort._

-> why? Maybe more effort should be placed on improving communication skills...

 _Low robustness. The process doesn’t guarantee that the error will appear
again in the future._

-> Neither would buggy automated tests - or whom do you need to hire to test the tester?

 _Bureaucracy. Traditional bug tracking systems switches the team focus to
bureaucracy from quality_

-> If you care about your product quality and your other team members development goals it's not bureaucracy it is quality. Can't see past mistake occurrences, thus don't see the system as meaningful. Rather the focus should be to take the current system seriously and work on improving it.

------
aw3c2
Reporting bugs is already tedious. Sometimes more (eg when requiring an
account), sometimes less. Asking for a test case would surely reduce the
number of "bug reports" to abysmal figures.

------
antirez
Most synchronization issues in multi threaded applications can't be automated.
Also there are bugs triggered by a specific state of the application that is
complex to encode in a test.

------
mars
my company is developing web applications that are used by companies to
optimize internal processes. i recently found out about selenium and this tool
is just great. i wasn't aware that it is possible to write tests for even
complex and heavily ajax driven web interfaces. roxxx. don't get me wrong, it
is indeed a lot of work to write those tests. but once written they'll save
your time and more important, your life. give it a try.

------
lifeisstillgood
Bug Trackers also play an important prioritisation role - a central location
to decide what work is most important right now.

If we wrote only unit tests, pretty soon someone would hack up a system to
take the names of the tests, and put them in a list. It would be called
testzilla.

------
grumpyfart
If you are developer and doing TDD or some sort of automated testing (which
you should!) and if you are not writing automated tests for bugs, you are
doing it wrong anyway!

If the bug is GUI inconsistency, looks crap on my mom's computer sort of bug,
then you can't automate anyway. Assuming you testers' mom haven't got serial
ports.

