

Ask HN: How do people really test their code? - dsc

If that question is too general, which it is, then consider the case of building a data structure and its accompanying algorithms. Now, how, IRL, is code verified both for correctness and for desired performance.<p>Right now, I rely on running a bunch of functions that I write, but I was wondering if there's anything out there that's better (or sort of a methodology that dictates which tests one ought to write).<p>(if the question is still too general, lets consider the case for Java, and maybe C).
======
chipsy
I'll try to answer the general question.

It's kind of a quantum physics problem, at least with respect to testing in
general. Code that is completely untested hides all errors, and code never put
through its paces won't show a performance problem. As it gets closer and
closer to the "real world" (compiling, initial tests, beta, 1.0, 2.0, etc.)
bugs and scaling issues will show up(often somewhat "magically"). The best you
can do to counter this is simply to make the most efficient use of your time
given the quantity/complexity of features, schedule, and available manpower -
so that the remaining time can go towards a thorough test cycle.

Code built within restricted computational models(stronger type systems,
garbage collected memory, functional-style code, relational logic...) can
eliminate entire classes of errors. This doesn't eliminate the benefits of
tests, but it makes it possible to focus your tests on a smaller subset of all
errors.

Code with extensive ongoing review processes(e.g. space shuttle code, or
perhaps the Linux kernel) can eliminate a different class of errors from
regular tests or restricted models, because it uses the power of human minds
to reason through the concepts repeatedly; a mistake made by one programmer is
not likely to be repeated in exactly the same way by ten or twenty of them.

Also worth consideration is test scaffolding and debugging tools. In a large
codebase, errors can appear farther and farther from their origin. This leads
to a "test suite" (unit tests, functional tests, example datasets) run more-
or-less independently of the application. For some kinds of applications,
relatively elaborate debugging features may be necessary to display and step
through core data structures while the app runs. Debugging-related features
are easy to overlook, but are often well worth the time spent, and I have
taken towards adding them whenever I encounter a class of bugs that they would
help address, rather than to just muddle through the first instance and say
"hope THAT doesn't happen again!"

Also, to a large extent, language and environment dictates debugging methods -
C code benefits from a machine-level debugging system like gdb, but in
languages with runtime reflectivity like Python or Ruby, you rarely need more
than a print statement to uncover a problem. If you are working with an
embedded device instead of a desktop OS you may have a remote monitor system
or an emulator. If you're working on a webapp, you have server logs and
browser-level tools. Et cetera.

------
Bluem00
In order of decreasing importance, I recommend you: Constantly check that
you're building the right thing. Get some kind of testing framework. Look into
Test Driven Development as a methodology.

With a testing framework, you won't waste time rolling your own. For Java, you
could start with JUnit.

TDD is a formalized approach to writing test functions as you work, which you
already do, while simultaneously designing your code. There are many sources
on it around the internet, but Kent Beck's book took the magic out of it (a
good thing) for me: [http://www.amazon.com/Test-Driven-Development-Kent-
Beck/dp/0...](http://www.amazon.com/Test-Driven-Development-Kent-
Beck/dp/0321146530)

Of course, there are many other specific techniques you can use to test that
your algorithms do the right thing, all dependent on what you've made.
Regardless, I try to make sure that the a real user gets their hands on an up
to date build as often as possible, and they're sure to show me all of the
ways that my software is technically excellent, yet in no way solves their
actual problem.

~~~
steveklabnik
Also, please check out 'Behavior Driven Development's or BDD, which is
basically "TDD done correctly." It alters the description of the process a
bit, and in the end, you write better tests.

Since I'm a Rubyist, I use Cucumber and RSpec to do testing.
<http://cukes.info> <http://rspec.info> (note that you can use cucumber to
test any language, and I bet you could use JRuby to test java.

~~~
kanak
Could you recommend some resources for learning BDD?

~~~
steveklabnik
Sure. Now that I'm not on my phone, it's much easier to grab some links.

Here's the original work on the subject, by Dan North:
<http://blog.dannorth.net/introducing-bdd/>

A little Rails specific, but this post by Sarah Mei is pretty awesome:
<http://www.sarahmei.com/blog/2010/05/29/outside-in-bdd/>

These two Railscasts on Cucumber are good:
<http://railscasts.com/episodes/155-beginning-with-cucumber>
<http://railscasts.com/episodes/159-more-on-cucumber>

But really, conceptually, BDD isn't that complicated. The hardest bit is
figuring out the tooling for whatever language you're using, getting
comfortable with it, and practicing. I know the Ruby side of this well, but if
you're using another language, I'm not sure I can be of much help.

~~~
uxp
Adding on to your comment, a friend let me read through a beta copy of The
Rspec Book last year. At the time, only the first couple chapters were written
and it was lacking a lot of content. It was a great resource though, and I
would highly recommend it. The final release is due "Soon", and ordering it
now gives you access to the current revision of the Beta, which I hear is
complete sans-editing. I'm holding out until I can buy a paper copy.

<http://www.pragprog.com/titles/achbd/the-rspec-book>

Your links to the RailsCasts screencasts are great also. PeepCode also has
their Cucumber Screencast that is a little more detailed than RailsCasts.

<http://peepcode.com/products/cucumber>

These are all Ruby or Rails specific. Cucumber itself is a DSL to be used
along with another language, and there are projects underway to implement
Cucumber or Cucumber-like frameworks for other languages like Python, Java and
.NET

~~~
steveklabnik
Thanks. I almost linked to the RSpec Book, but since I haven't read it myself,
I didn't feel comfortable endorsing it. Good to know it's shaping up! Maybe
I'll have to pick up a copy.

------
PKeeble
Personally I use a triplet of tests.

1) The first test I right is a functional end to end test that assumes the
application is deployed and running with all configuration in place. That
means there is a database and all the other parts necessary to run the
application. The test follows BDD format (Given ... when ... then). Its
purpose is to cross the bridge between high level functional requirement (I
want X) to an executable spec for the aspects of that.

2) I then develop a series of unit tests. These don't require a database or
even the file system, they are purely in the language of the application. I
use a mocking framework to isolate to units and TDD out all the aspects.

3) Finally I write a performance test. A preloaded set of known data is
inserted into an environment looks very similar to production. Any additional
specialist data for the individual test is then loaded on top and the test is
run and asserted against an expected maximum time.

The combination of the three is working OK for me but there are still gaps.
Its hard to get the automated performance testing right as there are so many
types of tests you actually want to run which are very hard to automatically
verify.

That is how I do it IRL.

------
aufreak3
TDD folks will ask you to write tests first, then code. At least, for the
example you used to make your question concrete, I'd first collect data I can
use for testing _before_ writing test code. You can even ask around for data.

Doing this helps me a lot 'cos by the time I have the data collected, I
usually have a rather good idea of what code to write and edge cases even
before writing a slew of tests for it.

If you're writing a data structure + algos for it, then think of a data format
for storing and loading that structure so you can do the data collection in
that format first. It'll also help write tests greatly.

Scripting languages are great ways to test your code, even if it is in C/C++.
Learning to bind native code to some scripting language such as python helps a
ton.

------
wilhelm
For web applications? Watir. Unit tests are great and all, but not that useful
compared to high-level functional tests that tests the stuff users are
actually interested in doing. With limited time and limited resources, write
Watir tests first.

Test #1: Sign up. Did it work? Were all parameters set correctly? Test #2: Log
in. Did it work? Are you logged in as the correct user? Test #3: Something
that touches your most important business logic. Can the logged in user add an
item to her shopping cart? Post in her blog?

Just a hundred tests like that gets you a long way, and doesn't take that long
to write.

------
brown
Since you mentioned data structures... The nice thing about most data
structures is that they are deterministic, and therefore there is (usually) a
clear definition of "correct". Write your unit tests, make sure they work, and
then make sure they work after all future changes.

For each function, you should have a handful of unit tests. Test the
normal/expected values. Test edge cases (zero, infinity, negative infinity).
Test possible error cases (null values, uninitialized data, etc). Basically,
you want to test all the possible "buckets" of possible inputs.

"Correctness" is typically assumed. If your code isn't correct, then fix it.

"Robust" is the ideal. Most seasoned devs will have their toolbox of classes,
snippets, and nuggets that they have incrementally tweaked and improved. Over
time, they become ridiculously stable, full of years of bug fixes and all
sorts of edge cases. They'll then use those snippets in project after project.

Testing becomes a lot more challenging when the function is less
deterministic. If it's a heuristic based function, you will constantly be
balancing the tradeoffs of accuracy vs. performance. For example, I used to
work on a driving directions algorithm. Our team had a library of ~30k routes.
After every code change, we would compare "accuracy" (average driving time)
vs. "performance" (routes calculated per minute).

Testing becomes even more challenging when you're playing at the system level.
Then you have load testing, stress testing, endurance testing, etc. That's
when you need full time testers.

------
crcarlson
In addition to essentially input-output test driven development at the unit,
subsystem and system levels, I am a huge fan of "design by contract" for
identifying both design and implementation errors. In my personal experience
no test suite has been fully comprehensive and rigorously placed assertions,
invariants, etc. pick up all kinds of design and implementation flaws. I find
they also help for debugging by taking the guesswork out of many intermediate
possibilities.

------
emmett
The only real answer is "it depends" for the correct way to do testing. You're
never going to get out of thinking it through, there's no substitute for human
judgement. If you care a lot about the memory profile of your data structure,
you need to test that really carefully. But you might not care very much about
that, in which case it would be a waste of time.

However, I think I can lay your fears to rest about "running a bunch of
functions" - in the end all testing frameworks come down to running a bunch of
functions. All that you get out of a framework is a (sometimes) shorter way to
write those functions in a domain specific way. In the case of a data
structure, a test framework doesn't buy you very much because the domain of
the problem is basically "a computer program", which computer programming
languages are pretty good at making statements about already.

So in short, it sounds like you're already doing the right thing.

------
mhansen
I'm interested in what you mean by the ' _really_ ' in the title. Do you think
there's some kind of secret to how people test that they're not revealing?

If you want to see some 'IRL' test code, have a look at your favorite open
source project. The test code will all be there, usually in a folder just off
the root called 'tests/'.

------
pilom
It very much depends on what you are doing. How many errors per KLOC can you
deal with? A website isn't going to have the same requirements that NASA does.
On the one end, try acceptance testing and user testing. Its cheap and fast.
On the other end, provably bug free code is difficult (not quite impossible
<http://www.ece.cmu.edu/~koopman/ballista/index.html>) but requires better
documentation than anyone should ever write.

------
mkramlich
Rough rule of thumb that works wonders:

Write/modify code.

Run program and/or ensure that codepath is executed.

Did it do what you intended?

If yes, figure out what's wrong about it, fix it, retry.

If no, move on to next item on your agenda. (Possibly first doing a quick
refactor to improve readability, etc.)

~~~
Tyrannosaurs
If you refactor you should always retest. I've lost track of the number of
times things have been broken by someone tweaking things to make them better.

~~~
silentbicycle
The flip side of this is that you can do deep refactoring with more confidence
if you have thorough test coverage (and/or a smart type system).

------
wheaties
Have you tried looking or searching on www.StackOverflow.com ? There's a ton
of information, often very specific to the types of technology you could be
using.

------
known
<http://en.wikipedia.org/wiki/Test-driven_development>

~~~
prototype56
TDD has nothing to do with testing.

------
kranner
No no, let's keep it general, please.

~~~
dsc
I know you! you're the guy that wants to take the good advice and use it
elsewhere!

^5s

~~~
kranner
OK, I'll bite. 'Code' for me, at the moment, is a web app in beta that needs
functional and cross-browser UI testing.

So until I script all flows into Selenium, twill, windmill or something else,
and until I can figure out how to automate screenshots of all screens in the
OSs and browsers I care about (<http://stackvm.com> ?), I have checklists, and
I go through them manually.

It helps that my web app's UI is short and sweet, but that's probably because
I'm trying to avoid thinking about the combinatorial explosion it's hiding.

~~~
epall
check out <http://saucelabs.com/>! Write your scripts with Selenium RC, run
them against any browser, and get screenshots of every important steup.

~~~
kranner
Thanks, I'll have a look.

