
The road to faster tests - duck
http://37signals.com/svn/posts/2742-the-road-to-faster-tests
======
lsb
2x speed improvements by getting better memory management. It's astounding how
good Ruby feels to program in, as well as how much the webapp bounce cycle can
hide significant infrastructure problems.

------
codex
As an aside, it would be fairly simple to decrease the time taken for a _unit_
test suite to run by switching to incremental testing--that is, by not running
tests for code that hasn't changed.

While I've never implemented something like this, I imagine it would be
straightforward for compiled languages if you have a good linker:

1) compile each unit test as a statically linked executable

2) make sure that each unit test outputs to an individual file; e.g. ./test_a
>> test_results_a.xml

3) jigger your build process such that the linker removes dead and unused code
from each test executable. Now each test executable contains code that is
directly used by the test.

4) ensure that your build process only touches a statically linked test if a
dependency has actually changed. This comes for free most of the time, but you
may still have the linker touching files that it doesn't have to. You can
remove these cases by staging the test files using a binary comparitor tool
like rsync. If you have a crappy linker, or there's something in your test
executables which is always changing (like a build stamp) you might need to
compare using more advanced tools (like binutils) or something like Google's
Courgette.

5) Run your tests only when the test is newer than the results--that is, when
test_a is newer than test_a_result.xml. This is straightforward to implement
using most build systems, like make, or using a test runner script.

Bam--now a test only runs if some dependent code has changed. If your test
also uses data, you should list it as a dependency in your build system--
either manually or by detecting open() calls at runtime via a shim over libc,
or via strace. Your test run takes much less time, and, best of all,
developers get a lot less spam to read through. This is perhaps a much greater
benefit than increased speed.

Dynamic languages are a much harder nut to crack, as are integration tests
that are loosely bound to the code.

~~~
elliottkember
I use autotest for this - it notices files being altered and runs relevant
parts of your suite. If something goes wrong it'll keep trying that test
everytime you save a file, until it works. I think it's fantastic.

~~~
icefox
Do you have a link to autotest? I run something similar. My ruby projects are
all in git. I have a git pre-commit hook that when file X is modified before
allowing a commit will run the test for X and only allow me to commit if the
tests for X passes. This happens with zero work on my part forcing me to never
forget to run the tests. While it doesn't run all of the tests just running
the associated test catches a very larger number of accidental test failures
for very little cost. Lastly because the tests are being run all of the time
if one suddenly takes a long time to run I quickly profile it and speed it up.
And --no-verify is always there to bypass the hook if need be.

~~~
Kaya
How do you handle the case where file X uses file Y, and you commit file Y?
You could run the test for X when Y is committed in case Y broke X.

~~~
icefox
For most of my projects I don't. Some of them have some rules listing test x
depends upon y, but for the vast majority of my projects modify file X and
only test X is run. This catches a huge non-insignificant amount of
regressions. Really it comes down to effort. I can either A) try to remember
to always run tests before committing or B) Have a basic hook that takes
minutes to write/install that will always run at least one test and catch near
all regressions.

I tried to do A for years. Most of the time you run the tests, but not always.
And heaven help you if you are on a team. There will be someone who never runs
tests. And you will end up having to schedule a chunk of time for regression
fixing before every release. So to answer you question who cares about when
file X uses file Y.

------
kalak451
_Ruby’s Test::Unit library (which we use for Basecamp’s tests) creates a new
instance of the test class for each test. Not only that, but it holds onto
each of those instances, so if your test suite includes thousands of tests,
then the test suite will have thousands of TestCase objects._

On the Java side, JUnit does exactly the same thing. However I usually see
this manifest itself as out of memory errors rather than a time sink. Over the
years I have gotten some VERY strange looks from clients when I null out
instance variables in my tear down methods. Usually takes a couple of hours of
explanation to gain some level of acceptance, if not understanding.

~~~
lsb
What are the benefits of holding onto each instance?

~~~
jamis
I don't believe Test::Unit does this intentionally; it's just a side-effect of
the implementation (load all tests into an array, and iterate over the array).

------
aarongough
AS an experiment I just implemented the GC tweaks for our test suite (we're
using RSpec), it gave us a 42% improvement in run time...

    
    
      Before: 914 seconds
      After:  538 seconds
    

Awesome! Thanks to Jamis for sharing!

------
andrewingram
Would it not be possible to track which lines and branches get executed by
each test (normally you'd track this for your code coverage report, yes?),
then have your IDE only re-run the tests which would have been affected by any
changes you've made?

You'd still want to run the full-suite when you commit, but it seems like a
possible sanity check. Unless I'm overlooking something :)

~~~
jdminhbg
Yes -- in Ruby the tool for that is `autotest`. It watches your project
directory and automatically runs relevant tests when you change application
code or tests.

~~~
patio11
Autotest has been the single biggest instigator in me adopting a more rigorous
testing procedure for AR than I have previously used. The number of bugs it
catches practically before I leave the method I'm tweaking is astounding.

(I mostly do unit testing, but also do some testing against APIs to ensure
that I'm passing them data in the format they expect. One of my external
service providers had downtime yesterday, and autotest caught it about six
hours before they did. I wasn't successful in waking them up, but at least I
was able to push a hotfix to AR to minimize inconvenience while I waited for
their API to come back.)

[Edit: I might consider, if tests were taking that long, spinning up a fleet
of VPSes to munch through them for me, maybe with a continuous integration
server. Unit tests should be disgustingly parallel, since they're small units
of work and shouldn't depend on each other. That means if one Macbook can
execute the suite in 15 minutes then you should be able to have 30 Macbook-
equivalents running either in the server room or the cloud munch through the
same test suite in 30 seconds, give or take a little startup time.)

~~~
jamis
We do have a CI server, and as you said it works well for catching failing
tests. However, it requires that you commit and push your changes in order to
test them, which means you are effectively publishing untested changes to your
entire team. The same for any kind of distributed testing, unless you are
using a shared volume to host your sandbox.

I'm running a Mac Pro with 8 cores, so there is a fair bit of parallelization
I can do locally too. Unfortunately, the tests all depend on the database, and
while I can certainly use tools like deep-test to spin up separate DB's for
each worker, I've found that doing so adds a full 60 seconds to the test run.
I fear that until we eliminate the database from (most of) our tests, super-
fast runs will continue to elude us.

CI and distributed tests are good things, no question, but I'm still looking
for ways to make it possible to run my tests locally in TDD-fashion. I'm far
from out of ideas, it's just a matter of making time to experiment.

~~~
patio11
This is totally a personal/team comfort question, but is there any reason why
you can't have two remotes? "git push jamistest" might use a few bits on a
spinning platter somewhere, but that is cheap, and there is no reason your
team has to see it if you don't push it to the master repo, any more than they
see changes you keep on your local repo.

~~~
jamis
Aside from me simply wanting to be able to quickly run my tests locally, you
mean? :) Mostly it's just an issue of configuring that so it works for all the
programmers. Each would need their own remote, and each would need to be
hooked into CI. Definitely possible, it just hasn't been a priority.

------
yxhuvud
Interesting. Hopefully someone (preferably someone that already knows the
library code) extracts a fix so that stuff like this don't happen.

