
Slow database test fallacy - victorgama
http://david.heinemeierhansson.com/2014/slow-database-test-fallacy.html
======
dragonwriter
DHH either is being disingenuous, or _badly_ misunderstands unit testing.

He opens with this:

> The classical definition of a unit test in TDD lore is one that doesn't
> touch the database. Or any other external interface, like the file system.
> The justification is largely one of speed. Connecting to external services
> like that would be too slow to get the feedback cycle you need.

No, "unit tests" in TDD -- and long before, TDD didn't change anything about
the definition -- are tests that, to the extent practical, test _all_ and
_only_ the functionality of the specific _unit_ under test, hence the name.
_That 's_ the reason why external interactions are minimized in proper unit
tests (whether or not TDD is being practiced). TDD _observes_ that such tests
are generally fast, and builds the red-green-refactor cycle around that fact,
but _speed_ isn't the justification for the isolation, isolation of the
functionality being tested from other functionality is the _point_ of unit
testing (which is designed not only to identify errors, but to _pinpoint_
them.)

~~~
adrianhoward
He also seems to misunderstand TDD - which is more about design than testing.
Unit testing and TDD are not synonymous. You can, and I certainly do, test-
drive code over traditional unit test boundaries.

He also seems completely unaware that there is an entire school of TDD/BDD
that doesn't really like mock objects (the Chicago vs London school
[http://programmers.stackexchange.com/questions/123627/what-a...](http://programmers.stackexchange.com/questions/123627/what-
are-the-london-and-chicago-schools-of-tdd)).

I've just generally stopped listening to what he says on the topic. It doesn't
seem to match the reality of what folk actually do.

~~~
mcphage
> He also seems to misunderstand TDD - which is more about design than
> testing. Unit testing and TDD are not synonymous.

Actually I think it's more his critics which misunderstand this—he argues
against TDD, and he doesn't think unit testing is sufficient, but he doesn't
argue against testing, or unit testing.

~~~
dragonwriter
> he argues against TDD, and he doesn't think unit testing is sufficient

But what he describes as "TDD" to argue against it isn't TDD (either in his
description of its substance _or_ his description of its rationale), and what
he describes as "unit testing" to argue that it isn't sufficient isn't unit
testing (again, either in substance or rationale.)

> but he doesn't argue against testing, or unit testing.

He specifically argues that unit testing should be deemphasized if not
outright eliminated, and that the reason for this is the elimination of test-
first as a design practice. So, yes, he does argue against unit testing (of
course, the argument is nonsense, since unit testing was an important practice
before test-first practices, and test-first practices are independent of kind
of testing -- sure, TDD emphasizes _unit_ test first, but ATDD/BDD focus on
acceptance test first; moving toward or away from test-first as a practice is
completely orthogonal to the degree of focus on unit testing vs. other kinds
of testing.)

------
kentwistle
Am I only the only one that doesn't find 'All tests in 4 minutes, all model
tests in 80 seconds' very impressive? It sounds like a really long time to me.

You know what could increase the speed dramatically.... decoupling.

I also think decoupling phrased in the context of the Rails 2 to Rails 3
upgrade, where pretty much everything changed, makes perfect sense. Imagine
just having a few wrapper classes that spoke to Rails and only having to adapt
them. Sounds good to me!

Bernhardt: Boundaries
[http://www.confreaks.com/videos/1314-rubyconf2012-boundaries](http://www.confreaks.com/videos/1314-rubyconf2012-boundaries)

Weirich: Decoupling from Rails
[http://www.youtube.com/watch?v=tg5RFeSfBM4](http://www.youtube.com/watch?v=tg5RFeSfBM4)

Wynne: Hexagonal Rails
[http://www.youtube.com/watch?v=CGN4RFkhH2M](http://www.youtube.com/watch?v=CGN4RFkhH2M)

~~~
taeric
You know what has a high chance of introducing a crap ton more bugs?
Unnecessary code. :)

Also, did you read the rest of the post? Typical test cycles are much faster
than that. But a full test of the entire system takes some time. As it should.

This is also why the test suite for git takes a long time, but has a very good
signal to noise ratio.

~~~
jafaku
> You know what has a high chance of introducing a crap ton more bugs?
> Unnecessary code. :)

If only that "unnecessary code" had tests... Oh wait.

~~~
taeric
At some point, you will find you have bugs in the tests. It isn't like that
code is magically immune to mistakes.

Not to mention you have increased the workload of everyone just to mitigate
the impact of additional code. Let me know how that works out for you. :)

~~~
jafaku
Slightly more complex code with tests beats code with no tests every time, and
it works just fine. Having 1 more class or function call is not exactly what
makes an impact on workload. If you are optimizing your application by
reducing function calls you are doing it wrong. But tell me, how does not
having tests work out for you? Can you modify old applications, or
applications that you didn't develop alone, with a high degree of certainty
that you didn't break things in the least expected places?

~~~
taeric
At no point was it declared there were no tests. So... different topic?

Seriously, straw men not withstanding, this thread is specifically about
tested code where a complaint was raised about how long the tests take. And
the "how long" was only 4 friggin minutes for a full suite, or 4 seconds for a
module.

------
mattgreenrocks
While reading this, I couldn't help but think of Alan Kay's biting assertion
about the pop culture of programming.

I'm not interested in pop culture; I'm interested in being a better developer,
and that requires a highly critical process of evaluating my practice. It's
not enough if something works once, I want to know _why_ it was effective
there, and when I can use it. I want to try practices like TDD just to see how
they affect the design, and then decide if I like that force. I'll use
hexagonal architecture on side projects just to see how it helps, and if it's
worthwhile. In short, I want to continue to _study_ the art of software
development rather than trusting emotion-laden blog posts with something as
serious as my skill.

I don't believe Rails is so special it warrants revisiting all of the lessons
from the past we've learned about modularity, small interfaces, and
abstraction. It's just a framework.

~~~
parasubvert
Rails isn't special. The debate is about the price of decoupling.

Many codebases don't heavily decouple from their frameworks unless they have a
good reason to do so, as they lose the productivity benefits of the framework
in the process. The framework you choose to tightly couple against should be
your flex point -- you don't have to design your own!

The level of tradeoff depends about the framework in question. I can recall a
moderate-sized project where we bound against Hibernate ORM for a couple of
years and eventually had to switch to MyBatis for a variety of reasons. But
since we were using JPA annotations mostly, the coupling wasn't so tight to
make the switch all that hard or brittle.

There are times where Hexagonal architecture makes total sense (immature
frameworks, shifting dependencies, etc.) , and times where it doesn't at least
for certain "ports" (you're building a moderately complex Rails/AR app, why
bother isolating AR).

------
kyllo
DHH and Uncle Bob are arguing past each other at this point.

Uncle Bob is saying that Rails is not your application, your business objects
that contain all your logic shouldn't inherit from ActiveRecord::Base because
that ties you to a specific version of a specific framework (have fun
migrating to a new version of Rails!) and means you have to design and migrate
your schema before you can run any tests on your model code. You should be
able to test your logic in isolation and then plug it into the framework.

DHH is saying that if you're writing a Rails application, of course Rails is
your application. Why waste hours adding layers of indirection that make your
code harder to understand, just to make your tests run faster?

Of course if it's just a prototype, who cares? But I really agree with Uncle
Bob that tightly coupling your application logic to (a specific version of)
Rails/ActiveRecord is a bad idea if you want to make a long-lasting,
maintainable application of any non-trivial size.

~~~
munificent
> your business objects that contain all your logic shouldn't inherit from
> ActiveRecord::Base because that ties you to a specific version of a specific
> framework

Any time you introduce an abstraction layer to decouple some code, you're
making a prediction. You're saying, "I think it is likely that I will need to
change the code on one side of this interface and don't want to have to touch
the other side."

This is exactly like financial speculation. It takes time, effort, and
increases your code's complexity to add that abstraction. The idea is that
that investment will pay off later if you do end up making significant changes
there. If you don't, though, you end up in the whole.

From that perspective, trying to abstract your application away from your
_application framework_ seems like a wasted effort to me. It's _very_ unlikely
you'll be swapping out a different framework, and, even if you do, it's a
virtual guarantee that will require massive work in your application too.

Sure, it's scary to be coupled to a third-party framework. But the reality is
is that if you build your app on top of one, that's a fundamental property of
your entire program and is very unlikely to change any time soon. Given that,
you may as well just accept that and allow the coupling.

~~~
kyllo
I agree that in most cases, it's reasonable (and cost-effective) to assume
that your Rails app will always be a Rails app.

However, it's _not_ reasonable to assume that your Rails 3 app will always be
a Rails 3 app. You will eventually have to upgrade--if not immediately for
feature reasons then eventually for security reasons. And upgrading a Rails 3
app to Rails 4 is a non-trivial effort, there are a lot of breaking changes,
some of which affect the models (e.g. strong parameters, no more
attr_accessible). If you skip versions you will just accumulate more and more
technical debt.

I think that ideally, you would have your business logic in classes/modules
that don't need to have code changes just because the app framework got a
version bump.

But generally speaking you're right, the decision of whether or not to put in
the up-front work to decouple your business logic from your application
framework, is like an investment decision with costs and benefits. Uncle Bob
is saying it's always worth it, DHH is saying it's never worth it, but I think
the reality is that it's sometimes worth it, depending on you and your
project.

------
rakoo
> The justification is largely one of speed.

Is it ?

I was under the impression that you don't include them because a unit test is
testing a very specific piece of code and not the dependencies around it. This
is why you'll mock disk/db/network, just like you'll mock _other_ pieces of
code.

~~~
SnacksOnAPlane
If something is broken in code that I'm interfacing with, I'd like for my
tests to reflect that. I've never understood the "testing little things in
isolation" strategy when it precludes a "test everything all working together"
strategy.

~~~
dllthomas
I don't think it does preclude that, it's just not the focus of _unit_ tests.
Integration tests are also vital.

~~~
SnacksOnAPlane
Yeah, I get that. But it makes me wonder why "unit testing" became such a hot
thing. I think it's just because it's simpler than integration testing. But
integration testing is much more valuable.

~~~
rakoo
True, it's much simpler. It's also much more interesting when you start having
a huge codebase, and you want to refactor/add a feature/fix a bug, because you
don't need to set and verify the whole input/output of your application but
only the one that does the logic you're interested in.

It's also simpler to use unit tests for verifying all possible inputs and
outputs of a component.

But, as others have said, _unit_ tests alone are necessary but not sufficient.

------
droob
I sense that these posts are written for a specific audience, rebutting a set
of arguments familiar to that audience, and that's why they seem so reductive
and narrowly-applicable, but I can't quite grasp how much of the argument
translates to the rest of the world.

~~~
amalag
It is aimed at design architectures which seek to separate application level
concerns from Rails. The theory being that Rails and your specific application
get too tightly coupled. The aspect being talked about here is the testing.

------
d64f396930663ee
I always assumed the point of mocking a database response was to ensure that
you were testing _just_ your code, and not also the existence of a database
with the right schema, the ability to connect to it, as well as the
correctness of the code that rolls back any side effects.

~~~
jarrett
> and not also the existence of a database with the right schema, the ability
> to connect to it

I like testing those things on which my model depends. It gives me much more
confidence. Why wouldn't I want to test them?

> as well as the correctness of the code that rolls back any side effects.

That's a drawback. No arguments from me on that one.

~~~
d64f396930663ee
> I like testing those things on which my model depends. It gives me much more
> confidence. Why wouldn't I want to test them?

Those things all need to be tested, but if a single unit test fails, it's nice
to know that it failed because the code was wrong, not because the database
connection happened to die just then. If I have one test for the logic, and
another that verifies that the database can be connected to, and a third that
verifies the schema is right, then the specific combination of failing tests
tells me a lot more about what's wrong and if my code even needs to be
changed.

------
cousin_it
111 assertions in 4 seconds? Why not 4 milliseconds, or 4 microseconds? These
must be some pretty huge assertions. I guess I'm missing something about
modern programming...

------
parasubvert
There's something to be said for DHH's point here, even though he's confused
about what a unit test is. Integrated and end to end tests are much, much more
important than unit tests. They actually test the application, not a
contrived, isolated, scenario.

Much of the testing activity and literature of late has been complaining how
brittle end-to-end tests are, because all the focus is on pure unit tests.
This leads to defect pile-up at release time or at the end of an iteration.
Whereas the smoother teams I've worked with did end-to-end and integration
tests all the time. Unit tests existed too, but only when there was
sufficiently complex logic or algorithms to warrant such a test, or if we used
TDD to flesh out interfaces or interactions for a feature.

Many web applications don't have a lot of logic, they have a lot of database
transactions with complex variations for updates or queries. So, _especially_
if you have an ORM, which are notoriously fiddly ... it makes sense to have
the majority of tests (TDD or not) hit the database, since the code will only
ever be executed WITH a database.

Mocking or decoupling the database can introduce wasteful assumptions and
complexities that aren't needed in your code base. The only time it makes
sense to decouple the database is if expect you'll need polyglot persistence
down the road and your chosen persistence framework won't help you.

I have worked with developers that prefer test cases run in under 1 second on
every save. To me it helps to have a set of unit tests that are in-memory and
very fast, that cover basic sanity checks like model integrity, input
validation and any in-memory algorithms. But the bulk of tests really need to
test your code _as it will be used_ , which often involves database queries.
At worse, use an in-memory database that can load test data and execute tests
in a couple of seconds.

------
jonahx
"These days I can run the entire test suite for our Person model — 52 cases,
111 assertions — in just under 4 seconds from start to finish. Plenty fast
enough for a great feedback cycle!"

4 seconds is really slow, actually, and enough to take you out of flow. With a
PORO Person object, decoupled from the system, that number will easily be sub
500 ms and possibly much less.

~~~
avenger123
I am really trying to understand these flow comments that are coming up.
Waiting for 4 seconds for tests to run or even a few seconds more just seems
like a silly thing to get caught up on. If we are talking minutes than I can
see that but single digit seconds?

~~~
dragonwriter
If you've developed a workflow around the kind of automatic test suites that
run on every save to a source file and provide instant feedback, a several
second wait would seem to be potentially a significant rhythm break.

~~~
dllthomas
Doubly so if that's synchronous (probably the case with vim).

------
joevandyk
A problem with running all your tests in a single transaction is that that's
not actually what happens when your code is ran. You will have multiple
transactions (unless for some reason you wrap ever single web request inside a
transaction, which I think is a terrible idea).

There's slightly different things that happen: _now()_ will always return the
same time, deferrable constraints/triggers are useless, you can't have another
database connection looking at the test results or modifying the database (say
you are testing deadlocks or concurrent updates, or you have code that opens a
new database connection to write data to the database outside the current
transaction), etc.

It's fine for simple, vanilla ActiveRecord use where you aren't using lots of
database features, I suppose.

~~~
DrJokepu
I'm curious, why do you think that wrapping every single web request in a
single transaction a terrible idea?

~~~
paukiatwee
Mainly is performance issue, see this SO
[http://stackoverflow.com/questions/1103363](http://stackoverflow.com/questions/1103363)

~~~
DrJokepu
Sounds like this is pretty specific to some ORM tools!

------
batbomb
> Oracle abomination

Okay... PostgreSQL is great but it still has a bit of catching up to do.

> ... run your MySQL

Wait, Oracle is an abomination but MySQL is okay?

> Before each test case, we do BEGIN TRANSACTION, and at the end of the case,
> we do ROLLBACK TRANSACTION. This is crazy fast, so there's no setup penalty.

You know what is just as easy? Making SQLite databases (aka files) for each
test case. Copy a file, open it, delete it. It has the added benefit of
allowing you to actually commit changes and not worry about rollback. There
are some compatibility issues, and I'm not familiar with all those issues in a
Rails context.

~~~
rakoo
> You know what is just as easy? Making SQLite databases (aka files) for each
> test case.

By doing this, you're breaking the dev-prod parity rule of 12factor apps [0]:
you should make sure your dev and prod differ as little as possible. If you're
using MySQL in production, you should also use it in dev.

[0] [http://12factor.net/dev-prod-parity](http://12factor.net/dev-prod-parity)

~~~
batbomb
Let's not conflate unit testing with integration testing. SQLite should, in
most cases, work just fine for unit testing.

First off, many people would argue unit testing should never really require a
database. I'm not of that opinion, but I'm not writing CRUD apps, and since
there's not really a SQL unit testing framework, using SQLite is a nice
compromise, especially as unit testing should necessarily be limited in scope.

Integration testing should definitely be done on a system that is in parity
with prod.

------
brasetvik
For local testing on Postgres where you don't care about database reliability,
you can also speed things up a lot by setting `fsync = off` and
`synchronous_commit = off`.

(Never do that on a production database, of course!)

~~~
joevandyk
You can also have postgres running off /dev/shm.

------
daleharvey
Testing dependencies is not a bug, if there is a reason to not test them, like
you need to test an error condition, or your dependancy is external (oauth
etc), then certainly, but if there is no need to mock a dependency other than
a dogma of some definition of unit test, then it usually isnt worth it.

With every test the questions should be answered are what bugs is this going
to catch and which one will it miss, if you mock a dependency then you are
introducing cases in which it will miss bugs and there should be a
justification along with it.

~~~
mbrock
The classic justification is that mocking encourages modular code where each
unit has shallow dependencies with well-defined interfaces.

This also means that a mistaken change to one important unit will not break
the entire test suite. Sure, the entire program will break, but it's nice to
get a single failing test.

Mocking also gives a very straightforward way to simulate interactions with
collaborators. You just say "given that the HTTP module returned a 404, this
request should be sent to the error log," instead of initializing those two
modules and arranging for the desired case to happen.

There's a very old discussion about decreasing coupling and increasing
cohesion that's super important to the whole motivation behind TDD and that
nobody seems to be very interested in anymore...

------
lmm
4 seconds is a long time. I'm reminded of the SVN fans who say things like "I
can commit in 2 seconds, that's plenty fast enough". Which it is, until you've
experienced the alternative, and then you can't imagine going back.

Also, all that separation isn't free. Sure, I don't _need_ to run all my unit
tests every time I make a change - but if they're fast enough that I can,
that's much less cognitive overhead than having to think about which tests are
relevant and press the correct button.

------
thejosh
If using MySQL, and need to run tests, the following option our our
_DEVELOPMENT_ server really sped things up:

innodb_flush_log_at_trx_commit = 0

------
lazyatom
Hitting the database or not, using fixtures introduces coupling into your test
suite that's often more trouble than it's worth.

[http://interblah.net/the-problem-with-using-fixtures-in-
rail...](http://interblah.net/the-problem-with-using-fixtures-in-rails)

------
slavoingilizov
So here's how I summarise the whole essay: "Hardware is cheap. Instead of
making your software perform well, why not just throw more hardware at the
problem."

Well, I've tried this before and it didn't work.

------
brown9-2
_runs in 4 minutes and 30 seconds. That 's for a 1:1 test:code ratio._

is this claiming 100% test coverage?

~~~
bmm6o
I think it's just a LOC comparison.

~~~
brown9-2
That seems like a not-very-useful metric.

------
metaphorm
today I learned that DHH doesn't know what a unit test is.

