
When to Mock - vkhorikov
https://enterprisecraftsmanship.com/posts/when-to-mock/
======
mabbo
What bothers me about this article is that it doesn't really present
alternatives or real-world examples of how to do what the author is saying.

> Use real instances of managed dependencies in tests.

Is the author suggesting I should spin up an entire instance of Oracle to run
a single 50ms unit test? Because I often want to run my unit tests very
frequently as I develop, ensuring I'm still green, etc. And my code may depend
on an large, complex database. I could have a continually running instance
just for my unit tests, but that's expensive (time, money, hosts, etc) and
means I can't run my unit tests if the wifi goes down.

If we're talking about large, complete systems running very large suites of
system tests as part of a CI/CD pipeline, then sure, I'm on board. Let's use a
real database. But there's lots of other small, simple tests that absolutely
don't need anything more than a mock while I work on business logic bugs.

And no, I'd not like to get into a debate of "you shouldn't bother to write
small unit tests". I find them very useful.

~~~
jaitsu
Writing unit tests is debatable?

~~~
Tomis02
Unit tests are code, and all code has a cost. Writing unit tests without any
benefit is harmful behaviour. Real world example: testing dumb
getters/setters.

~~~
allendoerfer
Writing getters and setters is harmf... - okay, I will stop before things
escalate in here!

~~~
Tomis02
If you live in OOP land and have a natural aversion to directly accessing the
data then by all means, write all the getters and setters you want. But
testing that "int getX() { return X;}" actually returns X is pathological
behaviour. It's harmful because 1) it costs to write 2) it costs to read 3) it
takes up space, distracting from more important tests 4) it costs to run 5)
prevents you from easily making changes (it's a brittle test). Can't really
see any benefits; if you manage to screw up writing a dumb getter, how do you
trust yourself that the test code isn't wrong as well?

------
harrisonjackson
For a long time, we ran our Django test suite against an in-memory SQLite DB.
It was super fast, which encouraged more tests to be written and a CI/CD
process that allowed everyone to confidently ship code often.

Our production database is postgres.

We kept bumping against things we wanted to do in the application code that
worked well with postgres, but would fail in sqlite. We limited our
development so that we could keep running the tests. We knew that we could run
them against a postgres db, but the development time to rewrite our CI test
runner to spin up a fresh database was not worth it.

Recently we migrated from github + Jenkins to gitlab + our own gitlab-runners
in AWS. During that switch, we prioritized testing against a Postgres DB and
got it all running in a containerized way that spins up at fresh DB for every
test run. The tests are slower but the runners scale horizontally so we don't
mind queueing up as many merge requests as we need to and deployments through
Gitlab environments are a big improvement over our Jenkins deployment job, so
we still ship as often as we want.

Now our biggest testing issue is keeping fixtures up to date.

~~~
malinens
if you keep using sqlite for tests it forces Your database logic to be
universal. You could confidently switch to other db like mysql any time.

~~~
mumblemumble
It also means that you have to stick to the lowest common denominator, which
is approximately equivalent to the state of the art as of a quarter century
ago.

Every benefit has a cost. The benefit doesn't always justify the cost.

edit: To add to that - I've seen more than a couple major database engine
migrations in my day, so it's not to say that that isn't a concern. But none
of them has ever been from one SQL RDBMS to another. More common is migrating
among different classes of database. MySQL to Mongo, Oracle to BigTable, Couch
to Cassandra, something like that. MS Access to MS SQL Server a couple times,
but even those are different enough that it was never going to be as simple as
changing the connection string and having a carefree life.

The speculative future proofing that you do almost never manages to work for
the future you end up actually living.

------
pgt
Panic closed the tab when a giant form popup appeared asking for my email.
Hate to add noise to the discussion, but I hope the author sees this and
understands the market doesn't want this and reduces readership.

~~~
xhrpost
These kind of CTAs are everywhere these days and it's rather frustrating. Wish
there was some solution to filter out sites that use completely not-ignorable
(like in the corner) CTAs.

~~~
SAI_Peregrinus
> Wish there was some solution to filter out sites that use completely not-
> ignorable (like in the corner) CTAs.

uBlock Origin, if the default filter lists don't catch it just right-click and
block the element.

------
mumblemumble
So, disclaimer: I'm about to focus on the finger instead of where it's
pointing. Because that's what I feel like jabbering about right now. TFA isn't
just about databases; it makes some interesting points on test practice in
general, and is well worth a read.

Anyway,

A story I've seen all too often when mocking the database: A large development
effort goes into creating test infrastructure. And then there end up being
scads of bugs that weren't caught by the unit tests, because the test doubles
for the database don't accurately emulate important behaviors and semantics of
the real database.

This isn't just a problem with mocking, mind - it's also a problem I've seen
(albeit less often) when using some other DBMS during testing because it's
designed to operate in-memory.

Nowadays it's not too hard to configure a RAM disk for the DBMS to use.
Especially if your test stack runs it in Docker. If you're having performance
problems with your test suite, start there. You might never achieve the same
run times as you could with mocking, but, if there's one thing they hammered
on in my Six Sigma training that I wholeheartedly agree with, it's that you
shouldn't sacrifice quality or correctness for the sake of speed.

It's also not too difficult (not any more difficult than going hog-wild with
mocks, anyway) to set up a mechanism that uses transactions or cleanup scripts
or similar to ensure test isolation, so I don't find that complaint to be
particularly compelling. You can even parallelize your tests if you can set
things up so that each test is isolated to its own database or schema.

~~~
commandlinefan
> scads of bugs that weren't caught by the unit tests

Which is fine! Unit tests will never catch all the bugs - neither will type
safety. Neither will code reviews. Neither will manual testing. But unit tests
do catch the kinds of bugs that unit tests are good at catching, which eases
the burden on the manual testers.

~~~
monkpit
> because the test doubles for the database don't accurately emulate important
> behaviors and semantics of the real database.

Commenter implies that bugs occurred in code that was assumed to be tested and
correct (according to the test specs) because it did have tests. Which is
decidedly Not Fine.

Now, would I consider them to be “unit tests” in this case? Probably not. But
the label you decide to slap on the test doesn’t change the fact that a spec
was written and code was tested against it and passed (falsely) due to mocking
the db.

~~~
mumblemumble
Whether, in practice, I would consider them to be unit tests probably depends
on what my colleagues want to call them, and little else.

Personally, I do prefer a more classicist definition, because I believe that's
the more pragmatic one. But I also believe that arguing over the definition of
the term "Unit test" is one of the most wasteful possible examples of
bikeshedding. Like you imply, the only thing that really matters is the extent
to which your test suite gives you confidence that the software behaves
correctly.

------
UK-Al05
Most of the reason for 'mocking' databases is because of performance reasons.
If you have 8000 tests, having a fake database drastically increases
performance.

One trick I use is that I write a in-memory version. I use that in unit tests.

I then write integration tests, that check behaviour of the InMemory and
RealVersion are exactly the same. I inject either version into the same tests.
They also check I haven't broken any code in the RealVersion which isn't been
covered by unit tests mostly because its just external interaction in there.

If you have to verify inter-service interactions, use contract tests.

------
bob1029
Our trick is to use SQLite to completely side-step the concern of maintaining
any sort of central testing database. Each developer can clone a fresh copy,
load the .sln, hit F5, and every database required by the application is
immediately available within the same process. There is absolutely no other
software required to be installed.

This also makes it infinitely easier to coordinate complex schema changes.
Each developer can sort it all out on their local branch before we even know
about it. If we had to share some common test database, this would become a
much more painful process.

Also, I do not believe in using mock layers for the database interactions. Our
service implementations are tightly-coupled with their backing datastore. This
is the only way we are able to make SQLite a viable storage medium for a high-
throughput business application. As a consequence, testing our services in
absence of their concrete datastores would be an extremely disingenuous
endeavor for us.

~~~
squiggleblaz
Are you using SQLite as your main production datastore or you're using sqlite
as a standin for something else? It sounds like production. One db file or
several? How big? Can you summarise the non-test benefits and costs?

------
steerablesafe
> Retrieving data from the database is an incoming interaction — it doesn’t
> result in a side effect.

Unless you count the time taken to retrieve the data as a side effect. If you
are implementing a cache with a well specified behavior then you might want to
test incoming interactions.

Anyway, this just shows that having a side effect can be a matter of
perspective.

------
devit
You should never mock anything. Test against the same dependencies that exist
in production.

If that's impossible (i.e. you get charged for the backends or you are
controlling physical objects), then generalize the program to support
alternative backends and frequently test only those that can work in the test
environment, using some more ad-hoc methodologies for the others.

Also, in general, if you need to change your tests for valid changes in the
implementation, then your approach to testing is completely broken. An example
are dumb testing strategies where you check that the code produces specific
SQL queries instead of checking that the code returns correct results.

~~~
doublesCs
> Test against the same dependencies that exist in production.

Interesting. I said this at my work recently and I got a condescending
explanation about how production things are _production_ , we don't touch
them. If we need stuff for development, those are dev things.

I now think that whether this is or isn't a good idea depends on specifics.
Most often than not, I think it makes sense.

~~~
detaro
I think parent means test against the same code, not against the same instance
as production. Tests don't get to talk to the production database, but should
use the same database software, deployed as similarly as possible, as
production does. Against test/demo versions of APIs if possible. ...

~~~
tmountain
With an identical schema to boot.

------
slifin
If your database is mutable then you have a third party network resource,
essentially you can't test it because you have no stable basis in time and no
means of dealing in values

All you have is a shared place, mocking is essentially putting up a fence
around it to prevent testing into that boundary

If your database is immutable and is indexed for time travel, you can rewind
or fast-forward your database into your desired state, if your database
supports speculative writes you can even build up a non-committable state

I'm sure Rich Hickey will have a talk on this in relation to databases

------
aidenn0
This article uses the Mock/Stub distinction exactly reversed from how I've
always heard it. Naming things is hard.

~~~
ooOOoo
The article uses a Mock/Stub distinction very close to the famous _Mocks Aren
't Stubs_ article by Martin Fowler
[https://martinfowler.com/articles/mocksArentStubs.html](https://martinfowler.com/articles/mocksArentStubs.html)

------
axegon_
There are cases in which this is nearly impossible, especially when you rely
on products/services delivered by other teams/companies. That said, whenever
mocking can be avoided, should be avoided at all cost. I've been a witness of
the "but it works on my computer" situation precisely because of that(and
subsequently suffered from it immensely). Especially during the development
phase(even if that means pulling 6-7-8 containers, do it): Someone misread the
documentation and instead of "package_signature" used camel case in their code
and used that. Of course it works on your local... Now could you give us back
the 3 hours trying to solve it, please?

------
BiteCode_dev
This is one of the reason I don't like to put application logic directly in
the database.

It makes everything harder: caching, testing, abstracting, balancing, etc.

~~~
Tomis02
Where do you draw the line between a query and application logic? Does a join
or aggregate function count as application logic?

On another note, a few hours/days spent implementing application code can save
you 5 minutes of writing SQL.

