
Superior Testing: Stop Stopping - ming13
https://arturdryomov.online/posts/superior-testing-stop-stopping/
======
kstenerud
Regardless of all the arguments for and against tests, it's important to
remember your purpose as an IT professional: Your job is to solve the
customer's problem. That's it.

HOW you solve their problems is entirely up to you, and some solutions will
work better than others.

The only thing about customers guaranteed to be consistent is that they will
request changes. Some will be good changes. Some will be bad changes. Some
will be unavoidable changes. Your job here is to guide them through the best
possible change process, with the ultimate goal of solving their problems.

Testing is a useful tool for (1) making sure the system matches the customer's
expectations, and (2) making sure new changes don't break old things.

How you implement tests is entirely up to you, but once again, some solutions
work better than others depending on the situation. Manual testing only scales
up to a point before the cost of the manual testing is greater than the cost
of implementing and maintaining automated tests. You need to become good at
estimation here, but for 90% of projects, automated testing is the most
efficient long term strategy.

WHERE you test is also important. Generally, it's best to test components at
the edge of their interfaces. When you input X, Y should come out, ideally
every time. This is where immutable data and idempotent APIs are VERY helpful
for maintaining a reliable codebase. The more you need to cut into a
component's innards to test it, the more you should be asking yourself why.

Testing is very much about architecture. Make clear boundaries between
components and outside interfaces, use immutable data and idempotent APIs, and
tests become a lot easier to write, and more resilient to change.

~~~
olau
> but for 90% of projects, automated testing is the most efficient long term
> strategy

I don't think this is a good way to think about it.

I see plenty of projects that don't go anywhere, and I also see plenty of
projects that do go somewhere, but where large parts change so little and are
so easy to test manually that there's little point in much automation. Other
parts are complex and full of contradictory business rules so formal tests are
necessary.

Of course, YMMV. If you're inside Google, things are probably looking very
different. But most people aren't.

~~~
bunderbunder
I've inherited one or two projects like the first one you describe before,
and, when they don't have tests, it's _awful_.

The problem is, regardless of how stable the code was, or how easy it was for
the original author to test it manually, if the code hasn't been worked on for
ages, then the original author is either gone or can't remember all the
details of how it was supposed to work anymore. Which leaves you in a
dangerous situation, because there's no good way of knowing which behaviors
are by design and which ones are incidental.

~~~
drdeadringer
Documentation.

I swear, if it's not "we don't have time to test" it's "we don't have time to
document". Ok, but you've hired all these firefighters instead.

~~~
bunderbunder
[http://wiki.c2.com/?SharpenTheSaw](http://wiki.c2.com/?SharpenTheSaw)

------
latch
Tests are first and foremost about design. If it's hard to test, there might
be too much coupling. Some code, by it's very nature, is hard to decouple.
Just like some code, by it's very nature, has to mutate state. But having a
signal (the pain of writing a concise test) is invaluable in proactively
improving code quality.

I think there's a direct relationship between _good_ tests and _good_
codebases. I have to say _good_ tests, because I've seen people attack in-
cohesive code not by making it more cohesive, but by writing in-cohesive
tests. Any test IS not better than NO tests. Slow tests are awful. Tests that
are flaky are awful. And tests that don't really test properly (or miss out
important edge cases) can give a false sense of safety. I've heard people say,
if tests are so hard to get right, maybe they aren't the right tool. I've
since come to the conclusion that: no one said programming properly had to be
easy.

What I've found is that, if you test for a long time, the value of tests as a
design tool diminishes. Because you gradually write code that's easier to test
(and easier to test is, for all intents and purpose, always easier to read,
reason about and maintain). But, you get better at testing and spend less time
struggling to write tests (because the code is cleaner) so you gain less, but
it costs you less.

And then you still gain the other benefits. Protection against regression,
correctness, documentation (especially of edge-cases). More efficient
onboarding.

Also, there's this common fallacy of: speed, cost, quality: pick two. The
reality is a lot more shades of gray, but if you were to generalize it, I'd go
the other extreme and say that you can't have low cost WITHOUT high quality.
Cost and quality aren't opposing forces of each other, they are opposing
forces of the skill of your team.

------
lordnacho
Once you have something nearing a steady state design, you need to have tests.
I can understand if something is in a very early stage it might not have tests
yet.

At my current work we have loads of automatic tests. Every time something is
changed, there's pipelines that test the following:

\- Simple unit tests: Start with state 0, make and action, is state 1 what you
expect? There's literally hundreds of these, and they have indeed found
issues. Many edits that would have gone in have had to be rewritten to cater
for corner cases found in the tests.

\- Valgrind suite tests: Since we're dealing with c++, you get some
interesting errors that might happen. Changing the order of some lines can
look innocuous, but sometimes that causes an iterator invalidation. This kind
of thing is hard to spot, but we have automated tests that will stop you
merging code that does this. Especially things that are UB in c++ can be very
tricky.

\- Integration tests: several pieces that are in themselves passing unit tests
might not work together. Luckily you can define a CI script that launches both
and makes them talk to each other.

As you can imagine, it takes a lot of work to write the tests. One issue with
a lot of teams is there simply isn't enough time. You have to show progress,
which means showing things that seem to function externally. But you're always
paying for that by having to fix things that you find as you go. Not having
tests is a form of tech debt. It costs more and more as your code base grows.

~~~
tokyodude
I am not convinced this is always true. The first 25 years of my career was
spent making games. Never had tests. We had testers but no tests. I then
worked at a big company with tests. My first experience with them. Working on
something important I loved them but they easily cut my velocity by 70%
compared to what I used to get when working on games. The games were AAA and
shipped to millions of customers on CDs and Cartridges so no bug fixes allowed
after the fact. Being a big company and big project there were 10-30 engineers
who's maintained testing infrastructure, continuous integration, etc. That's
larger than the entire engineering team on most games I worked on.

I'm not arguing against tests and having had the experience of testing I'd use
them when appropriate I'm just not sure the exact cutoff. If I was working on
a game engine for multiple teams I'd be writing tests for sure. If I was
working on a small 5-20 person team with custom tech I'm not sure I'd start
adding tests where I didn't before. Maybe if there was IAP or multi-player
online or some other server component and metrics. Or maybe if it was easier
to get started and maintained.

Theres a bunch of big differences with games vs much other software. Often
games are not maintained after shipping. Of course that's less true today than
it was in the past with longer term online games but it's still also true that
many games area pretty much done the moment they ship. Another big difference
is the teams are 70-90% non-engineers making tons and tons of data.

I mostly bring this up as that often software engineers talk past each other
since no all software engineering is the same.

[https://www.joelonsoftware.com/2002/05/06/five-
worlds/](https://www.joelonsoftware.com/2002/05/06/five-worlds/)

~~~
heavenlyblue
Well, I mainly work in a team where developers ship separate products for the
internal usage. So it's not really working in a team. Products are generally
quite technical - it's not your usual CRUD shop.

I do write tests, but only post-factum, when I know for sure this component
will be re-used in other components. I write them in cases where I know I will
forget about the certain edge-cases in some time, so changing the code will
most certainly introduce bugs.

When you have components that are reused over the course of several years (and
obviously being optimised if the requirement is there), the chance of
regressions is severely increased.

So I see that "writing tests for everything" is more of a political stance,
rather than entirely practical.

------
chriswarbo
I heard the "customers don't pay for tests" line recently, from someone
boasting about how they migrated hundreds of thousands of lines of PHP from
5.x to 7.x.

I didn't have an off-the-cuff reply to this, but thinking about it afterwards
the issue became clear to me: customers don't pay for code either! Customers
pay for solutions to their problems.

I think the key phrase is "how do we know?": _maybe_ those hundreds of
thousands of lines of PHP were needed to solve the customers' problems, but
how do we know? After all of that work, does it actually do what's needed?
When the requirements inevitably change, how do we know when we've finished
our patches?

We can never know for sure, but there are ways to gain confidence in what
we've done. Automated testing is a _really_ low-cost way to gain quite a lot
of confidence, which is also relatively stable over time. In contrast, manual
tests are either very expensive or woefully shallow; and re-running them in
the future takes just as much effort each time. Static analysis, code read-
throughs, formal verification, etc. can give us more confidence than tests,
but at a much higher cost. Simplifying the codebase can also help (code is a
liability, not an asset!), but again that can be expensive.

We should get the most bang for our buck: usually that means adding more
tests. _Occasionally_ , if that's not enough, we might sit down and prove
something, or manually step though a print out, etc. but _usually_ we could
get more out of the same time by writing tests.

~~~
arkh
> customers don't pay for code either! Customers pay for solutions to their
> problems.

And that's why you don't test code. You test that your application comply with
your customer's requirements.

I'm happy to see things changing about tests with people realizing fine
grained unit tests are often an hindrance and you should prioritize end to end
testing. Test the interface of what you're selling, not the inner workings.

~~~
purerandomness
Are you aware of the Test Pyramid, and why it exists?

~~~
arkh
> Are you aware of the Test Pyramid

Yes.

> and why it exists?

Because the xUnit consultant crowd worked a lot, it is easier to make unit
tests (which fossilize your code) so there's a lot of tooling around it. And
the reliance on external services with no sandbox or ready to use mocks mean
E2E is harder to implement. But being harder just mean you have to do your
job.

When I buy your software I don't care about how you implemented some pattern.
What I care is that when I click on this gizmo in that situation it does what
it should in some time-frame using some resources.

------
skohan
> It is the goal of every competent software developer to create designs that
> tolerate change [...] Code without tests is bad code. It doesn’t matter how
> well written it is; it doesn’t matter how pretty or object-oriented or well-
> encapsulated it is. With tests, we can change the behavior of our code
> quickly and verifiably. Without them, we really don’t know if our code is
> getting better or worse.

I agree that testing is a valuable tool for making good software, but I think
the idea that all code categorically _requires_ testing to be considered good
is overzealous.

I think the most ardent TDD proponents underestimate the _costs_ of testing.
Tests are also code which can have bugs and has to be maintained, and having a
well-developed test-suite can act as a type of inertia which makes it harder
for an organization to make necessary architectural changes.

Don't get me wrong - I think testing is important, but it is just one tool
which has to be balanced against a number of other factors.

~~~
pjbster
I have two anecdotes about testing. First one was where I decided to refactor
a function which had something like a 10-way cascading branch each of which
had a compound test. Normally I wouldn't bother but, on this occasion, I wrote
something like 60 unit tests in preparation for the exercise and one of those
tests exposed a misplaced closing parenthesis in my solution.

The second case was where an in-house quotes engine was to be migrated to a
SOAP service and some calculations needed to be ported. We didn't have access
to the source so we created a small set of the most complex scenarios we could
come up with and used those to generate calculator requests. I think we had 26
or 27 test cases and they each required non-trivial setup before the
calculator could be invoked. Those cases exercised code which took the
developer about 3 months to refine into a working solution.

So what does this reveal? I don't know. On the one hand, we had just under 60
unit tests which picked up 1 bug whilst, on the other, we had less than half
that number of end to end tests which were sufficient to build a major piece
of business functionality.

My gut feeling is that end to end testing is a better long term investment and
unit testing is perfect for refactoring but inefficient for anything else.

------
moring
The introduction made me hope for real advice on how to "stop stopping", but
it only repeated the old arguments on why we want tests. Why not deal with
some of the real problems that actually prevent people from writing automated
tests?

Real-world example 1: The system being developed talks to external system X.
We don't want tests to litter the production database, especially since it's
about accounting and we would get into legal trouble for that. However, there
is no possibility to open a test account on the production system, and no
budget for a license for a test installation of X. The main troiuble with X is
that its public API (web service) changes from version to version and there is
no useful documentation about it. How would one write integration tests for
that?

Real-world example 2: How would you write tests for a system that has its
requirements unspecified, even on a very coarse level, after the deadline
where it is forced into production by management?

I'm pretty sure that both examples happen in other places than the ones I've
seen them, too.

~~~
z3t4
If it's an accounting service, make a new account/table named "Software test".
Book-keeping was invented to spot errors. If you have a non zero balance in
your "Software test" account you probably have a bug in the software.

~~~
hibbelig
So I write some code and a test, and of course the first time I run the code,
the test is going to fail.

So now the balance in our "Software test" account is nonzero whereas it should
be zero.

But audit requires us to record all bookings, so how do we explain these bogus
bookings (both from the erroneous code that made the mistake in the first
place, and from the manual adjustment later that fixes the mistake for the
next test run)?

~~~
z3t4
You can have many types of tests. First you have assertions and unit tests
with mocks/injections that runs at compile time together with the type checker
and other low level tests. Then you have the integration tests that makes sure
your code works together with third parties and other API's. Those tests might
seem unnecessary when you already have unit tests, but it's very nice to know
if some third party have made breaking changes and that _your_ software have
stopped working because of it.

------
airfreak
I like the concept of error budgets. Start off by knowing what kind of quality
and resiliency a system requires and design your test strategy around that.
Means talking to the client about that.

I'm not going to invest a load of time in various types of automated test for
an internal site with a form over a database that 2 users use for low priority
work. The idea of 80%-100% code coverage for basic work like that seems like
waste to me.

But for critical path of the eCommerce shopping experience I'm going to going
to write all kinds of automated tests at multiple layers of the stack, right
up to chaos/stress testing it, so that we know when black friday comes we can
handle it.

I don't like dogma and TDD seems too dogmatic for me. I am very pro testing,
having been both a QA, Developer and Ops engineer. I want the freedom to
exercise my own expert judgement. The problem with dogma is that it makes
Thinking take a back seat. Suddenly we have 80% code coverage enforced on a
page that loads a grid from a table, going through a three layered monstrosity
of code.

~~~
rkangel
> I like the concept of error budgets. Start off by knowing what kind of
> quality and resiliency a system requires and design your test strategy
> around that. Means talking to the client about that.

If you have a client knowledgeable enough, then great. But most people who
don't have an engineering background think that the correct number of bugs is
'zero'. It's really hard for them to get their head round something being - to
some degree - buggy, and still being acceptable.

------
Izkata
> What is better — having a test or not having a test? The answer is obvious —
> any test is better than no tests at all.

Nooooooo....

For most people here, this is probably true - because you (and hopefully those
you work with) know how to write tests well.

Examples of badly written tests I've encountered that wasted everyone's time:

* Unexpected environment - such as, Django doesn't turn off the cache while tests are running

* Tautological tests - where the test just repeats what's in the code

* Peeking too far into the implementation - restricts refactors and can create lots of false negatives or false positives depending on what's being asserted

* Mocking out too much - tests that pass when they really shouldn't

* False assumptions/not thinking it through - why did this test start failing on New Year's Day?

* Flaky integration tests

In our case, about half of these could be updated once the issue was apparent,
but the rest either scrapped or completely rewritten.

And then there's a number of issues with test coverage giving you a false
sense of security, with things like reaching 100% for a given piece of code,
but only thinking about the happy path.

~~~
gnahckire
I partially agree w/ you. One can always delete the test and then it's like
you have no tests at all.

> And then there's a number of issues with test coverage giving you a false
> sense of security, with things like reaching 100% for a given piece of code,
> but only thinking about the happy path.

Yup, and this is why one writes tests against well-defined
interfaces/boundaries.

------
rkangel
I did a load of work in defence (in the UK) and found it a very frustrating
experience because the way it works they weren't even "paying for a solution
to their problems".

You have to underbid the initial proof of concept phase to win it, and then
hack together something that vaguely meets the requirements in the limited
budget.

Then you've got the next 10 contracts over several years to develop that into
the actual product - meet all the requirements. The problem is that nothing in
this process encourages you to write good, clean code. If the code was such a
mess that it took 3 times as long as it should to add new features, that meant
that we could charge 3 times as many hours.

------
pmontra
> Customers ask providers for a quality product

It's not even the "quality" thing. It's that tests help at delivering the
product.

Of course if they tell me, "I need this tomorrow" because of some emergency
there is hardly enough time to code for the happy path and manually check that
it looks OK. No tests. I make sure they understand it will be full of bugs and
we'll fix them later on. This happens rarely but it happens.

------
millstone
How do people here test their UIs? As an extreme example, the link here
contains hundreds of lines of declarative code; a single typo (unclosed
bracket, misspelled CSS property...) could break the site. So how is it
tested?

Probably it's not tested in any automated way, because manual testing (open
the site and look) is much easier. Maybe lots of software is like that, to
various degrees.

~~~
rkangel
I like the flutter widget test approach (it't not unique, but it's well
executed there): [https://flutter.io/docs/testing#widget-
testing](https://flutter.io/docs/testing#widget-testing)

In flutter parlance, a whole view (page/screen) is a widget, and that's the
granularity I was testing it - it's good for testing the high level patterns
that the screen is meant to adhere to, e.g. Send button appears when there is
some text to send.

You can also have golden images for rendering of each screen. They change a
lot, but that at least forces developers to look at the changes each time and
decide if they're on purpose.

------
tpaschalis
Well, I hope he continues with the idea to "kick-off a series of articles
about testing".

Any suggested readings about testing, HN?

