
Testing Microservices the sane way - alfiedotwtf
https://medium.com/@copyconstruct/testing-microservices-the-sane-way-9bb31d158c16
======
ekidd
_> The Vagrant repo itself was called something along the lines of “full-stack
in a box”, and the idea, as you might imagine, was that a simple `vagrant up`
should enable_ any _engineer in the company (even frontend and mobile
developers) to be able to spin up the stack in its_ entirety _on their
laptops._

This is honestly not that hard to get working, if you have Docker and good
tools. For simple cases, "docker-compose" will work well enough. For complex
cases, my employer open-sourced a tool for exactly this:
[http://cage.faraday.io/](http://cage.faraday.io/) This extends docker-compose
with the idea of multiple pods, staging/production environments, etc.

For testing, we have plenty of unit tests, and we've been experimenting with
pact [https://github.com/pact-foundation/pact-js](https://github.com/pact-
foundation/pact-js), which turns microservice consumer mocks into provider
contracts.

But at the end of the day, it appears to be absolutely necessary to have end-
to-end tests that run in a staging environment whenever a service is updated.
Otherwise, something inevitably falls through the cracks. It's a nice idea to
think that you can develop each microservice in isolation, but _somebody_
ultimately has to watch site-wide quality.

~~~
bpicolo
It's easy at low scale. It breaks down hard with dozens of services +.

~~~
evfanknitram
Why?

~~~
bpicolo
A variety different factors. More moving parts == more stuff that can break.
Knowledge about a specific service becomes less widespread, so issues take
longer to resolve. Also, just hardware scaling becomes a problem. It also
really sucks to have to have 50 people rebuild dev environments because you
bumped a system dep in a container that your app now depends on. At some
point, just building and bringing up all those containers becomes really slow,
and you're going to have to build more and more systems to aid in that.

There's also all the stuff that's tricky to automate - say you're using AWS
Lambdas, step functions, SQS queues. Translating that across different
developer envs is a challenging problem.

I'm very much on the monolith-first approach. There are cases for
microservices for specific business functions when you have few engineers, but
in general it's far more of a scaling layer for engineer count. Single-app
workflows are typically much more productive compared to microservice
workflows up until a certain point.

------
falcolas
Perhaps it's just me, but if you can't run your stack locally for any reason
but space or metadata, you're too tightly coupled to your platform. This is
doubly true in this age of Docker. Of course, Amazon and Google and Azure love
tightly coupled applications, but it's not good for creating portable code.
And portable code is something that's good for your pocketbook as well as your
code quality.

I'll admit, I'm not a fan of testing in production, especially when you're
doing that testing against paying customers. I've just seen it go poorly too
often, with the common result of lost customers. When you're B2C, it's bad
since you're now having to acquire more customers not to grow, but to remain
even; a death-knell for VC-backed startups. When you're B2B, the loss of
customers signals a coming winter for the business. I've been through a couple
of layoffs due entirely to lost customers.

Remember that your product is most likely a fungible asset - your service
easily replaceable by a competitor - and if you piss off your customer base by
allowing out blatant bugs, you will lose customers. The sheen of newness has
worn off tech companies, and customers are not going to be as forgiving of
their time being wasted as they once were.

~~~
dastbe
A pretty pithy response would be that any time you're deploying new software
to your production environment, you're testing against real customers. While
you should have comprehensive pre-production testing, there's no good
substitute for making sure the thing going into Prod is working by running
tests when it is in Prod.

Looking at her list of production tests, her list generally falls into 3
categories:

* instrumentation of code for observability * staged rollout of new software to observe issues on a subset of traffic/hosts against the rest of Prod * simulating failures in Prod to test against possible random failure scenarios

Of those, the first should have zero impact on your customers, the second
should be making your customer experience better by reducing the blast radius
of a bad deployment, and the third should be making your service more
resilient to failures over time for your customers. Would you rather find out
that you can't survive an availability zone outage when you can easily return
that AZ back to healthy in a minute, or when you have to wait for recovery for
an hour? Honestly I think its disrespectful to your customers to gamble on
what might happen in the future rather than asserting that your service can
handle the wide array of known failures that can happen in this world.

~~~
falcolas
If there's any question of how something is going to respond to production
load or production data, there's a missing step somewhere in the release
process.

WRT the three points you bring up:

1) Instrumentation is a part of any good deploy, regardless of where, when, or
why. Instrumentation _is not_ a replacement for testing.

2) Staged rollouts are a good thing, but they don't prove the new object being
rolled out is production ready. They can cordon off major failures, yes, but
see my comment above about missing steps when it comes to major failures.

3) Not all companies can afford the Netflix model of having multiple fully
redundant data centers at all times. I'd even hazard a guess that _most_
companies can't really afford to triple (or more) their infrastructure costs
in addition to the higher development and maintenance costs. Ultimately that's
one thing that testing is good at: ensuring your disaster recovery plan works
without honking off customers when you discover gaps in that plan.

None of these three strategies will replace pre-customer testing. At best full
adherence to the policies (in the absence of end-to-end testing) will only
limit the amount of damage major bugs can do. The question to me is: why are
we OK with just limiting the damage of otherwise findable bugs?

------
stephen
I'm going to have to re-read the post a few times, but my thoughts on this
topic are that cross-system automated tests are futile because, by definition
of being "not your system", you can't easily and deterministically control the
input state (of the combined your system+not-your-system), so any tests you
build on top of this will be like building on quick sand:

[http://www.draconianoverlord.com/2017/08/23/futility-of-
cros...](http://www.draconianoverlord.com/2017/08/23/futility-of-cross-system-
integration-testing.html)

My preferred solution, that I haven't had a chance to actually flush out at
scale, so disclaimer/YMMV, is for all services to ship with stubs:

[http://www.draconianoverlord.com/2013/04/13/services-
should-...](http://www.draconianoverlord.com/2013/04/13/services-should-come-
with-stubs.html)

Where the stub is an in-memory version of the service that its authors
maintain (not the client), so you can achieve the proverbial "deploy all
systems on your local machine", but since they're stubs, they're extremely
quick to boot/reset/etc., and also with allowances (again since they are
stubs) to let you set the per-test input data of any stub your system talks
to.

I believe this would work best with homogenous/noun/REST-based services, e.g.
all entities in your corporation have a strict/unified CRUD API, so then
"integration" tests (e.g. the proposed stub/no-actual-wire call tests) can
define their input data in terms of entities and be fairly oblivious about
which stubs/systems those entities actually live in.

~~~
chrisweekly
Good thoughts, Stephen!

I was recently reading about Hypermedia HAL APIs[1] (TLDR: add metadata to API
responses, helps w/ discoverability etc) which could conceivably play a role
in solving this kind of problem.

[1] [https://sonalake.com/latest/hypermedia-
apis/](https://sonalake.com/latest/hypermedia-apis/)

------
chatman
Discarding the "full stack in a box" idea, _in general_ , just because of past
experience with a poorly implemented vagrant setup seems naive. A good "full
stack in a box" implementation (docker-compose, swarm, kubernetes etc.) can be
a useful tool in testing/developing microservices.

~~~
justincormack
Well there is obviously a point at which it will no longer work (way before
Google scale). But the point is to stop testing services in a coupled way, the
whole point of microservices is to make decoupling real, not to build a
distributed monolith. Testing in a decoupled way helps this enormously.

~~~
_ZeD_
Yeah, excepts it doesn't. In those microservices setup more than not the
change in the data to accomplish a client request is 3 or more services of
"distance". Still, it needs to be accomplished. If you don't coordinate all
the services involved (all with different developers team, maybe from
different contractors) AND if you don't do an end-to-end test, how can you be
sure to do the requested change?

~~~
lmm
> In those microservices setup more than not the change in the data to
> accomplish a client request is 3 or more services of "distance". Still, it
> needs to be accomplished. If you don't coordinate all the services involved
> (all with different developers team, maybe from different contractors) AND
> if you don't do an end-to-end test, how can you be sure to do the requested
> change?

You're doing it wrong. If these services are really so deeply entangled that
you can't change and test them one at a time, they shouldn't be independent
services. Merge them, or otherwise rethink your service boundaries.

------
shcallaway
My company is in the process of building out an integration test suite for our
microservices platform. A few pain points:

1\. The tests themselves are housed in a separate repository, so you can't
update the tests alongside your service. This means every change to a service
has to be backwards compatible. Hello, multi-step rollouts.

2\. Our environments must be highly configurable, so that every permutation of
versioned services can be integration-tested. This is forcing us to adopt an
unnecessarily complex container orchestrator.

3\. Service owners are not excited about setting up and contributing to yet
another project. We end up with a lot of out-of-date integration tests. Lots
of noise, if you ask me.

I think a combination of contract testing (e.g. using Swagger, Pact) with
monitoring, canary deployments, and automatic rollbacks would be easier to
maintain and just as effective at catching bugs.

~~~
bpicolo
Once you dive into SoA, you see how much sense single-source-repository starts
to make. Developing features between <n> different repositories is tedious,
and the single-repo gives ample benefits. Having easy-to-work with interface
layers like protobuf in a shared repo makes tons of sense, for example.

It's interesting to see how things like Golang may well have evolved out of
this problem. Calculating a golang app's dependencies is as trivial as a grep
or two, so it's easy to know what tests to run when libraries change.

~~~
shcallaway
At the very least, this experience has made me question the conventional
wisdom that microservices = good and monolith = bad.

~~~
cle
That's conventional wisdom? Outside of loud naive bloggers preaching that
microservices solve the world's problems, most engineers I've chatted to IRL
have been pretty skeptical of the overall utility of microservice
architectures because of the high complexity they add.

------
maxxxxx
the discussion about the difficulty of testing microservices makes me wonder
how the programmers in 20 years will look at the currently modern systems that
are being built now. Let's say the system has been in use for 10 years,
everybody has moved on and now you have to make extensions or bug fixes to a
legacy cloud/microservice/multi-language/multi-server/distributed/queuing/
system. At first look this makes updating an old COBOL system where you have
one big codebase in one place look easy in comparison.

~~~
nine_k
Won't the fact you need to only replace some parts with well-defined
interfaces help?

In my company, we had a legacy PHP system that seemed to me excessively
layered and cut too thinly. In reality, it allowed to completely replace that
system with Python and Java piecemeal, without ever stopping it.

~~~
maxxxxx
Now imagine in 10 years someone comes in and has to figure out how all these
services written with outdated libraries fit together and how to fix bugs.

~~~
nine_k
HTTP does not get outdated for quite some time already. The standard
connection points with extremely late binding is what makes microservices
somehow easier to update, as opposed to rebuilding a monolith.

Fitting together is an aspect that becomes _easier_ when you decompose your
app properly microservices or not. When a component has clear
responsibilities, it's easy(-ier) to understand, modify, and replace. When a
component is reasonably small and isolated, it's easy(-ier) to analyze and
understand, even if it's written in an ancient language using dust-covered
unmaintained libraries.

------
jimbokun
A lot of interesting content in this article, but think it could benefit from
some editing. Maybe an article half the size, or two or three separate
articles breaking out the various topics covered.

~~~
coredog64
She's currently writing a book on the subject. I believe these blog posts are
teasers for the final product.

~~~
jimbokun
That would make sense! Seems like an early book draft.

------
friendly_chap
> Ultimately, every individual team is the expert given the specific context
> and needs.

Most companies don't have specific needs. They think they do, but 90%+ of the
companies out there are either building run of the mill CRUD apps or something
barely more technically difficult.

~~~
friendly_chap
Does the downvoter care to explain?

I have spent a good chunk of my life explaining companies why they don't need
any special hand rolled platform - leaving hundreds of thousands of dollars on
the table.

I guess parroting buzzwords is always more popular in this industry.

~~~
Abekkus
Managers in an organization often have more incentive to make the problems
they work on look difficult, and therefore important, rather than fixing their
problems quickly, cheaply, and reliably, which can make them look less
valuable to leadership.

------
zmmmmm
Seems a little bit inflated - yes, docker-compose et al, are a problem if you
have dozens of services all using different databases. But then, that's a
level of complexity and scale that most applications / organisations won't
reach. And if you do reach that, then you probably can afford to invest in
something more sophisticated to solve the problem (service mocks, etc).

For situations where the true overall complexity is manageable, I think that
making sure that the environment can be entirely replicated and bootstrapped
easily is actually very good discipline for keeping complexity under control.

------
akud
No mention of mocking out service dependencies, which seems like the obvious
way to go. Does anyone do this?

~~~
Clubber
Absolutely. We have several front end layers, then nested service layers and
then nested repo layers, all get mocked depending on the level of object we
are testing. We then have separate integration tests that test the repo
against an actual database. We design the integration tests so they are
largely data independent (we load the data we test rather than just mock
data).

------
sytse
Is the conclusion to monitor incremental rollouts of microsevice updates?

------
pbreit
Article makes me want to scream “just fire up rails/Django and build the dang
app already!!!”

~~~
coldtea
So the article makes you want to address totally different stacks for totally
different needs and assume they only used multiple microservices because they
are idiots?

------
hendry
tl;dr though I thought I'd mention that Postman's monitors is saving the day
for me. You can go further and script white box tests in its little embedded
JS language.

~~~
shcallaway
Postman can be handy for quick and dirty HTTP tests. Unfortunately, I find the
Postman GUI unintuitive. The alternative -- working with minified JSON -- is
equally painful.

Edit: English

------
matthewtovbin
TL;DR

