Does anybody else get nervous over architectural patterns that exist only/mainly to make unit testing easier?
My preference for these type of situations is to try to encapsulate as much logic as I can into pure functions. In addition to be being easy to test, they also have additional architectural benefits that go beyond testing.
When I was a young nerd, I was a rabid fan of logic, objectivity, science. Anything which smacked of spirituality (or, worse, religion) was immediately dismissed as useless sentimental nonsense. I recognized that some people might _enjoy_ thinking about these fripperies, but I was convinced that any phenomenon which claimed an effect but did not explain the science-based interactions that mediated it (think astrology or religion) was hogwash.
One day I was deathly bored on a school trip, and the only entertainment available was a book on Feng Shui. During the introduction, it advised the practice of imagining a dragon entering a living space through a door, coiling their way through and around the space, and then exiting. The following sentence stated that this was often a good heuristic for helping the human mind judge a space for sight lines, common travel paths, and avoiding clutter.
My mind was blown. The dragon wasn't the point - the dragon was a way to hack the human brain (itself a messy, complex, poorly-understood mechanism) to a better outcome than it would get by a straightforward series of explicable evaluations. If the process demonstrably leads to better outcomes - if it is "useful" - then it can still provide value even if you can't explain _why_ it works[0]
All of which is a roundabout way of getting to my point, which is: no, I don't get nervous about architectural patterns which _appear_ to only exist to make unit testing easier, because overwhelmingly I have found that "is this architecture easy to test?" is a good heuristic for "does this architecture have well-divided areas and layers of responsibility?". No matter your opinion on the value of tests _themselves_ (and there are very cogent arguments that an over-reliance on unit testing alone can lead to brittleness and false confidence), ceteris paribus making your architecture "more testable" will usually be a good thing in itself, even before any tests are written.
[0] this attitude needs some nuance, of course, when it is actually important to explain how and why the process reaches conclusions. See: ML/LLMs/AGI/AI/whatever-the-current-initialism-is, and recourse in the face of disputed judgements or unforeseen effects.
Testing is at its best when you are testing functions or small subsystems that can be created and destroyed independently and be initialized with a known state.
For instance the kind of things you write in an introductory programming class (somehow it goes over the heads of most students that the teachers are using automated testing.) It's certifiably insane to not test parsing code, whether or not it is "functional" (input text return a data structure) or not (function gets called to consume text, callbacks get called)
UI code is tricky because usually the framework is calling into your code and your code is getting called in a shambolic way. Of course it problematizes testing but it problematic anyway. I have been looking at testing of a React application again and boy is it crazy, where you have to deal with asynchronous updates and don't really know how many times the app will update itself before you know it has "settled down" and it is OK to test the "output". Testing is bad, with tests that should take 0.04 sec taking more like 4 sec, but the user is being subjected to the same thing with lots of waiting, layout shifts, flashes of incorrect information, etc. And don't get me started with the typical "uncontrolled" component which is completely out of control since some application state updating updates the state managed by React, other ones call freaking jQuery (talk about spooky action at a distance!)
> architectural patterns that exist only/mainly to make unit testing easier?
> try to encapsulate as much logic as I can into pure functions
These are one and the same. The suggestion may as well have been "the IO monad is hard to test, so move as much code as you can out into pure functions"
From my own experience: Occasionally it feels awkward, but more often than not, the same patterns that make testing easier also lead to an architecture that is easier for me to understand and reason about, because 'having a minimal set of well-defined functions' happens to be good for both, and also leads to better composability, etc etc.
And even in cases where they are not the same, I find that the confidence that automated tests give me is worth the little bit of gymnastics[1] I have to do to make the things testable.
YMMV. :-P
[1] And maybe I should note that I stay clear of all the fancy gizmos that 'modern' testing frameworks provide. If you need to mess around with a mocking framework, that means you did it wrong. It's hard to improve on `assertEquals( expectedOutput, doTheThing(input) )`.
> If you need to mess around with a mocking framework, that means you did it wrong
This is a debate that's been going around my professional circle for a while, I'd love to hear a bit more about your take on it. My position is fairly opposed to this - that mocks are a valuable tool allowing you to they test situations that are not reachable based purely on input (if you're not mocking, "how does my code behave in the presence of error responses from a dependency?" can only be tested by hard-coding the dependency with "when you encounter this particular input, short-circuit your logic and return an error"), and let you test otherwise-slow logic quickly.
I can see, and agree with, the argument that tests which rely on mocking have their confidence bounded-above by your confidence in the correctness of your mocking (that is - if you test against incorrect expectations, your tests will pass for code that will fail in production. GIGO), but to my mind that doesn't mean that tests based on mocks are inherently bad - simply that their restrictions must be acknowledged. They have strengths and weaknesses, and are most valuable as part of a suite of testing strategies. Mocks themselves are not inherently bad, even though an over-reliance on them is.
...that’s the three lines of code that I’m spending my “not 100% code coverage” on. Each sub-function is 1000% testable without mocks, and the combination of functions is a “secure, logic-less composition”, with little test value.
Basically, if you write your code that way you _never_ need mocks, and generally “mocks are dumb” because saying “Mock( DoRequest( FakeRequest() ), FakeResponse() )” is just wishful thinking compared to a properly factored codebase with _real_ non-mocked, significant, functional-only components.
Hmm - but hang on, though. Doesn't that just bring the problem up a level? This setup assumes that you have "shallow" logic: that `Process` will not call any other methods, and especially that it won't call any dependency services. While it's possible to unit test `DoFoo` and `Process` independently, you've now given up the ability to unit test `foo` itself - more specifically, you've given up the ability to determine the behaviour of `foo` in the presence of arbitrary behaviour from code which is calls (since specifying `DoFoo`'s behaviour would be mocking).
I didn't initially follow what you meant by "that’s the three lines of code that I’m spending my “not 100% code coverage” on", but in the context of this realization, I'm guessing you're saying something like "when my code has to call external dependencies, I wrap the call in this ~3-line pattern, which allows me to directly test the behaviour of each of the components of the call, and I'm comfortable not testing their integration together because it's so straightforward"? Or are you saying that this Command Pattern should be applied _every_ time one calls another function? If so - I guess that works, but I'm back to wondering what that gains? After all, both mocking and this Command Pattern serve to create tests which assert "when my code encounters situation X, it responds in fashion Y" - the difference is just on the "direction of encountering" (mocking controls "when my code calls other code and receives X", Command Pattern controls "when my code is passed Command X"). Don't they both have the same strengths (ability to induce arbitrary situations for the code-under-test to deal with) and weaknesses (if the developer's beliefs about the situations in which X arises are wrong, the tests are asserting on incorrect behaviour; if the business case changes, especially if the type of X changes, then the tests need to be rewritten to match the new situation)?
Oh interesting. So in a sense you are passing the "dependency's response" as a parameter _to_ `Process`, and thus removing the need to mock a dependency in order to induce an erroring dependency-response - you can instead _create_ a payload that resembles an erroring dependency-response, and then directly test "when `Process` is called with `BadPayload`, it should respond as follows". Neat - thanks!
> This presentation shows the fundamentals of the "decoupling from Rails"
I 100% understand why someone would want to do that, though, for unit testing or otherwise. Rails (and every framework inspired by it) seems almost intentionally designed to force you into coding a big ball of mud where everything is held together with magic framework duct tape. I guess some people's brains are wired to enjoy working in big mud balls, but for those of us who don't, adding a decouple-from-Rails-or-Laravel-or-Spring-Boot-or-what-have-you layer is the only hope we have of continuing to receive a paycheck while keeping our sanity intact.
Depends on the definition of easier. If easier is "it won't work with my particular flavour of framework or mocking" then yes, it's a bad idea.
If it's inherently untestable for any setup, that's already a smell and needs some sort of attention. If the smell is accompanied by potential to cause catastrophic error or such, I'd rather lean in to changing the code to be more testable even if it feels like an anti-pattern.
Where I hit this all the time is when people have written various flavours of queues and item processors all wrapped up into one, where a timer or similar mechanism will trigger an item(s) to be processed. For example, a denouncing queue where you can add items to it, and it will pause a bit to allow other items to accumulate in the queue that can be processed in a single batch.
To test the item processing well, you need lots of different tests covering all the branches in the logic.
But to trigger the processing you need carefully timed code full of sleeps.
Combine the two and you have lots of painfully slow tests.
Move the logic out so you can test that nice and fast, then have a handful of slow tests checking the queue mechanism works.
Contrary to some of the other comments here, I don’t think that’s designing your code specifically for your test framework. The logic of how you coordinate processing is a separate thing to the logic of how you process it. So splitting those concerns apart is good design regardless.
I’ve had the privilege of refactoring age-old code, and it takes more than two hands to count the number of times I’ve named a class “TestableMegaFoo”, whereby little testable bits of “MegaFoo” have been judiciously migrated to.
There’s another comment in here talking about feng-shui and dragons, I’ll recount the story of “peace pipes” and “killing your enemies”.
It’s tough to find in the modern internet, but I remember reading about Native American traditions of gathering around a piece pipe and applauding your enemies, looking for the good in them, effectively.
The story goes that if you do this enough, your enemy “disappears” and you see them as friends.
So it can be with code. Class MegaFoo simply becomes calls to “TestableMegaFoo”, with a few usual side-effect inducing “less testable components”. Your enemy disappears in a puff of smoke.
So structure your code for the testing methodology/mechanism?
Sorry, but that's BS. TDD works to some extent, when it's used as 'specification, then code', but solving the problem is the main thing, not testing that it solves the problem.
The fundamental problem here is "imperative GUI" incentivizing tight coupling between business logic and presentation.
Whereas "declarative GUI" like the browser DOM can be understood & interacted by automated tools without instrumentation of the business logic.
Adding more tight coupling, by coupling your code architecture to the limitations of your testing framework, seems like going in the wrong direction in most cases.
Some programs like games need imperative GUI controls to minimize latency, but trying to automatically test everything in an AAA game is a losing battle. You need QA eyeballs, period (or sophisticated visual AI agents).
I dunno. Sometimes it seems that 90% of the complexity of making web apps is figuring out details, corner cases, and spooky action at a distance with CSS. For crying out loud, the CSS spec comprises 50 documents
It's one thing to test that element A has the right classes on it, it's another thing to test that a certain CSS selector that is sensitive to the context works correctly, that precedence rules work as expected, or ultimately the document "looks right".
CSS is (too) powerful but not inherently complex. It is often abused.
Ultimately you will need real eyeballs to judge aesthetics. No argument there. But significant QA work can be automated with frontend testing frameworks.
I've never been that impressed with the functional coverage you get from front-end-driven testing frameworks, at least those that drive views. Test that interface with models in the front-end, though -- priceless!
And CSS is inherently complex, because the rules of precedence are fragile and easily broken.
Or unsophisticated agents. A strategy I've used for testing highly visual code: run the test procedure; take a snapshot of graphic output; generate a hash signature of the output (e.g. CRC32); if the signature has not changed, the test passes; if the signature has changed, queue the test to a human for verification; if the human accepts the output, save the new hash signature for the next automated test.
The TL/DR: if the output hasn't changed the test passes; if it has, get a human to approve the change. Developers can (and probably should) perform most of the manual verification.
It wouldn't completely eliminate test teams for AAA games; but it could substantially reduce the load placed on test teams, allowing them to focus on what test teams do best.
My preference for these type of situations is to try to encapsulate as much logic as I can into pure functions. In addition to be being easy to test, they also have additional architectural benefits that go beyond testing.