HOW you solve their problems is entirely up to you, and some solutions will work better than others.
The only thing about customers guaranteed to be consistent is that they will request changes. Some will be good changes. Some will be bad changes. Some will be unavoidable changes. Your job here is to guide them through the best possible change process, with the ultimate goal of solving their problems.
Testing is a useful tool for (1) making sure the system matches the customer's expectations, and (2) making sure new changes don't break old things.
How you implement tests is entirely up to you, but once again, some solutions work better than others depending on the situation. Manual testing only scales up to a point before the cost of the manual testing is greater than the cost of implementing and maintaining automated tests. You need to become good at estimation here, but for 90% of projects, automated testing is the most efficient long term strategy.
WHERE you test is also important. Generally, it's best to test components at the edge of their interfaces. When you input X, Y should come out, ideally every time. This is where immutable data and idempotent APIs are VERY helpful for maintaining a reliable codebase. The more you need to cut into a component's innards to test it, the more you should be asking yourself why.
Testing is very much about architecture. Make clear boundaries between components and outside interfaces, use immutable data and idempotent APIs, and tests become a lot easier to write, and more resilient to change.
I don't think this is a good way to think about it.
I see plenty of projects that don't go anywhere, and I also see plenty of projects that do go somewhere, but where large parts change so little and are so easy to test manually that there's little point in much automation. Other parts are complex and full of contradictory business rules so formal tests are necessary.
Of course, YMMV. If you're inside Google, things are probably looking very different. But most people aren't.
The problem is, regardless of how stable the code was, or how easy it was for the original author to test it manually, if the code hasn't been worked on for ages, then the original author is either gone or can't remember all the details of how it was supposed to work anymore. Which leaves you in a dangerous situation, because there's no good way of knowing which behaviors are by design and which ones are incidental.
I swear, if it's not "we don't have time to test" it's "we don't have time to document". Ok, but you've hired all these firefighters instead.
Similarly, tests can serve as a device for asserting contributions are acceptable pre/post-merge as well as evaluating that your dependencies work as expected.
I like to have tests that exercise the real stuff. The tests are flakier but have better fidelity to the actual use.
Portable code can be run on whatever filesystem the user chooses. We can only test with what we have. So this boils down to another way to say "works on my machine" and that may or may not be relevant depending on how well the OS and hardware we test on matches the customer's setup.
It's good to be able to run your tests against whatever filesystem you choose, though.
This has recently eaten up a nonzero amount of my time in maintaining our integration and acceptance test suites.
I'm not sure where I'm going with this, other than to observe that real-ish isn't necessarily a substitute for real. I suppose the "developer empathy" story is that maybe the more empathetic approach in the long run would have been to have developers work on the target platform rather than letting them pick whatever one they used to use at their old gig.
I think there's a direct relationship between _good_ tests and _good_ codebases. I have to say _good_ tests, because I've seen people attack in-cohesive code not by making it more cohesive, but by writing in-cohesive tests. Any test IS not better than NO tests. Slow tests are awful. Tests that are flaky are awful. And tests that don't really test properly (or miss out important edge cases) can give a false sense of safety. I've heard people say, if tests are so hard to get right, maybe they aren't the right tool. I've since come to the conclusion that: no one said programming properly had to be easy.
What I've found is that, if you test for a long time, the value of tests as a design tool diminishes. Because you gradually write code that's easier to test (and easier to test is, for all intents and purpose, always easier to read, reason about and maintain). But, you get better at testing and spend less time struggling to write tests (because the code is cleaner) so you gain less, but it costs you less.
And then you still gain the other benefits. Protection against regression, correctness, documentation (especially of edge-cases). More efficient onboarding.
Also, there's this common fallacy of: speed, cost, quality: pick two. The reality is a lot more shades of gray, but if you were to generalize it, I'd go the other extreme and say that you can't have low cost WITHOUT high quality. Cost and quality aren't opposing forces of each other, they are opposing forces of the skill of your team.
At my current work we have loads of automatic tests. Every time something is changed, there's pipelines that test the following:
- Simple unit tests: Start with state 0, make and action, is state 1 what you expect? There's literally hundreds of these, and they have indeed found issues. Many edits that would have gone in have had to be rewritten to cater for corner cases found in the tests.
- Valgrind suite tests: Since we're dealing with c++, you get some interesting errors that might happen. Changing the order of some lines can look innocuous, but sometimes that causes an iterator invalidation. This kind of thing is hard to spot, but we have automated tests that will stop you merging code that does this. Especially things that are UB in c++ can be very tricky.
- Integration tests: several pieces that are in themselves passing unit tests might not work together. Luckily you can define a CI script that launches both and makes them talk to each other.
As you can imagine, it takes a lot of work to write the tests. One issue with a lot of teams is there simply isn't enough time. You have to show progress, which means showing things that seem to function externally. But you're always paying for that by having to fix things that you find as you go. Not having tests is a form of tech debt. It costs more and more as your code base grows.
I'm not arguing against tests and having had the experience of testing I'd use them when appropriate I'm just not sure the exact cutoff. If I was working on a game engine for multiple teams I'd be writing tests for sure. If I was working on a small 5-20 person team with custom tech I'm not sure I'd start adding tests where I didn't before. Maybe if there was IAP or multi-player online or some other server component and metrics. Or maybe if it was easier to get started and maintained.
Theres a bunch of big differences with games vs much other software. Often games are not maintained after shipping. Of course that's less true today than it was in the past with longer term online games but it's still also true that many games area pretty much done the moment they ship. Another big difference is the teams are 70-90% non-engineers making tons and tons of data.
I mostly bring this up as that often software engineers talk past each other since no all software engineering is the same.
I do write tests, but only post-factum, when I know for sure this component will be re-used in other components. I write them in cases where I know I will forget about the certain edge-cases in some time, so changing the code will most certainly introduce bugs.
When you have components that are reused over the course of several years (and obviously being optimised if the requirement is there), the chance of regressions is severely increased.
So I see that "writing tests for everything" is more of a political stance, rather than entirely practical.
Integration tests are necessary for the exact reason you mention: just because things works in isolation, doesn't mean they will work together.
Ideally, for catching unknown bugs, I would also have a property-based generative test suite that is run over night against lots of random scenarios. This isn't always possible (in fact, so far, I've sadly only worked on one project where we did this, but I'd like this to become more common).
I forgot to mention this part. We have data recorded from external sources, which we run through our code. Millions and millions of state changes, along with their summary states. This also helps to find cases we didn't think of.
About tests types, unit tests are worthless unless proven otherwise. Automated analysis is just great, both at static (like types) and run time (like Valgring), do it as much as you can. Integration tests are the actual tests worth their name, everything else is just development tooling. For them, see the first paragraph.
I didn't have an off-the-cuff reply to this, but thinking about it afterwards the issue became clear to me: customers don't pay for code either! Customers pay for solutions to their problems.
I think the key phrase is "how do we know?": maybe those hundreds of thousands of lines of PHP were needed to solve the customers' problems, but how do we know? After all of that work, does it actually do what's needed? When the requirements inevitably change, how do we know when we've finished our patches?
We can never know for sure, but there are ways to gain confidence in what we've done. Automated testing is a really low-cost way to gain quite a lot of confidence, which is also relatively stable over time. In contrast, manual tests are either very expensive or woefully shallow; and re-running them in the future takes just as much effort each time. Static analysis, code read-throughs, formal verification, etc. can give us more confidence than tests, but at a much higher cost. Simplifying the codebase can also help (code is a liability, not an asset!), but again that can be expensive.
We should get the most bang for our buck: usually that means adding more tests. Occasionally, if that's not enough, we might sit down and prove something, or manually step though a print out, etc. but usually we could get more out of the same time by writing tests.
And that's why you don't test code. You test that your application comply with your customer's requirements.
I'm happy to see things changing about tests with people realizing fine grained unit tests are often an hindrance and you should prioritize end to end testing. Test the interface of what you're selling, not the inner workings.
In any case, whilst I think it's good and healthy to debate the different forms of testing, their merits and tradeoffs, etc. I don't think it's appropriate when the alternative is not testing at all (e.g. "customers don't pay for tests").
I've worked at three organisations and managed to introduce automated testing to two of them. Even then, the test suites were "my responsibility", since (a) nobody else was running them and (b) I had to hand-hold people whenever I spotted a breakage (which they inevitably blamed on the test being wrong).
The heuristics I've come to follow are:
- Having automated tests is better than not having automated tests
- It's easier to improve a bad test suite than it is to get a test suite added/accepted
- There usually aren't enough tests
There are exceptions to these rules, but I don't believe them without evidence. For example, "the test suite is slow" won't convince me that there are too many tests; demonstrating systematic redundancy and fragility in the test suite might convince me (e.g. exposing privates in order to test them).
Only if people are on board with the idea of automated testing, will I bother to get more opinionated about the specifics.
E2E tests can be slow and extremely fragile though. They have a place and purpose, but they should not be the only form. Why would you skip entirely over "inner workings" tests that could catch bugs sooner?
> and why it exists?
Because the xUnit consultant crowd worked a lot, it is easier to make unit tests (which fossilize your code) so there's a lot of tooling around it. And the reliance on external services with no sandbox or ready to use mocks mean E2E is harder to implement. But being harder just mean you have to do your job.
When I buy your software I don't care about how you implemented some pattern. What I care is that when I click on this gizmo in that situation it does what it should in some time-frame using some resources.
I agree that testing is a valuable tool for making good software, but I think the idea that all code categorically requires testing to be considered good is overzealous.
I think the most ardent TDD proponents underestimate the costs of testing. Tests are also code which can have bugs and has to be maintained, and having a well-developed test-suite can act as a type of inertia which makes it harder for an organization to make necessary architectural changes.
Don't get me wrong - I think testing is important, but it is just one tool which has to be balanced against a number of other factors.
The second case was where an in-house quotes engine was to be migrated to a SOAP service and some calculations needed to be ported. We didn't have access to the source so we created a small set of the most complex scenarios we could come up with and used those to generate calculator requests. I think we had 26 or 27 test cases and they each required non-trivial setup before the calculator could be invoked. Those cases exercised code which took the developer about 3 months to refine into a working solution.
So what does this reveal? I don't know. On the one hand, we had just under 60 unit tests which picked up 1 bug whilst, on the other, we had less than half that number of end to end tests which were sufficient to build a major piece of business functionality.
My gut feeling is that end to end testing is a better long term investment and unit testing is perfect for refactoring but inefficient for anything else.
That being said, there is a certain degree of intellectual rigidity when it comes to TDD. The point of a good test suite is to capture the behavior of the product from a user perspective, and not to duplicate implementation details, or test the same behavior at three different levels of the architecture.
Ideally, each path in the code should be exercised by one [and no more] test written from the perspective of the product, but not necessarily using the user-visible APIs. To exemplify test refrain, suppose one is building an IDE. Write one test that the IDE frontend surfaces errors from the compiler to the user, and write a test suite for the compiler to check they produce sensible errors, using code coverage tools to make sure all possible errors are accounted for. But don't test each and every compiler error using the IDE frontend APIs.
Real-world example 1: The system being developed talks to external system X. We don't want tests to litter the production database, especially since it's about accounting and we would get into legal trouble for that. However, there is no possibility to open a test account on the production system, and no budget for a license for a test installation
of X. The main troiuble with X is that its public API (web service) changes from version to version and there is no useful documentation about it. How would one write
integration tests for that?
Real-world example 2: How would you write tests for a system that has its requirements unspecified, even on a very coarse level, after the deadline where it is forced into
production by management?
I'm pretty sure that both examples happen in other places than the ones I've seen them, too.
How would you manually test in a situation like that? If you can't interact with the production API because it would corrupt data, and there is no dev API, would you just be guessing that everything works before you deploy your code?
IMHO, You can't. We did in fact just hope it doesn't break in production.
But then, the article says just do it, so maybe there was a better solution we overlooked.
You're going to spend extra time on the the problems you identified. You can either plan for it up front with tests, or be unprofessional and firefight later.
2. Don't. If you don't have fixed requirements your tests will have negative ROI.
2. This is good to know. I might have worked with vague requirements for too long, but knowing this may help identify the parts where testing is indeed possible.
It's been a while since I personally last read it, but, as I recall, it has an entire chapter devoted to each of your examples.
The second problem is that even a fake account doesn't belong on production, not in accounting. They are very strict about such things.
So now the balance in our "Software test" account is nonzero whereas it should be zero.
But audit requires us to record all bookings, so how do we explain these bogus bookings (both from the erroneous code that made the mistake in the first place, and from the manual adjustment later that fixes the mistake for the next test run)?
For most people here, this is probably true - because you (and hopefully those you work with) know how to write tests well.
Examples of badly written tests I've encountered that wasted everyone's time:
* Unexpected environment - such as, Django doesn't turn off the cache while tests are running
* Tautological tests - where the test just repeats what's in the code
* Peeking too far into the implementation - restricts refactors and can create lots of false negatives or false positives depending on what's being asserted
* Mocking out too much - tests that pass when they really shouldn't
* False assumptions/not thinking it through - why did this test start failing on New Year's Day?
* Flaky integration tests
In our case, about half of these could be updated once the issue was apparent, but the rest either scrapped or completely rewritten.
And then there's a number of issues with test coverage giving you a false sense of security, with things like reaching 100% for a given piece of code, but only thinking about the happy path.
> And then there's a number of issues with test coverage giving you a false sense of security, with things like reaching 100% for a given piece of code, but only thinking about the happy path.
Yup, and this is why one writes tests against well-defined interfaces/boundaries.
I'm not going to invest a load of time in various types of automated test for an internal site with a form over a database that 2 users use for low priority work. The idea of 80%-100% code coverage for basic work like that seems like waste to me.
But for critical path of the eCommerce shopping experience I'm going to going to write all kinds of automated tests at multiple layers of the stack, right up to chaos/stress testing it, so that we know when black friday comes we can handle it.
I don't like dogma and TDD seems too dogmatic for me. I am very pro testing, having been both a QA, Developer and Ops engineer. I want the freedom to exercise my own expert judgement. The problem with dogma is that it makes Thinking take a back seat. Suddenly we have 80% code coverage enforced on a page that loads a grid from a table, going through a three layered monstrosity of code.
If you have a client knowledgeable enough, then great. But most people who don't have an engineering background think that the correct number of bugs is 'zero'. It's really hard for them to get their head round something being - to some degree - buggy, and still being acceptable.
You have to underbid the initial proof of concept phase to win it, and then hack together something that vaguely meets the requirements in the limited budget.
Then you've got the next 10 contracts over several years to develop that into the actual product - meet all the requirements. The problem is that nothing in this process encourages you to write good, clean code. If the code was such a mess that it took 3 times as long as it should to add new features, that meant that we could charge 3 times as many hours.
It's not even the "quality" thing. It's that tests help at delivering the product.
Of course if they tell me, "I need this tomorrow" because of some emergency there is hardly enough time to code for the happy path and manually check that it looks OK. No tests. I make sure they understand it will be full of bugs and we'll fix them later on. This happens rarely but it happens.
Probably it's not tested in any automated way, because manual testing (open the site and look) is much easier. Maybe lots of software is like that, to various degrees.
In flutter parlance, a whole view (page/screen) is a widget, and that's the granularity I was testing it - it's good for testing the high level patterns that the screen is meant to adhere to, e.g. Send button appears when there is some text to send.
You can also have golden images for rendering of each screen. They change a lot, but that at least forces developers to look at the changes each time and decide if they're on purpose.
For other kind of GUIs, you can also emulate mouse + keyboard interaction with tools like Sikuli (http://www.sikulix.com/). This is also scriptable, e.g. in Python, but actually quite painful in my experience:
When you GUI needs significant time for loading and rendering, the Sikuli program might fail to find a button that did not appear yet. So you need to play with timeouts and trial-and-error.
Any suggested readings about testing, HN?