However, that would be the best case, which is the mock data is sufficient for all N tests. That is not likely to be true, but probably a M number of mocks can still cover all N tests. And M < N in most cases so there's still savings in number of service calls. In the worst case, N tests will need N mocks and it will be the same as before.
If they actually do this across their code they will never have to run "real service calls", because those services are also tested the same way. It's mocking all the way down.
An issue I see with this is that there's a potential window when tests pass with bad data. For example, tests using a mock and the mock is periodically verified with a service call. The mock could become bad data, but won't be marked as bad until the next service call verification. Until that happens, the tests using the mock will all pass. It's not clear from the article how they address this.
That's how I would do it, at least.
If you have task-oriented build system with dependencies (like Google does with Blaze), it'd fit right in.
Though depending on how you deploy, you may have old and new services at different versions. Assuming a level backwards/forwards compatibility may be reasonable.
You record your expectations towards the service mock, and then later verify that these expectations hold against the real service. This decoupling allows you to run your "end-to-end" test suite very often, say on every check in, and then only verify the contract say every hour or day, depending on how stable your test environments are. The theory is that this gives quicker feedback, reduce flakyness and avoid combinatorial explosion of number of tests needed to run.
Mocks are usually (inevitably?) a per-test structure which makes them outrageously fragile and tempting to over mock to get a passing test
I prefer building something that acts like the mocked thing (ie a database) but does So both ends (it looks like a database to the web service and a web service to the database. Then the stub is my contract
Well it's still better because you only have to run the services you are actually testing right?
This very much neglects the probably significant overhead of writing, maintaining, understanding the M mocks on top of the base of N tests, but since, as you said, this is "at Google scale" the trade-off seemingly becomes worthwhile
Writing tests is not something you knock up during coding, it needs infrastructure - a test suite, some test data, a Jenkins server and then whoops we need better test data and ...
This touches that feeling - I guess I need to write it down a few more times ...
I think the next successful framework will focus on the entire lifecycle and aim to solve the scaffolding problem.
That would mean a lot more infrastructure (sorry if that wasn't what you meant), but automatically provided or accepted by everyone involved.
The goal is to model end to end tests as state machines and define conditions to trigger transitions. There should be a code based interface allowing for arbitrary transitions using any go library, and a higher level interface, allowing a handful of operations to allow non-technical testers to model end to end tests.
I wonder if you could record a user action, and add variance to the recording as a test? Break a recording into ~5 actions and then perform those actions at various speeds and points of repetition.
A recording of
1) Move mouse from A to B
2) Click button
Produces a range of tests including
2) Click button 3 times
The declarative language is a readable YAML, which translates each named step into a python function call with arguments (e.g. - click: submit button).
The asynchronous part is 'under the hood' where it runs services which log lines of JSON. Those are watched using epoll triggers and parsed into objects which can be verified by one of the steps (e.g. check email containing name of user is sent).
I wouldn't say that it's necessarily possible nor desirable for a non-programmer to write the stories using YAML, however. Even with a perfect declarative language you need a mindset that requires precise thinking in order to write executable stories and that usually means programming ability. Stories also need refactoring as much as code does, alongside engine code.
However, the YAML ought to be useful as a way programmers can collaboratively write and refine stories with product owners/managers and should serve as a useful way to communicate back changes to the product that a non-programmer can understand.
Using stubbed or mocking based approach can only go so far, despite it being less "brittle" than end-to-end tests. That is why our team built PANIC, which is a distributed testing framework for the every-programmer ( https://github.com/gundb/panic-server ). Everybody should be able to do tests like this, not just the Googles.
In that case it's worth dealing with it as it is not how it should be.
All things being equal,
A depends on B depends on C
is more brittle than
A depends on B
So if you have
Tests depends on A depends on B
It's the same as if you added a layer of complexity :)
Do you really do conflict resolution via alphabetic sorting?
Also, you can have an append-only log where you can run your own conflict resolution algorithm or let manual/human resolution occur on top of it. So all in all it provides the most flexible foundation to build other things from, while still guaranteeing strong eventual consistency.
Testing all edge cases is important to us, which is why we built PANIC to verify our algorithms. Hopefully we'll be posting more results soon!
1. There is none, and it's awful, once there are more than two developers.
2. There are dedicated QA people, who click on things in a more or less formalized fashion depending on the company, and sign off on releases. This is "pretty bad" at most places once you have a product that can't be easily verified by a couple of people in a few hours
3. You have dedicated Test Engineers, who are just software developers who write test automation. This can be good, but usually ends up with mid-low quality engineers filling this role for a variety of reasons.
4. Test Engineers are just Software Engineers who specialize in test-automation work. They write automation in high bug areas, and keep a larger overview of the product (set) to identify problematic areas and work with the product developers to solve those problems.
5. The article. Test Engineers write infrastructure for testing and reporting, (and possibly proof-of-concepts). Product developers are responsible for their own quality. Test Engineers may also act as consultants for product teams.
It's probably worth pointing out that 2-5 are all reasonable places to be for different sized companies, at different stages. We shouldn't assume that everyone should be at a company that is 'like Google', since things that make sense at Google scale may not make sense for your 20 developer startup.
edited for formatting only
Do Test Engineers at Google do this kind of work as well?
Also (half important point, half plug) solid test results reporting is underestimated in importance and generally not done well. Doing that right can boost engagement with testing. A reason that Tesults (https://www.tesults.com) exists and I'm involved with it.
You could like, be the guy who writes a few unit tests.
This is based off what I was told when I interviewed there a year ago.
Decoupling is essential in order to have fast, maintainable tests that give you confidence to deploy continuously.
My two favourite techniques for this is a ports&adapters architecture where we plug in fake adapters for the majority of the tests. We then use contract tests to be confident that the fakes behave the same as the real services.
Can anyone here recommend further reading on this particular problem?
The most important thing is to follow its advice and just start doing it little by little, even small refactors of other people's code. A friend's company had a Book Club and some testing book got on the list at one point, and as they were discussing it they kept having arguments over how effective various things were, so they resolved the arguments by starting a Testing Club where every week everyone would put in at least an hour into getting some part of the system under test, or trying out a testing technique, and discussing it. Over time they got most of the product under test and fixed a lot of previously unknown bugs.
1. The tester and another team member spent a year developing something that would intercept calls and relay them. Two problems with that: (1) two person years spent, (2) and that sounds like serious NIH (not invented here) syndrome. The problem that should have been solved was everyone spending the time to write better tests and changing code as needed. Instead, they spent a year on a workaround, invented in-house. Was there not anything else out there that did this?
2. The word is "focused", not "focussed".
3. Lack of detail: how exactly does it work beyond that basic diagram?
4. Where's the code for the project? Would it be useful to others?
However, I admire that the OP posted their experience, and it is useful information.
Can you explain why this sounds like serious NIH syndrome? It looks like they built a system to cache service requests on top of an existing test framework. It seems specific enough that there might not be an existing method that fit well enough. The article is a bit light on details though, so I suppose it's hard to tell.
1. The idea was definitely out there (some other commenter posted a link to pacts, which were a strong influence). Part of the process (and I hope this comes out in the article) was to try to find good ways to write better tests. We couldn't really (long story, let's just say "legacy code" to summarize it). So in the end we went for this technique. The whole process took a year to do, and as far as NIH goes, the existing implementations do not work in Google's setting. So we had to roll our own implementation. That did not take very long, though.
You're asking that every engineer at Google spend more time writing better tests? What's the math end up saying there? That if each engineer spend more than X minutes per year writing better tests, there would be an ROI. Where X ends up being a comically low number.
And you'd still probably want caching system at scale anyway, esp. given a monorepo. Think compute hours saved.
NIHS, as you're calling this, can often make sense at scale.
You would know this isn't "fake news" if you spent the time it took to write your post to visit the link instead.
A challenge in software development has long been the division between test and development.
You could do as Microsoft recently did (more or less abolished "test").. the jury is still out on whether that is what has caused the recent Windows 10 quality issues.
Or you could keep a tester/developer separation. Good luck trying to recruit top (e.g. your testers should on average be as smart as your developers) people for the tester positions unless you are Google/Facebook.
Either way, I think this is a really interesting issue.
It should be noted: some pieces of software are a lot easier to test by its developer (say, a compiler) than others (say a GUI).
A glance at a few (edit: random!) users' comment histories is enough to see that this intention (either way) is mostly orthogonal to the topics at hand.
I'm just more interested in calling out drudgery, propaganda, forced fanaticism of boring topics, and adding color
See also: "I'm just saying what everyone else is thinking."
I was referring to both people doing manual testing and doing development of automated testing in the case of Microsoft.
My understanding (from the outside) is a that substantial number of people at Microsoft who worked in those areas were made redundant. From what I could gather the goal was to make the majority of testing automated and have the self test systems be developed by the developer of the respective subsystem themselves.
All of the weird regressions I've seen with Win 10 myself (and read about many people experiencing) matches that story.
My feeling is that with something as complex as a desktop OS that needs to work with the a) the history of Windows releases, b) the history of Windows apps, c) the entire, insanely big spectrum of PC hardware released the past 5-15 years or so you do need an army of relatively highly competent people willing to do lots and lots of manual testing over and over and over and over and over ... again. And of course lots of people to build automated systems.. but you can't really get away from the manual aspect very easily.
You can think of it like the engineers behind rspec vs people who use it. Or the engineers behind Selenium, etc. They're engineers first and foremost and while you're free to have the opinion that this line of engineering is "meaningless and trite" at Google SWEs really appreciate the tools SETI teams build.
I think there's a certain amount of recursiveness in the way we look down our nose at test code; we don't like it, so we write code that isn't easy to test, which makes it even harder to write code that is testable in the next composition layer of the system. Repeat a few times and the test code gets pretty horrifying. But somebody needs to crack that nut.
Developers ought to be appreciative that there are people willing to do QA and test infrastructure for them.