
Manual Testing, the Art That Cannot Be Lost - ideqa
http://ideqa.blogspot.com/2016/08/manual-testing-art-that-cannot-be-lost.html
======
hugs
I have a dog in this "fight", since my background is test automation and I
started the Selenium project and two test infrastructure companies (Sauce Labs
& Tapster Robotics). But I get so tired by these arguments. Testing is about
mitigating risk and increasing development speed. If you want to go fast and
there's low risk of failure, a little manual testing (with little or no
automation) is fine. If you want to go fast, and there's a high risk of
failure (lost revenue, reputation, etc.), go heavy on automation. If you have
lots of time and risk of failure is low, do whatever you want.

~~~
zachsnow
Why place them on opposite ends of a spectrum (low failure vs high failure
costs)? (Curious not saying it's incorrect).

At my company we do both and get very different value from them. Automated
testing is great (reproducible, handles regressions automatically, can run all
the time) but it's not intelligent. It can't think of things that you forgot
to test, or identify dependencies between components you thought were
unrelated. You can argue that you should have written better tests, but how do
you learn that? In our case... Manual testing!

On the other hand, your manual QA person can't test all things at all times,
and they definitely won't test the same thing over and over and over forever
without finding a new gig! So ideally you take their input to learn not just
how to test a particular component better (and fix specific bugs) but how to
generally write better tests, so that next time their usual "tricks" won't
work.

------
Clubber
In my experience, automated testing is best on engine / backend type projects
while manual testing works best on user facing projects.

If the purpose of your code is automation, as in an engine, EDI, etc,
automated testing works best.

If the purpose of your code is manual user interaction, manual, human testing
is best.

You can get away with 100% manual testing, but you can't get away with 100%
automated testing.

YMMV.

------
wyldfire
> ... to [satisfy customers] you should implement exploratory/first look
> testing by manual testers which are then converted into automated tests.

Agreed, that's a great way to find bugs. Another complement is to use fuzzing.
With system level testing that includes UI components, finding a way to
abstract that UI input and detecting display errors becomes pretty
challenging. But excluding the very tip-top and lowest-lowest-bottom of the
stack, much of software can be adapted to take fuzzed inputs.

~~~
csours
Regarding display errors: Our app has a fun little bug. While navigating
quickly, items from a previous tab will show up on the next tab.

Our automation is very patient and waits for everything to load and then waits
another 1 second more, so the automation never sees the bug.

------
Jtsummers

      To start let us talk about cost. Each is costly, but each is
      costly in different ways. Manual testing is costly because for
      every development team of four there should be at least one
      manual tester. That means hiring more people, which is
      expensive. Automation testing cuts the cost of manpower by
      almost 400% you only need one automation tester for every 16
      developers, but they is also a certain level of skill required
      for automation testing that requires you to pay the testers
      more. Both are costly just in there own way.
    

We do automated testing, primarily. With manual testing during development and
diagnosing errors (whether they are errors in our system or errors in
understanding _about_ our system or errors in systems we depend on).

I don't see 1 tester for 16 developers, though. For us, it's very nearly 1:1.
Now, this is embedded and I've always worked in some way related to aviation.
Perhaps our needs are different. But I honestly cannot imagine a scenario
where you'd have a 1:16 tester-to-developer ratio. That seems like an
environment highly skewed away from QA, in a bad way.

Similarly, how is this a "fight"? You need both. Manual testing is error-prone
and non-repeatable (reliably), especially in systems with tight timing
constraints. It's excellent for identifying errors in development, finding
problems in specifications, reproducing errors reported by users, but
ultimately testers should be automating their tests. That was my job for
years, I wouldn't want to spend my time flipping switches or pushing buttons
in a GUI for weeks at a stretch just to verify our latest build hasn't
introduced new or old errors.

~~~
nickpsecurity
I figured with good annotations by programmers and Design-by-Contract the
manual or automated testing would be easy enough that there could be a low
number of testers vs developers. 1:1 surprises me unless you're counting each
developer writing their own basic tests or whatever. I'd be interested in
hearing more about what tools or practices you use given I continuously survey
safety-critical development practices.

~~~
Jtsummers
1:1.5 tester:developer may be more accurate.

We don't use a lot of good practices in my current office. I wish we did. We
use C and fortran, one guy insists on C++ (which he calls C).

We _used_ to (before I came onto this project) do much better with using
static analysis tools and good mapping of: (requirements <-> specification <->
design <-> code) <-> tests. The reason this stopped, people retired. The
people that came in to replace them didn't understand the value of good
practices and tools for those things because, well, they were awesome rockstar
coders. And they are. But they assume _everyone else is too_. Which is a fatal
mistake. They're terrible at management, but good engineers.

A major problem, as well, is that we're on the maintenance end of the system.
These aren't fresh projects or clean slate projects so we're stuck with the
historical code (effectively). And too much of the initial (generally good)
practices and processes didn't survive the transition.

I've read a number of your posts, usually when I see your handle I make a
point to read them regardless of how long they are. And sometimes I realize
it's you just by the content and look for more in a thread. In principle I
think we're on the same page. In practice, I made some expedient choices for
myself that have kept me from being able to (on the job) make extensive use of
these ideas for a while.

One other reason, though, that we'd likely keep the same ratio is just the
nature of what we test here. Radios with an esoteric messaging protocol. It's
only used by us and a handful of others. A test case may involve dozens or
hundreds of key messages to simulate a "conversation" between two or more
radios to determine how our software behaves, with thousands more sent
automatically. It's made more complex by the (IMO) relatively poorly designed
protocol. It was satisfactory, maybe even good, originally but has become
overly complex and possibly contradictory in its specification as new features
were added over the years. Which is where better design (not just code
development) would be nice. I started on, but never finished, going through
all the messages that could be sent and what the responses ought to be and
when to try to produce a better simulator than we presently use. One of my
goals was to identify where the specification was contradictory, but I haven't
been given time and I'm not willing to consider it after work hours. Another
goal was to make it easier, with such a high level functional model/simulator,
to develop some of our other systems that rely on this same protocol (or some
subset of it). Something to verify/validate our implementation against.

EDIT: In a past life I worked with LDRA as our primary static analysis tool
for C. We had much better processes in place to track the relationships
between spec, requirements, test procedures and code (we used Telelogic
DOORS). On personal projects exploring these ideas I used Ada, and wanted to
use SPARK Ada but found the documentation at the time to be sparse online, and
my personal time waning so I dropped it.

~~~
nickpsecurity
"The reason this stopped, people retired."

Sorry to hear that. I'm short on suggestions for that one given I focus on
getting tools adopted more than re-adopted. I'll have to look into that
scenario a bit given I bet it plays out a lot.

"I've read a number of your posts, usually when I see your handle I make a
point to read them regardless of how long they are. And sometimes I realize
it's you just by the content and look for more in a thread. In principle I
think we're on the same page"

Appreciate the kind words. I work hard to get the quality stuff out there to
people like you. Hard to assess how it's being received. Comments like that
and occasional emails are what I go on for impact assessment.

"A test case may involve dozens or hundreds of key messages to simulate a
"conversation" between two or more radios to determine how our software
behaves, with thousands more sent automatically. "

Ohhhh. Ok. I'm already seeing it based on prior experience testing both
protocols and UI's. Yeah, that would take a lot of testing if specs were out
of date or insufficient. With specs, you could use design-by-contract on it to
support testing with interface checks and/or auto-generate tests depending on
tooling. I mean, the Fortran static analysis tooling might be behind a bit...
(read: nonexistent).

"but never finished, going through all the messages that could be sent and
what the responses ought to be and when to try to produce a better simulator
than we presently use"

Are you actually able to change the protocol in arbitrary ways so long as
product functions properly in some other black box your company can control?
Or do you have to maintain its existing, functional behavior for backward
compatibility? If the former, you can gradually spec it out then maybe apply a
protocol generator. If the latter, it might stay pretty manual with DbC,
testing, and/or static analysis being best you're going to do.

"In a past life I worked with LDRA "

Oh, you got to work with one of the real ones. Good for you. I haven't been
able to afford them so I keep trying to get comments on how effective or
usable they are. Did it catch stuff well? Make it fairly easy to do
requirements/spec/code consistency or lots of redundancy for user? On the
latter, point is that high-assurance development requires a number of
documents that all describe the same thing. Essentially views on the operation
of the system. How easily a tool can keep them together and/or propagate
changes in one to another is important to avoid people saying "screw it: i'll
update that later [never]." So, I'm curious about its usability. Still no
consensus on tools for that with Topcased being an open one some went with,
some using traditional SCM w/ specs/docs in it, some HTML (eg LISP spec), and
Karger et al on Caernarvon used Framemaker with lots of cross-links.

"we used Telelogic DOORS"

Could've been interesting but IBM acquired it for Rational suite. Nevermind.

"wanted to use SPARK Ada but found the documentation at the time to be sparse
online"

Might be worth re-exploring given they've done a lot over the years. SPARK
2014 was quite an update. Almost nobody does anything with it but there is a
DNS server written in it. Might contain insight on applying SPARK to protocol
design/verification. Not to mention its other, potential benefits. :)

[http://ironsides.martincarlisle.com/](http://ironsides.martincarlisle.com/)

~~~
Jtsummers
LDRA: It's been a bit since I touched it. Checking their site, we were using
LDRA Testbed, our use was for DO-178B certification. I specifically was using
it to develop test cases. What my memory is telling me is that we were using
it to test pre and post conditions, a lot like you'd use in design by
contract, on functional units of our system. I was a tester, not a developer,
I didn't use its static analysis tools as much and so I don't recall much
more.

DOORS: Honestly, at this point I don't care who makes it as long as it works.
What's needed, and what DOORS at the time provided, is:

    
    
      Concurrent editing.
      Traceability from one document to another.
      Workflow control - I can make changes, but someone else has to
        verify/validate. Same as with any issue tracker.
      Revision control
    

Everything but the workflow control I can do with a simple markup language,
git, and a few scripts. But in industry it seems people prefer Word documents
and Excel spreadsheets.

SPARK 2014: I will, if I ever have downtime. I leave for the office at 7, go
to the gym until 8, then spend the evening with friends or my girlfriend.
Weekends are my only downtime these days and I'm finding myself lacking
motivation to do this. I need a project to work on, trying that now.

We don't have control of the protocol as there are others making products that
interact with ours (in theory, in practice it's hit-or-miss as some are more
faithful to the spec than others).

One of the things, as an example, that I want to discover by go over the
protocol specification are cases like this:

I can't recall the specifics, but there was a mathematically impossible
requirement. It was related to the discrete logarithm. Only some of the values
involved weren't relatively prime wrt the modulus. So what happened, no one
implemented that part of the specification. No one tested it either because
there wasn't a good traceability matrix revealing that it was untested. One
day, someone realized this, implemented a test to see if this part of the spec
was being done, and hey, the test failed.

This reveals major flaws in the process. 1) That specification feature should
never have made it into the final. 2) The test should have been made or in-
progress once that spec feature was introduced, which would've led to (1)
being true.

Speaking of gym, it's time for me to go. Thanks for the pointer with
Ironsides. I'll take a look at it and SPARK 2014 over the next few weeks.
We're about to wrap up my current project (well, in 3-4 more months) so now's
a good time to start putting together a demo to try and save the next project
from some of the current issues.

------
mathattack
I think it's silly to make this either or. It's kind of like decision making -
do you want intuition, data or both. Data may beat intuition, or vice versa.
But data plus intuition beats either alone. In chess, computer plus player
beats computer-only or player-only.

In testing, Automated plus Manual beats either on their own.

~~~
cle
With cost factored into the decision equation, automated+manual often doesn't
win.

~~~
mathattack
True - there's a spectrum. Generally (5 or 6 experiences) I've found that the
benefits of automation tend to come from repeat experiences (systems that get
updated frequently) and that the weak spots of each mandate at least a modest
investment in the other.

Even with only a little time, a person can see the "Oh sh*t we don't have a
test case for that, but it's wrong" and enough stuff breaks over and over
again that it's worth putting in at least some automation.

Of course the degree of one versus the other requires judgment and a P&L
decision.

------
haney
It also depends on your deploy cycle. If you want to do continuous deployment
(which seems to make development teams very happy) then you need great unit
and integration testing. If you're ok with waiting a week or more between
deployments manual QA seems to work well. As someone who's gone 'all in' on
automated only testing I will admit that things that would have been caught by
more human sanity checking slipped past me, now that I'm in an environment
where manual QA blocks everything it feels like the excitement of shipping has
been dimmed a bit. To anyone starting a new project I'd probably recommend
more automated testing than manual and something closer to 12 developers per
manual tester with developer only checking in code that includes automated
tests.

------
a-saleh
My question would then be, how do I best teach it? I am a QE guy, mostly doing
Selenium based automation, but because we are hiring generalist devs much
faster than QE people, I would really like to teach them the QE mindset,
because I have seen too often that they try to work around a problem, than to
highlight it in a good bug report.

Context, currently I am in a ~6 person QE team with ratio ~1 QE to 6
developers, still just a small team in a growing corporation, our product
people are constantly pushing on rapid release cycle and I have heard "We will
fix our failing integration test suites after the the release, now we don't
have time."

In the end we mostly do manual checking of the new features, where we often
can persuade our product guys to drop few too broken off the release change-
log.

Only way we have so far been able to improve the testing situation was with
putting rest of the team in front of a choice between spending ~50 man/days on
manual regression testing or increasing our coverage ...

Anybody else in similar situation?

~~~
achievingApathy
I'm in the same boat. I lead my QA team, both the automation and the manual
testers. There are about 6 QA staff to about 24 developers, so we're vastly
outnumbered. I've worked tirelessly to get any sort of quality initiative as
part of the SDLC as early as possible. Code reviews, pairwise coding, TDD,
anything where I can "shift left" with the testing. Otherwise we would be
outmanned on every release.

------
1_800_UNICORN
My biggest beef with manual QA is that at most places I've worked for or
worked with as a consultant (~15 companies), the QA team either doesn't have
the skills necessary or doesn't care enough to understand the system at the
level I would want if the company is going to spend money on manual testing.
Most of the manual QA people I've encountered are either doing repetitive
checks that I would want automated, or they are checking features in a way
that I would expect the product team to be doing.

I've started to understand why large companies that attract high-quality
talent (i.e. Google) put new developers on QA... you need a developer's
mindset, while not being involved in the day-to-day software development to do
effective and creative manual testing.

------
maxxxxx
I generally push for automated testing but since you are testing the same
thing again and again there is a limit to the type of bugs it can find. It
just doesn't find bugs that are totally unexpected. From my experience all the
really difficult bugs get detected by manual testing. I can take almost any
app that has been tested thoroughly with automated tests and break it within a
day by just playing with it.

I do a lot of UI stuff with complex behavior. It may be different for server
stuff where the expected behavior is easier to describe.

------
galdosdi
I recently read a book that covers this topic pretty thoroughly. I learned a
lot of useful ideas from it.

[https://www.amazon.com/Google-Tests-Software-James-
Whittaker...](https://www.amazon.com/Google-Tests-Software-James-
Whittaker/dp/0321803027)

------
Animats
In game development, there are more testers than developers. It's the job from
hell.[1]

[1] [http://trenchescomic.com/tales](http://trenchescomic.com/tales)

~~~
nickpsecurity
I quickly figured out what it would be like just by imagining what gaming
experience is without almost anything that makes it enjoyable (incrementally
added) plus a huge number of what pisses gamers off (aka the bugs). I thought
I'd hate the game by the time it was released. Also worried I might game less
in general due to a negative-reinforcement effect. If I did testing, it would
have to be stuff unlike what I normally played so that other stuff would just
be so much better.

------
hacker1234567
I love this!!

------
someone7x
The state of QA, if there is such a concept, is not in a good place. To me,
it's more divergently interpreted at each organization than any other role.

For example, this individual seems to work at a place where manual/automation
overlap is taboo and gives rise to the issues described.

~~~
mikestew
_The state of QA, if there is such a concept_

There is, it just rarely applies to software. What all but a tiny fraction of
a percentage of shops do is Quality Control. Were it actually quality
assurance, when I file a bug on the 35 empty _catch_ statements littering our
code, there would be absolutely no debate on fixing it. But because we do
quality control, dev has the option to say "no" when told to fix Pokémon-style
try/catch statements.

But on to your point, yeah, software testing is not in a good place and
probably won't be for a while. For starters, software development is still
quite young, and formal testing is even younger than that. So give it time.
But as long as job descriptions for software testers start with "we're looking
for people to break software!", we're not there yet. Because if it were up to
me, we'd collectively be looking for people to tell us whether or not our shit
works, not find esoteric bugs. Software verification can _include_ finding
esoteric bugs, but not limited to that. The difference is subtle, but very
important.

An anecdote to back my point: for junior testers I will often describe a
blender, coffee grinder, what have you, and then ask them to describe a list
of tests with bonus points for a rough outline of a test plan. 90% of the time
a candidate will come up with creative ways to break the blender (put rocks in
it, flip the switch 1000x), and yet fail to come up with a test case that
tests what happens when I put ordinary food in the blender and flip the
switch. Because we have, as an industry, fostered this idea of the tester's
role as breaking things. That is an extremely naive, immature, and limited
view of the software tester's role. When we can expand beyond that view, then
we might begin to stand a chance of actually verifying software instead of
throwing rocks in a blender.

~~~
achievingApathy
This is absolutely true. Every tester that we get in at our shop has the
break-it mentality, but that just comes as part of how some of these people
are trained to get through testing technical interviews.

Wishing that QA was more standard as part of software development is such a
double-edged sword. We're going to end up with some very finely trained people
who look at defects in only specific ways and no more "out of the box"
thinking.

