
Java Mutation Testing - onderkalaci
http://pitest.org/
======
cfontes
I can't believe I've never heard of this. It looks really useful.

Thanks for sharing. I will play with it today!

Relevant plugin for IntelliJ:
[https://plugins.jetbrains.com/plugin/7119?pr=idea](https://plugins.jetbrains.com/plugin/7119?pr=idea)

~~~
0hjc
I'd recommend using one of the build tool integrations first - they're part of
the core project. The IDE integrations still needed a bit of work last time I
looked at them.

------
kremlin
That's an interesting concept - testing the tests themselves.

~~~
jasonmp85
I used to use a very convoluted coverage setup with my Rails apps to ensure
that coverage was only counted for the parts directly under test. To clarify:
it's pretty easy in a Rails app to write an integration test that hits one
endpoint and uses every model.

Because end-to-end integration tests don't actually _assert_ anything about
model properties, it's incorrect to record model coverage during such tests.
So each controller test only recorded coverage for the controller it was
testing (and nothing else), each model test only recorded coverage for the
model it was testing (and nothing else), etc.

Of course, in a more IoC/DI-type system (Spring, etc.) well-written tests
don't interact with other objects in the first place: stubs or mocks are
injected. In that case it's easier to ensure a test for a given component only
exercises that one component, but you _still_ have to verify that your
assertions are meaningful for the coverage to ultimately mean anything.

So I guess what I'm saying is even with all those precautions and thought, a
tool like this is _extremely_ useful to tell you "uh, hey, this method you
think exercises all your code isn't actually asserting anything meaningful
about your code's behavior". More of this, please!

~~~
0hjc
If you're working with ruby mutant is pretty good ->
[https://github.com/mbj/mutant](https://github.com/mbj/mutant)

------
TheLoneWolfling
A problem with this is that sometimes nondeterminism is OK.

For instance, changing the constant to another prime in the "classic" hashcode
implementation (repeatedly multiply by a prime and add the next field) will
(probably) not trigger any (well-written) tests, and indeed generally won't be
detrimental at all, but will be flagged by this sort of test.

~~~
0hjc
This is what it known an "equivalent mutation".

Along with high computational cost, it's one of the things identified in
academic research as preventing the widespread use of mutation testing. There
is no general method for distinguishing an equivalent mutation from a normal
surviving one except for getting a human to have a look at it.

Having built pitest to address the concerns around computational cost, it was
a pleasant surprise to find out that in practice equivalent mutations are not
much of a problem.

This isn't entirely by accident - the default set of mutation operations are
carefully designed to make equivalent mutations unlikely (they don't/can't
guarantee not to create them, but they make them as unlikely as possible).

There's a trade off here. Pitest has a smaller set of operators than a lot of
research focussed systems. A larger set of operators would catch more issues,
but would also create a larger proportion of equivalent mutants (and also take
longer to run).

There are more operators you can enable you wish to change this balance - an
operator that changes constants as you describe is one of them.

I rarely encounter equivalent mutants using the default operators and I know
of some rollouts of pitest where they break the build on anything less than
100% mutation coverage.

I have no figures to back this up, but I strongly suspect the % of equivalent
mutants will be highly dependent on coding style and the domain in which the
code operates.

~~~
TheLoneWolfling
> I have no figures to back this up, but I strongly suspect the % of
> equivalent mutants will be highly dependent on coding style and the domain
> in which the code operates.

Agreed. In everything I've written, at least, I can identify multiple places
where such equivalent mutants do exist, even with just the default operators.
(In particular, hashcode methods - there are very few mutators that break the
contract of a hashcode method, though removing entropy in most cases) But I
can easily see that not being the case with other coding styles.

Are the mutations done documented anywhere? I had to look through the source
to see what's done.

~~~
0hjc
The mutators are documented at
[http://pitest.org/quickstart/mutators/](http://pitest.org/quickstart/mutators/)

------
johnflan
I have used Pitest at work, it is very good and on more than one occasion it
unearthed wanting tests.

Unfortunately, we had to remove it from our build. Our CI pipeline uses VM's
that were not provisioned with this type of testing in mind and pitest ended
up slowing the build enough to make it painful. If we could get past this, I
would turn it on in the morning.

~~~
the_af
That's too bad that you had to stop using it.

I've never used one of these tools, but I knew they existed. Just today I was
discussing "testing the tests" with a coworker. In my opinion, where I work we
write lots of incomplete/illogical tests, sometimes bordering on cargo
culting. Aside from code-coverage, which is flawed, we have no real measure of
whether the tests we write are effective or not.

~~~
oliverc2
There is a tool to detect duplicate tests which many have found useful:
[http://ortask.com/testless/](http://ortask.com/testless/)

For other kinds of test quality, try Mutator:
[http://ortask.com/mutator/](http://ortask.com/mutator/)

~~~
the_af
Thanks for the links!

I hate to sound negative, but something about that website seems dodgy. I'm
unconvinced duplicate/overlapping tests -- while obviously undesirable -- have
a direct correlation with code quality. Unfortunately, in order to read their
"papers" where they elaborate on this, I have to register :/

I'd rather use something open source like pitest.

(Again, thanks for the links! I don't want to sound too negative)

------
JD557
There seems to be a sbt plugin by the original creator that has not been
updated since last year: [https://github.com/hcoles/sbt-
pit](https://github.com/hcoles/sbt-pit)

Does anyone know if the current version of pit already works well with scala?

------
dlhavema
so if i read this correctly "This filter will not work for tests that utilise
classes via interfaces, reflection or other methods where the dependencies
between classes cannot be determined from the byte code."

if you are using interfaces to inject and mock things, it cannot test your
code?

this sounds really cool, but most of the stuff we do is down with
interfaces...

~~~
jmsguy
Unit tests should have exactly one system under test. That has to be a class,
not an interface.

You may mock dependencies that may/may not be interfaces. This is safer than
using concrete dependencies whose behaviors may change once the fuzzer does
its thing.

The behaviors of the classes that implement the mocked dependencies have
nothing to do with the system under test.

So if you're writing unit tests that exactly one unit, you should be good to
go.

~~~
krzyk
Regarding one system under test, is there a java library (with maven
preferably) that would check that unit test is performing tests on single
class and all the rests are mocked (e.g. with mockito, or with custom
anonymous classes)?

~~~
ajanuary
Depending on who you talk to, a unit isn't necessarily a single class.

~~~
jmsguy
Actually this is a really good talking point that's often the start of many
interesting "discussions":

Namely, what happens when you're done with the test-code-repeat cycle for a
system under test and you want to make it more architecturally sound / OO /
etc.

You typically might end up doing a refactor in which you extract classes from
the original system under test and colocating common functionality into new
class(es).

In this way you still have the same coverage as before... But you're actually
testing multiple classes as a unit.

Some folks would argue you need to split the tests out. Otherwise would say
"the coverage is there, what's the point?".

Unless I need to do something, I'm not going to do it.

In playing with pitest, it looks like the refactorings might introduce some
fuzzing that needs to be considered in the original (and refactored) SUT
depending on how much conditional logic you're moving around.

Arghh. There goes my evening. Will be playing with this after work now :)

------
c4n4rd
if you are the site owner, please correct:

Its fast, ...

with

It's fast,...

------
oliverc2
There are also other great mutation testing tools for other languages, such as
Mutator: [http://ortask.com/mutator/](http://ortask.com/mutator/)

