
Hypothesis 1.0: A property-based testing library for Python - doismellburning
http://lists.idyll.org/pipermail/testing-in-python/2015-March/006348.html
======
bowyakka
This library is awesome, I am a massive fan of randomized and quick-check
style testing. This type of testing has for me found so many bugs its surreal.

I love the fact that hypothesis is a proper quickcheck and not just the
fuzzing part, and that the author has a crazy stack-based-vm-language-testing
thing project that might be useful.

I have been using this library with py.test for 6 months now and its been a
godsend.

If I ever meet the author I would buy him beers.

~~~
DRMacIver
Thanks so much for the kind words!

The best part of this message is that if you were someone I knew and thus
could trust to only say nice things about me you would offer to buy me gin
instead of beer. ;-)

~~~
bowyakka
Gin is fine, prehaps meet at a pycon one day :P

------
wyldfire
The search strategy is described in the docs [1]. I wonder if they could
capitalize on the algorithm NIST uses in their ACTS software [2]? It's great
for maximizing time spent testing parameters in N different dimensions.

[1]
[http://hypothesis.readthedocs.org/en/latest/details.html#sea...](http://hypothesis.readthedocs.org/en/latest/details.html#searchstrategy-
and-converting-arguments) [2]
[http://csrc.nist.gov/groups/SNS/acts/](http://csrc.nist.gov/groups/SNS/acts/)

~~~
DRMacIver
This looks super interesting and I'll check it out when I'm not doing release
management stuff, thanks!

If you haven't seen it,
[http://hypothesis.readthedocs.org/en/latest/internals.html#p...](http://hypothesis.readthedocs.org/en/latest/internals.html#parametrization)
describes the current algorithm for generation. It's smarter than is typical
for quickcheck, but could definitely be improved.

------
tekacs
Might be more useful pointed [here][1], perhaps?

[1]:
[http://hypothesis.readthedocs.org/en/latest/](http://hypothesis.readthedocs.org/en/latest/)

~~~
pyre
Well, it _is_ a link to the "official" announcement, and the announcement
directly links there, but I can see you point too.

------
michaelmior
GitHub
[https://github.com/DRMacIver/hypothesis](https://github.com/DRMacIver/hypothesis)

------
e12e
Interesting stuff. Can't wait to play with.

One immediate reaction from reading about things like "@given(float)... def
testFun(x): ... assume(not isnan(x))..." \-- I immediately would prefer to
decorate/document my functions (that could be test functions, but more
generally my "real" functions) with these pre-/post- conditions and
assumtions, and generate the tests, something like:

    
    
        @assert(idempotent) #(inv(inv(x))==x)
        def inv(x):
          "Return -x"
          @given(float):
            @assume(not isnan(x))
          return -x
    

Typing this out, I can see why they go in the tests (avoids going down the
rabbithole of creating a partially typed version of python...) -- which is why
@assume(float) isn't on top; it'd imply inv didn't work for integers. Maybe a
weaker:

    
    
        @assume(not isnan(x)) #implies x is number, but not nan
        @idempotent # inv(inv(x)) == x
        def inv(x):
          return -x
    

Now it should be possible to infer that inv(x) should be checked with numbers,
and that it is idempotent. Might be a weak test unless one can specfify a few
other things - but maybe such pre/post "hints" could be combined to make
better short tests? Eg that one could give a different expression to check
against rather than the function itself: def test_inv(): assert(inv(x), lambda
y: -y)?

At any rate, interesting framework.

~~~
reverius42
I don't think that's what idempotent means. The only way f can be idempotent
when f(f(x)) == x is if f(x) == x. (This implies that f is the identity
function).

The property that makes the function idempotent is that successive application
produces the same result as initial application: f(f(x)) == f(x).

~~~
e12e
You're right, of course. I typed that on my cellphone, and were a little quick
looking up alternatives for
@double_application_turns_this_into_an_identity_function -- that made more
sense, and was a little more succinct.

Maybe @left_inverse would be more appropriate? It's been a while since I had
to classify functions.

~~~
jfarmer
If you're curious, the math-y term for an operation that is its own inverse is
_involution_. I don't know that I've ever heard the word in a programming
context, but there it is! :)

~~~
e12e
Thank you! A quick look through my bookshelf reveals that out of four books on
discrete mathematics, only one contain _involution_ in the index: and there it
is mentioned _once_ in a supplementary exercise... I guess that explains why I
couldn't seem to recall a term for "self-inverse" [ed: No, that's wrong to,
lol. Lets just stick with _involution_ ].

~~~
jfarmer
The word "involution" finds more use in analysis and algebra, where self-
inverses have more interesting (often geometric) properties. The only
interesting property I can think of in the context of discrete mathematics
would be that if S is a finite set and f:S → S is an involution then the
parity of |S| is equal to the parity of the fixed points of f.

That is,

    
    
        |S| ≡ |Fix(f)| (mod 2)
    

where Fix(f) = {x in S : f(x) = x}.

------
lgierth
Is this similar to Mutation Testing, where the code being tested is mutated,
in order to identify unspecified behavior?

Mutant is a Ruby library for this:
[https://github.com/mbj/mutant](https://github.com/mbj/mutant)

~~~
DRMacIver
No, it's much closer to classic fuzz testing, where the test is held constant
and the data fed to it is varied. I've been meaning to see if I can figure out
a way to use mutation testing to feed into Hypothesis, but it's on the long
list of things I'll try "at some point"

~~~
baq
could you use something like recently open sourced z3 to assist in finding
minimal examples?
([http://en.wikipedia.org/wiki/Concolic_testing](http://en.wikipedia.org/wiki/Concolic_testing))

~~~
DRMacIver
This is also on the "at some point" list. :-)

Concolic testing is quite hard in Python because of its extremely flexible
semantics. I will probably look into doing _something_ with z3 at some point
when I need an interesting problem to entertain me, but I don't hold out a
massive amount of hope for it being useful.

------
ayrx
This is a very awesome library. I am attempting to integrate it into the test
suites of my various projects this past week and the author has been more than
helpful, even when I filed bug reports that turns out to be issues in my own
code.

~~~
IanCal
Slightly OT ramble ahead :)

> even when I filed bug reports that turns out to be issues in my own code.

Every time I've used or built a property based testing system I've had
problems when starting because there's something broken in it. This has then
turned out to be an actual bug in the system being tested.

Building a quickcheck style tester in AS3 drove me nuts at one point before
finding out that the conversions between strings and floating point numbers
has a bunch of weird issues. There are some obvious ones, but then also things
like certain numbers convert differently if you add a trailing zero (so
0.7362856270 is turned into a different number to 0.736285627 and
0.73628562700). So my little "decode(encode(x)) == x" test example broke!

My favourite thing that it found was in a menuing library we were building
(and why I built the tester). I set it up to test a library by making library
calls that a developer might make and test various properties of the overall
menu (all elements reachable, for some if you move left then right you're back
on the same item, etc). One property was that if there was at least one item
in the list and the list had focus, then one item in that list had focus. This
was found to be broken by starting with only one item, removing it and adding
a new one (reduced test cases are incredibly useful).

That's a fairly boring bug, but the interesting thing (to me) was that when I
fixed it my unit tests broke. I had an explicit test to ensure that this
behaviour was happening, and I'd also written it down in my spec for what
should happen.

The testing tool forced me to consider higher level things of what should be
true and drove out an inconsistency in my library. It's a 'bug' that would
have bitten me many times, but rarely enough that it would probably have
regularly ended up going live.

~~~
DRMacIver
> Building a quickcheck style tester in AS3 drove me nuts at one point before
> finding out that the conversions between strings and floating point numbers
> has a bunch of weird issues. There are some obvious ones, but then also
> things like certain numbers convert differently if you add a trailing zero
> (so 0.7362856270 is turned into a different number to 0.736285627 and
> 0.73628562700). So my little "decode(encode(x)) == x" test example broke!

I have in fact had to deal with exactly this problem in Hypothesis internals.
The example saving code didn't work correctly with floats in the first
edition, because it was serializing them as JSON, which loses some information
encoded in the actual float. In the end I serialize floats (which are actually
doubles) by converting them to a 64-bit integer with the same bitwise
representation first.

The bug aryx is talking about though actually illustrates an amusing feature
of Hypothesis, which is that the code in it is just weird enough that it tends
to do unexpected things that trigger bugs which have nothing to do with the
properties being tested. :-) In this case it was putting unicode objects onto
sys.path, which is 100% allowed but causes problems for certain code running
on python 2.7 on windows that previously appeared to work.

~~~
IanCal
> I have in fact had to deal with exactly this problem in Hypothesis
> internals.

Hah, wonderful. It's a great example of how this type of testing can really
dig out odd bugs. Once you've tried property based testing, I think you never
really trust things quite the same :)

> The bug aryx is talking about though actually illustrates an amusing feature
> of Hypothesis, which is that the code in it is just weird enough that it
> tends to do unexpected things that trigger bugs which have nothing to do
> with the properties being tested

Nice :)

Thanks for releasing this library, it's great to see more work being done in
this area. There are usually a few random testing libraries floating about but
adding things like minimisation (and I'm still reading through the templating
& other new stuff) really makes it stand out.

The API testing example in particular actually comes at a perfect time for me,
so it'll definitely be getting some use.

~~~
DRMacIver
> Once you've tried property based testing, I think you never really trust
> things quite the same :)

Oh god, tell me about it. So far I've hit 4 pypy bugs, 3 cpython bugs, two
pytz bugs (one caused by a cpython bug), and I've learned far more about the
edge cases of the language than I ever wanted to know.

Lets just not talk about the number of bugs I've found in Hypothesis itself in
the course of testing bits of the system I was sure worked perfectly.

------
evancordell
This is awesome! And it comes right as I've been getting my feet wet with
Haskell (and property-based testing).

It was really easy to get up and running with a simple test:
[https://github.com/ecordell/pymacaroons/blob/property-
tests/...](https://github.com/ecordell/pymacaroons/blob/property-
tests/tests/macaroon_property_tests.py)

and I'm excited to use it in more places. Though I'll have to configure
tox/travis not to run hypothesis tests on Python 2.6.

~~~
DRMacIver
Cool! I've added a comment to your commit as I think you've made a mistake in
the strategy setup, but it looks good as a test.

Sorry about the lack of 2.6 support. I looked into it and I could probably do
it but I just don't care enough about 2.6 to put in the work and make
everything uglier. :-)

------
coolrhymes
I love this library and like others, was a god send. one thing I want to try
is use the lib with the python mock lib to mock services that can return
randomized data of some kind.

------
dirtyaura
I read about QuickChck over 10 years ago in Uni, I was just reintroduced to
property based testing this week in HelsinkiJS meetup and now this Python
implementation hit Hacker News.

Question: most of my backend code is rather dull CRUD and business logic code.
Are there good tutorials about randomly generating objects with a few of
fields and relationships between them and testing business logic using
hypothesis?

~~~
DRMacIver
Have you seen
[http://hypothesis.readthedocs.org/en/latest/examples.html#fu...](http://hypothesis.readthedocs.org/en/latest/examples.html#fuzzing-
an-http-api) ? It doesn't test very much - mostly just that the API doesn't
produce a 500 error - but it's a decent example of how you can generate
structured data with some constraints.
[http://hypothesis.readthedocs.org/en/latest/examples.html#co...](http://hypothesis.readthedocs.org/en/latest/examples.html#condorcet-
s-paradox) is also a decent example where the data is more uniform but
requires a bunch of massaging to satisfy the constraints.

------
leondutoit
Thanks for a great library, can't wait to use it.

