unit tests are easy since I know what the code does, and I can just tell it what to do and what I expect.
But with hypothesis, I have to find some kind of general property, which I have a hard time to do.
Any tips, or materials I can use ?
P.S: I read your test.py, and see you use a rule base state machine. I've never seen that in any tutorial on hypothesis I read. What does it do ? How do you use it ?
For a long time, I felt the same way you describe: all introductions to property-based testing show you things like testing a function that reverses a list and yes it all looks very useful and nifty, but it's hard to imagine how to formulate interesting variants for more complex code. These two posts were the first to really give me a fuller picture and lots of practical strategies:
(In general, everything I've ready/watched by Scott Wlaschin has been super helpful, and I've never written a line of F#.)
The most basic things to check
- No valid inputs give an error
- All invalid inputs give a error
These are enhanced if you sprinkle your code with assertions. Writing assertions inside the code is good training to identify. Something supposed always non-negative, check it. Something always supposed to have more than 0 entries, check it. Something supposed to be sorted, check a <= a and a[-2] <= a[-1].
Often you can test symmetry type properties
- deserialize(serialize(in)) == in.
- init(); connect(); disconnect(); -> state identical as just init();
Or you can use an 'oracle', something external that tells you whether the output is correct or not.
- allTricksOptimized(in) == simpleReferenceImplementation(in)
- myImplementation(in) == thirdparty.implementation(in)
In many cases, though, it's better if you can refactor your code to support a stateless interface, and then you can write simple single-step function tests.
How do you know what to test? There's a whole spectrum. Just testing that valid inputs don't crash is often pretty useful. (If you're not sure what valid inputs would be, maybe try writing the README before the tests.) In the case of Dumpulse, I went to the other extreme; its actual semantics are extremely dumb (thus the name), so the test suite includes a reimplementation in Python of the intended semantics, and the test verifies that the implementation in C has the same behavior. (But in constant space and strictly limited runtime, which the Hypothesis test doesn't check.)
I picked a few functions and classes at random to see what kind of testable properties I could find:
- sympy.polys.rationaltools.together: this function gives a testable assertion in its docstring: "``apart(together(expr))`` should return expr unchanged." That's an easy property to test. Also, though, its output expression is intended to be equivalent to its input expression, so if you put together some kind of generator of expressions (which is actually the hard part in this context), you could verify that together(expr) always evaluates to the numerically same value as expr.
- horizons.world.buildability.settlementcache.SettlementBuildabilityCache: this seems to be a storage system that caches some data and then incrementally updates the cache as the data is modified. The desired property seems to be that adding some data to the cache, then modifying it, gives you the same results as you would have gotten just by adding the modified data to an empty cache. Also I think there's a serialization aspect, so you could check to see if loading the cache from a binary blob gives you the same results as just creating it in memory.
- statsmodels.datasets.macrodata.data.load: this function loads a CSV file from disk. Really all you can test about it, without interposition at the filesystem layer, is that it returns some data. It doesn't really take any inputs. A lot of installation problems could cause it to throw an error, though, so that would be a useful test in some circumstances.
- sre_constants: if executed as a script, this program should generate a syntactically valid .h file in sre_constants.h. You should be able to feed that to a C compiler to verify that, but if you care about that, your build system is probably already doing it!
Do you want to point me at some code of yours so I can see what kind of useful property-based tests could be written for it?
I've dug into parsers a few times, but everything I encountered seemed to think that once I had a parsed tree of commands it was obvious how to consume it...and it wasn't (for me). I've never had the free time to dedicate to experimenting that abstractly, so anytime I'm tempted to write a DSL or similar for a current problem I punt and do something else.
Does anyone have advice on how to find nice ways to use parsers/lexers in a practical way that doesn't involve a massive investment of time or assume I have a lot of abstract comp sci background? Every thing I've seen has been more of a compilers course, which seems overkill.
For slightly more advanced needs, parser combinator libraries make parsing and lexing quite straightforward. I honestly wouldn't use a parser generator.
Barring that, you can google "compiler tutorial" and just start looking for one you like. One nice thing about compilers is they consist of a lot of stages and while the whole is arguably greater than the sum of its parts, the parts are all pretty darned useful on their own, too.
I can't argue about the broad applicability of compilers, but the main reason I've avoided such tutorials is that I'm looking for something I can get the basics of in hours/days rather than a weeks/months.
Picture the task of what you want to do in your head. Maybe you're writing a linter, and you want to enforce a rule like "never use the 'var' keyword". Write some example code and poke around in the explorer.
Another thing to keep in mind is that once you have a tree, usually you also want to have some tree traversal utilities, so that you can walk over the tree and optionally transform it.
I love just looking through the absurd stuff AFL comes up with, even if it's not causing a crash or incorrect behavior. Like this bit of art it caused my parser generator to produce: https://i.imgur.com/VoV7cU9.png
If you're interested in trying out fuzzing without having to learn the intricacies of AFL or set things up manually, let me know and I can get you set up with an account to play around with.
Happy to answer any questions about AFL/Fuzzbuzz!
 everest [at] fuzzbuzz [dot] io
You won't get the same kind of results that you could by writing your own harness, but it would still be possible to find crashes, extreme memory usage or timeout bugs. Using something like libdislocator  would allow you to expose certain memory bugs as well.
It's similar to oss-fuzz in terms of functionality, in that it lets you integrate fuzzing into your dev workflow by automatically pulling your latest code, fuzzing in the background, alerting you on bugs, running regression testing, etc.
It differs in that while oss-fuzz is only for select large open-source projects, Fuzzbuzz lets anyone sign up and begin fuzzing their code. We also support more languages - the usual C/C++ as well as Golang, Python and Ruby, with more in the pipeline.
Am I underestimating the quantity or smth?
> That’s the way things were until mid-2018 when I revisited the project.
Maybe he's just got used to it.