
Rust testing or verifying: Why not both? - adreid
https://alastairreid.github.io/why-not-both/
======
afarrell
Keep in mind that tests & test tooling serves developer ergonomics purposes
beyond ensuring that the software works:

1\. Reproducibly setting up the conditions to trigger a bug.

2\. Explaining how a module expects to be used.

3\. Sketching the design of a module's interface & affordances.

4\. Nudging away from designs with poor encapsulation.

5\. Soothing the anxiety of a developer who is worried they are not making
hour-by-hour progress.

6\. Reminding a developer what task they were in the middle of when their
brain dumps Working Memory due to an interrupt ...or just randomly OOMKills
mid-day because a baby woke them up at 3am.

~~~
marcosdumay
A good type system will help with # 2, 3, 4, and 6 too.

In fact, types are better for #6 than tests, because they have lower overhead,
so they are more likely to be complete at the interruption.

~~~
afarrell
Agreed on the first 3.

I think you might have misunderstood #6, because I don't see how a type system
can function as a todo list.

~~~
dllthomas
As I am writing my code, I try to make sure that when I need to make a change,
I can make part of the change and the type checker will surface everything
that needs to change in tandem (not always in a single run). The todo list is
then simply the list of remaining errors, and I can usually jump right to
exactly where the fix is needed.

This sometimes works out really impressively well. There was a time I was
working on a project that required really low latency at low-to-moderate
throughput. It was greenfield C development, fairly experimental. I made a
habit of tagging functions with an argument describing which of the statically
defined threads the function needed to live on, tagging data with which thread
it should be accessed from, and checking agreement between them (without
runtime overhead).

This couldn't be rigorously enforced by the type system - C has way too many
escape hatches - but it was easy to build a little bit of infrastructure such
that doing it right was significantly easier than doing it wrong, and doing it
wrong was obvious.

When (as is inevitable) we discovered that we wanted to do significantly more
work somewhere and so wanted to move it out of the hot path, or discovered
that some new thing we wanted to do unavoidably had to now live in the hot
path, &c, I could move code between functions and the compiler would tell me
everywhere I was doing something that didn't belong in the new thread.

~~~
afarrell
Huh.

hmmm....

You know, I think you've just revealed something about the shallowness of my
experience with type systems.

------
siscia
I implemented a tiny and simple data structure in rust, a circular buffer:
[https://github.com/RedBeardLab/circular-buffer-
rs](https://github.com/RedBeardLab/circular-buffer-rs)

The amount of subtle bugs and problems that proptest was able to point out to
me was terrific.

Property based testing it is definitely the tool that you want to use to
thoroughly test a complex library that need to maintains its invariants.

The interesting bit is that if you approach the problem with unittest, you
will write mostly test that pass. If you approach the test writing as a
property test, then you will write much harder test cases.

~~~
eru
Apropos hard test cases: have a look at the floats that Python's Hypothesis
property testing library generates. They are truly evil.

------
pjmlp
More interesting, is that from his blog I ended up on Project Oak, being
written in Rust

[https://github.com/project-oak/oak](https://github.com/project-oak/oak)

~~~
adreid
That is the project we are trying to support with our work. Unfortunately,
this means I have to solve lots of hard problems like scaling, usability,
modular verification, etc.

~~~
pjmlp
On the other hand I guess those are nice opportunities for new papers on the
domain. :)

Very interesting research area.

------
ivanbakel
I'm pleased to see that "verification" isn't just relying on random value
generators for property testing in this case. While I agree with the author
that actually running code is still useful, simple specifications have the
power to be much more than glorified fuzzing - and Rust is the perfect
language to start pushing that in. The use of native syntax for int value
ranges is especially nice.

The main question is: how does this compare to other Rust verification
efforts[0]? Having to write specifications in test crates rather than next to
the function definitions is probably going to be a big obstacle to multi-crate
verification.

[0]: [https://github.com/viperproject/prusti-
dev/](https://github.com/viperproject/prusti-dev/)

~~~
adreid
This is more at the end of automatic verification tools like KLEE, SMACK, etc.
Auto active tools like Prusti are super-interesting too and I suspect that we
need a hybrid approach.

~~~
ivanbakel
From what I understand of Prusti, it still aims to be automatic and the
underlying Viper pipeline still relies on constraint solving for the most part
- certainly the Prusti examples don't seem to do anything more complex than
that.

Would you be able to outline the main differences?

~~~
adreid
The difference is really about the level of annotation you provide. In testing
(and the more automated verifiers) you might add assertions. In the “auto
active” tools, you are adding more function contracts, invariants, maybe hints
to help find a proof, etc.

------
kybernetikos
This article makes me think that we should consider tests as a monte-carlo
integration over the space that verification tries to cover.

~~~
eru
Interesting perspective. It only really works with property based testing, not
with conventional fixed-example based testing.

~~~
kybernetikos
It is definitely more natural with property based testing, but I think it
works to some extent as a perspective in general.

You have a huge input space where each point represents some combination of
possible inputs, you can think of representing correct behaviour as 0 and bad
behaviour as 1 for each of those points. You're attempting to show by your
verifying and testing as best as you can that the sum across all the inputs is
0.

For fixed example testing, you sample some points that you consider to be
representative or boundaries, show that they are 0, and use that to argue that
the whole space is 0. With property testing, you can let it literally do the
monte-carlo thing. With verification you attempt to do the integration
analytically.

~~~
eru
All good points!

I think just as reliability engineering benefited moving from 'keep the system
always up' (but fail occasionally in the real world) to 'keep the system up
99.999% of the time' (and succeed in a measurable way), fixing bugs in
software might benefit from such an honest and quantitative approach.

------
sn9
This looks super cool!

I was under the impression that burntsushi's implementation of quickcheck [0]
was the most popular property-based testing library for Rust. Was I mistaken
or was there a reason the authors chose another library as inspiration?

[0]
[https://github.com/BurntSushi/quickcheck](https://github.com/BurntSushi/quickcheck)

~~~
shepmaster
As a previous user of quickcheck and now user of proptest, I can say that
proptest feels much better to use because it doesn't tie quite as directly to
a single type (as a sibling comment mentions).

With quickcheck, I had to define `MyTypeWithAnOddOffset` and
`MyTypeWithAnEvenOffset` (or equivalent formulation) and _composing_ things
was tedious or hard.

With proptest [1], things flow much nicer.

[1]:
[https://github.com/shepmaster/jetscii/blob/2b1039a9fb8d8ad9c...](https://github.com/shepmaster/jetscii/blob/2b1039a9fb8d8ad9c0dbac2c1007c4d0533261c2/src/simd.rs#L387-L409)

------
jeremysalwen
It would be cool if you could also use the exact same interface for coverage
based fuzzing, creating a unified property testing/fuzzing/formal verification
interface.

