
Fuzzing is magic – Or how I found a panic in Rust's regex library - mmastrac
https://www.nibor.org/blog/fuzzing-is-magic---or-how-i-found-a-panic-in-rusts-regex-library/
======
lomnakkus
It's really disquieting how much similarity there is between fuzzing and
QuickCheck... and yet _so_ few people have even noticed and know about
_either_. (Yes, fuzzing and QC work at slightly different "levels", I suppose,
but... it's all just virtual execution by the CPU, so...)

~~~
kibwen
Worth noting that there's also an implementation of QuickCheck available for
Rust (by none other than burntsushi, the author of the regex library in the
OP):
[https://github.com/BurntSushi/quickcheck](https://github.com/BurntSushi/quickcheck)

~~~
im_down_w_otp
We've recently started testing C code via Rust QuickCheck.

In addition to the relatively clean Rust -> C interface it's very nice to have
higher degree of compile-time assurance of the correctness of the property
model itself. As the models become non-trivial it becomes increasingly
difficult to build them reliably in C itself.

~~~
adrianN
Do you have more information about that? Maybe a blog post?

~~~
im_down_w_otp
Not at the moment, but almost certainly at some point in the relatively near
future.

------
krick
How does the fuzzer actually work? Is the data really just a completely random
string (array of bytes) over and over again, or is there something else? Does
it just chose random byte random number of times for each input, or there is
some more sophisticated logic in how exactly does it chose the "random input"?

~~~
lbrandy
libfuzzer (and many of the fuzzers used widely today) is a coverage (as in,
code-coverage) directed fuzzer. So it's not just randomly mutating the input,
but "selecting" for random inputs that happen to light up new code paths. It
works with a (or generates a random) corpus, and prefers the sample inputs
that light up new code paths in subsequent mutations of the population.

------
kibwen
I'm really excited at encouraging a culture of fuzz testing in Rust. :) Can
someone explain what's currently Linux-specific about cargo-fuzz? Is it to do
with LibFuzzer itself?

~~~
amaranth
The libFuzzer website doesn't cite any supported platforms or platform
requirements so I think this might just be a matter of someone putting in the
effort. The Chromium documentation (where libFuzzer comes from originally)
does make it sound like it either takes effort to integrate with various
sanitizers or (more likely) requires the infrastructure that the sanitizers
do. None of those support Windows and seem to work best on amd64 Linux or, if
you're lucky, amd64 macOS.

------
tux1968
It would be interesting to know how such a panic-inducing bug makes it passed
Rust's much vaunted borrow-checker and other correctness validations. Or does
the regex library include "unsafe" code blocks?

~~~
kibwen
I don't think this comment warrants the downvotes it's receiving, as it's
entirely reasonable to conflate panics with unsafety if one is only used to
"crashing" in the context of C. It's a misconception that we get relatively
frequently, so we may just need to make sure this is highlighted in our
official tutorials.

~~~
carols10cents
We have a whole chapter on error handling in the second edition of The Rust
Programming Language[1] that goes into these issues in detail. When we revise
chapter 1 (we're doing that chapter last), we're planning on defining what
Rust means by "safe".

[1] - [http://rust-lang.github.io/book/second-
edition/ch09-00-error...](http://rust-lang.github.io/book/second-
edition/ch09-00-error-handling.html)

