
Write Fuzzable Code - matt_d
https://blog.regehr.org/archives/1687
======
papermachete
If I am going to plaster extra asserts and predicates in my code, why not just
use a logic programming language or something like SPARK/TLA+ anyway? Not
looking for an argument, just asking a question.

I've tried replacing asserts with Mercury code in my C++ pet projects. Felt
much easier to insert my thots into the source.

By the way, what do you guys think of predicate-based formal verification? I
think overall, C++ Concepts are checked for object class compatibility via a
logic engine, since Concepts are predicates.

~~~
wyldfire
> why not just use a logic programming language or something like SPARK/TLA+
> anyway?

This (usually) requires a comprehensive reference model of your design whereas
asserts can be added very adhoc and phased in as appropriate.

Also, fuzzing doesn't require assertions, it just yields more/better fruit
that way.

~~~
drewcoo
Wouldn't it require a set of business rules instead of a comprehensive design
model? It's logic programming. Based on those rules it can determine whether
the input I passed in should produce the output I received.

The logic programming suggestion would be much less work but is usually
overlooked. Probably because that kind of subtle, concise code is the polar
opposite of brute force fuzzing.

~~~
wyldfire
> Wouldn't it require a set of business rules instead of a comprehensive
> design model?

Well, I was imagining something like a library being tested/verified rather
than an application or system. But either way, you have to model the features
of the code-being-verified and the model must be comprehensive. An incomplete
reference model usually will result in false defects.

------
sriram_sun
What's a good tutorial for fuzzing? I work with many code bases in my
consulting, none of which have come close to being "fuzzed". I've googled
around and the learning curve is steeper than expected. I am approaching it
like learning a new unit-test framework. Perhaps that's wrong? Is it like
integrating a code coverage tool or memory analysis like valgrind? This
article seems like a bunch of high-level entreaties to go fuzz etc. maybe I
missed something.

~~~
matt_d
"Generating Software Tests"
([https://www.fuzzingbook.org/](https://www.fuzzingbook.org/)) is pretty great
(independent of your programming language) - arguably a must read for anyone
interested in software testing.

John Regehr (the author of the blog post) has written more great posts:

\- How to Fuzz an ADT Implementation -
[https://blog.regehr.org/archives/896](https://blog.regehr.org/archives/896)

\- Better Random Testing by Leaving Features Out -
[https://blog.regehr.org/archives/591](https://blog.regehr.org/archives/591)

\- Tricking a Whitebox Testcase Generator -
[https://blog.regehr.org/archives/672](https://blog.regehr.org/archives/672)

\- Fuzzers Need Taming -
[https://blog.regehr.org/archives/925](https://blog.regehr.org/archives/925)

\- Levels of Fuzzing -
[https://blog.regehr.org/archives/1039](https://blog.regehr.org/archives/1039)

\- API Fuzzing vs. File Fuzzing: A Cautionary Tale -
[https://blog.regehr.org/archives/1269](https://blog.regehr.org/archives/1269)

\- Reducers are Fuzzers -
[https://blog.regehr.org/archives/1284](https://blog.regehr.org/archives/1284)

In terms of software, DeepState
([https://github.com/trailofbits/deepstate](https://github.com/trailofbits/deepstate))
may be a good place to start for C and C++. Relevant links:

\- Fuzzing an API with DeepState:
[https://blog.trailofbits.com/2019/01/22/fuzzing-an-api-
with-...](https://blog.trailofbits.com/2019/01/22/fuzzing-an-api-with-
deepstate-part-1/), [https://blog.trailofbits.com/2019/01/23/fuzzing-an-api-
with-...](https://blog.trailofbits.com/2019/01/23/fuzzing-an-api-with-
deepstate-part-2/)

\- NDSS 18 paper, "DeepState: Symbolic Unit Testing for C and C++":
[https://www.cefns.nau.edu/~adg326/bar18.pdf](https://www.cefns.nau.edu/~adg326/bar18.pdf)

In terms of choosing among fuzzing solutions,
[https://blog.trailofbits.com/2018/10/05/how-to-spot-good-
fuz...](https://blog.trailofbits.com/2018/10/05/how-to-spot-good-fuzzing-
research/) is also worth a read -- as well as the article it refers to,
[http://www.pl-enthusiast.net/2018/08/23/evaluating-
empirical...](http://www.pl-enthusiast.net/2018/08/23/evaluating-empirical-
evaluations-for-fuzz-testing/). For a broad survey, see "The Art, Science, and
Engineering of Fuzzing":
[https://arxiv.org/abs/1812.00140](https://arxiv.org/abs/1812.00140),
[https://jiliac.com/pdf/fuzzing_survey19.pdf](https://jiliac.com/pdf/fuzzing_survey19.pdf)

More resources:

\- Effective File Format Fuzzing – Thoughts, Techniques and Results (Black Hat
Europe 2016): [https://j00ru.vexillium.org/talks/blackhat-eu-effective-
file...](https://j00ru.vexillium.org/talks/blackhat-eu-effective-file-format-
fuzzing-thoughts-techniques-and-results/)

\- libFuzzer – a library for coverage-guided fuzz testing:
[http://tutorial.libFuzzer.info](http://tutorial.libFuzzer.info),
[http://llvm.org/docs/LibFuzzer.html](http://llvm.org/docs/LibFuzzer.html),
[https://github.com/ouspg/libfuzzerfication](https://github.com/ouspg/libfuzzerfication)

\- Materials of "Modern fuzzing of C/C++ Projects" workshop:
[https://github.com/Dor1s/libfuzzer-
workshop](https://github.com/Dor1s/libfuzzer-workshop)

\- Introduction to using libFuzzer with llvm-toolset:
[https://developers.redhat.com/blog/2019/03/05/introduction-t...](https://developers.redhat.com/blog/2019/03/05/introduction-
to-using-libfuzzer-with-llvm-toolset/)

\- Fuzzing workflows - a fuzz job from start to finish:
[https://foxglovesecurity.com/2016/03/15/fuzzing-
workflows-a-...](https://foxglovesecurity.com/2016/03/15/fuzzing-workflows-a-
fuzz-job-from-start-to-finish/)

\- Materials from "Fuzzing with AFL" workshop (SteelCon 2017, BSides London
and Bristol 2019): [https://github.com/ThalesIgnite/afl-
training](https://github.com/ThalesIgnite/afl-training)

\- Making Your Library More Reliable with Fuzzing (C++Now 2018; Marshall
Clow):
[https://www.youtube.com/watch?v=LlLJRHToyUk](https://www.youtube.com/watch?v=LlLJRHToyUk),
[https://github.com/boostcon/cppnow_presentations_2018/blob/m...](https://github.com/boostcon/cppnow_presentations_2018/blob/master/05-10-2018_thursday/making_your_library_more_reliable_with_fuzzing__marshall_clow__cppnow_05182018.pdf)

\- C++ Weekly - Ep 85 - Fuzz Testing -
[https://www.youtube.com/watch?v=gO0KBoqkOoU](https://www.youtube.com/watch?v=gO0KBoqkOoU)

\- The Art of Fuzzing – Slides and Demos: [https://sec-
consult.com/en/blog/2017/11/the-art-of-fuzzing-s...](https://sec-
consult.com/en/blog/2017/11/the-art-of-fuzzing-slides-and-demos/)

~~~
unknown2374
If anyone, like me, wants to save this comment for later, go to the comment's
permalink
([https://news.ycombinator.com/reply?id=20830846](https://news.ycombinator.com/reply?id=20830846))
and click favorite.

------
d33
Speaking of fuzzing, does anybody know of a solution for fuzzing multi-step
processes? Suppose I was fuzzing a network application, which requires an
entire session for a bug to be discovered. I can't do that with vanilla afl-
fuzz; what tool would enable me to fuzz, say, an SSL/TLS library?

~~~
wyldfire
> Suppose I was fuzzing a network application, which requires an entire
> session for a bug to be discovered

I/O is a general problem for fuzzing and IMO the simplest/most general
approach is to try and decompose the code under test to find a part that is
able to accept a single input stream.

EDIT: e.g. for an SSL/TLS library -- if you had a

    
    
        bool msg_create(msg_t *msg, void *input, size_t len_bytes)
    

function, you could fuzz that one easily.

------
dev_dull
> _In other words, it detects only crashes. We can do much better than this.
> Assertions and their compiler-inserted friends — sanitizer checks — are
> another excellent kind of oracle._

I'm trying to understand this one. Isn't a fuzzer useful for catching
_unpredicted_ exceptions? It seems a non-zero or premature exit of a program
is one of the best scenarios.

~~~
MaulingMonkey
Fuzzing can feed all kinds of bad input into a program where the correct
result is to print an error message and/or return an error code. E.g. if I
feed a malformed PNG into a texture conversion tool, it probably _should_
print an error and exit nonzero, so we can't automatically mark those as test
failures.

On the other hand, if my same texture conversion tool hits a debug assertion
about a corrupted heap, that _should_ be a test failure. On the one hand, it
was kinda predicted - someone wrote a debug assert to catch it after all - but
on the other hand, it certainly wasn't supposed to happen, no matter what kind
of bad input we were fed. It indicates we're corrupting memory, quite possibly
in a way that _won 't_ always do the right thing of exiting non-zero,
especially if the debug assertions are disabled in release builds.

Even when your encounter assertions in more predictable code, it's often the
case that a non-zero premature exit actually wasn't the desired response.
Maybe you wanted to gracefully exit with a better error message. Maybe you
wanted to log an error and continue execution when your HTTP server recieves
bad client requests, instead of creating a denial-of-service CVE. Maybe you
had a bug, and your code is mishandling entirely valid input.

If you're using assertions to indicate "I believe this should never happen"
instead of a general error reporting tool, it can be very nice to make them
test failures when fuzzing.

~~~
Sohcahtoa82
> If you're using assertions to indicate "I believe this should never happen"
> instead of a general error reporting tool, it can be very nice to make them
> test failures when fuzzing.

I used to work for a company that used asserts as general error checking and
it was awful. It was for embedded firmware, and I always ran tests (including
security tests) with debug firmware. Non-debug firmware had asserts disabled.
On debug firmware, a failed assert would cause a complete crash with the
display showing the source code file and line number of the failed assert.

Some developers were using asserts on simple user input. For example, this
firmware had a web server with a REST API. One of the APIs would expect a
number as a parameter, and the person doing development added an assert()
statement to make sure the API received a number and not letters or other
characters.

It made security testing a nightmare and most of the assertion failures I ran
into were completely pointless.

~~~
redis_mlc
> It made security testing a nightmare and most of the assertion failures I
> ran into were completely pointless.

Just to clarify for me, what you're saying is that:

1) instead of asserts for invalid input validation, they should have just
returned the appropriate error code

2) so that the fuzzing harness could continue with the run rather than just
keep dying?

> It was for embedded firmware, and I always ran tests (including security
> tests)

You had me going there for a minute. lol. :)

TIA.

~~~
Sohcahtoa82
Correct.

Basically, I'd see code that looked like this:

    
    
        assert(isNumeric(userInput));
        if (!isNumeric(userInput)) {
            return ERR_INVALID_INPUT;
        }

------
euroclydon
I'm currently fuzzing some ActiveX controls. Fortunately the top-level code is
in JS, so I was able to write a shim to record all the calls and arguments.
Then I can replay them to get the control into a valid state before fuzzing
individual methods.

