Hacker News new | past | comments | ask | show | jobs | submit login
Write Fuzzable Code (regehr.org)
140 points by matt_d 47 days ago | hide | past | web | favorite | 29 comments

If I am going to plaster extra asserts and predicates in my code, why not just use a logic programming language or something like SPARK/TLA+ anyway? Not looking for an argument, just asking a question.

I've tried replacing asserts with Mercury code in my C++ pet projects. Felt much easier to insert my thots into the source.

By the way, what do you guys think of predicate-based formal verification? I think overall, C++ Concepts are checked for object class compatibility via a logic engine, since Concepts are predicates.

The biggest win of predicate-based formal verification (or really any formal verification at all) is forcing you to carefully think about invariants and properties of your code (independently of whether you can prove them). Even if you can't prove them, these serve as useful formal documentation for maintenance purposes and help guide and maximize the efficiency of other techniques such as traditional testing. If you can characterize your code with useful invariants and properties, then you can write useful unit tests and property tests, rather than writing happy path tests that fail to actually catch many bugs.

Assertions are one piece of the puzzle, but often times you want to reason about how something fits in a larger context (the canonical example of this is something like the associativity of an operator, which isn't something you can check for with an assert statement within the operator), which is where independent predicates come in handy.

In general, I think of fuzzing and property-based testing as a lightweight and practical stand-in for the verification part of formal verification. If your formal properties can be proved quickly (e.g. by a short manual proof or some sort of automated engine) then great! If they can't, you can still often times convert it into a fuzzing-friendly format or a property test and get a fairly high degree of confidence of correctness that way.

Because they work at different levels and provide different strengths.

Model-checking a TLA+ spec is fully automatic, but the specification is probably at a high level; fuzzing can find low-level implementation problems.

SPARK or languages/tools like it specifies at the code level, but verifying at the same level as fuzzer can be very laborious if at all feasible (same as using the proof assistant with TLA+).

In short, software verfication comes with huge tradeoffs among labor, soundness and scale. There is no one solution that subsumes the others.

> why not just use a logic programming language or something like SPARK/TLA+ anyway?

This (usually) requires a comprehensive reference model of your design whereas asserts can be added very adhoc and phased in as appropriate.

Also, fuzzing doesn't require assertions, it just yields more/better fruit that way.

Wouldn't it require a set of business rules instead of a comprehensive design model? It's logic programming. Based on those rules it can determine whether the input I passed in should produce the output I received.

The logic programming suggestion would be much less work but is usually overlooked. Probably because that kind of subtle, concise code is the polar opposite of brute force fuzzing.

> Wouldn't it require a set of business rules instead of a comprehensive design model?

Well, I was imagining something like a library being tested/verified rather than an application or system. But either way, you have to model the features of the code-being-verified and the model must be comprehensive. An incomplete reference model usually will result in false defects.

It's a lot easier to sprinkle some assertions throughout an existing legacy application than it is to reimplement that application in something like TLA+

>Felt much easier to insert my thots into the source.

thots -> thoughts?

What's a good tutorial for fuzzing? I work with many code bases in my consulting, none of which have come close to being "fuzzed". I've googled around and the learning curve is steeper than expected. I am approaching it like learning a new unit-test framework. Perhaps that's wrong? Is it like integrating a code coverage tool or memory analysis like valgrind? This article seems like a bunch of high-level entreaties to go fuzz etc. maybe I missed something.

It depends really a lot on what your target looks like, both in terms of what interfaces it has and what programming language you're using.

A very simple case that's a good start is if you have C or C++ code and already have some form of command line tool that parses an input file and does something with it noninteractively. Then you can easily use afl, ideally coupled with address sanitizer (a compiler feature to catch memory safety bugs).

If you want to up your game, don't target applications, target functoins. The go-to tool is libfuzzer.

If you leave C/C++ then the answer depends really on what programming language and what you even want to test for. (Looking for memory corruption bugs is just the most common thing to do with fuzzing, so if you don't have those, you need something else to look for.)

> I am approaching it like learning a new unit-test framework.

That's reasonable.

If you just take a step back from afl or libfuzzer and consider how you would test your product with arbitrary inputs, the critical steps that you have to take are pretty similar regardless of the framework.

It has to be noninteractive and it should ideally be something whose duration doesn't vary by many orders of magnitude (so you can set sane timeouts that you consider failures). It shouldn't have any nondeterminism or behaviors that consider other inputs outside of the ones generated by the fuzzing framework. It's great if you can avoid filesystem interaction.

I was in the same place for a long time. Recently I found the bar to entry in fuzzing rust code amazingly low. The docs aren't great yet but the tools are super easy to use and well put together. I started with the rust fuzz book: https://rust-fuzz.github.io/book/

Here's an example in Golang:


The short version is that you strip down the interesting main loop of your program to accept an item of input, process it, and exit, and then the fuzzer takes over.

"Generating Software Tests" (https://www.fuzzingbook.org/) is pretty great (independent of your programming language) - arguably a must read for anyone interested in software testing.

John Regehr (the author of the blog post) has written more great posts:

- How to Fuzz an ADT Implementation - https://blog.regehr.org/archives/896

- Better Random Testing by Leaving Features Out - https://blog.regehr.org/archives/591

- Tricking a Whitebox Testcase Generator - https://blog.regehr.org/archives/672

- Fuzzers Need Taming - https://blog.regehr.org/archives/925

- Levels of Fuzzing - https://blog.regehr.org/archives/1039

- API Fuzzing vs. File Fuzzing: A Cautionary Tale - https://blog.regehr.org/archives/1269

- Reducers are Fuzzers - https://blog.regehr.org/archives/1284

In terms of software, DeepState (https://github.com/trailofbits/deepstate) may be a good place to start for C and C++. Relevant links:

- Fuzzing an API with DeepState: https://blog.trailofbits.com/2019/01/22/fuzzing-an-api-with-..., https://blog.trailofbits.com/2019/01/23/fuzzing-an-api-with-...

- NDSS 18 paper, "DeepState: Symbolic Unit Testing for C and C++": https://www.cefns.nau.edu/~adg326/bar18.pdf

In terms of choosing among fuzzing solutions, https://blog.trailofbits.com/2018/10/05/how-to-spot-good-fuz... is also worth a read -- as well as the article it refers to, http://www.pl-enthusiast.net/2018/08/23/evaluating-empirical.... For a broad survey, see "The Art, Science, and Engineering of Fuzzing": https://arxiv.org/abs/1812.00140, https://jiliac.com/pdf/fuzzing_survey19.pdf

More resources:

- Effective File Format Fuzzing – Thoughts, Techniques and Results (Black Hat Europe 2016): https://j00ru.vexillium.org/talks/blackhat-eu-effective-file...

- libFuzzer – a library for coverage-guided fuzz testing: http://tutorial.libFuzzer.info, http://llvm.org/docs/LibFuzzer.html, https://github.com/ouspg/libfuzzerfication

- Materials of "Modern fuzzing of C/C++ Projects" workshop: https://github.com/Dor1s/libfuzzer-workshop

- Introduction to using libFuzzer with llvm-toolset: https://developers.redhat.com/blog/2019/03/05/introduction-t...

- Fuzzing workflows - a fuzz job from start to finish: https://foxglovesecurity.com/2016/03/15/fuzzing-workflows-a-...

- Materials from "Fuzzing with AFL" workshop (SteelCon 2017, BSides London and Bristol 2019): https://github.com/ThalesIgnite/afl-training

- Making Your Library More Reliable with Fuzzing (C++Now 2018; Marshall Clow): https://www.youtube.com/watch?v=LlLJRHToyUk, https://github.com/boostcon/cppnow_presentations_2018/blob/m...

- C++ Weekly - Ep 85 - Fuzz Testing - https://www.youtube.com/watch?v=gO0KBoqkOoU

- The Art of Fuzzing – Slides and Demos: https://sec-consult.com/en/blog/2017/11/the-art-of-fuzzing-s...

If anyone, like me, wants to save this comment for later, go to the comment's permalink (https://news.ycombinator.com/reply?id=20830846) and click favorite.

Thank you very much for collating this, the list looks terrific

Awesome list.

For C/C++:

This should help you get started with AFL and more traditional security research workflows:


Consider DeepState for a more development-focused workflow, akin to writing unit tests:


One thing I do occasionally is to push /dev/urandom into functions in various tests. Read n bytes, use it as input. Not quite as extensive as fuzzing libraries, but it works.

Speaking of fuzzing, does anybody know of a solution for fuzzing multi-step processes? Suppose I was fuzzing a network application, which requires an entire session for a bug to be discovered. I can't do that with vanilla afl-fuzz; what tool would enable me to fuzz, say, an SSL/TLS library?

> Suppose I was fuzzing a network application, which requires an entire session for a bug to be discovered

I/O is a general problem for fuzzing and IMO the simplest/most general approach is to try and decompose the code under test to find a part that is able to accept a single input stream.

EDIT: e.g. for an SSL/TLS library -- if you had a

    bool msg_create(msg_t *msg, void *input, size_t len_bytes)
function, you could fuzz that one easily.

For something like an SSL/TLS library, you could have a fuzzing mode that sets the initial randomness to a fixed value. That lets you create a pre-baked input file that can successfully establish a session, and you'd start the fuzzing from there.

You can do the same thing with session IDs or similar. In general when you're fuzzing you want to fix your random seeds (and date/time inputs etc, as well as reset any external data store modified by the program back to initial conditions) - ideally your program being fuzzed is a pure function of its input.

If you want to fuzz a library by making a series of calls, DeepState has special support for that, but that won't handle some other issues with network applications of course.

> In other words, it detects only crashes. We can do much better than this. Assertions and their compiler-inserted friends — sanitizer checks — are another excellent kind of oracle.

I'm trying to understand this one. Isn't a fuzzer useful for catching unpredicted exceptions? It seems a non-zero or premature exit of a program is one of the best scenarios.

Fuzzing can feed all kinds of bad input into a program where the correct result is to print an error message and/or return an error code. E.g. if I feed a malformed PNG into a texture conversion tool, it probably should print an error and exit nonzero, so we can't automatically mark those as test failures.

On the other hand, if my same texture conversion tool hits a debug assertion about a corrupted heap, that should be a test failure. On the one hand, it was kinda predicted - someone wrote a debug assert to catch it after all - but on the other hand, it certainly wasn't supposed to happen, no matter what kind of bad input we were fed. It indicates we're corrupting memory, quite possibly in a way that won't always do the right thing of exiting non-zero, especially if the debug assertions are disabled in release builds.

Even when your encounter assertions in more predictable code, it's often the case that a non-zero premature exit actually wasn't the desired response. Maybe you wanted to gracefully exit with a better error message. Maybe you wanted to log an error and continue execution when your HTTP server recieves bad client requests, instead of creating a denial-of-service CVE. Maybe you had a bug, and your code is mishandling entirely valid input.

If you're using assertions to indicate "I believe this should never happen" instead of a general error reporting tool, it can be very nice to make them test failures when fuzzing.

> If you're using assertions to indicate "I believe this should never happen" instead of a general error reporting tool, it can be very nice to make them test failures when fuzzing.

I used to work for a company that used asserts as general error checking and it was awful. It was for embedded firmware, and I always ran tests (including security tests) with debug firmware. Non-debug firmware had asserts disabled. On debug firmware, a failed assert would cause a complete crash with the display showing the source code file and line number of the failed assert.

Some developers were using asserts on simple user input. For example, this firmware had a web server with a REST API. One of the APIs would expect a number as a parameter, and the person doing development added an assert() statement to make sure the API received a number and not letters or other characters.

It made security testing a nightmare and most of the assertion failures I ran into were completely pointless.

> It made security testing a nightmare and most of the assertion failures I ran into were completely pointless.

Just to clarify for me, what you're saying is that:

1) instead of asserts for invalid input validation, they should have just returned the appropriate error code

2) so that the fuzzing harness could continue with the run rather than just keep dying?

> It was for embedded firmware, and I always ran tests (including security tests)

You had me going there for a minute. lol. :)



Basically, I'd see code that looked like this:

    if (!isNumeric(userInput)) {
        return ERR_INVALID_INPUT;

I'm currently fuzzing some ActiveX controls. Fortunately the top-level code is in JS, so I was able to write a shim to record all the calls and arguments. Then I can replay them to get the control into a valid state before fuzzing individual methods.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact