
American fuzzy lop – a security-oriented fuzzer - pcr910303
http://lcamtuf.coredump.cx/afl/
======
XMPPwocky
AFL is neat because it's conceptually simple, but practically revolutionized
fuzzing, especially for non-specialists. The basic algorithm is

    
    
      def run_test_case(input):
        # return a set() of instructions that the target executes when ran on `input`
        # or throw a CrashException if the target crashes
      
      def mutate(input):
        # mess with the input- flip some bits, delete chunks, set things to 0xffffffff... randomly
        # return the mutated input
    
      def fuzz(initial_test_cases):
        test_cases = initial_test_cases
        coverage_seen = set()
        # collect coverage from the initial inputs
        for case in test_cases:
          coverage_seen += run_test_case(case)
    
        while True:
          fuzzed = mutate(random.choice(test_cases))
          try:
            new_coverage = run_test_case(fuzzed) - coverage_seen
            if new_coverage:
              # ooh, this input did something we've never seen before!
              # save it, so it can be used as a starting point
              # for even more mutation
              test_cases.add(fuzzed)
              coverage_seen += new_coverage
          except CrashException:
              # we successfully crashed the target!
              # save fuzzed off to disk or something and log a happy message
    

There's more to it in practice, of course- for example, run_test_case doesn't
return a set of instructions, it's a bitmap / psuedo-Bloom filter of _basic
blocks_ hit. (A basic block, to a first approximation, is "a sequence of
instructions that doesn't have any unusual control flow" \- so if you run the
first instruction in a basic block, you'll run all the rest.) And there's a
fair bit of complexity involved for performance reasons.

But the core algorithm is simple enough that you can actually implement it in
pure Python, for Python programs, in < 100LOC - and that implementation will
find real bugs in a lot of Python libraries.

~~~
XMPPwocky
And speaking of its simplicity - the original code to instrument programs
involves, uh... doing _this_ on assembly files.
[https://github.com/google/AFL/blob/master/afl-
as.c#L284](https://github.com/google/AFL/blob/master/afl-as.c#L284)

Makes a pretty good case for "worse is better", if you like.

~~~
f-
(Author here)

The funniest part is that this ugly hack kept working across platforms for
many years; whereas when somebody else implemented a "proper" integration with
the clang / llvm API, their solution proved to be extremely fragile. The API
wasn't stable between compiler versions, and because it wasn't really used
much, it had all kinds of bugs, including being outright unusable at times.

Also, most distros packaged clang in a way that made it impossible to compile
the plugin, because of missing or mismatched headers, missing companions
tools, etc. So you had to download and rebuild the whole compiler, which took
hours (and that's if you didn't get stuck in a dependency hell).

So yeah, this was very much a lesson in "worse is better".

~~~
touisteur
\+ it allows working with GCC-based compilers (GNAT for example) and mixed
C/Ada code bases.

Although the instrumentation via asm patching works well in most cases, it can
break down in strange ways. See (shameless plug) :
[https://blog.adacore.com/running-american-fuzzy-lop-on-
your-...](https://blog.adacore.com/running-american-fuzzy-lop-on-your-ada-
code)

Amazing tool, quite extensible (adapted it to work only in memory+tmpfs
without touching disk - specific corner-case... Very easily) and readable.
It's also funny to scale to multiple cores and multiple machines.

------
alltakendamned
I suggest using AFLPlusPus instead, a maintained fork of AFL with a lot of
community patches further extending it's performance and capabilities.

[https://github.com/vanhauser-thc/AFLplusplus](https://github.com/vanhauser-
thc/AFLplusplus)

AFL itself hasn't seen updates in years.

~~~
f-
One of my design goals for AFL was to make it very simple to use - because
there's plenty of fuzzers that work OK when you dial in 50 knobs just right,
but fail spectacularly otherwise - basically ensuring that nobody but the
author can really use the tool to its full capacity.

While AFL++ is cool, it sort of ditches that philosophy, giving you a lot of
options to tweak, but not necessarily a whole lot of hope that you're going to
tweak them the right way. So, that's one gotcha to keep in mind.

~~~
jonhohle
This shouldn't be understated. I went from having never run a fuzzer to having
a 16-node run executing for thousands of machine hours on a somewhat
unfamiliar C++ codebase with minimal effort.

Both the tool and the documentation made it easy for me jump in, identify
bugs, write new test cases, implement a fix, and verify the fix passed without
issue. I've mentioned it on HN before, but AFL taught me how incredibly
difficult it is, even for experts (think most senior engineers at a FAANG) in
the field, to write C++ without security vulnerabilities. I was even able to
find and fix bugs which were previously reported but no one was able to
reproduce reliably.

If there was an AFL t-shirt, I'd wear it ;-)

------
Animats
By the way, a "lop" is a lop-eared rabbit, one where the ears droop instead of
being upright. The American fuzzy lop is an rather obese domesticated breed,
for what that's worth.

~~~
tomashubelbauer
Holy! I've been calling this software American fuzzy_loop_ for years without
ever noticing my mistake.

------
toasted_flakes
The project home page [1] has a lot of information and a "bug trophy case"

[1] [http://lcamtuf.coredump.cx/afl/](http://lcamtuf.coredump.cx/afl/)

~~~
VBprogrammer
Michal Zalewski is one of those people who, by one method or another, seems to
be able to do 100x what most people manage to do with their life. I still go
back and read his introduction to CNC machining / building robots every year
or two. I still get something new out of it every time.

[http://lcamtuf.coredump.cx/gcnc/](http://lcamtuf.coredump.cx/gcnc/)

------
jkaptur
AFL might be security-oriented, but I think it should be part of the testing
strategy for any software that reads input.

Right now at work I'm modifying an old C++ codebase, and knowing that AFL
can't crash it is just one of the main things letting me sleep at night. (When
I first set it up, it almost immediately found two minor bugs that were
obvious _in retrospect_ ).

------
ris
Coincidentally I've actually spent the last couple of weekends putting
together a little toolkit to allow fuzzing of cpython bytecode using
AFL/AFL++:
[https://github.com/risicle/cpytraceafl](https://github.com/risicle/cpytraceafl)

------
dang
Previous threads:

[https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...](https://hn.algolia.com/?dateRange=all&page=0&prefix=false&query=%22american%20fuzzy%20lop%22%20comments%3E1&sort=byDate&type=story)

[https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...](https://hn.algolia.com/?dateRange=all&page=0&prefix=false&query=AFL%20comments%3E10&sort=byDate&type=story)

I think there have been many more even than these, but can't think of how else
to search for them.

------
devy
Why isn't the site that hosts security research tool downloads is NOT an HTTPS
site? ¯\\_(ツ)_/¯

Chrome browser will rendering non-HTTPS sites as "not secure". And the
download links are local.

[http://lcamtuf.coredump.cx/afl/](http://lcamtuf.coredump.cx/afl/)

