
Linux Filesystem Fuzzing with American Fuzzy Lop [pdf] - grhmc
http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016.pdf
======
cyphar
Fuzzing is an incredibly useful technique for finding bugs in a codebase. But
it should be noted that BUG() is the _only_ valid response if your state is
invalid. Filesystems should _never_ try to soldier on if their internal state
becomes corrupted -- there lie dragons.

~~~
asdfaoeu
The correct response is to stop writing to the filesystem. They mean BUG as in
it crashes the kernel. Having said that most Linux machines will not let you
mount filesystems unless you are root or physically present so they don't seem
like major concerns although should be fixed.

~~~
cyphar
> They mean BUG as in it crashes the kernel.

I know what BUG means. There's a reason it exists: to make sure that code in
an invalid state doesn't do something really dangerous. assert() is very
useful.

~~~
casas
It should be noted that we've hit that BUG() assertion on ext4 using the mount
option errors=remount-ro - it should just not be possible to trigger an
invalid state causing your kernel to Oops when you've configured it like this.

See Vegard's discussion with Theodore Ts'o about this:
[http://marc.info/?l=linux-
ext4&m=144898400422842&w=2](http://marc.info/?l=linux-
ext4&m=144898400422842&w=2)

~~~
blumentopf
I'm awed by your work but this is really an abomination:

"Unfortunately, company policy prohibits me from sharing the actual code."
([http://marc.info/?l=linux-
ext4&m=145007745502639&w=2](http://marc.info/?l=linux-
ext4&m=145007745502639&w=2))

------
realo
Super interesting, but...

We have "Time to first bug" for a lot of file systems (ext4, btrfs, hfsplus,
NTFS) covering a wide range of OSes & platforms ... and yet not a single word
about ZFS?

Come on Oracle... you can do better than that.

~~~
zdw
My understanding is that Oracle's linux devs were behind btrfs before the Sun
takeover and thus are boosting that, and also Oracle's lawyers are highly
litigious on licensing issues, so they probably view ZFS on Linux like lions
circling a carcass.

~~~
SEJeff
Close!

Chris Mason wrote btrfs when he started at Oracle (not Sun):

[http://www.linuxfoundation.org/news-
media/blogs/browse/2009/...](http://www.linuxfoundation.org/news-
media/blogs/browse/2009/06/conversation-chris-mason-btrfs-next-generation-
file-system-linux)

------
Ded7xSEoPKYNsDd
Wow, I did the same thing as part of my Bachelor's thesis recently. I'm glad
they ran into similar issues I did, although I didn't spend much time on that
part of the work. (It still got me the best results, afl is great like that.)

I guess I should bump reporting/fixing the issues I found on my todo list.

------
bjackman
Awesome! I work on an embedded project that could benefit _enormously_ from
having AFL run against it, but I've never taken the time to do it, because it
would take several engineer-weeks to even investigate if it could be done.
Their approach to "porting AFL to the kernel" makes me think that yes, it
would cost perhaps an engineer-month but at least the outcome wouldn't be
"nope, not practical". Thumbs up.

~~~
jonhohle
If you know the build system for your project, have some way of getting input
via stdin, and existing test corpus, it's surprisingly simple.

I was able to set it up in a few hours for a moderately sized library. Along
with valgrind, I was able to find and fix all of the bugs it uncovered after
over a CPU month of testing.

~~~
wyldfire
Arguably llvm's libFuzzer is around the same magnitude of complexity and
delivers similar results.

I used it to create a fuzzer for CPython [1] and it didn't take terribly long
to get something going. Majority of my time's been focusing on new test cases.

[1] [https://bitbucket.org/ebadf/fuzzpy](https://bitbucket.org/ebadf/fuzzpy)

------
okket
Once again Ext(4) shows that its praise for a clean, robust code base is well
deserved...

~~~
hannob
Have we lowered our standards so much that "it took longer to crash it with a
fuzzer" already qualifies for "clean, robust codebase"?

~~~
rictic
Our standards were higher in the past? [citation needed]

There's a trade off between development cost, performance, functionality, and
correctness. Writing a filesystem with reasonable performance and that never
crashes and never does the wrong thing is hard.

~~~
hannob
> Our standards were higher in the past?

No, but our fuzzers were worse, so we didn't know :-)

------
boardwaalk
Randomly modifying a filesystem image and then fixing up the checksums seems a
little unfair. Would it not be reasonable to write code that assumes data that
matches its checksum is valid? Isn't that the point of a checksum?

~~~
nabla9
> Would it not be reasonable to write code that assumes data that matches its
> checksum is valid?

If you assume that, you don't need checksums in the first place.

~~~
cmurphycode
How's that? If you go read your data, re-checksum it, and that matches the
original checksum, then you have confidence (to the strength of your checksum
function) that the data is not corrupted.

------
PaulHoule
I expected ext4 to last longer than the others.

~~~
masklinn
It does, by a fairly large margin (alongside XFS)? The "time to first bug"
table is sorted alphabetically, and the times are humanised not in the same
unit.

A reverse time-sort would be

    
    
        ext4 (2h)
        XFS (1h45)
        GFS2 (8m)
        NTFS (4m)
        NILFS2 (1m)
        HFS (30s)
        HFS+ & ReiserFS (25s)
        OCFS2 (15s)
        F2FS (10s)
        BTRFS (5s)

------
crb002
Why isn't Oracle running this on the JVM or Oracle DB?

~~~
jerven
Who says they aren't? Oracle DB results would be internal. Where would you
start with AFL on the JVM? Class loader verifier?

------
codys
Using a gcc plugin to instrument code for AFL sounds interesting (and
generally useful for speed). Does anyone know if this plugin's code is
available anywhere?

~~~
aseipp
I don't know about their implementation, but I wrote exactly this plugin for
GCC several months ago and announced it on the afl mailing list, as a patch to
the source. The lack of replies lead me to believe it was mostly uninteresting
to people - but maybe I should have advertised it more.

You can find the source code here:
[https://github.com/thoughtpolice/afl/commit/e54c0237e934d734...](https://github.com/thoughtpolice/afl/commit/e54c0237e934d7340d477a837eb891c4fe638b26)

It should not be difficult to update this to work on more GCC versions (I only
tested on GCC 4.8.x), but that will take some #ifdef'ery. Porting to newer
AFLs should be relatively trivial.

EDIT: I initially wrote this for no particular reason, mind you, other than to
play around with writing GCC plugins, and the result wasn't so bad, modulo
non-existant documentation. I also thought it would be nice to have an
identical equivalent to 'afl-clang-fast' for GCC ('afl-gcc-fast'), in the
hopes that perhaps one day the hacky, sed-inspired backends could be removed
from afl. I initially wanted to use this on a POWER machine as proof of a
portable GCC plugin for afl, although I lost interest in porting to a newer
GCC, before losing access to the machine. Watching afl fly on 176 cores was
fun, though.

------
ericfrederich
Any link to the video presentation?

~~~
ericfrederich
Just noticed this presentation is in the future (or a typo)

------
thrownaway2424
Nice to see that the law firm of Oracle still has a few engineers on staff.

