Hacker News new | past | comments | ask | show | jobs | submit login
Fuzzification: Anti-Fuzzing Techniques [pdf] (usenix.org)
93 points by kens on Aug 26, 2019 | hide | past | favorite | 36 comments

You know what's probably a better way of making it harder to find bugs in your code using fuzzing? Fuzz it yourself and actually fix the damn bugs.

I love how their justification is:

> Unfortunately, advanced fuzzing techniques can also be used by malicious attackers

So the solution is to make it harder to find bugs instead of finding and fixing them, which affects everybody, not just "attackers". This is like the companies who'll sue or ban researchers for finding bugs in their stuff. Or programmers on HN who criticize researchers for publicizing vulnerabilities when companies refuse to fix them and give them the runaround.

I can already see companies like Apple implementing something like this instead of putting in the required time, effort and money to fix issues, while simultaneously villifying legitimate security researchers who look for vulnerabilities in their products.

The state-of-the-art analyzers and fuzzers are also really good. Especially the ones setup for no false positives. Bar is lower than ever.

In practice I think this is akin to asking someone to write bug free code the first time.

Regardless of this specific paper, assuming all bugs can be found and fixed is not a sound assumption. I'm happy to see techniques that correctly assume undetected bugs exist, and propose solutions to that problem. (Other than asking developers to just find and fix all the bugs, which again, is not possible)

You don't need to find and fix all the bugs to make fuzzing less useful for an attacker - just finding and fixing the low-hanging fruit that's easily detected by fuzzing is enough to make it a lot less productive for everyone else.

One, these guys needed to write a paper. They also can't write a tool that can fix all bugs, but they can write a tool that makes it harder to find for others which is what they did. Last, you never know who else could build on this idea and where they might take it. So relax. :)

Good read, but there is a huge elephant in the room.

In many cases vulnerabilities come from complex, messy, disorganized, or just poorly written code. Assuming that the "fuzzification" techniques are implemented by the same developer who wrote the original vulnerable code; how can we be sure that he isn't introducing MORE bugs in his fuzzified version?

For example, if you have a product which is NOT currently vulnerable and add BranchTrap to it the likelihood that you're creating a vulnerability where one never existed before is insanely high.

So you essentially created a code path for the fuzzer that never would have existed if you never tried to fool the fuzzer in the first place.

Assuming two binaries, an original and a "fuzzified" version and an unlimited amount of time: Which version would have overall More vulnerable code paths? Probably the fuzzified version simply because it has more code. The point of producing two binaries is that one can be analyzed quickly and one slowly. But if your original code doesn't contain a vulnerability and the fuzzed version does your trusted party doesn't mean anything.

This is not insignificant either. Time to fuzz an application no longer matters for the attacker. If he grinds on it long enough he will have a vuln that the authors aren't aware of (and can't/won't even find now). It doesn't matter if it takes 2 years to find a vuln in your code... Someone will do it.

This is by definition impossible to do without changing the behaviour of the code being 'anti-fuzzed'. This therefore risks introducing bugs which are then much harder to fix, and complicates debugging of issues in the original code which may show up in production.

Note that whether or not that behaviour happens to be undefined in the source language is irrelevant. Undefined behaviour still tends to manifest in a few recognisable ways, which this would then obfuscate.

While an interesting academic exercise, this has negative value to the software industry. I bet it'll still see a lot of use though.

Reminds me a bit of people trying to 'fix' errors with an empty `catch` block...

Hmu for questions! (Daniel Pagan)

I've seen a lot of criticism of this approach - hiding vulnerabilities, instead of actually fixing them. Other mitigations actually prevent exploits (e.g. the combination of NX and ASLR raises the bar for getting code execution: suddenly interactivity and an address leak are required in addition to a stack-based overflow) whereas your mitigation (?) just sweeps bugs under the rug.

From my own skimming of the paper, the discussion of 'why?' boils down roughly to 'exploits are sometimes used for bad things'.

Do you believe your approach will actually improve the security of any systems, or will it just allow lazy vendors to hide their shallow bugs - leaving them to the most motivated (e.g. nation-state) adversaries?

What would you say to someone who was worried fuzzification would introduce additional vulnerabilities? For example, Section 5 reads "To be specific, FUZZIFICATION changes the original condition (value == 12345) to (CRC_LOOP(value) == OUTPUT_CRC) (at line 20)." This is a dangerous bug, since it's trivial to find tons of different values with the desired checksum.

At this point this research is purely academic, so the scope of it was simply to hinder fuzzers while minimizing overhead for users. Like all code, this would also need to be evaluated for correctness (just like your comment!). This is introducing new techniques to the already established field of anti-fuzzing

How do you see your work being used? I only skimmed the paper, but it seems like Fuzzification hinders fuzzing but doesn't actually solve the issues that fuzzing would expose, which makes it seem ripe for misuse by "lazy" teams that aren't willing to fix the underlying issues but see it as a way to obfuscate their vulnerabilities away. For open source projects I can see attackers just fuzzing unmitigated binaries they compile themselves and translating these techniques to release versions, and for closed source projects security is predicated on the original authors running fuzzers themselves and actually acting on the results they find (which has historically been a smaller and less motivated group in fixing vulnerabilities).

There is definitely a lot of considerations that arise with this form of releasing software. It does not fix bad code and should not be used as the only security consideration for a codebase. I could see this being incorporated into a closed source project that has a good testing and code review pipeline as well as regular third party pentests on a release of the software that does not use fuzzification. From a project motivation standpoint, we were examining the internals of fuzzers and found that these techniques could be extremely effective. Right now the project is just academic research and no projects currently use this technology.

I figured you were going in a similar direction as obfuscation-oriented security (eg randomizations) to effortlessly add effort for attackers. I knew you were going to say tested, proprietary applications as a potential target. The ideas were fun to read with clearly plenty of talent going into it. I'm with the naysayers, though, in that it's probably not a great idea or at best might be a good supplement to something else.

If you or anyone else is interested, the strongest approaches when I last looked at everything tried to create full memory safety or do properties like CFI with strong assurance. Softbound+CETS, SafeCODE, and Data-Flow Integrity focused on memory safety with less performance hit. Code-Pointer Integrity with segments had low overhead. Some people were talking about S+C combined with DFI but overhead was crazy. I encourage folks to look at that kind of work, look at whatever [fast] mechanisms are in modern hardware, look at any software improvements in literature, and try to reduce the overhead of techniques like that (esp in combination).

Similar to you, I proposed we use techniques like Softbound+CETS on proprietary, closed applications to increase their security at a performance loss that can be made up for with cheap hardware. And then sell them on that saying security personnel cost way more than a few more rack servers.

Interesting idea. It makes you wonder how to best spend your time: writing better software (ie less bugs), or invest time in fuzzification (ie less exploits).

Well, fuzzification is implemented as a compiler pass, so I feel like this is asking whether it's better to invest time writing better software or using -O3. In principle you wouldn't need to spend developer resources on it.

If this ever gets popular, it is just one more reason to strongly prefer free software, I guess, since it can't be "protected" with this.

What prevents free software from using this technique?

Nothing, but attackers can just compile their own versions without these obfuscations to fuzz.

As the other poster said, any such technique would have to be rather easily removable on the source side in order not to be a maintainability nightmare.

I have some bad news for you: There are developers of the Linux kernel who deploy anti-fuzzing techniques (CRCs, which is a surprisingly effective anti-fuzzing-technique) and think it's a good idea to reduce the number of people reporting fuzzing-bugs: https://marc.info/?l=linux-fsdevel&m=152410224900838&w=2

I didn't draw that sort of conclusion at all from that email: isn't it just saying that the filesystem has a CRC (for reliability, presumably, not because they are trying to deter fuzzers) and they're frustrated that they keep getting fuzzer reports for things that don't pass that check (and don't matter because they're caught by the CRC)? They have a specific suite of CRC-aware fuzzers that they're still using.

The other poster already handled the issue that this is not talking about the same thing, but filesystem-specific CRCs. But why would you think it a good thing if less people reported unfixed bugs?

The things people will do to avoid doing actual work...

My opinion: this is an idiotic thing to do. I got triggered in a huge way by this for a number of reasons:

1. An attacker skilled enough to perform black box binary fuzzing, and create a testing harnass. Will only be minimally hindered by anything less than full blown obfuscation.

2. This approach aims to 'fix' the problem of someone looking and testing your code, it's hiding (not fixing!) a symptom of the problem: you wrote vulnerable code. Why not spend this effort on improving existing and actually helpfull mitigations such as dep, aslr, re-linking and pledge systems? Even better, why not re-write error prone or complex code in a safe language?!

2.1. This is security through obscurity, the vulnerable code is still there, someone will find it.

3. End users will be inconvenienced because obfuscation implies inefficient code.

4. This widens the gap between white hat research (not funded, often done for reputation) and black hat research which is worth good money for a quality exploit and therefore more sustainable for researchers making it their day job.

When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."


As you can plainly see from the paper, this is a form of full-blown obfuscation --- it's post-compilation program transformation to inject small proof of work functions entwined with program logic, to inject new branches and jump tables, and other things. Their goal isn't to make static analysis impossible, but the underlying CS they're invoking to accomplish their goal is the same as obfuscation.

Nothing that they're doing would be a "minimal" hindrance to people employing fuzzers. Rather, it all seems like an extreme pain in the ass, and it would doubtlessly deter people from fuzzing.

My guess is that the authors likely share your opinion of the utility of anti-fuzzing. This is (literally) a science project, not an advertisement for a product. In the scientific context, that the most popular current-generation fuzzing tools can be straightforwardly retarded through these techniques is an important observation.

1. Yes but think of all the attackers who aren't skilled enough to do that. Making the barrier to entry higher is exactly what you want.

2. Well yeah! If I write vulnerable code I would want to make sure it was as difficult as possible to discover. Who wants easy to discover vulnerabilities? Anyone who I want looking at my code will have the source and can build without obfuscation and fuzzing protection.

2.1 If I'm a big enough target sure, but if I make it as annoying and difficult to work with my software as possible the hope is that you'll go somewhere else.

3. The performance impact is super mild and completely unnoticeable on anything that works in human time. The slight performance hit to protect your IP is a no brainier calculation. Nvidia famously implores absolutely crazy obfuscation on their chips and they're still the market leader on speed.

4. Companies that use these kinds of techniques don't recognize a difference between the two. If you're not employed or contracted by them and you're looking at their code you are the adversary.

I think it's funny that when we talk about encryption we talk about its strength like "an attacker with a supercomputer X years to brute-force it" but when it comes to things like this we say "security through obscurity! Worthless!" instead of "the total population of people with the skills to even begin to find vulnerabilities is professional security researches and could all fit comfortably in a mid-sized event venue."

Encryption has mathematical guarantees on its resistance against brute-forcing. Security by obscurity just makes it harder for white hats to help you and lets nation states and other determined attackers exploit you secretly.

If by brute force you mean checking the entire key space, fuzzing can be made to sound quite similar to encryption.

There is a specific program path, or "key", that leads to a crash. Fuzzers attempt to search all paths in a somewhat intelligent manner to find the "key" path that leads to a crash.

We can inhibit the fuzzers ability to find the key by increasing the search space by adding more paths. The way in which this paper adds paths takes advantages of the way intelligent fuzzers explore the path space to make it even more difficult to search the entire path space.

The problem is that there isn't a single key in this case, there are many. If there was just a single key to begin with, fuzzing would not be very fruitful for any moderately+ complex program since finding that single path would already be very hard (and therefore has a high cost/benefit).

What fuzzification actually enables is to hide more keys (crashes) from a constant amount of (current) effort. Those crashes are still there and now there are more of them, waiting for someone to find them with an alternative method or simply by buying more fuzzing equipment to scale horizontally.

2./2.1./4. Yes this is true, and that's exactly where root of my problem with this direction of research lies. The incentives of software vendors are partially misaligned with those of the users. A vendor will want to maximize their profit, if a tool that implements this research allows them to be lazy then en-masse they will (ab)use it.

3. Software is layered, if every layer of your electron app starts using this type of obfuscation you will notice the slowdown.

1. I think there's an underlying assumption we both make about this research: it will increase the costs of finding exploits, but it will definitively not prevent a skilled attacker.

I also don't think it absolutely disqualifies anyone who's already able to craft an exploit, perhaps it will take them longer but anyone with the reversing chops to abuse an aslr weakness can figure out how to patch out or reverse the jumptables.

With that in mind, it comes back at the question:

Is it worth it? Increase the costs for an attacker but simultaneously adding an incentive for vendors to be lazy, an chose a cheap band-aid bolted-on-security approach as opposed to what we all know is the eventual (expensive) fix for memory related security problems?

I still think: hell no, don't make this stuff available, invest resources in alternative mitigations if not in outright solving the issue.

> but if I make it as annoying and difficult to work with my software as possible the hope is that you'll go somewhere else.

And yet, Oracle still has some market share.

On 2.1, I"m not sure this is security through obscurity.

I can send a malicious individual a link to this paper saying I used these techniques on my binary but fuzzing will still be more difficult.

Just like encryption takes my message and jumbles it up to "obscure it", this technique is adding slowdowns and false paths to hinder the fuzzer.

Knowing my message is encrypted doesn't make it any easier to read, just like knowing my binary has been fuzzification-ified doesn't make it any easier to fuzz. The only thing that would help is giving you the key to my message, or program paths with added complexity to throw off the fuzzer.

If you have the source code[0], you can compile your own version without fuzzification, and any attacks you find will work just as well on the fuzzified executable. This is not true of eg ASLR. If you don't have the source code, it's security through obscurity.

0: Note that obfuscator output is not source code.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact