Hacker News new | past | comments | ask | show | jobs | submit | metzmanj's comments login

I work on oss-fuzz.

I don't think it's plausible OSS-Fuzz could have found this. The backdoor required a build configuration that was not used in OSS-Fuzz.

I'm guessing "Jia Tan" knew this and made changes to XZ's use of OSS-Fuzz for the purposes of cementing their position as the new maintainer of XZ, rather than out of worry OSS-Fuzz would find the backdoor as people have speculated.


How many oss-fuzz packages have a Dockerfile that runs apt-get install liblzma-dev first?

Had this not been discovered, the backdoored version of xz could have eventually ended up in the ubuntu version oss-fuzz uses for its docker image - and linked into all those packages being tested as well.

Except now there's an explanation if fuzzing starts to fail - honggfuzz uses -fsanitize which is incompatible with xz's use of ifunc, so any package that depends on it should rebuild xz from source with --disable-ifunc instead of using the binary package.


This is interesting, but do you think this would have aroused enough suspicion to find the backdoor (after every Ubuntu user was owned by it)? I don't see why this is the case. It wasn't a secret that ifuncs were being used in XZ.

And if that's the case, it was sloppy of "Jia" to disable it in OSS-Fuzz and not do this:

``` __attribute__((__used__,__no_sanitize_address__)) ```

to the XZ source code to fix the false positive and turn off the compilation warning, no attention would have been drawn to this at all since no one would have to change their build script.

With or without this PR, it's very unlikely OSS-Fuzz would have found the bug. OSS-Fuzz also happens to be on Ubuntu 20. I'm not very familiar with Ubuntu release cycles, but I think it would have been a very long time before backdoored packages made their way into Ubuntu 20.


>Woha, is this legit or some sort of scam on Google in some way?:

I work on OSS-Fuzz.

As far as I can tell, the author's PRs do not compromise OSS-Fuzz in any way.

OSS-Fuzz doesn't trust user code for this very reason.


It looks more like they disabled a feature of oss-fuzz that would've caught the exploit, no?


That's what people are saying though I haven't had the chance to look into this myself.

Fuzzing isn't really the best tool for catching bugs the maintainer intentionally inserted though.


It's more likely that fuzzing would blow up on new code and they wanted an excuse to remove it.

After all, if it hadn't had a performance regression (someone could submit a PR fixing whatever slowed it down, heh) it still wouldn't be known.


No projects yet, but I bet we'll have some by next week.


I don't think we have plans to build this for now.

I find it a really cool idea, but for now, running fuzzers natively on Google Cloud with ClusterFuzz (https://github.com/google/clusterfuzz) suits our needs.

One challenge for the WASM approach is it will always be at least as hard to build a project for WASM as it is for native.


Right I think WASM offers some nice advantages over native for distributed fuzzing.

It's also worth pointing out that Mozilla made a (non-WASM) distributed fuzzing project, virgo: https://github.com/MozillaSecurity/virgo but it appears to be inactive.


I haven't done a comprehensive study of this but in general I find that fuzzing programs in different environments (e.g. CPU architectures, OSes) tends to find some bugs that won't be found by fuzzing in just one environment.

But in general, I agree a lot of the bugs in WASM apps could be found by fuzzing their native versions.


Great post Guido!

Guido's bignum fuzzer which tests the correctness of math operations in crypto libraries is one of the most interesting fuzzers we run on ClusterFuzz.


There are tools for fuzzing go: https://github.com/dvyukov/go-fuzz

But I think the kinds of bugs found by fuzzing aren't generally security issues in go (I don't know much about go) as they are in C/C++.

EDIT: See guidovranken's excellent sibling comment for how fuzzing can still be useful for go.


Awesome, I will check that out. Thank you!


btw, ClusterFuzz, the infrastructure behind OSS-Fuzz was open sourced today: https://news.ycombinator.com/item?id=19106771


I work on this. Happy to answer questions if people have any.


How does the decision get made at Google to open source something?

I'm always confused about what gets shut down vs open sourced vs paid product.


Hi there! I work in the Google Open Source Programs Office. Echoing what others have said, it's usually just a matter of an engineer or team deciding it's something they want to do. Other times, it's a strategic choice.

We open sourced our open source policies/docs a while back, so if you're so inclined you can dig deeper there. These two links will be of particular interest: https://opensource.google.com/docs/creating/ and https://opensource.google.com/docs/why/


+1

I can speak a little bit about what motivated us.

We saw from OSS-Fuzz (https://github.com/google/oss-fuzz) that this sort of thing could be widely useful and wanted non-open source code to benefit from making fuzzing easier.


Thanks Josh! Love the meta-open sourcing.


I would guess that it has to do with the usefulness of the project outside of Google. This project could be applied to so many other things (as OSS-Fuzz demonstrates), so open-sourcing it makes perfect sense. It isn’t some kind of classified algorithm, either.


Open sourcing is usually pushed from the bottom. People decide they care about open sourcing their project at they push for it.


Any plans to port this from GCE APIs to something more provider agnostic, like k8s? A lot of us would like to fuzz on on-prem equipment.


We would like to support this use case.

For now, you can actually do the fuzzing on prem while communicating with app engine. We do this for our OS X bots since GCE doesn't offer OS X.


The other possibility for completely on-prem use right now is running it using the dev server: https://google.github.io/clusterfuzz/getting-started/local-i...


Hi, good stuff! I am curious to know why it was primarily written in Python (according to GitHub 83.3%)?


Just generally speaking, code that does orchestration and testing in general is often easier under a dynamic scripted language over something that is built and compiled, even if it winds up as a custom DSL. I think Python is one of the better options here for the broader community support, and tooling.

Aside: I tend to reach for node/js often for similar reasons (despite detractors) mostly because I'm more comfortable with it over Python or Ruby, but also because it's already integrated to most of the build/test environments I'm working on anyway.


Python is great for writing glue code, so I'm not particularly surprised by the language choice.


I’m curious why you seem to be surprised? It’s one of the most popular languages (even within Google).


That's one of the main languages at Google. And there are moving towards more Go since a few years.


I am quite ignorant on this subject. I looked briefly through the docs, and still feel a little lost. So before I go too much further, would it be possible to use this for web apps or unity games?


>So before I go too much further, would it be possible to use this for web apps or unity games?

Web apps, almost certainly no.

ClusterFuzz (and fuzzing generally) is most useful for finding bugs in C/C++ code so maybe it could work for unity games? I don't know much about them though.


I thought unity was mostly written in either C# or an EcmaScript flavour?


Unity itself (the engine) is written in C++. Game scripts are written in either C# or UnityScript iirc.


Although the engine is currently written in C++, they are in the process of rewriting parts of it in C#, with help of their HPC# subset and Burst compiler, having some ex-Insomniac Games developers like Mike Acton on the team.


Is this the fuzzing tool that was used to find bugs in Bitcoin? I don't see it in the repository.


I don't think so.

But solidity was recently added to OSS-Fuzz: https://github.com/google/oss-fuzz/tree/master/projects/soli...


Keep up the good work, Johnathan! This is a great feature.


Congrats, Jonathan!


Thanks Tanin!


How does it compare to afl?


It uses AFL.

ClusterFuzz is infrastructure for running fuzzers, so we use it to run AFL, libFuzzer, and other domain specific fuzzers we've written.

Using it to run AFL gives us a lot of nice things over using AFL on someone's desktop (such as crash deduplication, automatic issue filing, fixed testing, regression ranges etc.)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: