I'm always confused about what gets shut down vs open sourced vs paid product.
We open sourced our open source policies/docs a while back, so if you're so inclined you can dig deeper there. These two links will be of particular interest: https://opensource.google.com/docs/creating/ and https://opensource.google.com/docs/why/
I can speak a little bit about what motivated us.
We saw from OSS-Fuzz (https://github.com/google/oss-fuzz) that this sort of thing could be widely useful and wanted non-open source code to benefit from making fuzzing easier.
For now, you can actually do the fuzzing on prem while communicating with app engine. We do this for our OS X bots since GCE doesn't offer OS X.
Aside: I tend to reach for node/js often for similar reasons (despite detractors) mostly because I'm more comfortable with it over Python or Ruby, but also because it's already integrated to most of the build/test environments I'm working on anyway.
Web apps, almost certainly no.
ClusterFuzz (and fuzzing generally) is most useful for finding bugs in C/C++ code so maybe it could work for unity games?
I don't know much about them though.
But solidity was recently added to OSS-Fuzz: https://github.com/google/oss-fuzz/tree/master/projects/soli...
ClusterFuzz is infrastructure for running fuzzers, so we use it to run AFL, libFuzzer, and other domain specific fuzzers we've written.
Using it to run AFL gives us a lot of nice things over using AFL on someone's desktop (such as crash deduplication, automatic issue filing, fixed testing, regression ranges etc.)
I'm super pleased to see this! Abhishek and the cluterfuzz team were one of our initial customers for Preemptible VMs, still are, and make for a great example. Congrats to the team!
Even in the absence of memory corruption bugs there is a subclass of bugs that can emerge in any general-purpose language, like slowness/hangs, assert failures, panics and excessive resource consumption.
Barring those, you can detect invariant violations, (de)serialization inconsistencies (eg. deserialize(serialize(input)) != input, eg. see ), different behavior across multiple libraries whose semantics must be identical (crypto currency implementations are notable in this regard as deviation from the spec or canonical implementation in the execution of scripts or smart contracts can lead to chain splits).
With some effort you can do differential 64 bit/32 bit fuzzing on the same machine, and I've found interesting discrepancies between the interpretation of numeric values in JSON parsers, which makes sense if you think about it (size_t and float have a different size on each architecture, causing the 32 bit parser to truncate values). This might be applicable to every language that does not guarantee type sizes across architectures like Go (not sure?), but I haven't tested that yet.
You can detect path escape/traversal (which is entirely language-agnostic but potentially severe) by asserting that any absolute path that is ever accessed within an app has a legal path, or by fuzzing a path sanitizer specifically.
And so on.
Code coverage is the primary metric used in fuzzing, but other metrics can be useful as well. I've experimented extensively with metrics such as allocation, code intensity (number of basic blocks executed) (which helped me prove that V8's WASM JIT compiler can be subjected to inputs of average size that take >20 seconds to compile), and stack depth, see also .
Any quantifier can be used as a fuzzing metric, for example the largest difference between two variables in your program.
Let's say you have a decompression algorithm that takes C as an input and outputs D. Calculate R = len(D) / len(C), so that R is the ratio between compressed input and decompressed output. Use R as a fuzzing metric and the fuzzer will tend to generate inputs that have a high compressed/decompressed size ratio, possibly leading to the discovery of decompression bombs .
Wrt. this, libFuzzer now also natively supports custom counters I believe .
Based on Rody Kersten's work I implemented libFuzzer-based fuzzing of Java applications supporting code coverage, intensity and allocation metrics , and it should not be difficult to plug this into ClusterFuzz/oss-fuzz.
Feel free to get in touch if you have any questions or need help.
Guido's bignum fuzzer which tests the correctness of math operations in crypto libraries is one of the most interesting fuzzers we run on ClusterFuzz.
Edit: Just realized I didn't quite address your question fully. https://pentest-tools.com/home is an online service that will run tests, including URL fuzzing and what not. All of the features they offer can also be found in open source and proprietary software. Not sure about saving failed tests as selenium tests for re-running in the future, though I imagine that you'd just re-run the same tool in the first place.
My bignum-fuzzer project  runs on oss-fuzz and tries to find mismatches between bignum computations across different libraries (OpenSSL, Go, Rust, etc). This is one example of how fuzzing can be useful even if the underlying language is "safe".
With some small hacks you can also have Go code coverage instrumentation as a libFuzzer counter.
If you're using goroutines you may want to consider fuzzing with the race detector.
But I think the kinds of bugs found by fuzzing aren't generally security issues in go (I don't know much about go) as they are in C/C++.
See guidovranken's excellent sibling comment for how fuzzing can still be useful for go.
More info: https://github.com/rust-fuzz
Disclaimer: I am the author of the rust fuzzer honggfuzz-rs.
My guess is that they'd use Rust for new code.
Beside, as much I agree with the benefits of a memory safe language, and I do believe in the urgency of promoting techs like rust, C and C++ are going to be part of our lifes for a lot of time.
If you gotta use knife, make sure you have the tools to make it sharp.
It makes me wonder why Google wouldn’t put their efforts into using Rust, for example.
Of course, server power is cheap, but not for our planet.
Even if all new code was written in safe languages we would still need to do fuzzing until all legacy code was rewritten -- operating systems, SSL libraries, browsers, load balancers, etc.
It's getting there.
My C programmer friends like the dangerous features of the language and want to use them. Their programs are designed around the assumption that they can mutate anything in memory whenever they want to. They use mutable global variables. They would strongly dislike Rust's memory protections and ownership concept, and its other safety features.
Integrating a new language into a new codebase is a lot of work; it would be great if Chromium started using it, but I also understand why it may be a harder sell there. Engineering is all about tradeoffs.
Not to say there isn't work on replacement efforts, only that testing what is in use is a practical thing, even if the scale is surprising.
Well some newer projects in Chrome/Chromium are written in Rust so for what it's worth there is some effort.
I believe just yesterday the security team published a document which recommended avoiding unsafe languages for safety critical code: https://chromium.googlesource.com/chromium/src/+/master/docs...
But apologies perhaps I confused Chromium for Chromium OS: https://chromium.googlesource.com/chromiumos/docs/+/master/r...
Rust is listed a little higher in that link in the section about avoiding unsafe implementation languages. Given that they list it as a counterexample of things to avoid I would assume that means it is good, but that is no sure thing.
C++ static analysis is helpful, but it is not memory safe. Modern C++ doesn't really help. In fact, in my opinion, modern C++ tends to be less memory safe than "classic" C++, because it adds all sorts of new fun ways to get use-after-free.
That's usually because the "low-hanging fruit" is mostly gone due to protections like stack canaries, CFI checks, address randomization, and sandboxing, which makes things like buffer overflows or jumping to shellcode more difficult.
I think modern C++ is much safer than legacy C++, not by disallowing the USE after free, but by giving you the tools to make sure that the FREE happens at an appropriate time. A lot of this is not necessarily a feature of the language itself, but of idioms the community seems to agree to ie CppCoreGuidelines. I'm thinking of
- never use the delete operator manually, prefer std::unique_ptr
- prefer single ownership over shared ownership
- prefer immutable over mutable state
I agree with you that compared to Rust, this places a lot of burden on code reviewers and is in no way perfect, but I wouldn't say it's going in the wrong direction, making the language less safe.
Additionally, it's not really shared ownership that causes problems; it's the fact that references (and pointers) make it possible to violate ownership semantics. References are used all the time in C++, and you can't use any libraries without them. It doesn't matter if data has a single owner if that owner is deleted and you still hold references to the data. In fact, shared ownership via shared_ptr is safer than unique_ptr, from a UAF point of view.
Shared ownership makes it much harder to reason about lifetimes, increasing the chance of mistakes.
> shared ownership via shared_ptr is safer than unique_ptr, from a UAF point of view.
Only if the dereferencing code holds the shared_ptr. It's also common to pass non-owning pointers to objects held by shared_ptr.
> When people use the delete operator manually and mess up, do they usually delete objects too early or too late?
As you pointed out, problems arise if objects are deleted too early or twice. Double free is common in C and can be exploited as well...
Well if it isn't getting any safer than the time-disproven technique of relying on human behaviour and there are automatically safer alternatives, then it is less safe than those alternatives.
Modern C++ is safer, but only if you can ensure everyone on the team never reaches for C style tricks, nor the libraries that you link against.
Additionally, that example doesn't seem to be a resource leak in the sense of a memory leak. The exact same problem occurs if the file description is purposefully used after the 'system' call (that is, it isn't being leaked by not being used again). That's an issue more with 'system'/'fork' implicitly inheriting file descriptors than a resource being leaked.
Also note that Rust has contained a mitigation for that issue for a long time now: https://github.com/rust-lang/rust/pull/24034
> - during the GSubprocess discussion, I originally held the opposite
opinion, but eventually became convinced (by Colin) to see the
inherit-by-default behaviour of exec() as nothing more than a
questionable implementation detail of the underlying OS. Consequently,
at the high level, GSubprocess provides an API that gives the caller
direct control over what is inherited and what is not, and that's just
the way that it should be.
> - this behaviour is not limited to GSubprocess. Closing all fds before
calling exec() is a common practice in modern libraries and runtimes,
and for good reason.
That's silly. The standard library of new systems programming languages should do the safe thing, not inherit all the mistakes of the OS that the OS can't fix due to backwards compatibility. If Unix were being designed today, I am certain O_CLOEXEC would have been the default. Besides, Rust's behavior matches what happens on Windows, and cross-platform consistency in the standard library is desirable.
In any case, what glib does matches what the Rust standard library does. The latter uses posix_spawn() directly in order to make sure no file descriptors are passed between parent and child. There is an API available to request file descriptor sharing explicitly.