
Scaling Static Analyses at Facebook - dons
https://m-cacm.acm.org/magazines/2019/8/238344-scaling-static-analyses-at-facebook/
======
wallstprog
I think one of the unstated problems with static analysis is just keeping
track of the results. I know that when I started working with these tools, it
was a huge PITA just dealing with the various output files.

That's why I created tools to convert the output from different tools into a
common CSV format that can be databased and used to compare output from
different tools, or from different versions of the code (e.g., after fixing
errors reported by the tools).

These tools currently work with cppcheck, clang and PVS-Studio and can be
found here: [http://btorpey.github.io/blog/categories/static-
analysis/](http://btorpey.github.io/blog/categories/static-analysis/)

~~~
nh_99
Interesting approach. Where I work, we use Jenkins for collecting results.
That way, for each build of our application we have a history of results for
static analysis. Jenkins has good tools for storing and displaying this
information, as well as the ability to show trends over time.

~~~
wallstprog
If Jenkins works for you, great. It does seem to support both clang and
cppcheck, although not PVS-Studio (which is one of the better tools out there
in my experience).

Personally, I'm happier with plain old text files that can be manipulated with
awk, grep, etc., can be databased if needed (since they're csv files) -- and
can also be compared using my all-time favorite software, Beyond Compare.
([http://btorpey.github.io/blog/2013/01/29/beyond-
compare/](http://btorpey.github.io/blog/2013/01/29/beyond-compare/)).

------
nickpsecurity
One of the things I like about this article is that it gives another example
showing how formal methods catches deep errors unlikely to be caught with
human review or testing:

"Overall, the error trace found by Infer has _61 steps_ , and the source of
null, the call to X509 _ gmtime _ adj () goes five procedures deep and it
eventually encounters a return of null at call-depth 4. "

I think the example Amazon gave for TLA+ was thirty-something steps. Most
people's minds simply can't track 61 steps into software. Tests always have a
coverage issue.

------
SanchoPanda
> Zoncolan catches more SEVs than either manual security reviews or bug bounty
> reports. We measured that 43.3% of the severe security bugs are detected via
> Zoncolan. At press time, Zoncolan's "action rate" is above 80% and we
> observed about 11 "missed bugs."

>. For the server-side, we have over 100-million lines of Hack code, which
Zoncolan can process in less than 30 minutes. Additionally, we have 10s of
millions of both mobile (Android and Objective C) code and backend C++ code

> All codebases see thousands of code modifications each day and our tools run
> on each code change. For Zoncolan, this can amount to analyzing one trillion
> lines of code (LOC) per day.

11 "missed bugs" on the 100 mm server-side lines of code per run, or ever?

~~~
m0zg
Also, the main issue with static analysis tools tends to be not false
negatives, but false positives. That is, they churn out tons and tons of
alerts that aren't actually bugs. Some such systems alert so much that they
aren't worth using.

~~~
Matthias247
Yes, that's the main culprit with traditional static analysis. No one wants to
review the results, because the amount of signal to noise is far too low. And
also since it's an optional thing and not enforced by the compiler.

I think this is where languages with stronger inbuilt analysis (e.g. Rust)
win: The results are better, and since the analysis is always running as part
of a compiler pass there are no huge jumps in indicated bugs at once (like
what would happen if one would run Coverity on a legacy C++ codebase).

~~~
apaprocki
From experience on large codebases, get to -Wall -Wextra “clean” in both the
latest versions of GCC and Clang and then tools like Coverity will produce
much more useful results. The signal it provides to me at that point is
exactly what it is meant to provide: mostly improper error handling analysis
and N-level deep branches that result in a poor result due to an error or bad
decision in another file that a human would not associate with the current
call chain or think to look at. To be fair, the tools work much better when
you know you have complicated pieces that you spend a little time writing
correct models for (e.g. custom assertion/error handling, runtime supplied
vtables, custom allocators, etc.).

------
sanxiyn
We should start to run Infer on all open source C and C++ code in existence.

~~~
dhekir
Not only Infer, but other static analyzers would also be useful.

Hopefully Software Heritage
([https://www.softwareheritage.org](https://www.softwareheritage.org)) will
help with that.

~~~
apaprocki
Coverity scans most open-source software you likely depend on:
[https://scan.coverity.com/projects](https://scan.coverity.com/projects)

------
mhxion
Is there something wrong with acm's load balancer or whatever? First managed
to read to the end of the article, but to download the PDF showed "Oops! This
website is under heavy load." Now article page is under heavy load too.

Edit: It worked again right after I posted this comment.

------
sjtindell
Always cool to read about scale.

