
Static Program Analysis [pdf] - ingve
https://cs.au.dk/~amoeller/spa/spa.pdf
======
hyperpape
I haven't read it (it's on my list), but I'll pass along a recommendation by
someone well suited to know:
[https://twitter.com/johnregehr/status/1037098838752784384](https://twitter.com/johnregehr/status/1037098838752784384)

~~~
michaelmior
This is the same document the original post links to :)

~~~
hyperpape
Yes, I should've said that. I meant "this isn't some random link posted to HN,
it is well regarded."

------
yazr
Can someone knowledgeable summarize the current state of Program Analysis
state-of-the-art?

Do we know how to do context sensitive / intra procedural analysis and scale
it to millions of lines of code ?

~~~
mynegation
I co-created one of the commercial program analysis tools used by many large
customers on millions line of code. I have been out of this for a while but
track what is going on every now and then. Our analysis was context-sensitive
and inter-procedural (this is probably what you meant to ask about as intra-
procedural analysis means "within one function/procedure" and I cannot imagine
having a function with millions line of code).

First of all: real bugs and security vulnerabilities can can be found and were
found with these kind of tools. But state explosion is real. To deal with
enormous computational complexity of sound program analysis, corners must be
cut. Program analysis is usually defined as sound (i.e. all warnings you get
are about real bugs, i.e. no false positives) or safe (if there is a bug, it
is found, i.e. no false negatives). You can make analysis sound but it will
not be safe, or you can make analysis safe, but it will not be sound. Both
extremes are useless, because in the former case you end up with very few
warnings (if any) that are definitely real, but you miss a lot of interesting
cases, in the latter case you are inundated with a huge number of warnings the
vast majority of which are false positives.

All commercial tools and most open source tools that I know are neither safe
nor sound but try to hit the sweet spot to be useful.

I believe most progress in program analysis is done by migrating to safer
languages (Rust) and runtimes (Java, Go) over time, where different aspects
are eliminated or mitigated, like memory leaks with GC or deadlocks with co-
routines in Go or message passing in Erlang. Proliferation of IDEs and
ubuquity of lightweight static analysis tool (lint-like) during the
development also helps.

~~~
yazr
Many thanks for the reply. Yes i meant inter-procedural.

For a POC project, which (open source) library/framework would be a good
starting point today?

We are focused on some deep analysis of some very specific pointer/array
analysis. We can target either C/C++/Java. We have varying experience with
compilers/JITs/builders.

(I have worked with WALA/Java 5yrs ago..)

~~~
mynegation
These are very different languages. For C++ there are not too many open source
b/c C++ is tough. There is cppcheck, but if you want to do something very
custom, starting with clang may be an option. For java
FindBugs/SpotBugs/HuntBugs was the most prominent some years ago. Also check:
[https://github.com/mre/awesome-static-
analysis](https://github.com/mre/awesome-static-analysis)

