[Also, I remember reading the author's paper on Type analysis in JS: https://cs.au.dk/~amoeller/papers/tajs/paper.pdf, and remember it being really good.)
Do we know how to do context sensitive / intra procedural analysis and scale it to millions of lines of code ?
First of all: real bugs and security vulnerabilities can can be found and were found with these kind of tools. But state explosion is real. To deal with enormous computational complexity of sound program analysis, corners must be cut. Program analysis is usually defined as sound (i.e. all warnings you get are about real bugs, i.e. no false positives) or safe (if there is a bug, it is found, i.e. no false negatives). You can make analysis sound but it will not be safe, or you can make analysis safe, but it will not be sound. Both extremes are useless, because in the former case you end up with very few warnings (if any) that are definitely real, but you miss a lot of interesting cases, in the latter case you are inundated with a huge number of warnings the vast majority of which are false positives.
All commercial tools and most open source tools that I know are neither safe nor sound but try to hit the sweet spot to be useful.
I believe most progress in program analysis is done by migrating to safer languages (Rust) and runtimes (Java, Go) over time, where different aspects are eliminated or mitigated, like memory leaks with GC or deadlocks with co-routines in Go or message passing in Erlang. Proliferation of IDEs and ubuquity of lightweight static analysis tool (lint-like) during the development also helps.
For a POC project, which (open source) library/framework would be a good starting point today?
We are focused on some deep analysis of some very specific pointer/array analysis. We can target either C/C++/Java. We have varying experience with compilers/JITs/builders.
(I have worked with WALA/Java 5yrs ago..)
I am asking because I was thinking about creating such tools myself in the future, preferably going commercial. But at the same time, when I look back at other developers I've known over the years, only about 10% of them used static analysis tools, and below 1% used commercial ones.
Pressure from improving IDE capabilities is also very real.
Your chances of success might be better with something lightweight, integrated with major IDEs and editors, very well polished and filling a specific niche.
One example is the open source tool Infer, which we run on very large bodies of native and Java code at Facebook. http://fbinfer.com