Hacker News new | past | comments | ask | show | jobs | submit login
Clang Static Analyzer and the Z3 constraint solver (cambus.net)
63 points by fcambus 3 days ago | hide | past | favorite | 9 comments

The Clang Static Analyzer is a great codebase to learn from. A long time ago, before Z3 was open source, I added a constraint manager that used STP for the same objective: https://reviews.llvm.org/D15609

It may be dated by now but there is also a tutorial on writing checkers (https://clang-analyzer.llvm.org/checker_dev_manual.html), or follow one of the many examples in the codebase.

My first checker is still there — it just turned 10 years old, and is under 100 lines without comments. https://github.com/llvm/llvm-project/blob/main/clang/lib/Sta...

I'm very happy with the Clang static analyzer used with Z3 in the recommended way (for verification, not as the main constraint handler) through CodeChecker [1]. We use it with "cross translation unit" (CTU) support, to analyze call flows across files.

As said, there are few false positives. For embedded software the fact that Z3 understand bitfields and bit operations is nice. But in general we found that it makes better use of CTU information than a proprietary tool we're using (and now consider phasing out).

When used with cross compiled software the CC+ClangSA combination can be made to work, but is sometimes a bit rough when there are GCC/Clang subtle incompatibilities. But nothing too bad, and CC provided the hooks to work around any issue we found.

Even in "verification" mode, Z3 also helps the analysis as I understand from the docs. Some cheap Z3 analysis are used to prune some paths, which also helps the classical range analysis. Those tools must make some approximation to deal with the combinatorial explosion, so pruning useless paths early can help exploring others leading to bugs. So Z3, even in the default mode, is not only to filter out false positive but can also help finding issues.

I would highly recommend it for people using C and C++. With those languages the tooling around is very important to (partially) mitigate their known "dangerosity".

[1] https://codechecker.readthedocs.io/en/latest/

Does this use tactics to identify contradictions, tautologies, subsumptions, etc in conditions? How does this work?

A static analyzer does what is called "abstract interpretation". A good introduction on the principles is [1], from pioneers in the field.

Very briefly, the analysis tries to approximate the "states space" for the program, and if it intersect problematic value then it detects a bug. For example, if it can approximate the range of values a variable "x" can take, and this range includes 0 and x is used for division, then a division by zero bug can happen.

When an issue is found, those tools can give you the detailed path leading to the bug, from input down to a call chain to the bug.

The challenge here is to find a good approximation of the possible states. If the tool over-approximate, there will be false alarms. If it under approximate, it will miss bugs. Most tools do a bit of both ;)

[1] https://www.di.ens.fr/~cousot/AI/IntroAbsInt.html

Not all static analyzers use abstract interpretation (they should though to find all issues) because they may implement a rules based approach of checking against "common" bugs, or they may use model checking, or they may use some other form of symbolic execution, or do concolic execution.

I think Patrick Cousot (and others) would argue that most of those are some form of abstract interpretation :)

They're using a terrible example, because I'd prefer that null reference to be caught whether the code is reachable or not.

That's in general not desirable. The conditions that lead to the dereference of the null pointer can be arbitrarily complicated, including coming from templated code or macro expansions. For all you know, the condition in that if statement could very well be the same condition to govern whether the pointer is null or not.

> As far as static analyzers are concerned, one of the most important point to consider is filtering out false positives as much as possible, in order for the reports to be actionable.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact