A gentle introduction to static analyzers for C

addaon · 2023-09-27T17:52:27

When you start thinking about how to write correct C code, there are a few steps along the way. Glorified linters are a great first step; in the best case, they prove useful safety properties of code (no access to uninitialized variables, etc). From there, static analyzers start proving more complex properties -- no stack overflow, for example. The final step along this journey is to proof assistants, proving arbitrary properties not just with regards to safety, but correctness.

One often overlooked part is CompCert [1], a pretty-much-verified C compiler. All of the tools mentioned above prove properties on the semantics of the C code that is written. These properties are only conserved in practice if the translation of the C code to equivalent-semantic machine code is correct, and the execution of the machine code is correct with regards to its modeled semantics. The first part of this chain of trust is where CompCert slots in; for the second part, in practice, we are at the mercy of our silicon vendors.

https://compcert.org/compcert-C.html

AlotOfReading · 2023-09-27T18:11:37

Compcert is a true tour de force in what can be done with formal methods, but my experience is that almost no one is actually in a position where semantic equivalence is major source of issues for them. I more commonly see it brought up internally as a way for people to avoid facing the larger issues traditionally plaguing high assurance C development that aren't as simple as "just" changing the compiler.

addaon · 2023-09-27T18:14:48

> almost no one is actually in a position where semantic equivalence is major source of issues for them

Well, they don't necessarily know if it is or not...

When you're dealing with safety-critical software (DO-178, etc) where the cost of verification and validation can be easily 100x - 1000x the cost of writing the code, having traceability from C code to machine code be correct by proof instead of a manual process can be a lifesaver. Are a significant number of bugs actually excluded this way? That basically comes down to how much the defect rate has been reduced at the C code level by other activities, to determine significance.

It's hard for me to imagine working on DAL B or DAL A code and not at least seriously considering use CompCert just to reduce the scope of necessary verification activities a bit.

Quekid5 · 2023-09-27T21:20:18

As you alluded to... How many compiler bugs materially affect the C->ASM bit of the pipeline rather than the C source code just being incorrect? I'd wager it's a tiny ratio, but I honestly don't know. Could you point to any research on this, if you happen to have links handy?

I could imagine even 0.01% being significant for e.g. space missions or (if we're being optimistic) lives are at stake.

rsaxvc · 2023-09-27T22:24:25

I found one in a supposedly IEC 61508 compiler. It was nasty.

Every so often our OS needed to read a memory mapped asynchronous timer, comparing values it emitted until they stopped changing to confirm that the various system clock demands were safely aligned and that the addition of a periodic tick was not rippling its way through the counter, before using the value. This time I ran at 32.768 kHz, but to see the glitch required reading the timer exactly as the adder rippled. Most of the time the CPU would read in between the ripples and it would work fine.

This driver had worked on several released products with an earlier version of the compiler, we upgraded the compiler slightly to take advantage of a new instruction set in the next chip.

The updated compiler would rarely assume sequential volatile reads from the same address had the same value.

Something like while(v!=v){} would become the equivalent of if(v!=v)while(1){};

This compiler fault meant that our previously working driver would lock up the machine between maybe an hour and a few days, whenever a particular IRQ occurred during the clock domains race.

Normally watchdogs would trap most infinite loops in the system, but it didn't I knew that it had to be this particular IRQ. Disassembled the whole thing, which only had one loop in it (this one). Took me forever to find because the loop structure was essentially correct except that it was branching just past one of the loads that needed redone. So instead of load x, load y, compare y,x, branch if difference to load x, it branches to load y, a difference of two bytes in the branch offset.

The compiler vendor confirmed this and corrected it in the next compiler version. The ticket number used to be on their website but they no longer publish the change log that far back.

addaon · 2023-09-27T21:25:55

You're asking about the ratio of "compiler bugs in code generation impacting behavior" to "bugs in C source code"? Obviously the number depends critically on the denominator... Defect density for DAL A software is well studied and very, very low. See, e.g., [1]. "Most estimates put the number of residual defects for a good software process at 1 to 10 per KNCSL," of which 1% are of catastrophic severity. These processes give a low enough defect rate that real-world failures are dominated by errors in the requirements process rather than in the process of converting requirements to source code; it's not hard to see that at this point compiler bugs can be non-negligible.

[1] https://apps.dtic.mil/sti/pdfs/AD1088687.pdf

astrange · 2023-09-27T22:01:51

CompCert has had bugs in its C compilation before, for instance because the assembly to binary translation wasn't verified.

AlotOfReading · 2023-09-27T18:42:09

Absolutely agree. Compcert solves a very difficult problem better than pretty much any other solution. However, that problem is mostly a theoretical one, even if it's important to mitigate it. It's not a replacement for source level verification and migrating the compiler can be put off until relatively late in a project.

nullifidian · 2023-09-27T19:55:42

Comcert is verified functional code in coq that is translated to ocaml for execution, the compiler for which isn't verified. It doesn't solve the problem of verification of imperative code. The state of the art approaches to imperative code verification using coq ( such as https://gitlab.mpi-sws.org/iris/refinedc ) struggle to verify the basic college textbook algorithms and are extremely clunky in practice.

scscsc · 2023-09-27T19:08:07

Having some experience in this field, I can say that all of the tools presented on the webpage are very lightweight static analyzers. They are all based on processing the syntax tree and they which will detect things like uninitialized variables, or do style checks.

"Real" static analyzers, which are more useful, are based on symbolic execution and abstract interpretation and they will uncover more interesting classes of errors, like double frees, out of bounds array access, etc.

Note that all analyzers will have FPs and FNs, because the underlying problem is uncomputable.

The best open source "real" static analyzer overall for C++ is the Clang static analyzer. There are many commercial analyzers available, and most C++ shops will use one or more of these.

chc4 · 2023-09-27T22:28:36

clang-tidy is mentioned in the blog post, and what I personally use - it does do control flow and dataflow analysis to find double frees or missing frees from any possible program trace (though doesn't do any interprocedural analysis, for hopefully obvious reasons).

flohofwoe · 2023-09-28T06:34:32

AFAIK clang-tidy is a superset of clang static analyzer nowadays.

E.g. quote from the documentation page (https://clang.llvm.org/extra/clang-tidy/):

    clang-tidy has its own checks and can also run Clang Static Analyzer checks.

Animats · 2023-09-27T17:34:17

These are all just lints, not proof systems.

The big problems in C are:

* How big is it?

* Who owns it?

* Who locks it?

Those are the problems which cause trouble after the program has first started to work at all.

These lints don't help with those problems.

flohofwoe · 2023-09-27T17:45:26

At least gcc -fanalyze and clang-tidy are "proper" static analyzers. Don't get confused about the clang-tidy name, it might have started as a simple linter, but it also includes all the clang analyzer checks (e.g. search for clang-analyzer here: https://clang.llvm.org/extra/clang-tidy/checks/list.html).

And the clang static analyzer indeed does things like static ownership/lifetime tracking following the control flow as much as is possible at compile time. In code bases which haven't run through an analyzer yet it typically does need some handholding via asserts (for instance hinting that a pointer isn't expected to be null, otherwise you'll be swamped with false positives).

Clang analyzer basically tells you things like "this statement here will cause a null pointer access if this seemingly unrelated function over there at the other end of the project is called with this specific argument, and here's how I arrived at that conclusion (...followed by dozens of "calling function x with y...", "entering loop with loop count z", "taking this if-branch because x is greater than y" etc... - in Xcode this is accompanied with a nice control flow visualization over the source code which makes it a bit easier to follow).

Once a code base is static-analyzer-clean via "assert-hints", it will actually catch bugs at compile time which might be tricky to reproduce at runtime.

Of course static analyzers are still only part of the solution (the "compile time part"), for runtime checks there are the various clang sanitizers.

steveklabnik · 2023-09-27T17:45:06

While they cannot solve all problems, it is worth not allowing the perfect to be the enemy of the good. You are correct that these are huge problems that are not currently solvable, but that doesn't mean that using these tools is a bad thing.

jvanderbot · 2023-09-27T17:48:13

I'd be suspicious of someone taking a stab at running ownership static verification before enabling basic sanity checks like those presented here.

So. Maybe it qualifies as a "gentle introduction" since it's effectively step 1.

pjmlp · 2023-09-28T08:56:16

Yes, however many C and C++ developers to this day don't use any kind of lint checks, even though it was already invented in 1979 to deal with typical C flaws, by the group that created C in first place.

Granted they could have made the language better, but I guess in the spirit of UNIX, they decided to have the separate pass be yet another tool.

If someone is going with C, at very least they should already consider these, and eventually learn to adopt the real deal in terms of tooling for "safe" C.

bfrog · 2023-09-27T22:05:25

Right, people have created whole new languages to solve those

P_I_Staker · 2023-09-27T21:01:58

Interested in whether anyone has compared how these analyzers compare with expensive enterprise options?

So much of this enterprise software is junk, but it seems like the options much more basic. In this case, perhaps the simple version is worse.

I'm not as sure how these are designed either. Usually, this requires some fairly sophisticated management of the static test environment, and providing information to the tool.

Perhaps it's just embedded systems, but you also have to manage a custom preprocessor. I'm not sure if these tool have any good rulesets, either default or stuff geared towards eg. CERT-C and MISRA-C.

Anyway, I've played around with Unit test frameworks, too. I'm not sure if I found any that provide good coverage metrics. Also here, it seems in both cases you do get a basic suite of behavior

static: rules you can enable for static checks, logs/reports, etc

unit: framework, scripts, assert, test sequencing, etc

Anybody use anything "heavier weight"? It doesn't have to be "free free", but not for $6-50k+... say less than $2k.

canucker2016 · 2023-10-01T20:47:01

PC-Lint/Flexelint is the grownup/commercial version of lint/compiler analyzers. Looks like they merged with a more corporate software company, but you can still find their webpages on archive.org.

see https://web.archive.org/web/20061206130312/http://gimpel.com... and https://web.archive.org/web/20061208002102/http://www.gimpel....

And for fun, see https://web.archive.org/web/20070204002103/http://www.gimpel...

Two commercial static analyzers I've heard of are Klocwork and Coverity. I know Coverity has a program to scan open source projects. So you can take a look at the output of their scans to see how they fare.

For PC-lint, it was relatively easy to integrate into the build process. Just a few changes in each makefile - probably less if you use something like ninja/cmake to drive the build process.

But adding static analysis to your build is similar to getting a home energy efficiency audit.

The auditor comes in and analyzes your house and points out where you can improve the house's energy efficiency (go from single to double/triple pane windows, fill in cracks around doors, don't use oil/coal to heat the house, etc.) But once they're done and you've adjusted the house accordingly, further audits won't bring up any new issues unless something in the house has changed.

Similar thing with static analyzer (and with turning on compiler warnings). First time you use that feature, you'll get a lot of issues - some of them you may modify, some you'll ignore. But further static analysis won't bring up many new issues, if at all, unless the code has changed.

Which means it's important to run the analysis when code is committed. Besides being cheaper/faster to fix problems before the commit, issues won't build up until the next time you run the static analyzer.

These tools don't eat, don't sleep, don't take vacation/sick days, won't get tired looking at the same code over and over again, and know much more about gotchas/footguns in the programming language than most programmers.

When I talked with a programmer who had worked on an enterprise-level commercial static analyzer, we agreed that these types of tools would find about up to 10% of bugs in a project - closer to 10% for the lower quality projects.

bjourne · 2023-09-27T20:37:48

What I'm missing from static analyzers are English awareness:

    static void
    get_foobar() ...

Get is a verb that means "to retrive" so "get_foobar" should return something.

    int n_bytes = ...;

Why a signed type? A negative number of bytes is nonsense. In fact a whole language designed around assigning semantic meaning to names would be cool. f has to be a function, n has to be an integer >= 0, M has to be a square matrix of some type, has_xyz must be a boolean, and so on.

neutrono · 2023-09-27T21:26:27

In C23 you can use the attribute

    [[nodiscard]]

on functions to mark that the result from function shouldn't be discarded. I think there is also a

    [[reproducible]]

attribute for saying a function doesn't do side effects.

flohofwoe · 2023-09-28T06:42:46

> A negative number of bytes is nonsense.

That's true, but mixing signed and unsigned integers opens a much bigger can of worms (for instance it's not unusual to add a negative quantity to an unsigned integer). It's often better to just stick to signed integer types which are big enough to cover the required positive range (e.g. you only loose one bit, and 63 bits ought to be enough for anybody).

...also an optimizing compiler will usually turn something like:

    assert((index >= 0) && (index < size));

...into a single comparison, since on the assembly level it's free to interpret that 'index' as unsigned.

bjourne · 2023-09-28T19:46:46

It's not about the compiler, it's about clearly communicating the intent of the code to the reader.

cozzyd · 2023-09-27T20:48:42

get_foobar() could update some thread_local static value, for example.... or print something to screen.

n_bytes could easily be negative if a byte offset, or as a sentinel.

This is why we have a type system, to express these things :)

bjourne · 2023-09-28T19:53:57

Nah, those are type errors. Some languages use case so that FOO can never have the same type as foo. Nothing prevents a type system from similarly distinguishing between get_foo() and set_foo().

canucker2016 · 2023-09-27T20:48:17

I've written a simple C/C++ linter/static analyzer long time ago that's analyzed large commercial and open source code bases. At that time, compilers didn't catch many of the errors my tool found. Couple of decades later, compiler warnings from the major compilers cover the majority of the bugs my tool looked for.

Spewing out a list of potential bugs to stdout/stderr isn't enough. There needs to be a smooth workflow for the developer to quickly assess if a reported bug is an actual problem. Also the developer needs to be able to tell the tool to ignore the bug in the future, because getting a list of bugs that you know aren't problematic is annoying the next time around, but infuriating after the tenth time.

Commenters have referred to some linters/analyzers as lightweight, etc. IMHO the simple tools will analyze code on a per-function basis. No/little data-flow analysis is done across function boundaries. The more thorough tools will go the extra mile and analyze resources allocated/values used in one function and passed to others. These tools will require a database and several minutes/hours depending on the size of the codebase. They're not meant as a precondition for committing code - they're doing work that computers are meant for - boring, repetitive, exhaustive, dreary code analysis.

But most of the code where my tool found errors tended to reside in the uncommon code path, typically error handling code/uncommon cases.

You don't want to create an Apple-level face-palm-like bug - https://dwheeler.com/essays/apple-goto-fail.html

So turn on the highest level of compiler warning you can and/or pass the appropriate analyze flag to your compiler.

38 · 2023-09-27T17:59:13

> -Wall turns on a group of warnings that can catch common mistakes.

I appreciate modern languages, which just catch this at compile time, without making people jump through hoops. for example I can just run "go build", and this wont even compile:

    package main
    
    func main() {
       println("Program run!")
       i := 10 // i declared and not used
    }

vbezhenar · 2023-09-27T19:31:19

That's a terrible example and I hate go for that. It makes writing code iteratively huge PITA. It's error to turn warnings into error. My final code is free of warnings, however having a warning during development, especially so innocent like unused variable should totally be fine. Because I just wrote this code and I need to test that something works and then I'll write the next part.

38 · 2023-09-27T20:37:26

sorry you feel that way. I agree its annoying, but I think ultimately it makes for better code. you can always comment out the line, or add a "_ = i" or similar.

IshKebab · 2023-09-27T18:26:24

I agree, but that is not the most compelling example to be honest. I think a better one is my nemesis:

  int foo() {}

This is not an error or even a warning by default! Worse, even if you have enabled warnings, it can cause stack corruption and crazy insane behaviour in a different file. I've lost hours to this because I was working on code without `-Werror`, I didn't see the warning, and incremental compilation meant the offending file was never recompiled (I was trying to fix the error in a different file).

Of course modern languages don't have this insanity.

LoganDark · 2023-09-27T18:34:23

Could this explain the miscompilation I was experiencing on macOS some years ago where I forgot a `return` statement, the compiler didn't give a single fuck and spit out an executable anyway, and I spent hours in a debugger wondering how the hell I found myself in random completely invalid instructions for no reason?

Took hours, perhaps even days to track down. I did it by spamming single-step until it stopped being able to find the line that was currently executing.

flohofwoe · 2023-09-27T18:44:10

> This is not an error or even a warning by default!

It depends on the compiler, for instance in Clang this is part of the default warning set:

https://www.godbolt.org/z/bKnq456eK

vbezhenar · 2023-09-27T19:35:17

gcc has that warning as well (not enabled by default, but enabled with -Wall).

However gcc does not have a warning that I'm supposed to write `int foo(void){}` (if we're talking about C) which is real issue.

flohofwoe · 2023-09-29T06:49:01

Ah right. Clang will warn if you actually try to call such a () function with arguments: https://www.godbolt.org/z/qEa55PdG9

Both GCC and MSVC don't care though.

astrange · 2023-09-27T22:04:18

Not true anymore in C2x, () means (void) now.

The missing return type is still an issue though.

flohofwoe · 2023-09-27T18:15:54

Declaring '-Wall -Wextra -Werror' on the compiler command line isn't exactly "jumping through hoops" though.

This should be the default of course, but this will just bring out the pitch forks because people have different opinions (for instance this innocent unused-variable-error in Go is currently by far the most controversial feature in Zig).

spc476 · 2023-09-27T18:31:15

And then you hit:

    warning: ISO C forbids assignment between function pointer and `void *'

Which, while true, is allowed (and in fact, required by) POSIX. Of course, not passing "-pedantic" will suppress that warning, but you'll also suppress other, critical warnings.

Okay, I'll put away my pitchfork now.

pjmlp · 2023-09-28T09:11:20

Such warnings might save some hours of pain when writing cross platform code.

LoganDark · 2023-09-27T18:32:31

This isn't really an innocent Go feature, either. I have a meme saved somewhere that's basically "no, fuck you, variable declared but not used".

quincepie · 2023-09-27T22:09:01

Nice article,

I wasn't aware of the GCC -fanalyzer flag, I think that approach is much better than clang. The warnings also seem to be much nicer.

gdgghhhhh · 2023-09-27T21:28:54

There is also smatch, it is used by the Linux Kernel community https://github.com/error27/smatch

bfrog · 2023-09-27T22:06:11

I mean the best static analyzer is rustc bar none, but then you have to drop the ancient dialect known as C

galangalalgol · 2023-09-27T22:24:00

I love rust, but "bar none" is perhaps overstating it. While formal methods aren't the same thing as static analyasis MISRA, SPARK, and normal Ada shouldn't be dismissed as things vastly less worthwhile.

bfrog · 2023-09-28T12:38:09

MISRA is at best a glorified style guide and short for misra-ble. I have yet to actually see MISRA do much to prevent bugs.

I haven't used SPARK and Ada to be frank, so perhaps they are even superior in their ability to catch bugs with static analysis, I have heard great things. It's something I've been meaning to try.

galangalalgol · 2023-09-28T14:27:38

I've not used MISRA, so I'm really referring to the collection of tools for formal proofs that surround the style guide. Ada was worth trying for me just to contrast with rust and to add some humility to my rust evangelism. I don't like the verbosity or module system, and it is a bit dated as far as supporting anything but imperative styles. There is some confusion about how to deallocate stuff or if you should... Lastly, the thread safety mechanisms are kind of clunky to me.

jnwatson · 2023-09-27T18:08:47

The open source tools listed are quite basic compared to commercially available tools.

flohofwoe · 2023-09-27T18:10:45

Clang static analyzer (integrated with clang-tidy) is actually really good.

rwmj · 2023-09-27T18:21:34

I've found that GCC -fanalyzer is similar to Coverity in the bugs it finds. They both have pros and cons.