
A Few Billion Lines of Code: Using Static Analysis in the Real World - idlewords
http://cacm.acm.org/magazines/2010/2/69354-a-few-billion-lines-of-code-later/fulltext/
======
scott_s
This is easily the best article I've read on HN in weeks. It's refreshing to
read a report from people who clearly understand their problem well, and are
able to explain the technical and social problems they face.

~~~
gnosis
I was initially skeptical, especially after reading your glowing review (what
can I say? I'm a cynic). But the article won me over. It actually quite
interesting and informative.

------
yread
_"Why is it when I run your tool, I have to reinstall my Linux distribution
from CD?"

This was indeed a puzzling question. Some poking around exposed the following
chain of events: the company's make used a novel format to print out the
absolute path of the directory in which the compiler ran; our script misparsed
this path, producing the empty string that we gave as the destination to the
Unix "cd" (change directory) command, causing it to change to the top level of
the system; it ran "rm -rf _* _" (recursive delete) during compilation to
clean up temporary files; and the build process ran as root._

------
JabavuAdams
Fascinating! Creating tools for developers is such a PITA.

Interesting observation:

Parsing is considered a solved problem. Unfortunately, this view is naïve,
rooted in the widely believed myth that programming languages exist.

The C language does not exist; neither does Java, C++, and C#. While a
language may exist as an abstract idea, and even have a pile of paper (a
standard) purporting to define it, a standard is not a compiler. What language
do people write code in? The character strings accepted by their compiler.
Further, they equate compilation with certification. A file their compiler
does not reject has been certified as "C code" no matter how blatantly illegal
its contents may be to a language scholar. Fed this illegal not-C code, a
tool's C front-end will reject it. This problem is the tool's problem.

~~~
stonemetal
From what I understand parsing is a solved problem. Parsing exactly like 40+
different C compilers for which you don't have source is a different beast all
together.

~~~
tetha
I'd rather call it: Parsing a given language is solved. Unterstanding the
not-C (love that term) language those 40+ C-Compilers accept is hard.

------
angelbob
We used Coverity at ACCESS (who make ALP and the First Else, neither of which
you've heard of).

If you're using C and/or C++, it's a truly amazing tool, and chases down some
very weird bugs.

As our upgrade method, we basically enabled serious tests a few at a time,
fixed most of the problems that showed up as a result, and then later enabled
a few more checks on slightly less serious bugs.

But I can see how it's hard to sell that in general. ACCESS was actually a
pretty good, sophisticated user of Coverity in most ways, and it paid us back
handsomely in code quality.

------
camccann
I don't know whether to laugh or cry at some of these anecdotes.

 _As a final example, a buffer overflow checker flagged a bunch of errors of
the form

    
    
      unsigned p[4];
      ...
      p[4] = 1;
    

"No, ANSI lets you write 1 past the end of the array."

After heated argument, the programmer said, "We'll have to agree to disagree."
We could agree about the disagreement, though we couldn't quite comprehend
it._

~~~
abecedarius
Ha! That probably came out of a rule in the standard that a pointer may
_point_ one past the end of an array. (Bump it any further and your compiler
is not required to do anything sensible, iirc.)

------
cousin_it
I want a lot more of this kind of article here.

------
peterwwillis
_The C language does not exist; neither does Java, C++, and C#. While a
language may exist as an abstract idea, and even have a pile of paper (a
standard) purporting to define it, a standard is not a compiler._

That is some Zen shit.

------
Daniel_Newby
" _How to handle cluelessness._ You cannot often argue with people who are
sufficiently confused about technical matters; they think you are the one who
doesn't get it. They also tend to get emotional. Arguing reliably kills sales.
What to do? One trick is to try to organize a large meeting so their peers do
the work for you. The more people in the room, the more likely there is
someone very smart and respected and cares (about bugs and about the given
code), can diagnose an error (to counter arguments it's a false positive), has
been burned by a similar error, loses his/her bonus for errors, or is in
another group (another potential sale)."

