
C++ at Google: Here Be Dragons - ryannielsen
http://blog.llvm.org/2011/05/c-at-google-here-be-dragons.html
======
SwellJoe
I haven't worked in C (or C++) heavily in about 6 years, since I shut down my
prior company and stopped working on Squid or having to look at kernel code.
But, these errors are simply beautiful, and make me have vague longings to
work on C projects again (I'm sure I'll get over those longings soon).

These are the kinds of mistakes I made all the time when working in C, and the
kind of thing that made coding extremely tedious...it feels like magic when
the compiler catches them with such clear and concise warnings. For whatever
reason, I didn't use lint very much back then, as I guess I always assumed I
knew what I was doing and that the compiler would catch mistakes. Having this
capability in the compiler is pretty cool and brings C/C++ a small step closer
to working in higher level languages, is what I think I'm trying to say here.

~~~
stephen_g
"Having this capability in the compiler is pretty cool and brings C/C++ a
small step closer to working in higher level languages"

C++0x is another (much bigger, in my opinion) step that makes C++ a lot easier
to program. It's not quite as easy as higher level languages, but far closer
than before, and the performance gains over most other languages make it worth
using.

~~~
roel_v
I agree. Lately I've been writing substantial amounts of 'higher level' code
again, and I find myself writing many checks for the types of variables (as
members or arguments), return values, contents of containers, in situations
where one may make inadvertent conversions etc. I'm thinking that much of the
time I spend on that would have cost me less time if I could have just
specified them in the code and the compiler/runtime would check them for me,
like in C++.

Of course writing unit tests helps too, but in C++ the compiler catches these
things easily. And I haven't had the desire to use a variant in years so the
advantages of having a 'variable' that can be of any type is quite minimal,
imo. The auto keyword in C++0x will make a large portion of the tedious parts
of strong typing in C++ go away, too.

IMO thinks like type hinting in PHP are absolutely steps in the right
direction. There is room for both 'scripting' and 'compiled' languages, but
good support for indicating and checking the expected or required types in
scripting languages helps tremendously in proactively validating programs.

------
timr
To me, the most remarkable thing about this post is that when the rest of the
world is falling in love with the "power" of weak typing systems, Google is
going the _other way_.

~~~
profquail
_the rest of the world is falling in love with the "power" of weak typing
systems_

Really? I'd argue the exact opposite -- with F# and Scala (and some others) on
the rise, I think there are plenty of people who are fed up with weak/dynamic
typing and want to take advantage of building programs with strong type
systems.

At the very least, it seems that the programming world is becoming much more
polarized. Anecdotally, it seems that every developer I know is either a
proponent of strong/static typing or weak/dynamic typing -- but I don't know
anyone that's just sitting on the fence.

~~~
swannodette
Those in favor of only static typing and those in favor of only dynamic typing
are living in the past. Any interesting evolution in programming languages
will allow developers to program with and without types / contracts from the
same language.

~~~
olavk
In Mascara (my own project) you can start out with dynamically typed
JavaScript, but then gradually add type annotation as appropriate to provide
stronger verification. It can be done using structural types without changing
the runtime semantics, and further by rewriting to use classes and nominal
typing.

I think this provides a useful upgrade path, because type verification is a
hassle with small projects, but becomes increasingly valuable (IMHO) as the
program grows.

The problem with current languages is that you have to decide upfront if you
want a language optimized for quick development or strong verification. But
often in the real world, programs start out as quick prototypes, and then grow
into large applications.

------
mayoff
When will they enhance it to flag the other error in this line:

long kMaxDiskSpace = 10 << 30; // Thirty gigs ought to be enough for anybody.

10<<30 is ten gigs, not thirty gigs.

~~~
chandlerc
Doh! Good catch, comment updated. =[ Maybe we do need Clang-for-comments as
well as Clang-for-C++ code.... ;]

~~~
nikki9696
And the error says it's an int, but it's declared long. Am I missing something
about long in C++ not being 64 bits?

~~~
chandlerc
That's the whole point. =] This is a surprising aspect of C++: the shift
expression doesn't have the type of the declared variable.

The integer literals we are shifting are of 'int' type, and the shift occurs
at that type (based on the usual arithmetic conversions). There is stack
overflow question with explanations and a good blog post here about it:

[http://stackoverflow.com/questions/836544/usual-
arithmetic-c...](http://stackoverflow.com/questions/836544/usual-arithmetic-
conversion-a-better-set-of-rules)

[http://blogs.msdn.com/b/oldnewthing/archive/2004/03/10/87247...](http://blogs.msdn.com/b/oldnewthing/archive/2004/03/10/87247.aspx)

Also, you can look through the C++98 standard to understand all the details.
Relevant sections are [expr]p9 and [expr.shift].

------
matthavener
I wonder if this is an indication that google is moving to clang for compiling
(and not just diagnostic tools). If that's true, maybe this is another nail in
the coffin for gcc? I see apple and google behind llvm/clang, who's behind
gcc? Nobody?

~~~
jrockway
_who's behind gcc? Nobody?_

I hear there's this kernel called Linux that depends heavily on GCC.

~~~
roel_v
But why would the kernel depend on gcc? Are there so many gcc-isms in there
that would be hard to replicate on other compilers?

~~~
adestefan
The biggest one is (was?) the use of GNU variable length arrays. The GNU
extension has different syntax than C99's variable length arrays. There are
also instances of __attributes__ on platform specific code.

------
hsmyers
CLint meet CLang and the better for it. Although if I read correctly between
the lines, there might be a little trouble getting the engineers to buy in :)
Every met anyone who could actually make it all the way through CLint with all
warnings on! Enough to drive you crazy!

------
siphr
Well written! The simple bugs that it seems to detect are fairly high
frequency so it should therefore improve overall code quality. Looking forward
to playing around with this in my spare time.

------
Natsu
I haven't even done any C++ for a while, but reading those other articles on
HN about undefined behavior in C made the example bugs in this article really
jump out at me.

------
archangel_one
The article implies to me that the third bug (passing 0.5 to sleep() ) is not
caught by gcc. Does anyone know if this is the case? It doesn't seem
excessively hard to produce a warning about shortening like that - the first
two seem more subtle, but that one less so. I don't have gcc on this machine
to check it, but VC++ certainly does emit a warning for that kind of thing.

~~~
chandlerc
GCC definitely has a warning for 0.5 -> int (likely -Wconversion, but I've not
checked). It also has a warning for setting a pointer to "false"
(-Wconversion-null). However, turning that warning on in a codebase where
every warning breaks the build was challenging because of false positives.
We're able to remove false positives and narrow the scope of the warning to
just the buggy code in many cases with Clang, and that allows us to turn these
warnings on much more aggressively.

~~~
lurker19
When is assigning a pointer to be boolean false intentional and correct?

~~~
chandlerc
Most of the cases we ran into were metaprogramming techniques which test
whether an expression is a valid null pointer constants. These got innocently
applied to 'false' and trigger the warning needlessly.

------
mleonhard
Eclipse highlights these kinds of bugs in java code. It saves me a lot of
time.

------
anonymous246
I wonder how their checks compare to Coverity's and QAC++'s.

I have a passing acquaintance with both, and I'm almost certain both would
have caught the three bugs listed on that page.

~~~
chandlerc
I would expect many of these tools to catch these types of bugs. The
challenging thing for us has been to catch _only_ bugs, and to catch them very
fast during normal compilation.

A lot of the static analyses we've looked into (and I'm hoping for more
detailed blog posts about that in the future) find plenty of bugs, but also
find lots of non-bugs. Combine that with being too slow to run during the
normal build, and you can't break the build when such a bug is found.

I think one of the most interesting aspects of this is how we catch the bugs
early, and force developers to fix them immediately by breaking the build.

~~~
spitfire
These sort of articles (and the attendant comments about false positives)
always scream out for Ada to me. It's a language designed by a calm, careful
thinker back in the 80's for life critical programs. It has everything Java
and C++ have except the vast number of undefined states and it's designed for
static analysis. By designed I mean, there are formal verifiers and the NSA
has used it in a test security system.

Plus the compiled code is pretty fast. So if you're feeling the need to reduce
your workload take a look at it, you might be surprised.

~~~
cageface
What kinds of libraries are available for Ada? Half the reason I use C++ is
that half the code I need to write is already available in mature libraries.

~~~
spitfire
Library support is aimed squarely at realtime life critical systems. You're
more likely to find a library with some sort of safety certification than not.
If you're expecting to use the latest web libraries or hadoop you'll be
disappointed.

However, there is a small collection of oss Ada libraries out there.

~~~
axman6
I seem to remember there was a pretty interesting web framework written in
Ada, Adaweb I think?

Anyway, Ada has some other amazingly cool features. The concurrency primitives
it offers are very cool, lets you make some much stronger guarantees about the
interactions between threads than any other language I've seen. For example,
you can define rendezvous sections, which if memory serves, are pieces of code
that are guaranteed to only be run once both threads participating in the
rendezvous and neither thread can leave the section until both are ready.

~~~
spitfire
There's ada web server (aws) which is neat. Similar idea to the java web kits
like jetty. You can even hotplug code during runtime.

You're right about the concurrency. Ada has a bunch of stuff like that built
into the language since 1983.

The particularly cool toys I like are SPARK (a formal verifier tool) and
stackcheck - tells you exactly how deep in the stack your code can possibly
go. (Yes you have to annotate cycles.)

