> They discover bugs by testing random programs against multiple compilers. ...

Wilya · on June 19, 2012

From the linked paper: "Although this compiler-testing approach has been used before [6, 16, 23], Csmith’s test-generation techniques substantially advance the state of the art by generating random programs that are expressive—containing complex code using many C language features—while also ensuring that every generated program has a single interpretation. To have a unique interpretation, a program must not execute any of the 191 kinds of undefined behavior, nor depend on any of the 52 kinds of unspecified behavior, that are described in the C99 standard."

loumf · on June 19, 2012

I took "They guarantee that the inputs are legal" to mean that they limited it to programs with specified behavior. They don't know what the behavior is -- just that it is specified.

If they can do this, it finds a subset of bugs, with no false positives.

silentbicycle · on June 19, 2012

C compilers can be buggy, particularly when you start working with vendor-supplied compilers for embedded platforms. A colleague was furious when he realized that his board's compiler didn't support function pointers.

joshAg · on June 19, 2012

that does seem like a rather large omission. how did he work around that issue?

pmjordan · on June 19, 2012

I could imagine a DSP architecture that doesn't intrinsically support indirect jumps. (especially as DSPs frequently use the Harvard memory model) That would make implementing function pointers tricky. I'd probably work around this by making a set of dispatch macros that expand into a giant switch block where each case is a (static) function call. The other option would be self-modifying code, which is annoying to do, to say the least, particularly for Harvard systems.

kragen · on June 20, 2012

If your CPU supports keeping function return addresses on a stack that you can push other things onto, you can do an indirect jump by pushing the address you want to jump to and then "returning" to it. That's a lot easier than self-modifying code or massive switch statements, and just as easy on Harvard as on von Neumann architectures.

silentbicycle · on June 20, 2012

To be honest, I don't know, except that there was a lot of scowling. I think it was a PIC micro.

scott_s · on June 19, 2012

Both loumf and Wilya are correct. In support of their answers, remember that the specification does not specify the results of interesting programs. It says "if you do this, this must be the result." But if you limit yourself to only testing such simple cases, you're not going to find any interesting bugs - because such simple programs are likely to have already been tested.

cube13 · on June 19, 2012

>If the result differs from the specification, it is a bug.

A large part of the C standard is implementation defined(see acqq's post here: http://news.ycombinator.com/item?id=4131828 ), so the result could be different on multiple compilers, not a bug, and STILL completely within spec.

scott_s · on June 19, 2012

If I recall correctly, they only generated programs that had well-specified behavior. Not just legal, but specified.

MidwestMuster · on June 19, 2012

Isn't this called "fuzzing"

dllthomas · on June 19, 2012

It's certainly related. The difference here is that 1) they're comparing output of multiple systems, rather than looking for obviously erroneous behavior of one (segfaults, memory leaks, failed assertions); and 2) the input data is all correct - fuzzing (per my understanding) usually implies tossing bad data in to see if the system breaks (frequently just slightly bad data is more interesting than complete garbage, but either falls under "fuzzing").

scott_s · on June 19, 2012

It's rather like inverse fuzzing. So, zuffing.

_delirium · on June 20, 2012

This kind of coinage is a rather large rabbit hole! Once upon a time in 1963, someone asked: what happens if you take the Fourier transform of a Fourier transform? Well, a Fourier transform gives you a spectrum, so let's call a Fourier transform of that, a new concept called a cepstrum. So what are its bins, analogous to frequency bins? Let's call them quefrency bins, and the cepstrum is therefore a quefrency cepstrum. What's the operation when you modify quefrencies in the cepstrum in some manner other than uniformly, analogous to how one might run a frequency spectrum through a frequency-domain filter? Why, liftering, of course.

andrewcooke · on June 20, 2012

in case anyone else is confused - it (a cepstrum) is the ft of the log of the modulus of an ft. the ft of an ft is the original signal. https://en.wikipedia.org/wiki/Cepstrum