

Automatically fixing bugs in C programs with Genetic Algorithms - peterhunt
http://epr.adaptive.cs.unm.edu/

======
tansey
Their research is certainly cool and novel. On the surface, this is a
straight-forward application of genetic programming (evolving abstract syntax
trees):

1\. Take function which fails some test case(s)

2\. Parse AST

3\. Find a bunch of other lines of code in the program and use those as
possible mutations

4\. Evolve until test performance is improved

The trick is #3. They are going on the hypothesis that the solution to bugs
are often found in other parts of a program. For instance, you pass in a
variable and forget to check for it being null. It's likely that you have a
check for that somewhere else in the program, and if so, then you can add that
(templated) line of code into your buggy function and it will now pass that
test case.

It's certainly not a panacea, but it does work remarkably well for many bug
cases.

~~~
rat
How readable/maintainable is the resulting code?

~~~
Volpe
That's the next genetic algorithms problem :P

------
rcfox
John Regehr has comments on when this should or shouldn't be used:
<http://blog.regehr.org/archives/383>

~~~
sehugg
Yep, an evolutionary algorithm will take advantage of any shortcut and any
loophope in your fitness function. They're best for optimization, not for
ensuring correctness :)

~~~
eru
If you care to read the output, that seems like a good way to find errors in
your fitness function.

------
munin
it's funny, this is exactly how lots of lower-division undergrads at my school
approach writing code! i've heard professors refer to it as "debugging via
brownian motion"

~~~
Someone
You have very persisting undergrads. Most would skip the "check whether the
tests that passed previously still pass" step.

~~~
munin
their grade is (sometimes) largely based on how well they pass unit tests.

------
Havoc
Is it just me or is this a horrible idea? Surely the GA will evolve in ways
that fix the problem but introduce some new problems that you haven't even
considered/tested for?

~~~
breckinloggins
Sure, if you intend to just run the algorithm and check in the changes proudly
proclaiming "it's fixed!"

A better use for it would be as an assistant to help you track down the root
cause of stubborn bugs.

It's kind of like git-bisect in reverse.

~~~
dkubb
I've used a similar technique to this called Mutation Testing to find code
that wasn't tested by a unit test.

It doesn't use a GA, but it's still pretty neat. It parses the code into an
AST and then mutates each node in the AST, rerunning the tests to see if they
pass or fail. They are supposed to fail; if they continue to pass, it means a
state isn't being tested.

The computer acts as my assistant tracking down cases I forgot to test, so
that I can write the test to catch the mutation. Sometimes it even finds
sections of code that are impossible to reach normally, allowing me to remove
the dead code paths that I might otherwise have missed.

------
erikb
I enjoy the idea a lot, that one day people won't need to program their
machines anymore but can discuss with them what they actually want. Yes, there
will still be the need for the good old coding (maybe even more, because
someone needs to write these eloquent programs, right?). And yes, the
resulting programs of these discussions will be a lot more unclear (and thus
prone to all kinds of bugs). But for what does your grandfather need a 100%
correct program? He needs some food (no matter if carrots or beans), some
clean clothes (no matter if the red or the blue shirt) and someone who reminds
him to take his pills and tell him how the weather tomorrow maybe could be.

I wonder why we don't read more about these topics. Not just because of the
opportunities for less educated people and the chance to automate repeatable
tasks we can't repeat just yet. There are so many things where we are actually
a lot more flexible then we think. For example when I prototype my new android
game, I don't really care what way finding routines the game characters use.
In the beginning I just want to put together something really fast to see if
my game idea is worth anything. I would really like an IDE to just fill in the
empty spots itself with anything that might work. And after time, using my
more concrete programming input and the data from the test runs that every
coder does while developing the IDE could improve the code itself.

I imagine an IDE that I can tell "I want to code a computergame. It should be
rougelike. Very rougelike computergames are rouge, nethack, ADOM. A little
rougelike computergames are Diablo, Dungean Siege, Baldurs Gate, Oblivion,
World of Warcraft. Not rougelike computergames are Counter Strike, Doom, Sim
City. Not computergames are VIM, Firefox, Word. Make prototype!"

~~~
eru
Vi is actually quite nethack-like in its interface.

------
cpeterso
A more tractable problem is genetic Acovea (Analysis of Compiler Options via
Evolutionary Algorithm), a genetic algorithm to find the "best" optimization
CFLAGS for compiling programs with gcc. Unfortunately, the Acovea project is
no longer maintained.

[http://web.archive.org/web/20101223023921/http://www.coyoteg...](http://web.archive.org/web/20101223023921/http://www.coyotegulch.com/products/acovea/)

I've been told that Intel's compiler performance team has investigated similar
genetic algorithms.

~~~
InclinedPlane
Such a hugely promising area of research. Also consider something even finer
grained, adjusting optimization options on individual blocks of code at run-
time or through instrumented binaries using genetic algorithms. It could
revolutionize optimization.

------
brindle
1\. The search space is all possible abstract syntax trees. That is a large
search space.

2\. The correct solution must also satisfy all test unit tests. Basically the
bug fixing is driven by the fitness function that determines if the bug is
fixed and the program still satisfies the specification, i.e. the unit tests.

If you have a complex program the challenge will be coming up with a complete
specification of the program's behavior.

------
aninteger
One of the problems with machine generated code is that it's rarely readable
by human beings so you'd still have to take a lot of time to see what the
algorithm changed. The other huge problem is false positives (although OpenBSD
made lint work for them so maybe it's not that bad)

------
amikula
Fascinating concept. What if humans wrote the tests, and a genetic system like
this used a specialized language (machine code or JVM bytecode, for example)
to write a solution? What level of sophistication could a system like this
attain?

~~~
rcfox
I assume it would be like finding the interpolating polynomial of a bunch of
points in a row. (Like this: . . . . . .) The resulting polynomial would go
through all of the points, but have very large peaks/valleys in between.

Essentially, you'd get a system that passed all of your tests, but produced
garbage for anything not covered by the tests.

~~~
eric-hu
Your post got me excited because it sounds like you're describing Runge's
Phenomenon ( <http://en.wikipedia.org/wiki/Runge%27s_phenomenon> ).

The wikipedia article doesn't have the best illustration, but I think the
inherent idea is really wise: in trying too hard to meet your initial
constraints, you can come up with a solution that's only useful at those
constraint points.

~~~
rcfox
I'd never heard of Runge's phenomenon, but it does indeed sound like I was
describing it. Thanks for giving me a name for it.

~~~
eric-hu
Interesting, I thought that was exactly what you were talking about :)

I learned about it in numerical analysis. It illustrates the downside of
trying to be too precise--you can make a polynomial that will go through an
arbitrary number of data points.

As the number of points increases, your function will look less like a line
and more like a magnitude 9 earthquake on a seismograph. The function will
pass through all the points used to define it. However, it'll be useless when
predicting the original data's behavior, as it changes too quickly on small
input.

Instead, mathematicians find more useful functions by relaxing the conditions
so that the model function only has to come 'near' the data points.

------
zwieback
I think if you took this technique plus input from compiler warnings and
static code analysis and rolled them all together into a real-time IDE advisor
you would have a nice product.

I don't think fixing programs in post processing is such a great idea but if
the feedback can be used to make me a better programmer I'm all for it.

Scanning neighboring code is a really good idea, when I'm working on someone
else's code I'll try to follow the exisiting practices instead of inserting
bits and pieces in my own style.

------
caustic
That's an interesting concept, but it seems like they are not alone in this
field. Just recently I stumbled upon a web page by Moshe Sipper, who seems to
work on a very similar topic "Darwinian Software Engineering":

    
    
        http://www.moshesipper.com/finch/
    

He makes a rather grandiose claim -- “We believe that in about fifty years'
time it will be possible to program computers by means of evolution. Not
merely possible but indeed prevalent.”

------
5hoom
This just looks so cool. I'm sure this would be very useful on giant legacy
codebases where the author is long gone but the bugs remain, and I just like
the idea of using ai techniques to refactor code. Seems all futuristic and
stuff ;)

~~~
ForrestN
I agree. Fun to imagine the software bugfixing future versions of itself.

~~~
jpadkins
To some, self improving software is a frightening.

------
michaelochurch
Watch out with this. If you auto-fix a Java program, the result is that none
of the original code survives; it's rewritten from scratch in Scala!

------
braindead_in
Reminds me of I Robot. Someday these C programs will group together and
develop the ability to think independently.

