
CodePhage - Automatic bug repair, without access to source code - tvvocold
https://newsoffice.mit.edu/2015/automatic-code-bug-repair-0629
======
TeeWEE
FYI The full paper can be found here:
[http://delivery.acm.org/10.1145/2740000/2737988/p43-sidirogl...](http://delivery.acm.org/10.1145/2740000/2737988/p43-sidirogloudouskos.pdf?ip=80.113.211.165&id=2737988&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&CFID=524572299&CFTOKEN=51965653&__acm__=1435665069_096eebca6644a0f5e695dea4e5bdc5c8)

In summary, for an app with an bug. You must know a) input that causes to bug
to show up, b) input that doesnt cause an error. Then it will look for similar
code in github, and try inserting the checks done from that code into the new
code. And then it reruns the program, hoping that the bug is resolved.

This approach is very cool, and harnesses the power of lots of developer. But
its also very limited. However thats what research is for. Small steps
together are a big leap for mankind :p

I also like this conlusion in the article:

""" In recent years the increasing scope and volume of software development
efforts has produced a broad range of systems with similar or overlapping
goals. Together, these systems capture the knowledge and labor of many
developers. But each individual system largely reflects the effort of a single
team and, like essentially all software systems, still contains errors. We
present a new and, to the best of our knowledge, the first, technique for
automatically transferring code between systems to eliminate errors. The
system that implements this technique, CP, makes it possible to automatically
harness the combined efforts of multiple potentially independent development
efforts to improve them all regardless of the relationships that may or may
not exist across development organizations. In the long run we hope this
research will inspire other techniques that identify and combine the best
aspects of multiple systems. The ideal result will be significantly more
reliable and functional software systems that better serve the needs of our
society. """

~~~
vezzy-fnord
Well, so much for everyone panicking over strong AI taking over their jobs and
writing bug-free software.

Not that it isn't impressive, but there's been all sorts of academic projects
throughout the years that require lots of manual intervention to work their
outwardly powerful techniques, and many of them never even seeing the light of
a public release.

------
Piskvorrr
The old adage "fixing a bug introduces at least two more" comes to mind. Now
fully automated! ;)

But seriously: autogenerating fixes as observed by fuzzing does sound cool.

~~~
collyw
So do "schemaless databases".

~~~
robmccoll
"Schemaless databases" are cool. I would rather maintain as little schema as
possible in the database and instead keep it in my code. Schema must exist
somewhere because my data model exists. Since the data model and my program
are tightly coupled (since the entire point of the program is to manipulate
internal data state due to external and internal stimulus and produce output),
keeping them together makes the overall system manageable versus the headache
of effectively having code in two places. This is similar for complicated
routing configurations pushed to proxies and other external config. They are
really part of your code.

~~~
MrBuddyCasino
If you can specify those invariants in your code in a concise and declarative
way, minimizing the chance for bugs, then great. Thats not how it usually
works though. Also, how do you want to perform foreign key checks outside of
the db in a way thats fast?

~~~
carapace
"If you can specify those invariants in your code in a concise and declarative
way" ...then you've got a schema. ;-)

------
tempodox
If I exaggerate just a little bit, that means the end of us programmers
(though not right away). Just imagine, any-one can make a rough sketch of a
“computer program” and have a system like CodePhage fill in the blanks. There
would still be CS people for fundamental research and new discoveries, but the
rest of the software industry would collapse into one automated know-it-all
software replicator. Someone wake me up please!

~~~
TeMPOraL
Part of me is scared, but another thinks "it's about time". Web industry is
mostly millions of people coding up the same CRUDs all over again, it cries
for being automated.

~~~
tempodox
As for web apps, I would already be happy if you only needed two languages
(client / server, roughly) instead of five-ish.

And yes, lots of Java programs accessing a DB, slapping a front end on it that
lets people enter / request stuff in the browser are prime auto-generation
suspects.

~~~
anentropic
...what are the other three?

~~~
arithma
I guess: Backend, Frontend, Markup, Layout/Presentation, DB Querying

------
tluyben2
Related:
[http://dijkstra.cs.virginia.edu/genprog/](http://dijkstra.cs.virginia.edu/genprog/)

Edit; this one is open source; is the MIT one ? Couldn't find references on
the page?

------
vmorgulis
[https://news.ycombinator.com/item?id=9800883](https://news.ycombinator.com/item?id=9800883)

------
higherpurpose
Related or just a coincidence?

> _An Obama Administration official tells Re /code that recent advances in
> using automated methods to analyze software code for vulnerabilities have
> spurred interest in government circles to see if there’s a way to
> standardize how software is tested for security and safety._

[https://recode.net/2015/06/29/famed-security-researcher-
mudg...](https://recode.net/2015/06/29/famed-security-researcher-mudge-leaves-
google-for-white-house-gig/)

I just wonder what will happen to Google's Project Vault [1] now that Mudge is
gone. Hopefully it will still be on track.

[1]
[https://www.youtube.com/watch?v=V6qrQzn8uBo](https://www.youtube.com/watch?v=V6qrQzn8uBo)

~~~
miander
Regarding the quote from the article, I'm guessing it's referring to code
fuzzing and static analysis tools which have now become relatively widespread.

------
omouse
I wonder if this works mainly for C, C++, ObjC, Java and C#, Python, Ruby or
would it also work for Lisp/Scheme and other more powerful languages? Does it
only work for crashes?

In any case, CS is awesome. Love it when research that a few years ago would
have been theoretical is applied.

