
Rice University leads $11M effort in big data software analytics - ColinCera
http://news.rice.edu/2014/11/05/next-for-darpa-autocomplete-for-programmers-2/
======
ColinCera
“We envision a system where the programmer writes a few of lines of code, hits
a button and the rest of the code appears. And not only that, the rest of the
code should work seamlessly with the code that’s already been written.”

I'm skeptical...

'Writing computer programs could become as easy as searching the Internet. A
Rice University-led team of software experts has launched an $11 million
effort to create a sophisticated tool called PLINY that will both
“autocomplete” and “autocorrect” code for programmers, much like the software
that completes search queries and corrects spelling on today’s Web browsers
and smartphones.'

Interested, but very, very skeptical...

~~~
simplemath
I've seen plenty of platforms that write their own code from UI-type elements.
That code is scary.

~~~
AngrySkillzz
More or less scary than the code generated by a compiler? The JavaScript you
get out of a ClojureScript compiler is pretty scary, but it still runs
reasonably well. Though I suppose it would be pretty hard to write additional
JavaScript to interact with the generated code.

~~~
simplemath
>Though I suppose it would be pretty hard to write additional JavaScript to
interact with the generated code.

------
hollerith
I wish the people here would upvote more posts by actual researchers and
programmers and fewer press releases from university PR departments.

------
mring33621
This is easy. IDE bot scrapes stackoverflow as you type. Ctrl-space to paste
'best' solution.

~~~
Igglyboo
[http://gkoberger.github.io/stacksort/](http://gkoberger.github.io/stacksort/)

------
jimbokun
From this article, I have absolutely no idea what this software will actually
do.

Can anyone enlighten me as to what they are actually trying to do, from the
perspective of an actual software developer?

Or did they just successfully string together the correct sequence of buzz
words to unlock the grant money?

~~~
leeber
> Or did they just successfully string together the correct sequence of buzz
> words to unlock the grant money?

This, absolutely. Big data, data mining, and machine learning are really cool
topics but the words became overused, overhyped, used out of context,
especially by people who don't really understand what these are.

I have an old co-worker who spent a lot of time working with large excel
spreadhseets, using some formulas, sorting it to look for things, etc.

He lists "data mining" on his linkedin, and has a ton of people who endorsed
him for it.

This became a little off topic, but I hate how venture capital, grant funds,
the media, and misinformed people completely butcher these topics.

~~~
borplk
"big data" .... because my Excel spreadsheets were like ... really big ...
like you had to scroll down for 5 seconds big.

------
VLM
As usual, the future is already here, just unevenly distributed.

[https://github.com/capitaomorte/yasnippet](https://github.com/capitaomorte/yasnippet)

I do wonder if you gave $11M to João Távora what the end result would be.
Probably pretty cool.

~~~
bhudman
Grants like there baffle me. Having worked for the university, it is a lot
about how you phrase the proposal so you can get the approvers to issue the
grant. Someone had suggested ya-snippet and it already solves many of their
goals. I am skeptical about the auto-correct feature

~~~
mafribe
One of the PIs is a Goedel award winner (M. Vardi). Track record matters in
getting funding.

------
lostpixel
I half-remember they tried to do a subset of this with some LISP/SCHEME, but
it didn't pan out too well.

[http://en.wikipedia.org/wiki/DWIM](http://en.wikipedia.org/wiki/DWIM)

------
boardstretcher
It's a great idea. But, what code base is this autocomplete going to run off
of?

If they are thinking of sourcing the internet itself, there had better be some
kind of omniscient, all powerful proofreader in place, because there are a lot
of people that submit a lot of code that is HORRIBLY insecure, inaccurate,
prone to breakage or just plain spaghetti.

I'd hate to be working on a missile guidance system, only to press <tab> to
complete a code block and end up getting some Intel Pentium FDIV instructions.

------
saurabh20n
The announcement itself is pretty sparse on the proposed approach, but given
the research interests of Swarat Chaudhuri [1] and Moshe Vardi [2], I would
guess they will attempt to combine recent advancements in program synthesis,
program verification, and code mining.

Program synthesis: There has been a lot of interest in the formal methods
community to automatically generate programs (for small instances) with the
target specification coming from input-output examples (e.g., Excel Flash Fill
[3]), program templates or holes (called Sketches [4]), reactive models of
adversarial environments, formal invariants etc. Also the solution techniques
used vary considerably: from game theoretic solving, SAT solvers, model
checkers, to version-space algebras and others. The community has not yet
fixated on a specification language, or a solving technology. The industrial
nature of the tools being leveraged (e.g., model checkers and SAT solvers from
the hardware community) gives hope for promising developments. A Berkeley
course [5] covers a good spectrum of the current developments.

If I were to guess, maybe the Rice researchers are approaching the code
completion/correction problem as mining for fragments of large codebases that
are incomplete/incorrect and applying program synthesis to fill those
fragments. Of course that would mean that they would also need to mine the
specification requirements for those fragments. All of this is easier said
than done, and it would be an ambitious project. Swarat has also done some
really cool work on "probabilistic reasoning for programs" and "verification
of probabilistic programs", so that might be part of it too. (Of course, I may
be completely off-base! After all, we are commenting on a non-technical
funding announcement here.)

[1] Swarat's publications:
[http://www.cs.rice.edu/~sc40/pubs/](http://www.cs.rice.edu/~sc40/pubs/)

[2] Moshe's publications:
[http://www.cs.rice.edu/~vardi/papers/index.html](http://www.cs.rice.edu/~vardi/papers/index.html)

[3] Excel's FlashFill from Sumit Gulwani, researcher@MSR:
[http://research.microsoft.com/en-
us/um/people/sumitg/flashfi...](http://research.microsoft.com/en-
us/um/people/sumitg/flashfill.html)

[4] The Sketch program synthesizer:
[https://bitbucket.org/gatoatigrado/sketch-
frontend/wiki/Home](https://bitbucket.org/gatoatigrado/sketch-
frontend/wiki/Home)

[5] Ras Bodik/Emina Torlak: Berkeley course material on Program Synthesis:
[http://www.cs.berkeley.edu/~bodik/cs294fa12](http://www.cs.berkeley.edu/~bodik/cs294fa12)

~~~
infinite8s
I put my money on this comment being reflective of the actual research award
vs all the other comments complaining that its an $11M grant to build an
autocomplete that scrapes the internet for code snippets.

------
geobmx540
Posted just a bit later on HN:
[http://codesnippet.research.microsoft.com/](http://codesnippet.research.microsoft.com/)

------
chadmckenna
"PLINY is part of DARPA’s Mining and Understanding Software Enclaves (MUSE)
program, an initiative that seeks to gather hundreds of billions of lines of
publicly available open-source computer code and to mine that code to create a
searchable database of properties, behaviors and vulnerabilities."

I feel that the reason DARPA is willing to fund this is because of that last
part: "vulnerabilities".

------
RA_Fisher
Hopefully they publish in public jounals!

------
m3sh
all those millions are given for explaining stuff with that papers and not
laughing once.

------
blktiger
Autocomplete + the Internet already erodes many programmers skills to the
point where they can barely write code without help. I can't imagine what this
kind of tool would do.

Not that there is anything wrong with autocomplete. I certainly use it, but
I've seen a lot of programmers that barely understand the code they are
writing.

