
Generating code from natural language is closer than you think - p_alexander
http://blog.stephenwolfram.com/2010/11/programming-with-natural-language-is-actually-going-to-work/
======
blahedo
"Make it possible for programmers to write in English and you will find the
programmers cannot write in English."

I teach computer science and have a particular fondness for introductory CS.
The reason Stephen Wolfram is wrong, wrong, wrong about this is that people
that have never been taught programming can't express themselves precisely
enough in their native language, either; and even among those of us that have
been programming for decades, when we express ourselves in natural language we
_can_ be very precise but it takes a lot more work and becomes a lot more
unwieldy than just writing out our instructions in [pseudo]code.

CS educators have been wishing for a long time that "intro to CS" didn't
equate to "intro to programming". And it doesn't have to, not quite, but the
reason it always seems to revert there is that the prerequisite for _every
other thing in CS_ is, not programming itself, but a certain precision of
thought that is easiest to teach just by teaching students to program. In a
programming language. Because if you try to make them write out instructions
in a natural language, and you notice that they aren't being precise and
therefore deliberately misinterpret that instruction, they just think you're
being a dick about it. They sometimes even think this if you honestly
misinterpret them. (This is true even in a non-CS context.)

Saying that we will soon be "generating code from natural language" is, at
best, misleading. It implies that people who couldn't learn a programming
language will be able to program, which is quite untrue---I promise that with
the possible rare exception of a few pathological edge cases, when people
can't learn to program, the language is the least of their problems. And for
those of us that _can_ and _do_ learn programming languages, all but the
simplest sorts of programs will probably be easier to write in a programming
language (which was designed for that sort of thing) than in a natural
language (which was not).

(And holding up Mathematica as an exemplar is particularly egregious; it is so
loaded with syntax that "just works" that you need to either have a deep
familiarity with traditional mathematical notation or else a degree-level CS
background in programming language theory if you want to have a good shot at
learning the language in anything more than a pattern-matching fill-in-the-
blank way.)

~~~
shasta
The problem with this analysis is that it discounts the possibility of
interactivity. There are many who cannot think through the logical steps
needed to complete a task, but can watch a system stepping through a task,
stop it when it seems to go wrong, and provide an explanation as to what's
wrong about it. There are a whole lot of problems for which this trial and
error process would converge to a mostly working solution.

Hell, there are many _programmers_ who code things this way - try it, change
it, repeat. Natural language would just allow non-programmers who aren't
familiar with the syntax to join in on this process. Continued practice would
likely improve the reasoning abilities of the user and make the process
faster.

~~~
blahedo
I have no problem with this idea of iterative refinement, and indeed I
actively teach it. There are certainly students who pick up the syntax more
_slowly_ , but this is largely a matter of degree. But the students I refer to
above have trouble even identifying what's wrong about an intermediate value
or where things went off the rails or what a correct answer would even look
like. They're more common than you (as a hacker) would ever believe unless you
teach intro CS. They _can_ be taught at least enough to pass CS1, but it's not
easy and "natural language programming" will not help them with the things
they need help on. (Might actively hurt, actually.)

------
jerf
This is in the class of "demoware", projects that are easy to program fancy
demos for but are very difficult to bring to production status. (See also:
"fully visual programming".) It's only really interesting if they escape from
that. We'll have to wait and see.

~~~
brudgers
There is a difference between visual programming and natural language.

Natural languages already have tokens, syntax, and grammar whereas visual
fields do not have them. All those elements must be imposed onto a visual
language before it can be translated into the machine language (e.g. there is
no obvious convention for visual commands or visual conditionals).

~~~
pak
_there is no obvious convention for visual commands or visual conditionals_

I don't know about that. Within a particular domain, you can certainly come
close to representing these things, or presenting the right controls for a
user to represent commands and conditionals. For an example, check out
QuickFuse <http://quickfuseapps.com>

Before we built this, we thought about common ways people "drew" voice apps,
and commands and conditionals both have certain natural visual representations
in the "voice app" space, e.g., blocks with branching arrows and writing text
to indicate spoken words.

~~~
brudgers
The convention is arbitrary (e.g. most people don't have receivers on their
phones and hanging up is done by pressing down a button not by placing the
receiver in a horizontal position).

If there were an obvious visual convention, the start button would not need to
say "start" and the hang up button would not need to say "hang up."

That's not to say that arbitrary visual conventions can't have great utility
(the alphabet being a case in point). But to illustrate the issues with
graphic conventions, Quickfuse does not use the long established conventions
for flowcharting. It uses natural language instead.

~~~
pak
Why would you exclude text labels from visual programming? I think this is an
artificial distinction. Sometimes they provide the best balance of space
consumption vs. clarity. Labels on their own aren't natural language, they are
at most terms put into context by the surrounding graphics; would your
criteria be that visual programs use icons for every single concept?

Also, you'll find that that where we departed from conventions for
flowcharting, we did it to save pixels, or make the UI more accessible. There
is a tradeoff in visual programming between ease of editing and clutter--
compare with Max MSP, which has a stark UI, at the cost of having you memorize
certain textual commands. If you draw a bare flowchart, it's not obvious how
to manipulate it until you draw other GUI controls on top of it or make some
modifications. However, it is the general paradigm we are tapping into.

~~~
brudgers
> _"Why would you exclude text labels from visual programming?"_

The context of my initial context was in response to _"fully visual
programming"_ in the ancestor comment and the implication that natural
language programming faced similar challenges.

The expediency provided by labels is a result of the arbitrariness of graphic
interpretation. I am not suggesting that the use of labels isn't helpful, only
that the use of labels doesn't really differentiate "visual programming" as a
subset of programming, i.e. in and of itself a text label is not significantly
more visual than text on a terminal screen.

For example, the label is necessary because the visual convention for "start"
is ambiguous. Left, right, top and bottom are all used as a starting position
depending on the arbitrary conventions of the context. Likewise, a MaBell
styled receiver, green flag, a vertical stroke or a hand with index finger
extended as if to press a button may all be used to indicate start.

I used your departure from flowcharting conventions as an illustration of the
unique problems with graphical communication conventions. I am not implying
that deviating from flow charting convention is a bad idea.

My point is that your deviation from flowcharting conventions is arbitrary in
the sense that it was driven by factors irrelevant to the process of flow
charting (i.e. the limitations of the medium on which the flowchart is
presented rather than concerns about the mapping of graphic symbols to
processes).

------
stellar678
I would be fascinated to see several hundred years down the road how natural
languages and computer languages have comingled and evolved into something
new. I'd be inclined to believe that bringing natural language to computers
won't just be a one-way street.

You already see this in places like hacker news here where people often use
constructs like "s/thing/other thing/" because it's more concise and useful
than writing out the natural language version.

~~~
alanh
I upvoted you. But I am not sure that "/thing/other thing/" is more concise or
useful than all alternatives.

"thing" → "other thing" is the same length, and while not natural language, it
isn’t a computer language.

I think using the sed-like (right?) language is more useful as a signaller.
Check it out, yo, I grep shit _all the time_.

~~~
silentbicycle
The whole s/X/Y/ thing is very Unix, and is a (sub)cultural signifier as much
as anything. I'm not sure if it's originally from ed, sed, or what, but most
people (self included) probably picked it up from vi (nvi/vim/etc.) or perl.

X->Y makes just as much sense, but the s (for "substitute") makes it mnemonic
- I read it as "sub X for Y".

~~~
aaronblohowiak
ed gave rise to sed and vi. ed -> em -> ex (VIsual mode). vim is Vi iMproved.

x->y suggests lambda to many people in our community.

Edit: I meant to provide some additional information to other readers, not to
disagree in any way.

~~~
silentbicycle
Well, right, but how many people here have actually used ed standalone? (
_lone hand_ ) It's overwhelmingly likely that most people picked it up from
vi(m).

------
bradly
While it would be neat to give the power of programming to everyone, I'm not
convinced coding in a natural language would necessarily be better/easier than
writing code in Ruby or Lisp or Python.

Sure, you eliminate the first big hurdle in programming, but learning the
syntax of programming language is usually one of the easier parts of software
development.

~~~
brudgers
"Draw a red circle" is enough to get a red circle. Basic, C++, or Javascript
just aren't that closely coupled with the way we think.

The advantage of natural language over a high level programming language would
appear to be analogous to that which a high level programming language has
over assembly.

I think you may be conflating programming with software development. People
still develop software in assembly language, but few people use it in lieu of
javascript on the web.

~~~
necubi
Where is the red circle drawn? How large is it? What shade of red? How are
those arbitrary values chosen, and how do I choose others?

The inherent problem with NLP is that human languages are imprecise and
ambiguous. For an example of the problems with NLP, try using Wolfram Alpha.
While it returns useful results for many queries, as soon as you start off the
beaten path it can become an exercise in frustration as you try to figure out
exactly which format of words it will accept, especially as I know the
mathematica command that would accomplish the desired goal (or can look it up
quickly).

Obviously Alpha isn't the end-game of NLP and improvements can be made to
accept more constructions. But ultimately, you're going to have to restrict
yourself to a subset of your natural language's syntax, grammar and
vocabulary, and I believe that learning what that subset is is far more
difficult and frustrating than just learning a programming language.

Furthermore, merely knowing a NL programming language isn't enough to be a
programmer--you still need to learn how to think logically and
algorithmically, which seems to be the hard part for people new to
programming.

~~~
akkartik

      Draw a circle.
    
      Color it red.
    
      Make it smaller.
    
      No, 5% bigger.
    
      Make the radius 5 units.
    

Refining specifications like that seems like the holy grail of movements like
aspect-oriented programming, NLP or no. Separating concerns is a _really_ good
thing.

~~~
mkramlich

      circle = Circle(); // default origin, radius, color, etc.
      circle.set_visible(true)
      circle.color = RED
      circle.make_smaller()
      circle.set_radius(5)
    

// or whatever. just saying we can do approximately this already today, in
most any modern language, as long as you're willing to express it in the prog
language rather than the natural language. And I'm not sure having it
expressed in a natural language is better in any significant way.

~~~
alextp
This only works if someone has added such an interface to every object in the
programming language, the user is familiar with the dot-calls-method
convention (and the state-is-modified convention), etc. I think to do actual
programming it's probably best to use an actual programming language, as you
do want all the details under control. To do what most people do with
mathematica, however, (and the way most people use mathematica is more like
the way they use an interpreter than the way they write in a source file, by
trying out all sorts of combinations of some little pieces while they work out
the actual problem they're trying to solve in their heads) it seems like a
good idea, as you're just removing friction from the system (what if
circle.make_smaller() was named circle.reduce_size()? Is this responsibility
better offshored to an intellisense-like technology with heavy autocompletion?
But in general I don't want to be told what to type, I want what I type to
work)

------
erikpukinskis
Awesome, Stephen Wolfram has duplicated Terry Winograd's 1971 PhD thesis,
which ran in 256K of memory on a PDP-6.

Winograd: <https://hci.stanford.edu/~winograd/shrdlu> Wolfram:
[http://blog.stephenwolfram.com/data/uploads/2010/11/conesphe...](http://blog.stephenwolfram.com/data/uploads/2010/11/conesphere_11.jpg)

------
amichail
A related thought experiment: if programmers were to work for free, would non-
programmers think up of lots of clever things for them to build?

~~~
jpwagner
This is a great point.

After a few minutes of thought, the biggest benefactor of natural language
coding would be what I'd call late-bloomers: people who always thought like
coders, but never actually did programming.

------
justin_vanw
Well, I since it isn't possible (natural language isn't precise enough to even
communicate efficiently with other humans that share the same hardware as you,
much less an unthinking autamoton such as a computer), that would put the time
frame at around never. If it ever actually happens, that would by definition
of 'never' be sooner than I think, so all he has to do is actually accomplish
it instead of talking shit his entire life, and he'll have proved his
statement correct.

Maybe he'll call it 'A New Kind of Programming Language'.

~~~
derefr
Programming languages are _specificational_ : you say everything up-front,
then the computer interprets all your statements at once and executes them.

Natural language, however, is _conversational_. You say A, the person you're
talking to interprets that as Z and asks for clarification, you explain the
difference between A and Z, now the other person thinks M and asks more
clarifying questions, etc.

Computers are fully capable of working conversationally, instead of
specificationally; it just requires that instructions be stored in a form
that's a bit more complicated than a linear tape (e.g. a database of
constraints, like Prolog.)

~~~
silentbicycle
In more conventional programming terms, some languages have REPLs* for a
conversation, and support _declarative_ programming ("Here's what I want,
figure it out"), but most are procedural ("Do this, than this, than this, then
give me the result").

* Read/eval(uate)/print loops

Prolog (also Erlang and some others) is mixed; you reload a file of rules
_read as a whole_ , but can easily prompt the system for easy testing, and
reloading is very fast. It's very convenient with a Prolog shell terminal and
a vi window, two buffers in Emacs, etc.

~~~
derefr
Prolog has a REPL, but only new _facts_ can be declared through it
efficiently, not new _rules_. If you (re-)declare a rule (through `assert`),
the entire constraint database is actually re-evaluated behind the scenes. The
big problem in conversational declarative programming is how to start with
general-purpose rules, and work downward with more and more special-case
exceptions, without each new assertion taking longer to integrate into the
database than the last.

Inform is an exapmple of a natural-language-ish, rule-based system (for
programming text adventures), that _could_ efficiently re-declare rules in a
REPL (if not for its basis in virtual machine image formats that expect to be
compiled from complete specifications.) Inform guarantees efficiency by using
a hub-and-spoke system of rules: rather than every rule having the possibility
to interact with every other rule, rules can only interact with rules in their
own "rulebook" (module), the core rulebooks (standard library), and the "meta"
rulebook (monkeypatches to re-specify libraries.) Thus, integrating a new
definition only takes O(k + n + e) time—where k, n, and e should all be
small—rather than O(n^2). This works well for Inform, but I'm not sure whether
it would be as effective in a general-purpose programming environment.

~~~
aaronblohowiak
Do rules need to be global in scope?

~~~
silentbicycle
Not necessarily. Some Prolog implementations have module/packaging systems,
some don't. Prolog is a weird language - many details feel very antiquated* ,
yet on the whole it's _way_ ahead of its time (esp. constraint programming). I
think it would fare _much_ better as an embedded library (like e.g. Lua or
SQLite), rather than a freestanding language. Working on it, though I will
likely finish other projects first.

* Case in point: Loading a file is "consult"; I assume this is historically because Prolog was originally a language for doing NLP in French. (See e.g. HoPL-2.)

------
waterlesscloud
Sure, right after physics is adequately expressed in natural language.

There's a reason physicists express their concepts in mathematics, and that's
because math is the language humans devised to express those things, having
found natural language inadequate.

Programming is similar in that regard.

------
ADRIANFR
This is clearly demoware, good enough to impress the general public. But once
you go from "Draw a red circle" to "Draw a red circle overlapped one third
with a blue square with white dots over the lower left corner" that would be
progress. And then I wonder if this will also work: "Paint a white-dotted blue
square that intersects over a third of a red circle in the lower-left corner.
Or will it give a "compile error"

Why not just create a DSL (e.g. in Scala) with a simple standardized NL-like
syntax that can give meaningful "compile errors". There is no need to impress
the general public.

------
mkramlich
Two reactions to this piece.

1\. I'm not sure what problem so-called natural language programming is trying
to solve.

2\. Though I admire this man's building of Mathematica and the company that
sells it, I'm generally not a fan of what I perceive as his history of
"discovering the obvious" and self-promotion. Or rather, rediscovering or
making things sound like he invented them or came up with them for the first
time. Cellular automata and it's implications in his book "A New Kind of
Science", and now this piece sounds like more of the same. I give him a little
slack because he's in business and so there's the self-promotion angle, but
not much.

------
hackinthebochs
First off, did he really have to plug ANKS? Honestly it seems like every piece
of writing I come across from him he has to mention it.

Secondly, I think eventually programming will have to become more mainstream.
And this will be done through some form of a natural language interface.
Programming is best a tool to solve problems. It's way too limited now where
only an elite group gets to control its use.

Eventually programming will have to become a tool as common as mathematics.
The problems we face are only going to continue to grow in complexity where
the only way to get a handle on them is through automated computation. The
user will need the instant, iterative feedback that only a self-made program
can provide. For this to happen, the interface to programming will need a
radical change.

~~~
Uchikoma
Most people are not able to use mathematics as a tool.

~~~
hackinthebochs
That's not true. Most people with jobs that aren't just manual labor use basic
math to solve problems. As the problems get more complicated, the tools have
to grow as well.

Furthermore, I'm not talking about your average office assistant. I'm talking
about professionals in a science-related field. Being able to use programming
as a tool is going to become invaluable.

------
Uchikoma
This will exactly end like Infocom adventures. They are mostly nice, but quite
often you hunt for the exact phrase the parser understood (but to be fair, it
happened more often in Magnetic Scrolls).

But would you call using a text adventure (put blue ball in red box)
programming?

~~~
jcl
I was thinking the same thing. And it's no coincidence that Inform 7 -- the
latest iteration of a popular text adventure game programming language -- is
perhaps the most extreme example of a programming language pretending to be
English.

Here's the actual source code of a game:
<http://inform7.com/learn/eg/bronze/source.html>

Of course, it's really a precise programming language that happens to read
like English; the same text might not work if it were rearranged into
semantically equivalent English.

------
IAforyears
I think that the title should be: Generating code that can be described in a
short phrase is closer than you think.

We can create a database with the different phrases that people use to suggest
a command, for example:

Calculate the limit, find the limit ... all of this is Limit, and so on.

To sell a product, this kind of ability is well received. Is like telling to
your telephone: Please call this number for me, the number of my friend
Alfred. And the computer lookup Alfred in its database and connect with that
number, that is not deep IA but is a useful trick for selling products.

------
zellyn
Sure, by the time you enter all the specifics of the Circle, you've got
something more unwieldy than a succinct programmatic description.

But for most people unfamiliar with Mathematica syntax, typing, "draw a red
circle" and having the computer choose sensible (or any) defaults yields a
template of the exact code one would have needed to type. Which saves Googling
or reading the manual for circles, and teaches the syntax in a very natural
way.

~~~
elasticdog
That is correct...it is nice to be able to see the underlying Mathematica code
that any natural language query generates. Here's your "draw a red circle"
example:

<http://img502.imageshack.us/img502/4355/drawaredcircle.png>

------
ckcheng
Not the same thing, but similar?, from 1983:

"A natural language query system for a prolog database" (Hadley)
<http://ir.lib.sfu.ca/handle/1892/7254>

Not general code generation I suppose, but there's been quite some work on
natural language to database queries systems in AI research elsewhere.

------
EGreg
Natural language can lead to contradictions and ambiguities. Just look at that
star trek episode where they supposedly brought down androids built by a very
advanced ancient race just by saying stupid and self contradictory things.

Which by the way was a stupid premise but hey :)

------
Uchikoma
A programmer can code in English (though a programming language is easier,
more concise, easier to understand) or in a programming language.

A non-programmer cannot write code, neither in English nor in programming
language.

------
hsuresh
On a tangent, has anyone seen/used Intentional Workbench? They claim to be
changing the way we program, but haven't seen anything concrete.

------
jared314
The "uncanny valley" might apply here.

~~~
yafujifide
Might be right, you, I think.

------
skybrian
It's over-hyped as usual, but this looks like it could evolve into a handy way
of generating snippets of working code that you can then modify. Think of it
as an alternative to doing a Google search for a useful example to start from,
or a better kind of Rails scaffolding.

------
madcaptenor
Closer than we think, maybe, but further than Stephen Wolfram thinks.

------
hasenj
I'm suspecious of the usefulness of such an approach.

What's easier, "5 + 10" or "five plus 10", or even worse "five added to ten"?

Dijkstra had an article about that, titled: On the foolishness of "natural
language programming"[1].

[1]:
[http://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/EW...](http://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/EWD667.html)

------
alanh
These guys are _smart._

------
pama
To Wolfram Alpha: Hello, HAL. Do you read me, HAL?

Result: Affirmative, Dave. I read you.

