IMO the argument is sound in one direction: the precision, structure, and unforgiving nature of programming languages is a feature rather than a bug. That said, I think this does not mean that there is no benefit to natural language interfaces to machines. I think he goes wrong here:
> When all is said and told, the "naturalness" with which we use our native tongues boils down to the ease with which we can use them for making statements the nonsense of which is not obvious.
I would say that he leaves out a critical detail: the nonsense of which is not obvious _and also not important_.
Exactly. When you instruct a human in natural language you can rely on them to do disambiguating and nonsense filtering, which means you don’t need to take the time to phrase things precisely. If you’re really lucky, when you give them an ambiguous instruction, they will even ask for clarification.
And these are precisely the sorts of things LLMs can bring to the table. With the added benefit that they are infinitely patient, and don’t suffer from embarrassment if they get it wrong.
As long as anything they produce is reversible and malleable they can be induced to try again or refine their output indefinitely, which helps solve ambiguities and confusions - I don’t have to prematurely jump to precise language if the LLM can figure out what I want from a vague description.
And these are precisely the sorts of things LLMs can bring to the table.
LLMs have shown some promise here, obviously, but I think at the end of the day you are still going to have to enumerate all of the edge cases and error states w.r.t. a particular segment of code and it's not clear to me that LLMs are going to make this faster in many real world cases.
LLMs shine with boilerplate stuff that zillions have done before ("create a linked list implementation for me in language XYZ", "create a dropdown list with these five options", etc) but it is less clear to me that they are (or can be) useful for more custom domain- or application- logic.
There are certainly many classes of things for which specialized notation/representations are far clearer than natural language. Imagine describing how to play Beethoven's 5th with [insert human language here] rather than standard musical notation.
It's probably a continuum of suitability, and not a binary yes/no thing, but I think even after LLMs really hit their full potential most software development will probably still be best achieved with the precision of specialized programming languages and not natural language.
We’re perhaps talking about very different levels of ‘programming’. Just in the sense of ‘expressing what you want a computer to do’,
Think about a thing someone needs to do in an office somewhere.
‘Get all the data from the excel file attached to this email, find the CRM record for each company listed and make a list of the ones whose data needs updating’.
That’s the kind of thing you can ask John in accounting to spend a day doing.
Or you can write a script. Try to express in precise unambiguous code exactly how you want to accomplish it and spend a couple of days debugging it.
Or you can ask an LLM with some tool access to do it and maybe it’ll give you a list in a few minutes.
And if you need to do that same task every week: either John gets good at doing it, or he gets bored and quits; if you made a script you have to keep bug fixing it and dealing with the changes in the excel file, and eventually someone tells you that it really needs to be implemented in The System so you file a Jira ticket and a product owner comes out and writes down your user story and six months later you get a web form you can upload your excel attachments to that sometimes produces the list of clients to update…
But if you asked the LLM to do it… you just ask it do it again the next week.
This is the bit that maybe programmers haven’t quite realized yet: if I can get an LLM to just do that task, I don’t need to go and get someone to build some software to do it. I can get the computer to do it myself.
This is the bit that maybe programmers haven’t quite realized yet:
if I can get an LLM to just do that task, I don’t need to go and
get someone to build some software to do it.
I think everybody understands that part, and that the cynicism/skepticism is about LLMs' ability to deliver on that aspect of it.
Part of training John, or writing a script, or telling an LLM, is handling all of the edge cases that exist in even a "simple" import job.
- What to do if there is a duplicate record? Create a dupe? Do an upsert? If data conflicts how we decide which version "wins?"
- What about fuzzy dupes? Two records that are probably identical but not really the same, probably due to typos or slightly different accent characters, etc
- "find the CRM record for each company listed" <-- okay, and what if the record doesn't exist? fail?
- Is the import job an all-or-nothing thing? In other words should we wrap everything in a transaction and back it out if there's a single failure? Or do we keep going? Do we log failures? Email somebody?
- If we're wrapping the whole job in a transaction, should we lock out the users while this is happening so they don't see stale data? Do we notify them? Do we just do it in the middle of the night?
- Should we log changes to an audit table for debugging purposes?
- What if the import file is invalid? Do we log, email, etc?
etc etc etc.
Like Visual Basic, Excel macros, etc I think LLM will enable some non-programmers to do scripty, programmery stuff and I think that is excellent. But anything really robust (and honestly, "simple" import jobs are often critical path and therefore need to be fairly robust) is going to require some rigor. A person telling LLM what to do is still going to need to address all of those edge cases and then some.
It's also very unclear to me how we'll debug LLM-managed processes. When things go wrong, how do we determine what/when/why? And in fact, how do we determine LLM-managed processes are correct in the first place? These particualr concerns seem completely solvable, but they will need to be solved.
This is true, and a good reason why automation is so valuable. It's not just the speed, but also the reproducibility. If the script goes wrong, it will go wrong in a predictable way that can be investigated and debugged. If John goes wrong, he will go wrong sporadically and without any clear reason - maybe he was tired one day, or maybe he got an important phone call in the middle of the task another.
With AI (or at least the versions of it that we're looking at today) we get the speed of a script with the "programmability" of a person, but the question still remains what sort of failure mode we can expect - predictable like a script, or unpredictable like a person. Right now, it's looking like LLM failures tend to be fairly unpredictable (in the sense that a prompt can give a good response one day and a bad one the next based on difficult-to-explain factors).
A random John has a much better track record of getting the result right instead of making it all up.
And you can tell John to be extra careful (in fact, if he has a bit of experience, you won't have to tell this either) and he will be. While if you tell the LLM to be extra careful, it may track intermediate steps better or just make some other answer up.
And yeah, this is very likely intrinsic to the architecture we use on our LLMs.
The part about script is very stretched, let's face it. I'm not denying that asking in natural-ish language maybe more comfortable for some operations, but when something becomes repetitive (and most things people do for money are) a script run hands down wins over telling the same phrase everyday
Another example: look at the solution for a cubic equation written in words and compare it with the same solution expressed in algebraic notation e.g., for Tartaglia’s Quando chel cubo.
http://dx.doi.org/10.1080/17513472.2014.933552
Algebraic notation is not just more concise/precise. It enables solving harder problems that otherwise wouldn't be even attempted.
"When you instruct a human in natural language you can rely on them to do disambiguating and nonsense filtering, which means you don’t need to take the time to phrase things precisely. If you’re really lucky, when you give them an ambiguous instruction, they will even ask for clarification."
I have several friends and an entire career worth of coworkers on the spectrum who would unintentionally challenge the shit out of that statement.
Try asking a programmer to provide a formal proof of correctness of even a simple algorithm. Most are not trained to do this. Many try to forget it when they leave academia. Few actually try to use it.
Expecting a human to disambiguate and do nonsense filtering is a crap shoot. Usually done through trial and error. We write things we think are correct according to an informal specification and since there’s no way to prove our implementation is incorrect… we review it, wrote a few example tests, and deploy it.
Surprisingly this is good enough most of the time because the liability for mistakes lies with the company, most mistakes are benign, and users will tolerate being frustrated to a degree.
This is why I’m not optimistic about LLM’s replacing programming. The art and science of programming is rooted in formal mathematics. It takes a sophisticated ability to reason and a certain amount of creativity to write beautiful proofs. And formal maths demands a level of rigorous thinking that even mathematicians balk at.
The allure that EWD is writing about here is very much alive. Many people hope to have a genie inside their computer do the thinking for them. It may not be the answer we desire but precise thinking is what we need and I would rather have better tools and languages for doing formal mathematics and programming with them than using a really expensive system to piece together the program I want from other peoples’ code and guessing what I want from an imprecise specification.
If the specification itself isn’t precise then any program I give you, by definition, cannot be wrong.
> When you instruct a human in natural language you can rely on them to do disambiguating and nonsense filtering, which means you don’t need to take the time to phrase things precisely.
This relies on a feature of human communication that the LLMs do not have an approach for: empathy. I am able to discern meaning from ambiguity because I can put myself in your shoes and decide what you probably meant.
Computers can probabalistically guess what was meant in natural language, but it will be missing out on enormous context of who the speaker is and what they're attempting to achieve (granted, you could provide this context in text form to the LLM, but then you are no longer taking advantage of the human ability to disambiguate, instead relying on verbosity and redundancy to constrain the LLM to catching your drift)
I think it depends on the system you are building.
If you're building system software that must be reliable, then precision is fantastic and mandatory.
But if you're building some light weight automation that will have enough human oversight that it can tolerate a bit of failure, it's very powerful to open up the implementation to a layperson.
> IMO the argument is sound in one direction: the precision, structure, and unforgiving nature of programming languages is a feature rather than a bug.
If we have learned anything from the previous decade in terms of programming is that strong (and static?) types, strictness, and compiler feedback to the developers are features that nearly everyone appreciates. We see this in the popularity of Rust, and TS, static type inference tools and linters like eslint, mypy, ruff and on.
>I would say that he leaves out a critical detail: the nonsense of which is not obvious _and also not important_.
Not important to whom?
Strongly disagreeing with you here, because misleading is a thing that is quite important to people that carefully manufacture nonsense.
And if it's not important to you, you're helping them reproduce it.
For further reference, I direct you to one of the best (in my opinion) essays of all time: Orwell's "Politics and the English Language"[1].
Like no other, this essay illustrates how harmful the critical flaw Dijkstra pointed out is if we aren't careful enough about how we use our "natural" language.
I put "natural" in quotes, because something like Google's press release on any given matter is anything but natural, and yet it's this kind of language that both sticks around and is deliberately used to create non-obvious nonsense.
I think you might have committed a straw man error. My point is not that approximate or vague language is always good, it's that it's sometimes good. Sometimes imprecise communication is good and useful, sometimes it's doublespeak.
Perhaps I'm hopelessly out of touch with pragmatism, but I think Dijkstra's comments would still be applicable even if we could trust GPT to write code correctly.
Getting some task done is only half the reason I write programs, even professionally; just as important (if not more important) to me is that I understand something precisely and completely. Writing a program is how I do that. That program is usually in the form of code.
Programs aren't just a means to an end, they're an end unto themselves. Or more correctly, a precise and complete understanding of a problem and its solution is a desirable even if it's unnecessary for the task immediately at hand.
Although I do mean that understanding is valuable in the way that art or music are valuable, or that education is valuable [1], I also mean it's more valuable than any particular program that's a product of it (and I think there are practical reasons why [2]).
I see the same issue with the interminable line of todo list apps, mental map apps. They confound the utility of the process - surveying my immediate responsibilities and ranking them, connecting concepts being learned in a new domain - with the product. In my view it has to do with a 'productivity fetish'. We feel like we need to be always outputting something, when the real goal often is to internalize concepts.
In the meantime we’re using programming languages that cause us all kinds of problems because there’s a ton of constructs that cause undefined behavior.
I agree - we should not try to make computers that understand us better - we should learn to understand computers better.
I argue that software is a form of literacy (Hell I'm rewriting a book ok it for the third time). But we don't say "instead of novels let's just stick to rearranging cut out panels from Marvel back issues."
We spend years training children to handle words and speech and presenting arguments - and we need to spend more on empirical testing and skepticism.
Literacy is a technology - and we bend our minds to encompass it - and we should expect no less of software
Is, though, the literacy of software strictly necessary, or an artifact of lower technology? Isn’t it the case that you can use plain English to explain _exactly_ what a piece of software does, and as such, you can, _exactly_ supplant software with plain English?
I think everyone in this thread is patting their backs right now about the necessity of precise syntax and symbology to express programs, but that’s still missing the point. Literacy and communication is built around the ways our minds already work (indeed, it’s often an artifact of a culture distilled to its essence). We _don’t_ need computer language symbols and grammar and syntax — it’s just another form of expression, and definitely not one that fits the human mind nicely.
> We _don’t_ need computer language symbols and grammar and syntax — it’s just another form of expression, and definitely not one that fits the human mind nicely
A computer is a machine that implements a particular form of mathematics. We happen to use them for fuzzy, imprecisely defined tasks, thanks to mountains of convenient abstraction, but we do very much need precise mathematical language for other classes of problems!
A language is a vehicle. If we need to build a website, natural language specs might get us where we want to go. If we need to build a rigorous model of some phenomenon, natural language is often an obstacle. There is a reason mathematicians invent and adopt formalisms: they extend our capacity for precise computation, mental or digital.
> Isn’t it the case that you can use plain English to explain _exactly_ what a piece of software does, and as such, you can, _exactly_ supplant software with plain English?
Is it any easier to do that than to “program”? In my mind that’s the exact same thing, just a different, less syntactically strict programming language. Writing good specification is crazy hard, and it usually incomplete, so you end up going back and forth with the human compiler (the programmer) to get a common understanding.
One might even argue that a cleaned up, domain-specific language actually helps mutual understanding. It’s not always the case that written text is easier to comprehend — see for example math notation.
I've been thinking about this argument and while I agree that having a symbolic construct ("Code") that strictly models the solution is ideal, arriving at that model via a single 'layer' of code which must have been written perfectly by a programmer is a big hurdle given the small fraction of humans who seem to have an aptitude for programming (just like with advanced math too). It's probably a less versatile strategy than iteratively building it assisted by AI guided by natural language. I think in the future, the AI agent will probably code the application, asking its human "PM" partner to clarify each ambiguity, and adjusting as it goes. Finally, it should be able to generate charts, narratives, and diagrams to explain how the created application works, for validation of the solution by a non-programmer.
The entries ascend in date as they ascend in number. I believe that the date for this is 1978 since it sits between entry 665 "Trip report E.W.Dijkstra, U.K. – Bahamas – U.S.A., 11–30 April 1978" and entry 674 "Trip report E.W.Dijkstra, Zürich, 2–7 July 1978."
This is definitely the perspective of an academic - focused on the ideal at the expense of what's actually useful.
Programming in natural language will definitely be less precise than programming in an actual programming language, but with the very beneficial tradeoff that a broader swathe of people can use it. I'm not a programmer, but I've been using GPT to write scripts to automate pulling data from my Shopify store for an ecommerce business for reporting purposes. I'm sure that professor would be disgusted by the code, but the thing is... it works.
In my case, using natural language via GPT to create the script is obviously superior to any programming language. To write it myself in Python, I'd have to learn Python. Alternatively, I could hire a developer to do it, but then I'd have to write a product spec for them to implement, which takes just as much time as writing that same level of detailed spec and giving it to GPT (less, because I can half-ass the spec initially and iterate as I go with no communication overhead). Then, of course, I'd have to pay the developer. And while the results might be better in some very theoretical sense of code quality, in practice they would be the same, because the script would do the thing I need it to do without breaking anything.
There's a reason there's more than one programming language - each has upsides and downsides. As natural language becomes a programming language, it will have the upside of broad accessibility with the downside of relatively low precision. The good news for the author is that you don't have to use it.
>> To write it myself in Python, I'd have to learn Python.
Yes, you need to learn a programming language in order to be able to program in that language, it takes a long time, and a lot of hard work to do it, and there are no shortcuts:
And yes, with ChatGPT et al you can get the job done without paying the cost of learning the language - albeit you can only do that in a very limited way because the performance of systems like ChatGPT in programming, when evaluated systematically (rather than listening to peoples' subjective opinions about it) is always very poor (and not in a "very theoretical sense of code quality", but in the sense of code that just doesn't do the job).
But my question is: ChatGPT is one product, by one company, that isn't even particularly tailored to the programming task. Are you really comfortable handing off your ability to write programs and create applications to one product, owned by one company? Doesn't that just completely take away your independence and even your choice of what you can program?
> the performance of systems like ChatGPT in programming, when evaluated systematically (rather than listening to peoples' subjective opinions about it) is always very poor (and not in a "very theoretical sense of code quality", but in the sense of code that just doesn't do the job).
I feel like you didn't read what I wrote. I used ChatGPT to write a script. It takes information from Shopify and writes it to a reporting spreadsheet in a way that I used to do manually. It literally does do the job - that was the whole point of my comment. You couldn't do the job any better, because the job is already done perfectly - it fulfills 100% of what it needs to do.
> Are you really comfortable handing off your ability to write programs and create applications to one product, owned by one company? Doesn't that just completely take away your independence and even your choice of what you can program?
I cannot understand how you're getting to this point. I already can't program (at least not to any particularly useful standard), so how does using ChatGPT to program take away my independence? It gives me the ability to do more things than I could before without relying on others who can program - that's an increase in independence.
And in terms of giving away my choice... no. I can still learn Python or any other language I choose. You're writing as though I've signed an exclusive programming rights contract with OpenAI. I'm just using their tool because it does what I need.
LLMs like ChatGPT are the modern gateway drug into programming. If ChatGPT removes some independence then users can always use Bard, or use an open source LLM eventually.
> but in the sense of code that just doesn't do the job
Regardless, people who are new to coding making useful work of it unattainable to them before.
And soon with future versions of gpt the produced code will have a better quality anyway.
10,000 hours in programming is a romantic idea for most people in non-STEM professions.
Most people need only a familiarity for writing small scripts that help them on their day to day, as they do not have the time to dedicate years or even months of their life to a new hobby.
Unlike programmers attaining mastery over the span of 10 years, they are trying to reach mastery in their own field.
> some people found error messages they couldn't ignore more annoying than wrong results, and, when judging the relative merits of programming languages, some still seem to equate "the ease of programming" with the ease of making undetected mistakes.
> the fetters of medieval scholasticism — a vain attempt at verbal precision
&
> computer science would consist mainly of the indeed black art how to bootstrap from [using only language as an interface to machines]
Dijkstra was making a prophecy even though he looked (at times hypothetically) to the past. The first quote cuts through to the essential paradigm shift that led to enlightenment. The second is an apt description of what "programming" in the next decade may look like.
I'll add my own little ironic prophecy: if machines do become intelligent, they will most likely use formal symbolic systems to interact with each other.
Anyone proposing that we use ChatGPT to write programs or query databases should read and ponder this and pretty much any other Djikstra papers they come across.
What is the SQL query for: “What was the name of the guy, you know from that consulting company like Deloitte but the other one, who worked on the design for some authentication system last year with Dick?”
I’ve seen LLMs find matches for queries like this, even if “Dick” is a nickname for “Richard”. An awful lot of business questions are in this category and neither relational theory nor full text search is useful to solve such problems.
An LLM coupled with a vector database could solve this, and provide entire new categories of products instead of a sub-standard substitute for an existing product.
> I’ve seen LLMs find matches for queries like this, even if “Dick” is a nickname for “Richard”. An awful lot of business questions are in this category and neither relational theory nor full text search is useful to solve such problems.
Question - have you seen an LLM that can answer queries like this correctly and consistently, for a range of natural inputs?
The ones I've seen often produce correct answers, but likely often produce apparently correct but wrong answers, and often produce garbage. For those systems, we've just changed the human's work from "write a query" to "filter the correct answers from the garbage" :(
Yes, provided the answer for the question is in context, not training. If you feed GPT4 a document and ask questions, it will virtually always answer those questions right. If you ask it for info in its training set, it will hallucinate sometimes.
With Anthropic's 100k token context window, you can do a whole lot with this.
Sure - once the hallucination problem is solved. Otherwise it'll just tell you the guy's name is Harry whether that's supported by the data available or not.
Speaking of that, has anyone shown a convincing demo of that working at all yet, or is it just hype? I've seen one short clip of allegedly working in long term memory with it, but the reliability could've been anything from zero to one.
> provide entire new categories of products instead of a sub-standard substitute for an existing product.
The question is, do people really want to give up deterministic execution or not. Sure, it might find that “Dick”, but will it reliably find the plain simple “Richard, order by date” query? Or will it sometimes omit/add a few additional records?
An LLM might also create a meta-analysis questioning the value of the basic assumptions involved in the desirability of creating 'entirely new categories of products' - which, if you look at the history of financial products invented in the United States over the past 40 years, has some validity.
When Elon Musk et al. talk about 'slowing down AI' that's the kind of thing that seems to concern them. What if AI could make a convincing argument for unionization at Tesla and the institution of a VW-style system of corporate management? Very frightening, I guess...
Sure he warned of dangers, but also invested heavily in AI for autonomous driving by hiring DL researchers ("a million robot taxis by 2020").
He also invested in OpenAI until he had a falling out about who would be in charge of the company. He recently founded X.AI, which is supposedly also building generative AI.
Everyone sees danger, but many, including Musk, are forging ahead.
Both were existing companies bought up by him, with the "founder" title included. Also, paying salaries for talent is very different from having said talent oneself. He has no expert knowledge on anything, one might have thought that at least he is good at programming (paypal), but look at twitter, he ain't good even at that.
Or just try it --- and quickly realise that LLMs can tell lies confidently, and mislead you in ways that no human would do (except perhaps the best --- or worst --- of politicians.) To paraphrase a memorable comment I saw here recently: AI combines the speed of machines with the imprecision of humans.
I had some conversations with ChatGPT about python coding. I am not so sure that an AI that’s specialized on code couldn’t write and modify code with the user having a dialog in regular language. I think that’s where it will be going.
When I write code it's usually not working the way I want it on the first try. There are errors and it also doesn't do exactly what I and users would want it to do. So I iterate. I compile and test and debug and modify.
But how could an LLM do that kind of thing if it can not actually run the code and inspect the results, and compare them to some criteria it has in its "head"? Does any LLM do such a thing? Of yes you say surely next year but let's focus on what we know to be possible now.
I’ve been working on a tool called autorails. It is a GitHub application that reads issues and create PRs. The ReAct tool in LangChain with a custom DockerGitGitHub tool results in this PR:
What’s your modern update on this article from 1978? Today is 2023-05-17 and this is on the first page of hacker news. Can you find actionable advice here for prompt engineers?
$ARTICLE
GPT-4:
This article from 1978 discusses the idea of natural language programming, which is the concept of programming computers using languages that are more like human languages, as opposed to the structured and formal programming languages that we use today. The author argues that such an approach is not feasible due to the imprecision and ambiguity inherent in natural languages, and the historical precedent of mathematics flourishing only after adopting formal symbolisms.
Looking at this concept from a modern perspective (2023), natural language programming is still not the norm, but there have been significant advancements in natural language processing (NLP) technologies. The development of sophisticated AI models such as GPT-3 and GPT-4 by OpenAI have demonstrated the ability of machines to understand and generate human-like text based on prompts. However, these models do not imply the feasibility of natural language programming. Instead, they are used in applications such as chatbots, translation services, and tools for generating human-like text.
In terms of applying the article's insights to today's software engineers, there are a few key points to consider:
1. *Understanding of Formal Languages:* Despite advancements in NLP, the importance of formal languages for programming remains. Engineers should have a deep understanding of the programming languages they use. The ability to think logically and algorithmically is crucial, and this is best facilitated by formal languages.
2. *Designing Interfaces:* The argument for "narrow interfaces" is more relevant than ever. As software systems become more complex, the need for well-designed and specific interfaces increases. Engineers should strive to create interfaces that are clear, concise, and precise in their function and communication.
3. *Natural Language Processing Skills:* While natural language programming might not be feasible, the importance of NLP is growing. Understanding how to use and implement NLP technologies could be beneficial for creating more intuitive user interfaces and interactions.
4. *Communication Skills:* The article criticizes the ambiguity of natural language, but in the human realm, effective communication is crucial. Engineers should cultivate their ability to communicate clearly and precisely, both in their code and their interactions with others.
In conclusion, while natural language programming might not be viable, the lessons from this 1978 article remain relevant. The need for precision, formalism, and clear interfaces in programming continues to be important, and the growing role of natural language processing technologies presents new opportunities and challenges for today's engineers.
> When all is said and told, the "naturalness" with which we use our native tongues boils down to the ease with which we can use them for making statements the nonsense of which is not obvious.
I would say that he leaves out a critical detail: the nonsense of which is not obvious _and also not important_.