Hacker News new | past | comments | ask | show | jobs | submit login
Study of non-programmers' solutions to programming problems [pdf] (ucr.edu)
342 points by yxlx on Apr 15, 2016 | hide | past | favorite | 170 comments

Top three takeaways for me: event-based logic, sets instead of loops, and using past tense instead of state. Events and linq-like queries are popular enough, that last one is interesting.

Especially in an environment where you mostly interact with objects via events, I think querying an object's past sounds pretty doable. Naively we could hold on to all events ever and query for one that matches what we're talking about. Less stupidly, these past tenses are usually in forms like "if this happened recently" or "if this happened ever" which a compiler could rewrite into a variable that the relevant event sets.

So, the compiler sees "if Pacman ate a power pellet within the last ten seconds" in whatever syntax it accepts. It goes to Pacman's "eat power pellet" function and appends code to set Pacman's last-power-pellet-eaten variable, which it has to introduce. The original conditional gets rewritten in terms of timestamp comparison.

Using sets instead of loops is a sensible thing to do. After using LINQ or a proper functional language it's hard to go back.

Using past tense instead of explicit state handling is actually a very interesting idea that I don't see being commonly used. Wort thinking about.

But... Event-based logic is often the source of horrendous complexity and bugs due to timing and exponential complexity of dealing with various sequence permuations. This is one of those things that I firmly believe people should be conditioned to avoid at all cost by using either functional (immutability+recursion) or logic (rules+deduction) flavors of programming. This is one of those things that's really not intuitive to get started with but provides huge benefits in the long run.

The "sets instead of loops" one makes me think of Ruby, where operations on sets are a basic part of the philosophy in a way they aren't in a bunch of languages.

Even the basic "do this 5 times" is expressed in the same way as a set comprehension:

    this_set_of_items.each do |item|
      puts "hey, look at #{item}"
    5.times do |index|
      puts "look, a line!"

I took it more like the way you specify certain jQuery operations such as:

to mean "hide all cards", or

to mean "show all green cards". It's really comfortable to think about operating on the set like this and not having to worry about looping on the individual items.

In fact thinking of it as a set frees computing to do things concurrently invisibly.

A lot of other comments here say "ah, but that's just JS iteration still"... but there's really nothing to prevent it being the equivalent of a Go routine... no order implied, no iteration implied, but all the things in the set will have the functions hide() or show() called.

I am not sure that would count as working on the set. It looks like iteration.

From the article:

a) 1___ 2___ 3___ 4___ 5___ 6___ 7___ 8___ 9___

Thinks of them as a set or subsets of entities and operates on those, or specifies them with plurals. Example: Buy all of the books that are red.

b) 1___ 2___ 3___ 4___ 5___ 6___ 7___ 8___ 9___

Uses iteration (i.e. loop) to operate them explicitly. Example: For each book, if it is red, buy it.

In your case it looks like you are operating on each item and not on the set.

It sounds very like array programming, eg APL and it's derivatives

That's just Array.forEach in JS. Or in Scala:

  (1 to 5) foreach { _ =>
    println("look, a line!")
That's really nothing to do with Sets.

Logically it can be expressed as operating on the set {1, 2, 3, 4, 5} - I think that's the point: one can perceive it differently, either as a set operation or as a loop/iteration.

  module Numeric
    def times(&b)
It's really no different than the Scala you'd write for the same thing:

  implicit class LongHelpers(l: Long) {
    def times(f: => Any) = (1L to l) foreach f

  10 times println("here's a line!")
To interpret that as a Set operation would make the definition of a Set operation so broad as to be useless IMO.

sets aren't ordered

The set of ordered sets is a subset of the set of sets. Totally ordered sets, which I think are synonymous with chains(?), are still sets.

But yes, I was talking loosely: my point was to differentiate between a mental model of iterating over a sequence (or ordered set) and applying a set-like transformation. My mental picture was of a field formed by a matrix operation on a limited space as being composed of the summation of a series of vector transformations; my maths language doesn't really allow me to properly describe it however. Loosely: you can break down a [subset of] 3D transformation[s] of a surface in to a series of 2D transformations performed iteratively, or you can apply a 3D transformation; they're different mental images of acquiring the same result.

> using past tense instead of state

Isn't Clojure's philosophical foundation almost exactly this? Adding the dimension of time to data structures? I vaguely recall a ConTalk by Rich Hickey from the early days about it.

You may be thinking of http://www.datomic.com/ a product by Cognitect where Rich Hickey works. However the Clojure data structures also have the time dimension in the context of the language's immutable structures but it is not something you typically concern yourself with too much.

I looked it up, it's this talk, which indeed is the first one about Clojure, and it's philosophical foundations:

https://github.com/matthiasn/talk-transcripts/blob/master/Hi... (slides)

http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hic... (recording)

The opening lines are:

> So I’m going to talk about time today. In particular, how we treat time in object-oriented languages generally and maybe how we fail to. So I’m trying to provoke you today to just reconsider some fundamental things that I just think we get so entrenched with what we do every day, we fail to step back and look at what exactly are we doing.

Whoever downvoted my previous comment does not know their Clojure, that's for sure.

Well, the time stuff has to do with Clojure's different ref types (Ref, Atom, Var, Agent...) combined with the persistent data structures.

the past-querying sounds a lot like reactive programming, doesn't it? you lay out an expression (and you would be typically using set/aggregate operations there as well) and the expression kinda just sits there quietly, accumulates events until a certain condition or threshold is reached, when it fires a signal further down the chain...

This is a nice tl;dr, thanks

The 3 elements make sense even as a programmer, but of course you would need to do them yourself in your program logic and might opt for something simpler

Adding a variable is creating state. Defeating "past tense without state."

What you described, has ever, has recently seems like classical finite state machine.

The entire point is the model presented to the programmer, not the guts. Of course it's implemented in terms of state. Computing hardware is stateful.

This is really fascinating - our security folks are using even based history - must go talk to them again in this context

Thank you

> Especially in an environment where you mostly interact with objects via events, I think querying an object's past sounds pretty doable

Sounds a bit like Event Sourcing.

I can imagine logic like:

Do action "foo" if user_account has ever been in the "deactivated" state.

It's funny we make the presumption that the natural way is the best way.

Programs are for humans to read. That is their sole purpose.

Bah, the past tense in a summary is just bad grammar. It is regularly correct to use the present, also called the narrative present as wikipedia will have you know.

My perspective, as someone who came relatively later to programming as a career, is that the skill of "thinking like a programmer" isn't too hard to acquire once you've taken a few college CS classes and started to simply program a lot.

Rather, what I'm constantly amazed at (and maybe its because I'm working in stereotypical enterprisey environments?) is how complex, convoluted, long and hard to follow a lot of code is. So much code goes into what to a non programmer or novice would seem like a fairly simple task. And so much code is really horribly written and designed and implemented. Its very much true what your professors say about how you'll spend more time reading code than writing it. Anyways, I'm sure my perspective as both a junior dev and a late career switcher is somewhat biased, but there you have it.

Well, that's the thing. Simple code is hard to write. It's easy to end up with a convoluted mess, because often the first version is just the programmer implementing their thought process as they think through the problem.

It's rare that people get the chance to go back and fix up the code once they have a better understanding of the problem.

I have a similar story. After taking a not-unusual academic path from Math through Linguistics to Cognitive Science, I eventually got started in programming to support my research. I had previously avoided programming largely out of fear. When I thought of the effort it took me to produce a half page of correct mathematics, I blanched at the thought of trying to write thousands of lines of working code.

When I finally delved into Perl, Python, C, etc. and reading open source code (mostly C, mostly greenfield projects), I was flabbergasted to discover how much coding is by guess and by golly. I am genuinely unable to grasp how a person with the intelligence to master the technical details that programming requires could be so lacking in ability to elegantly conceptualize a problem domain and produce suitably organized code. I have come to wonder if it isn't a psychological inhibition. Like some people keep their desk obsessively neat but others work amid a filthy messs, not because they're too lazy to clean but because they're more comfortable that way.

Once in a while, I see a programmer demonstrating the clarity of analysis I'm accustomed to seeing in science (and even the humanities). This is the kind of programmer who will produce a shorter and clearer solution in C than the average programmer will produce in Python. It's rare, though, so I'm constantly amused/irritated by the frequently stated assumption on HN and elsewhere that programmers have superior analytical skills.

For any non-trivial (code size, complexity etc) programming-problem

there are programmers with little analytical skills but given enough time they will end up with code doing mostly the right things... unfortunately from the out side there is no way to tell to what degree the code evolved from try and error or from true analysis...

Then there are skillful and analytical programmers who over the course of the dev cycle for a problem burn out their "analytical mana" due to simple exhaustion, external pressure or both or simply because the problem grows conceptually over their 'head-room' during the work on it... backtracking and reworking already written down code is still cumbersome, always a risk and unpleasant ('trashing something already done')

Then there often is time eroding the most clean, analytical code with every little update into a mass, unless ofc energy is spent to re-analyse and rewrite.

Then there are ofc some differences between areas in which programming is used.

'how much coding is by guess and by golly'

I hear this quite often from ppl with math, physics background etc. who are often accustomed to read expositions of ideas in math papers: presenting the final versions of the formulas.

Code quite often is more equivalent to the notes a mathematician would make and use before summarizing them into the published paper.

Those Math-Whitepapers are executed in the heads of the humans so they tend to try to keep it clean and minimal and analytical sound.

Those Code-Artefacts are executed by machines which don't care at all about such things, thus as long as the humans are satisfied with the end-results the 'inner-structure' tends to erode. Code is read more often than written but it is often only read later and by other people, so ... first things first: it has to work, right? ;)

But yea, I mostly agree that many programmers slide down into a habit of very special kind of

"reactive programming" ;)

where they try to hit the target by incrementally zeroing down on a solution reacting only on the results of the previous version of the code. Due to the action - reaction structure this tends to be quite flow-inducing and satisfactory... but the results are seldom analytically pleasing or sound.

I think that code is often written in much the same way that metal forgers hammer metal into shape. It takes a while, there is a lot of frustration involved, lot of mistakes made in the process of getting the end-result, and often times when deep skill is not present, bulldozer approach to solving the problem become the norm. Clever code is used, and overlooked like small landmines. Modify the clever code and the landmine blows up in your face w/ 12 broken features & the whole application turned out to rely on some clever piece of logic that befuddles most people from a single glance.

Writing code is mostly a creative process. When you used standardized tools, the creative process eventually starts to look common, but when you learn how to use the general tool set which might be analogous to a carpenter's tool belt, then you end up with a more custom tailored product which is very specialized for a specific task. Since there are too many ways to solve any specific problem when the tools you are using are equivalent to screws, nuts, hammers, screwdrivers, anything can be fabricated to make the process work.

The real key here is that when creative-problem-solving meets design-pattern-nirvana, then you can start refactoring into the standard patterns that everyone is accustomed to seeing, and then things start to make more sense.

It's like the PubSub pattern is beaten to death in so many different workflows, but ultimately it's the same shit. Event-based programming is PubSub. Clicking causes a publish/emit, and every entity subscribes to these events/emits on a particular address of a specific pattern.

When we are all starting to speak the same language, I think writing code will become more a practice in writing in our general idioms. Lot of programmers are not formally trained, and nobody does apprenticeships before they write code, they just start.

The mismatch between the high-level notion of programming (solving interesting problems) and the daily routine is there because in practice only a relatively small part if code deals with domain knowledge, and the rest deals with mundane technicalities (persistence, validation, security, logging, diagnostics, user interface technicalities, command routing, data translation ...). The latter part certainly has its share of interesting challenges, but most of it is boring cruft.

Mathematical proofs of correctness of even simple programs WITHOUT the mentioned technicalities are very complex (multiple times longer than the actual programs). This suggests that there is a lot of inherent complexity in the core software.

Why do you hold the assumption that code can be made much simpler than it is?

I've done many years of programming and wrote some bigger programs from scratch by myself. Almost always the actual code is far more complicated than one would assume it should be. And it's not because it's of low quality. It happens that writing something nontrivial that works in the real world requires all the details, checks and abstractions it contains.

If you were in school recently, maybe you were lead to think that programming is an extension of math, where everything is pure, simple and provable. But it isn't - programming is much messier.

I'm not OP, but I can answer the question as if it was addressed to me. I believe that a lot of "real life" code can be simplified because I've had way too many cases where I open some program and remove 60 to 80 percent of its code without affecting the overall logic. (This usually includes obvious duplication, needless abstraction layers and things that reimplement functionality of standard libraries.)

The difficult part usually isn't the restructuring itself, but understanding what the program is supposed to be doing in the first place. Fortunately, deleting code is one of the best ways to learn about what it's doing. You need some good tooling to do this safely, though.

> needless abstraction layers

In my experience this is one of the most offending things in large code bases. I've worked on code that does what it's supposed to do, but you can't see that it's doing it because the solution to the initial problem statement is completely diluted in a mess of factories, abstract classes, gigantic class hierarchies, delegates, and so many other abused patterns.

In some cases the extreme level of abstraction can be justified by the need to have a system that's extensible in many different places. But more often that's not it, it's really just patterns being overly used and abused. You can rewrite the code to be much clearer with less than half the original size and have no negative impact from it.

And after you've "fixed" this "problem" for the current state of the app, you happily leave, and the next person is tasked with adding new features, and suddenly they find the app incredibly rigid and impossible to modify, so they add some abstractions like factories, delegates, etc...

I mean, anecdotes. I find it funny that in a profession allegedly heavily influenced by science, we keep trading anecdotes(in my experience, "more often than not", etc) instead of having any hard data, tests and experiments to compare.

I wasn't advocating writing extremely rigid, non-extensible code. I was alluding to unnecessarily abstract code bases where e.g. there's a class hierarchy with 7 classes when in fact 3 would properly describe the problem domain. Some people get really carried away coming up with abstractions and in the end they just write a lot of meaningless or purposeless code.

Im for doing data analysis at least on the code bases I work on but alas would have to do that in my own time.

The difficult part is to have the time to do the restructuring. Customers and bosses won't pay for it.

Rewrite small bits of it when you're in the area adding features or fixing bugs.

Rewriting code that works and doesn't need new features is pure waste.

It happens that writing something nontrivial that works in the real world requires all the details, checks and abstractions it contains.

I think it helps to separate essential complexity, due to the nature of the problem you’re trying to solve, from accidental complexity, which comes from other sources. In an ideal world, you’d represent the essential complexity as clearly and flexibly as possible, and minimise the amount of accidental complexity you put on top.

That accidental complexity can come from lots of different sources: tools that aren’t a perfect fit for what you’re trying to do, a design that isn’t as clean as it could be, an unfortunate choice of data structures of algorithms… I suspect it probably is fair to say that a lot of accidental complexity that goes into real world programs could have been avoided if, for example, more appropriate tools had been used or better design decisions had been made during development.

The problem is that when people get told to cut a board to 1 meter and we get handed a .5 meter board we say "fuck you" the program on the other hand needs to be explicitly told how/when to say fuck you, and it needs that for basically every single possible source of error.

That doesn't fit into either of your categories really but it's where so much of the actual complexity comes from.

I for one applaud your use of the vulgar and am doing my best to upvote you mightily.

I claim that the essential complexity of real world software is inherently high. Yes, the accidental complexity can balloon to infinity (FizzBuzzEnterpriseEdition), but even the simplest possible software solving a given real world problem will be always highly complex.

This is why mathematical proofs of software correctness are impossible in practice. A small 100-200 LOC program takes a qualified mathematician weeks to create a proof of correctness. No particular step of a program is complex, but it combines a lot of these simple steps.

I claim that the essential complexity of real world software is inherently high.

Sometimes it certainly is high, but why inherently? Surely it depends on the particular problem you are interested in solving?

This is why mathematical proofs of software correctness are impossible in practice. A small 100-200 LOC program takes a qualified mathematician weeks to create a proof of correctness.

As someone who actually does formally prove things about algorithms significantly larger than that from to time, I believe you’re exaggerating here.

See the CompCert C compiler[1] for an example of using formal proofs in real world programming at a much larger scale.

No particular step of a program is complex, but it combines a lot of these simple steps.

And this leads to the “secret” of being able to construct proofs for realistic programs in useful amounts of time, in my experience: you have to be able to decompose the fundamental problem into manageable parts, prove some properties of interest for those parts, and then be able to compose the things you’ve proved separately in order to prove useful things about the overall system. We do this all the time with some properties in strong, static type systems, for example.

[1] http://compcert.inria.fr/compcert-C.html

> Sometimes it certainly is high, but why inherently? Surely it depends on the particular problem you are interested in solving?

Even the simplest "real world" programs do things like: communication with a database, string manipulation, calling library functions which can fail.

> As someone who actually does formally prove things about algorithms significantly larger than that from to time, I believe you’re exaggerating here.

I wrote a master's thesis that touched the subject and it's what I put in an introduction as an illustration of the field's difficulty (I believe it was not taken from air ;). I don't know the field now but ca. 10 years ago program's proofs were mainly refinements of high level axioms to executable code so there was no separate "proof" step of a finished program, but still the added complexity of proving the refinement steps was huge.

Even the simplest "real world" programs do things like: communication with a database, string manipulation, calling library functions which can fail.

Sure, but these things are exactly what I’m talking about with accidental complexity. You’re talking here about databases and strings and library functions, not about whatever real world problem you’re ultimately trying to solve. Complexity that comes from the tools you use or how you use them is exactly what I’m arguing we should ideally minimise when programming, but we often don’t for various reasons.

I was agreeing with the essential / accidental complexity dichotomy until I read this comment. I think we need a third layer, or at least we need to differentiate between accidental complexity because we need to use real-life tools like databases and strings, and accidental complexity because, as J. B. Rainsberger puts it, "we're not so good at our jobs". [1]

[1] https://vimeo.com/79106557

That was a frustrating video to watch. There are a few obvious flaws, like the “proof” that doesn’t actually prove the original claim at all, and the way you literally can’t reach the equivalent of

    add x y = x + y
using the proposed Agile programming model. However, the giant elephant in the room seemed to be that all of those tests the presenter was talking about, and any aspects of the overall design of the production code that exist only to support those tests, are themselves what he terms accidental complication. By his own argument, it seems we should never write tests!

This is why I prefer to talk about minimising accidental complexity rather than eliminating it. Essential complexity is logic you can’t avoid. It’s fundamental to the problem you’re solving, and any correct and complete solution must take it into account. Accidental complexity is logic that in principle you can avoid. However, sometimes you don’t want to: diagnostics and test suites bring practical benefits other than directly solving the original problem, and we will usually accept some extra complexity in return for the value they add. Similarly, using a tried and tested tool, albeit one designed to solve a more general problem, may be more attractive than implementing a completely custom solution to our specific problem from scratch, even though again it might introduce extra complexity in some respects.

I don’t really mind whether we break down accidental complexity into finer divisions. There are lots of sources of accidental complexity. There are lots of potential benefits we might receive in return for accepting some degree of accidental complexity. My original point was just that some complexity is avoidable and often it is possible to simplify real world code by eliminating some of that accidental complexity, even if we choose to accept it to a degree for other reasons.

> and then be able to compose the things you’ve proved separately

This has always struck me as the really challenging, problematic part about correctness proofs. I've no formal training in this area (sadly), but how does one go about being able to just compose your proof components? Often as not, that seems like a pipe dream - http://queue.acm.org/detail.cfm?id=2889274

There are two common ways in which I've seen this happen.

In one, the code started simple and evolved in place to accommodate bugs or additional use cases. Some refactoring has happened in the past, and that added some complexity to the overall project, but still not all of the abstractions were right. Further refactoring may have been attempted, but proved too involved given the value of the task at hand and the complexity the code had already acquired. The code is now more complex further raising the bar for refactoring when the next change comes along.

In the other, substantial effort went into designing the code to be flexible. It was recognized that not all use cases could be understood at the outside, and that refactoring things mid project often fails due to time constraints. A design was arrived at best on the most complete understanding of the requirements at the time, and that naturally design had a certain amount of intrinsic complexity to it as the requirements were complex. As development progressed, it was discovered that the design was not ideal for the actual requirements of the project. Plumbing was added and layers were tacked on to work around the issue.

What I conclude from this that you're damned if you do and you're damned if you don't.

Practically, I generally prefer working with code that was arrived at by the first approach. It often has a few good abstractions in along with a bunch of methods or classes that just combine too much stuff in one place. Perhaps some things are plumbed into places they shouldn't be. This can, with time and patience, be peeled apart into something a bit more manageable. By contrast, my experience with the second strategy is that it leads to a much less manageable mess where most time is spent finding the code that actual does anything. There is much more plumbing to rip out and seeming innocuous changes have far-reaching consequences.

It could be that the reality reflected by the code base is simply complicated. It could be that technical debt isn't measured properly or at all.

It could be a "that's your problem, buddy". That's more likely.

As someone with a definite and well-established viewpoint on CS education, I sort of expected to hate this paper... but I didn't. I thought it made some excellent points, and I very much liked the idea that people not trained as programmers might be able to point us toward new paradigms.

At this point in the development of programming languages, the problem is not really that we can't build languages that do what you want; by and large, for unambiguous specifications (and yes, that is a big qualifier), we can.

At this point, then, the conversation shifts from "how can I meet the machine's needs?" to "how can the machine meet my (programming) needs?" Another analogy: we're no longer just stone-age people looking for a rock that doesn't shatter when we hit things with it; we're can shape our rocks now, and we're trying to figure out what shape allows us to hit things hard without cutting our hands.

Go, functional and declarative programming! Oops, I gave it away. Sorry.

I feel that at its core programming is about taking a conceptual idea (e.g. pac-man moving around a maze) and determining the unambiguous logic that describes it. The language used to express that description has a significant effect on the end result, but it's the ability to develop the logic in the first place that really separates "programmers" from "non-programmers".

"Non-programmer" isn't meant as a slight. This style of problem solving works great a significant portion of the time. Natural languages can describe a solution to a lot of problems very concisely because a) there's a lot of implicit context that clears up many of the potential ambiguities, and b) you're typically present and available to handle any unexpected situations that may arise. For many problems, the best solution is one that can be specified quickly, will work 90% of the time, and can be easily adjusted for most of the other 10% of the time. Natural languages and fuzzier thinking work great for this.

But this approach doesn't work well for problems where the solution is either too complex too be easily described using natural language, or situations where data sizes or time constraints make it unfeasible for you to be available to handle unexpected situations. In this case the solution requires all of the logic to be precise, unambiguous, and developed up front. It's a different way of thinking than what has typically been asked of humanity, and natural languages are pretty poor at expressing that logic.

I think the paper has some good points, but I'm not sure how much you can really draw from it other than verification that natural-style problem solving doesn't work well for the type of problems that are typically solved by programming. If you asked a bunch of experienced programmers to write programs that will tell you how to "go to the store and buy me some milk", you'd probably get similar results about how the programs didn't handle the many different unexpected situations that might occur in such a simple task.

>> But this approach doesn't work well for problems where the solution is either too complex too be easily described using natural language,

For such solutions, let's say in the business domain - the programmer could work with the domain experts on the general structure of the domain model - business objects, members and methods - while the domain expert will design fill the methods, maybe validation rules(with some tool), all small code sections(so it might be easier to think un-ambigously) - and than the model will be fed to an automated system like naked-objects/ISIS that will handle all the technical stuff automatically.

Than if the system detects an ambiguity, it will offer debug info(and code view) in a format that domain experts understand, and let them fix it - or ask help. And of course you could add testing and code review with programmers and domain experts(who can now read the code) to the mix.

And yes, sure this won't fit every system. But it maybe extend the power of the domain expert.

>> or situations where data sizes or time constraints make it unfeasible for you to be available to handle unexpected situations.

For such situations, a search engine having access to the full code specified in an ambiguous language(not necessary natural), could help tools find/build an code containing an optimized form, and maybe offer help about how to integrate it.

It's fundamentally wrong to have non domain expert, "generic" devs. To be effective your devs need to become domain experts in whatever they are developing

>For many problems, the best solution is one that can be specified quickly, will work 90% of the time, and can be easily adjusted for most of the other 10% of the time. Natural languages and fuzzier thinking work great for this. But this approach doesn't work well for problems where the solution is either too complex too be easily described using natural language, or situations where data sizes or time constraints make it unfeasible for you to be available to handle unexpected situations. In this case the solution requires all of the logic to be precise, unambiguous, and developed up front.

The good thing is, both extremes are not mutually exclusive, but rather exist on a continuum. You can start with fuzzy specifications and progressively refine it by case-based reasoning and induction until you get some general, unambiguous rules. Programmers work this way when building new programs, even though they hate to admit it and like to say that the program logic emerges fully formed from the specifications (ha!), they just need to understand the problem in their heads first (no tinkering and trial-error happens in programming, right?).

I believe the next radically different family of languages will be one that embraces that fact and allows developers to better handle that progressive refinement in all steps of the process, rather than require the program to be fully formed and flawless in order to operate. In some sense, modern IDEs already work that way.

This is a wonderful idea. There are few programming environments that are useful to non-programmers with the possible exception of Excel.

A study like this helps us gain inspiration and to remind ourselves how non-programmers think about programming problems.

In the words of the linked pdf: "Programming may be more difficult than necessary because it requires solutions to be expressed in ways that are not familiar or natural for beginners."

Medicine may be more difficult than necessary because it requires solutions to be expressed in ways that are not familiar or natural for beginners.

Law may be...

Physics may be...

Getting the point? Beginners may not always be the best yardstick of everything.

I think you're missing the point. We don't have any control over the fact that medicine and physics are complicated because they reflect nature. Law is complicated because it's an attempt to create a set of rules that apply to all possible human scenarios.

This study, and the person you're replying to, aren't saying that all programming languages are unnecessarily complicated and should be changed. They're just saying that they're inaccessible to beginners. The authors of the study note that they're designing a programming language for beginners, so these things are useful for them to know.

One of Rich Hickey's talks talks about this (I think it was 'Are We There Yet?'). In it, he references musical instruments, which are absolutely NOT designed for beginners. Why should they be? Tools shouldn't be designed specifically to PRECLUDE beginners, but why should any of them be optimized for beginners?

A good tool should allow someone to progress from beginner to novice to expert, but at the end of the day, the expert is who the tool is really designed for. Anything else is just bonus.

Nobody is arguing for replacing expert tools with these beginner-optimized languages. Continuing the instrument metaphor, I'd say they're like recorders: being taught in elementary school won't make them ubiquitous in the professional industry. But they're a great way to introduce students to a simple, relatively inexpensive way to create music.

>In it, he references musical instruments, which are absolutely NOT designed for beginners.

Musical instruments increasingly are designed for beginners. A typical software suite makes it easy to make decent music by tapping a few buttons more or less in time.

Classical instruments aren't designed for beginners, because they weren't designed at all - they evolved from crude and simple historical originals.

But they're a subset of music as a whole.

Arguably the problem is that computer languages are NOT designed for anyone. This is why so many software products and services are significantly broken so much of the time - to an extent that would be ridiculous and completely unacceptable for hardware objects.

I don't see a problem with at least exploring new language models that have roots in perceptual psychology instead of in hardware design.

Classical instruments are designed, just as much as any other artifact created by humans. If pianos were not designed, then neither were bicycles, looms or printing presses.

> Tools shouldn't be designed specifically to PRECLUDE beginners

I'm reminded of a lesser-known talk[1] by Dan Geer, in which he discusses cybersecurity maturing into its own field - its own science - using T. S. Kuhn's definition[2] as a rough guideline. One of the criteria used to recognize the transition into a specialized field was the use of jargon. Maturity as a stand-along field has happened when it becomes necessary to invent new jargon that is - by definition - inaccessible to outsiders.

To be clear, I completely agree that tools shouldn't preclude beginners whenever possible, but I suspect that my not be possible as everything becomes more specialized.

[1] https://www.youtube.com/watch?v=fHZJzkvgles

[2] https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Re...

Rich Hickey is definitely an interesting person.

The questions for me is that do we actually want to master a bare bones instrument (for example a violin) or do we want to make music.

The more you advance in a field, the more you control you want to have over the concepts you work with and what you produce. So an expert will look for a more flexible tool, one that gives access to all these concepts. Whether that's a bare bones instrument or a computer program, it will be unsuitable for a beginner.

> We don't have any control over the fact that medicine and physics are complicated because they reflect nature.

I don't agree that this is all that different.

In physics, some phenomenons can be either simple or complex to describe, depending on what formalism you use to describe them. And sometimes, complex formalisms can be hard to understand at first, but help producing simpler description of the phenomenons your looking at.

For example functions and derivatives are hard to understand at first, but they drastically simplify the description of classical physics.

So in programming just as in physics, you just need the right conceptual tools to describe your problem. And while boolean logic may be hard to understand at first, it really helps specifying complex behaviors.

In that sense, I think I agree with GP: over simplifying the conceptual tools may not results in better problem descriptions.

Honest question, because this comes up reasonably often in PL discussion. Do people genuinely have a hard time with Boolean logic? If I was to make a list of things I found hard when learning about programming I'm not sure Boolean logic would make the cut.

I said boolean logic because the paper mention fifth graders, which are unlikely to be at ease with complex boolean formulae. But there are number of more complex concepts that would also work: OOP, maps or reduce, mutexes and so on.

To answer your question, in my experience (I taught programming 101 number of times) many people are indeed confused when they need to reason about complex if statements. Typically, they make redundant tests, and it takes a little while before they understand why it is not needed.

Interesting, thanks!

>We don't have any control over the fact that medicine and physics are complicated because they reflect nature.

But doesn't the machine reflect nature for a programmer as well? I think this paper does little more than capture fantasies about the inherent capabilities of machines and recite general logic elements of natural language.

To expand on this, even expert programmers have a harder time adjusting to the mindset of a lot of modern programming languages, which is one of the reasons a lot of software written by experienced programmers can still be riddled with bugs. Just because they're skilled at programming in this fashion does not imply that they could not be skilled in a "more natural" style.

Did YOU get the point? Maybe Medicine and Physics and Law shouldn't be so complicated also?

Medicine and physics are not purely human constructs tho.

the universe and how it changes over time isn't a human construct, but physics is, one that tries to describe the universe and how it changes over time at a certain level of complexity. same for medicine, except that the human body is actually a human construct ;)

If you figure out how to make physics simpler and still be accurate, the physicists would definitely like to hear it.

Oh, really?

Yeah, really. One of the driving forces of theoretical physics is simplification. That's what all the quantum-gravity-unification noise is about: having two theories is messy, even if they both produce solid predictions.

Unless your point is "they won't listen to a poor sap like you", in which case, take a chill pill.

>same for medicine, except that the human body is actually a human construct

The human body is merely (re)produced by humans, whereas with "construct" here we mean the design of things (which, in the body's case would be, e.g. nature, evolution, etc).

It's interesting because many characteristics carry over to other areas. For example, we didn't have a wine opener at a hotel with my wife's brother (a youth pastor). Before I got back to the room he got started on it and used a knife to effectively drill out the cork, the obvious cost here is that you've turned the cork into sawdust. The wine was pretty terrible so i had him filter it through a coffee filter. It was still terrible, but as i walked to the room i thought about a few approaches and their associated costs, he busy did the first thing that came to mind. I found the "non engineering approach" very interesting. If course every solution has costs, his was sawdust filled wine. Mine was having to push a cork out of the way to pour.

pro-tip: I've seen demonstrated, if you repeatedly whack the bottom of a bottle of wine with the heel of your shoe (or, in a similar manner, smack the bottom of a bottle of wine against a tree trunk) the cork will slowly emerge on its own.

It occurs to me if you get it wrong, glass shards from the bottle could emerge before submerging into your flesh, so perform this trick at your own risk.

I've seen a variant of this attempted.

One of my colleagues put a wine bottle into a shoe, so the base of the bottle was resting where your heel would go. Then repeatedly 'tapped' the bottle/shoe combination heel first against the wall. Apparently hoping the hydraulic rebound would cause the cork to back out.

The bottle smashed. The concrete wall was permanently stained, the shoe was ruined and the carpet needed to be cleaned.

Very entertaining, but not terribly useful.

PLEASE don't do this. A friend of mine had done it a few times and was getting confident in the process. Once I was watching him, the bottle broke into tiny pieces and went straight into his hand. He had around 30 pieces of shattered glass that managed their back several centimeters deep inside. I still remember calling emergency and waving hands when they arrived so that they saw us. My friend came back a few days after with his hand wrapped in band after a pretty awful night in hospital. Pretty stupid idea if you ask me. Don't do it. Remember one hand is one half of your text editor...

Have you ever tried to break a wine bottle? They're pretty sturdy. This is actually not very difficult to do:


Although, I would think borrowing a wine opener from reception would've been the easiest approach in this case.

Pro-tip: slowly twisting the cork out of the bottle (as if it were a screw) would eventually cause it to pop out. No need for a wine opener.

I do this anytime I don't have a wine opener handy.

In many wines the cork is fully inside the bottle, there's nowhere to grip it.

GP post is right, but left out a detail. I have used this trick multiple times and it usually impressed people:

Scenario is wine with cork fully inside bottle. Take a long slender knife, stab through the center of the cork as far as possible. Twist and and pull, with very little pull and lots of twist. Cork comes out. The key here is that the pressure of twisting prevents most of the slippage that would happen if you just pulled the knife out.

Enjoy impressing friends/family.

It's a bit nerve-wracking reading people who use their fingers for a living doing stuff that risks permanently injuring those fingers!

The infantry grunt knuckledragger side of me (large prefrontal cortex) sometimes ignores such obviously real risks because they seem so minimal compared to past experience. (Which is dangerous in itself. This kind of thinking got more of my buddies killed in motorcycle accidents stateside than did in combat) Point is, you are right to be concerned about safety, but I would argue this particular method is pretty safe if done right. I mean, step 1 is stab the cork, thats not hard to do. I guess the knife could slip and come out quickly, but thats what the lateral pressure of the twist is there to prevent.

The argument that we have to take people who are unfamiliar with the work and look at how they approach it is alienating me. Would you design mathematical notations based on the opinion of 5th graders? Would you build a skyscraper based on how 5th graders feel about it because it's more natural?

EDIT: For those who didn't read the article, I say 5th grader here because a large part of the study is actually about them. Not because I arrogantly compare people from other fields to children.

"Would you design mathematical notations based on the opinion of 5th graders?"

Haven't read the entire paper but...learning from something doesn't mean you need to design something 'based' on that.

The advantage of studying how people unfamiliar with a specific way to solve problems will solve that problem is actually a great technique to try to understand common patterns and approaches humans think about problem solving.

Some pattern solving strategies will be more successful than others and it is worth understanding what is common to those (especially when compared to the unsuccessful ones)

Identifying patterns in the way people solve these problems can prove to be great design insights that go into the design of problem-solving solutions for those problems.

Sure. That's what we did with numerical notation. We switched to a positional decimal system that even fifth graders can understand and do operations with.

Yes, except where what the 5th graders come up with fails to meet other important criteria.

A lot of notation is the way it is because of history and inertia, rather than because practical considerations or requirements means it needs to be that way.

If there are changes we can make to make languages more approachable without making them worse in other ways, it makes sense to opt for making them more approachable.

> A lot of notation is the way it is because of history and inertia...

Less than you may think. Leibniz and other mathematicians spent years debating notational forms in mathematics before settling on what we have. See Florian Cajori's _A History of Mathematical Notations_.

So history and inertia by now, in other words. The point is not that people didn't think about them when they first came up with them, but that they are not generally regularly revised based on e.g. practical teaching experience or research.

False: even our languages are full of quirks and take much more time to learn than Esperanto for example.

> Would you design mathematical notations based on the opinion of 5th graders?

It's not their opinions. It's about how they naturally think.

And, yes, if I could change mathematical notation in a manner that made it easier to learn without introducing a more significant disadvantage, I would do it in a heartbeat.

Things are the way they are for a reason but that reason isn't necessarily that the way things are is the best.

I don't see why that shouldn't be taken into account. Coming up with notation that is intuitive is an important part of the design process and like the article said, this is meant to see if you can't use children as a way of measuring intuition quantitatively.

Yes and no. If we need outsiders it means we have become too arrogant ourselves. Need a fresh take.

The thing about programming is that you're typically doing it FOR someone other than yourself. In other words, the programmer is the "middleman" between some domain and the computer. The idea is ultimately to get rid of the middleman.

It is a mistake and somewhat arrogant to view your domain experts as "5th graders".

When was this written? They quote an article from 1985 saying that programming languages has not been designed with human interaction in mind. I think quite a lot has happened since then.

I also think the conclusion that people prefer sets over loops is biased because the problem domain is a database table where it is more natural to work with sets.

In any case i find the study interesting to compare with how people write software requirements and not to how languages are designed. Requirements are often written in natural language form and are often written by product managers that are non-programmers. I found the answers to be extremely similar to user stories seen in requirements, in particular the use of end user perspective when describing how pac man should move.

Of course non-programmers aren't used to being extremely precise with their grammar to express solutions to logic problems. We have the capability of getting the "gist" of what someone is trying to express that machines current don't. Expressing those thoughts to a machine requires precise syntax and well formed flow control. I honestly don't find much in this paper that's very surprising, but I think the participants did pretty well given that they aren't expected to be extremely precise on a day-to-day basis.

It's not just about non-programmers being insufficiently precise. It's also about the different ways they express precision compared to how you have to do it for existing programming languages. For example, the paper talked about how the participants tended to talk about doing something to everything in a set in vectorised terms whereas scalar languages tend to be more common outside of scientific settings.

Is it really necessary to make programming more accessible to non-programmers? (this is how interpret some of the introduction about making programming easier to a 'beginner')

How is it different from making structural engineering more accessible to non-structural engineers, dentistry more accessible to non-dentists, etc?

Take my latter question not so literally—I mean to ask what is wrong with everyone having their own profession as a result of their passions/natural talent?

I don't think this is a matter of making programming easier for people who are not interested in programming - I look at this as a study of how "normal" programming could be made to model how the human mind might think naturally.

Non-programmers often need to "program". For example, consider writing email filtering rules. The wording and flow could be improved with the learnings from this study. "Apply this label to this mail and all the ones like it, then archive all of those mails".

You would be surprised how much you can get away with by learning to work with the machine directly. The way that the human mind works without training is naturally ambiguous and chaotic. There are of course benefits to offloading some of the thinking and parsing to the compiler or interpreter whatever; that's why we have Python, because C and Assembly weren't enough.

But human beings have been inventing formal notations because they end up being the right tool for the job, most of the time, to guide thought, even if the upstart cost of using them takes some work. I suspect that being afraid of formality will gradually make you lose power over the computer, and will cause unpredictable results when it isn't crystal clear where the impedance to communication and understanding with the machine lies.

I'm all for the more declarative style of programming though, even if it doesn't resemble any particular natural language specifically (remember that papers like these still have a grand Anglo bias in their implementations – imagine the sparsity of a Chinese version of these studies, and the potential difference of the results!).

They need to "program," not program. The difference is key—people who can't write email rules just aren't "computer savvy" and that's not such a large issue with the new generations.

What seems to be happening is people trying to figure out how to make teaching programming easier—like the guy below who works as a TA. I wasn't a TA but I did help a lot of people in school with programming and other CS assignments, and I definitely noticed lots of people in the field for the wrong reason. Either because they thought they could make a lot of money or because their parents wanted them to do it—but they had no natural interest or talent and didn't understand anything. Some of them declared many times they "hate" math or science.

Why do we have to go out of our way to teach people like this computer science principles or programming? They aren't interested in it and don't think the right way. There are many other professions and schools out there, surely most people can find what they really are cut out for, right?

I work as a teaching assistant in a compsci 101 lab - I found this paper to be incredibly eye opening, and will hopefully help me link the intuitive thoughts people have with the specific language features they should be using. We just did a lab on lists and for loops, and this paper basically clarified what the issue most people were actually having, that they wanted to operate on the list in aggregate. While this paper does seem to be looking at potential changes for new languages, it's useful for more than that.

And what makes you think that these potential new language features won't make it easier for experienced programmers as well? Sure, everyone will eventually get their head around iterative loops, but should we not be using the most efficient language for our brains? At the end of the day, a non intuitive concept will have inherent overhead.

I taught javascript to an older person once, and I think for-loops are one of those things that people who studied programming think "are not so bad", but it's absolutely senseless and insane to outsiders.

I'm thinking

    var output = [];
    for (var i=0; i<listOfThings.length; i++) {
        var thing = listOfThings[i];
        var newThing = applyAction(thing);

    listOfThings.map(thing => applyAction(thing));
and even that is a little bit iffy.

I replied to another guy, but it addresses your answer too.

Indeed, if there was something I could force the average person to learn it would not be computer programming, it would be mass balance or other elementary principles of accounting.

Input + Generation = Output + Consumption + Accumulation

It applies to so many problem domains. They should teach it in 5th grade.

"Is it really necessary to make programming more accessible to non-programmers?"

no, but it might be helpful for designing programming languages. not necessarily to make programming languages that're easier for beginners to use, but because there are two immediate entities that interact through a programming language, the computer and the programmer. so learning how humans naturally tend to formulate problems and solutions could be valuable for designing that interface.

Last I checked, dentists don't brush our teeth for us.

And all dentists, at one point, were non-dentists.

Yeah, but nobody set out to make dentistry easy for them, or to coach them through it when clearly they have no natural ability to learn the sciences required. Some people just don't get it.

I wouldn't know how to write something that "summarizes how I (as the computer) should move Pacman in relation to the presence or absence of other thing." I can't figure out what that's supposed to mean. Given the image shown, I would have said "if PacMan reaches a wall, the computer should not continue moving PacMan in that direction."

Also, why anthropomorphize the computer? And why are we "summarizing" instead of instructing the computer? The question asked for a declarative solution and then is surprised that the children gave declarative answers. [edit: I misread what was being said in that portion of the paper and that portion of has been replaced by this bit in square brackets]

The experiment seems extremely sloppy. It can't possibly show what it purports to show, there's way too much in the design of the study that could have biased the results, the sample sizes tiny, the task unclear, and the quantitative analysis subjective.

> The list of things they taught kids for the PacMan study also seemed very... well, technical.

What list? Perhaps I'm just missing it or there's another link I haven't seen but they don't seem to mention having taught the kids anything--especially anything technical--before the Pac-Man study.

My mistake, that list was something the researchers before the task was conducted, not part of it.

Not exactly the same but many years ago I did a small experiment: I gave a few relatively easy logical puzzles to my family (including my grandmother who only finished elementary school) and to friends studying CS and math. In general, my family solved the puzzles faster and making very basic (but useful) representations that my friends who followed more formal thinking.

Obviously I can't extrapolate or make big assumptions about this tiny experiment but I underestimate my family members capabilities. Also, it is obvious that on complex areas of study it's very difficult to came up with a solution if you are a novice.

Note: This was published in 2001. Here's a Scholar link of other papers citing it (none of the top ones are very recent) in case anyone wants to dig deeper:


A good search term is "computational thinking" which probably needs to be combined with a couple of synonyms for "non-programmer".

Reminds me how non-programmers quickly pick up the vectorization of R and programmers keep writing loops.

It feels a little bit disingenuous to say that you can use a spreadsheet's builtin "sum" function to compute a sum. It honestly sounds like an argument in favor of functional languages; here's a Haskell program that's just as simple:

Prelude> sum [1, 2, 3] 6

Hey, you could even search for a library that sums lists for you in C, then include and call that function.

And in C++ you would write this as follows (with a range library it would be a one-liner):

  const auto & xs = {1, 2, 3};
  cout << accumulate(begin(xs), end(xs), 0);

Yes, it's a simple first order function on some kind of collection.

A sum function is not very `function' at all. (It sure feels at home in functional languages, more than in imperative bit-by-bit piecemeal programming, sure.)

I think the idea that the author was trying to express was that in a spreadsheet language, you select a data range and then choose a function to perform on that data. That's the "functional" aspect I took away from my reading of it.

Its no huge surprise that spreadsheets implement a form of FRP . Specifically, you can look at loeb's function [1] as an example of the relation. Being able to focus on the data itself while easily composing operations is definitely functional in nature.

However, FP is not a silver bullet either. In practice I think humans care a bit less about correctness than compilers and would like some aspects of the language to be 'fuzzy' for lack of a better word. To most people "1" and 1 are the same thing so why shouldn't the compiler understand that. There could be potential for a dynamic-fp language of sorts.

[1] http://blog.sigfpe.com/2006/11/from-l-theorem-to-spreadsheet...

`Fuzziness' and FRP are pretty orthogonal.

Ie even in Haskell as it is, it's pretty easy to add an 'if' that takes 'truthy ' and 'falsy' values like in Python. Just use a type-class 'Booly' that include a conversion function toBool.

For example, a typical C program to compute the sum of a list of numbers includes three kinds of parentheses and three kinds of assignment operators in five lines of code

Let me take the opportunity to plug a new language which I have spent the last 5 months designing: github.com/jbodeen/ava

An ava solution -- 9 lines of code, and 1 set of parentheses -- would look like this:

  let rec sum list = 
    let are_we_at_the_end = 0 in
    let take_a_number_and_the_rest_of_the_list n list =
      add n ( sum list )
Maybe we need to zoom out of ancient languages into more intuitive paradigms if programming is to become easier for more people to access

Could you elaborate on how this is more understandable to a novice than the typical C program? My guess is that neither is immediately understandable, and that the C program is clearer to someone experienced.


  sum(ListOfNumbers) ->
      sum(ListOfNumbers, 0).
  sum([Number | RestOfList], Subtotal) ->
      sum(RestOfList, Subtotal + Number);
  sum([], Total) ->
edit: blargh... sorry @simoncion, I didn't see your reply. It apparently takes me longer than 14min to type that out on my phone without typos + proper spaces to treat it like code :-)

> sorry @simoncion...

No worries. It's astonishing how absolutely awful on-screen phone keyboards still are for doing anything more involved than writing a brief human-language message.

That's because they are optimized for that.

No reason, apart from economics, why even with current IDE technology we couldn't make one that works well for specific programming languages.

> That's because they are optimized for that.

That's like a big chunk of my point. :) Phone/tablet keyboards really suck for anything other than short human-language text entry. You wanna write a five-page paper? A code snippet to demonstrate a problem? Forget about it.

It's astonishing that these devices have been around for at least seven years and their packed-in keyboards still fail at these tasks.

> Maybe we need to zoom out of ancient languages...

Maybe. Compare your function to the equivalent Erlang one:

  sum(L) ->
    sum(L, 0).
  sum([], Acc) ->
  sum([Num|Rest], Acc) ->
    sum(Rest, Acc+Num).
Assuming that it's in a module called 'math', run it like so:


Typical C program, eh? Yet another reason to not program in C, I guess.

sum = std::accumulate(begin, end, 0);

Wow, that's pretty verbose. But I guess the long variable names are partly to blame.


let listsum = foldl (+)

You probably want foldl' (+) 0

foldl1 (+) ?

I think its better to give a same result to the empty-list case when its possible.

There are two aspects of this study that I think nullify its value.

First, they take naive user reasoning as normative. Nobody remains a naive user for long. When I first started leaning to program, shifting from set-based to iteration-based reasoning about collections was a bit of a jolt, but it didn't take me long to become comfortable with it.

Second, they ignore that programming is an activity requiring much more precise reasoning than typical daily life. You must learn to think differently and it is beneficial for the notation to enforce this.

I will also point out that English-like programming languages have been promoted for decades and they haven't caught on.

Essentially isn't excel "programming for non-programmers?"

So what's the closest language/language subset that fits this ?

Good question. Racket tries, and because it's Scheme gets a good part of the way. Mozart/Oz tries, with a different basis, and gets somewhere good.

Visual Basic also seems quite close, especially with features like Linq.

> Visual Basic also seems quite close

Honest question - what is there about Visual Basic that makes you say this? I've written small programs with it and on the surface it feels close to "C# if someone changed all the keywords".


You might like Functional Relational Programming. See http://lambda-the-ultimate.org/node/1446

This is definitely true, and why I had more than a few stumbles learning it after I was already an experienced programmer.

Case insensitivity in string comparisons is a good example. It makes perfect sense to a non-programmer but is not at all what a programmer would expect.

What case insensitivity are you referring to?

MySQL with default collation / locale / operators does case insensitive string comparison IIRC

Exactly. Something like 'P' = 'p' evaluates to true.

I always tell non-programmers (and sometimes programmers interfacing with the systems I maintain):

Don't tell me what I need to do, tell me what You want to do. Mostly clients seem to think they need to come up with the way to accomplish stuff, rather than express the need and let the programmer figure out how to meet that need.

The problem with programming isn't solving simple problems. The hard part is dealing with the hard problems. An important contribution of Computer Science is the recognition that abstractions (functional, procedural, object-oriented, relational, ...) are necessary when writing software of any significance. The paper's simple problems perhaps give insight into how a programming language for children should be designed, but for the most part its recommendations should be ignored.

The paper points out that non-programmers (5th graders) have trouble with NOT, AND, and OR and suggest in a separate paper that table based queries can avoid some confusion with these Boolean operators. I'm sorry, but a programming language without Boolean operators is going to be worthless. Just because 5th graders haven't learned De Morgan's Laws doesn't mean that we should throw out Boolean operators. What about lambda expressions, functions as first class elements, higher dimensional arrays, recursion, complex numbers, binary and decimal internal integer representations, floating point with exponents, built in log functions, setjump/longjump, call-with-continuation, threads, concurrency, interrupt handlers, atomic locking, streams, files, relational data bases, the list goes on and on.

Programming in a programming language for Kids tends to be tedious and very concrete. Scratch bored me to tears. Perhaps its a good fit for kids, but it's not going to be used to write a web server. I just don't see how these "experiments" give us any insight into non-toy programming.

In the 1970's there were still plenty of professional programmers and fellow grad students that felt like programming in assembly language was the highest form of programming. It was challenging, I did my fair share, but it was also brutish and nasty. There were no powerful abstractions to facilitate ones programming. Everything was concrete and explicit and terrible. The history of programming languages has been to build a tower of increasingly powerful abstractions over the hardware below. C++ templates, Haskell's type system, Scheme's call-with-continuation, SQL, these are so far removed from the simple little operations being performed by the processor, but they give us the power to write the programs that we do.

The use of abstraction in programming isn't limited to programming languages. Libraries supporting matrix operations won't make sense to a 5th grader or anyone else that hasn't studied matrices. So how is a 5th grader going to describe rotation of a graphical element? They don't know matrix math or trigonometric functions? Should these be eliminated from programming languages? Operating systems also insulate us from the hardware through abstractions not present in the physical hardware: processes, scheduling, virtual memory, files, abstract sockets, networks, threads. What about the other tools we use like relational data bases, source code control systems like git, and bug trackers? How about pseudo-random numbers and encryption? What do 5th graders know of these?

Finally, some professional programmers have to understand deeper issues, programming complexity, turing incompleteness, regular expressions, context-free grammars, LR parsing, performance of algorithms, correctness arguments. All of these issues have some impact on programming. Are we really going to throw all of this out because it is confusing to 5th graders? or even adults that haven't studied these issues?

Firstly, no-one is suggesting we get rid of the hard programming. But actually, I disagree with your point of view.

I do research into A.I., and often work with companies. I find problem fall into 3 categories. Approximately:

50% of problems are extremely trivial problems which we've know how to solve for at least 10 years, we just need to help the companies use the existing techniques.

15% of problems require techniques from the last 10 years or so, so require extensive up-to-date expertise but aren't interesting research.

5% of problems are interesting, hard problems which lead to interesting research problems.

30% of problems are so far beyond the state of the art they are impossible.

Helping people in that first 50% solve their problems without having to talk to us is an interesting area. At the moment they use tools like Excel and Microsoft Access. I believe there must be better tools for those problems, which aren't any more complicated. Why can't my mother easily produce (for a concrete example) create and update a schedule for her darts league, without needing me to help?

I took a business analytics class which taught SQL and basic machine learning. We exclusively used an excel plugin, and it was exceptionally easy to get great results. This sounds like something that would help that 50%.


non-programmers tended to define/use

- declarative event-based rules over imperative flow

- set manipulations instead of 1by1 iterative changes

- list collections instead of arrays. ability to sort implicit

- rule-based exclusions for control flow instead of complex conditionals with NOTs

- object-oriented state but no inheritance

- abstract past/future tense to describe information changing over time instead of defining state-variables

other issues

- not well specified mathematical operations

- AND used as logical OR e.g. "90 and above"

- life-like motion/action assumed instead of defined; e.g. not defining x,y location and frame-by-frame delta

In the paper, they dismiss the example of "if you score 90 and above" as incorrect use of "and" (or too vague to turn into any formal logic).

However, looking at your summary, it suddenly sticks out that this issue could actually be connected with the tendency to use set manipulation. "If you score 90 and above", and I suspect many other seemingly abusive uses of "and", can turn out to be perfectly valid if you consider them as (infinite) set manipulations. However, I'm not sure which of the two explanations is closer to the actual cognitive processes behind such a phrase. Seems to me that humans are naturally comfortable with many set manipulations, while current computers require fairly elaborate abstractions in order to deal with them as sets, especially infinite. This might be one of the gnarly parts of human -> machine translation.

To elaborate on this point, "if you score 90 and above" could be parsed in two different ways:

  1. "if [you score 90] and [you score above 90]"
  2. "if you score in [{90} 'and' {x: x > 90}]"
[1] is unsatisfiable. [2] is still ambiguous, as it's unclear in natural language whether 'and' is a set union or intersection.

In mathematical terminology, 'and' in this countext would mean set intersection, but I don't think it's necessarily "incorrect" to have this mean set union in natural language.

To elaborate, take: C = A union B. Here are two propositions about C:

  I. forall c in C. (c in A) OR (c in B)
  II. (forall a in A. a in C) AND (forall b in B. b in C)
These propositions are not equivalent. [I] actually implies C is a subset of (A union B), and [II] implies that it's a superset. Note that set builder notation for C, {c: (c in A) OR (c in B)} is structurally very similar to [I].

I think [II] is the interpretation of 'and' that is intended through the natural language use. It's essentially a form of set construction: I am constructing a set; it contains 90, and it contains the numbers above 90. As a set construction it also adds an implicit constraint that the new set can't contain anything not in the operands, so that resolves the superset ambiguity (it would be patently absurd in natural language to claim that 55 could be in the set "90 and above").

I don't read it that way. To me the use of the word "AND" is a red herring, as it wasn't meant to imply a logical operation, but rather in the usual sense (and meaning "plus that") to denote the range as half open.

  [90, infinity)
as opposed to:

  (90, infinity)

The ACTUAL meaning:

3. "if [you score 90] do [x] and if you score [above 90] also do [x]"

I think about it as:

"90 and above" is the smallest set X satisfying both:

* 90 is in X


* Above(90) is a subset of X

Another description of this set is:

An element x is in the set "90 and above" if x is 90 OR if x is in Above(90).

AND/OR are dual to each other, and it's just a matter of perspective on whether you're building up the set(OR) or constraining the set(AND).

Analysis: non-programmers don't naturally code to the Von Neumann machine architecture. They instead declaratively define higher-level rules and operations with a naive grammar strongly influenced by human language and experience.

I don't think "programmer" and "non-programmer" is a binary distinction. Most programmers don't think when iterating an array, but even experienced programmers sometimes take a non-programmer like approach when they see an unfamiliar problem.

In other words, if we could make a programming language that's more approachable to non-programmers, we might benefit everyone.

I thought the most interesting one was the use of then as a temporal construct. First this, then this, then that. That's the common use in literature, but that sense of then is implied by statement sequence or function composition in programming. When you are programming the use of then as conditional consequence is so natural you don't think of it as different to the way most people (66%) use it.

This is something I'm having trouble explaining to Business Analysts - they tend to think and define systems declaratively whereas when we receive a set of requirements we prefer to work from an imperative set of instructions. I've not succeeded in conveying this successfully (they either think devs are just being difficult/lazy or they just fail to grasp what we need) - anyone else suffered this? If so have you got any tips as to how it could be better conveyed?

They are telling you what to build, you should be deciding how to build it. A business analyst who writes requirements as imperative pseudo code is doing most of your thinking for you.

That is like saying that the person who came up with the conjecture did more work than the person who discovers the proof...

Maybe you could use a more declarative language? Or build a domain-specific language yourself?

Indeed, if they give you declarative rules, use prolog or some expert system engine.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact