Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What human language features could be useful for the writing of code?
60 points by webmaven on April 7, 2018 | hide | past | favorite | 58 comments
Currently, programming languages seem to be largely isolating[0] and invariant rather than inflected[1].

Could operands such as variable names be usefully inflected (probably agglutinatively[2] for simplicity's sake) to indicate type or other contextual constraints?

How about more extensive stacking[3] of operators (an existing example would be += that combines addition and assignment) the way some languages do verbs?

Could we use tenses[4] somehow to make parallel and concurrent programming easier?

Could articles[5] be used to do things like pass variables or instantiate objects?

Anyway, there seem to be opportunities to make programming languages more expressive by borrowing human language features rather than using ever more complex typographical conventions. What do you think?

[0] https://en.wikipedia.org/wiki/Isolating_language

[1] https://en.wikipedia.org/wiki/Inflection

[2] https://en.wikipedia.org/wiki/Agglutinative_language

[3] https://en.wikipedia.org/wiki/Serial_verb_construction

[4] https://en.wikipedia.org/wiki/Grammatical_tense

[5] https://en.wikipedia.org/wiki/Article_(grammar)




I've been experimenting with various riffs on Common Lisp for about thirty years now, and eating my own dog food. I've found four things that turn out to be huge levers for me:

1. Python-style iterators. No matter what I want to iterate over, I just write:

    (for item in thing do ...)
or if I want to collect the results:

    (for item in thing collect ...)
2. A universal binding construct, which I call binding-block or BB for short. It aggressively eliminates parens so that my code ends up looking like:

    (bb var1 (expression)
        var2 (expression)
        ...)
3. Generic functions, and in particular a set of standard generic functions that perform the most common operations on any data type. For example REF dereferences anything. It replaces ELT, NTH, SLOT-VALUE and AREF and even certain database queries.

4. Classes, with very light use of inheritance.

For an example of code written in this style, take a look at:

https://github.com/rongarret/tweetnacl/blob/master/ratchet.l...

You can transfer some of this to other languages, e.g.:

https://github.com/rongarret/ratchet-js/


Well I've been working on a language based on Linguistic Universals for like over 10 years. It's called Pyash.

of the things you mentioned, it is analytical rather than isolating, so there is morphological separation between grammatical concepts and stem constructs -- much like in Lojban cmavo vs gismu.

Biggest difference from Lojban and other unnatural programming languages is the use of grammatical cases for denoting parameters.

for instance: "hyikdoka tyutdoyu plostu" which glossed is "one _number _accusative_case, two _number _instrumental_case plus _deonetic_mood", or in colloquial English "Increase the number one by the number two!"

(result is "the number three." or "tyindoka li")

because it has a rich type system which is based on noun-classifiers, a language construct popular in asia, english example is "two dollars of corn" where dollars-of is the noun-classifier for two.

In terms of tenses and concurrency, I have fairly elaborate asynchronous parallel execution model already. but probably could use future tense of scheduling. And the computer could accurately describe it's state by combining tense and grammatical aspect.

articles are generally found in languages that have lost a nominative-accusative case distinction, and can generally be piled up with anaphoric references, i.e. refering to variables.

Here is a paper wrote about it last year: http://liberit.ca:43110/1DYjc22BP5VkqNLJgr3nQfoGiNLbGFGffG for slightly more information http://pyac.ca

If anyone is interested can put up a more recent paper also.


well im certainly interested, sounds absolutely fascinating


Good to hear! Here is a paper I wrote begining of this year: http://liberit.ca:43110/1PLUkrWc9KGrj3d8BaPctpC3CbrXtspHq2/p...



I know Perl has a bad rep for being noisy, but it also has some nifty conventions to make code more rational. For example you can add your `if` condition after your expression or use `unless` rather than an awkward `if (!())` kind of thing.


Ruby copied those two, by the way. Useful features.


Yes! a great way for OP to avoid "ever more complex typographical conventions" /s



I like programming languages much more than natural ones, therefore what usually makes me happy, is to see a language-for-humans to resemble a programming language, like https://en.wikipedia.org/wiki/Lojban.

But thank you so much for the links! I learned quite a lot from them and now I at least know that the reason I feel German is more similar to Russian, than English, is the thing called "analytic vs. inflected" languages.


Pretty much along your line of thinking, check out this 2003 paper by Crista Lopes titled "Beyond AOP: Toward Naturalistic Programming":

http://www.dourish.com/publications/2003/oopsla2003-naturali...

Abstract:

> Software understanding for documentation, maintenance or evolution is one of the longest-standing problems in Computer Science. The use of “high-level” programming paradigms and object-oriented languages helps, but fundamentally remains far from solving the problem. Most programming languages and systems have fallen prey to the assumption that they are supposed to capture idealized models of computation inspired by deceptively simple metaphors such as objects and mathematical functions. Aspect-oriented programming languages have made a significant breakthrough by noticing that, in many situations, humans think and describe in crosscutting terms. In this paper we suggest that the next breakthrough would require looking even closer to the way humans have been thinking and describing complex systems for thousand of years using natural languages. While natural languages themselves are not appropriate for programming, they contain a number of elements that make descriptions concise, effective and understandable. In particular, natural languages referentiality is a key factor in supporting powerful program organizations that can be easier understood by humans.


Pronouns, for starters. Letting users rephrase commands in order to highlight the most salient pieces is another. More flexible syntax would be good; people are really good at interpreting language, compilers are not nearly so.


I know we're in brainstorming mode, so please don't interpret this as criticism of your suggestion.

Don't named variables serve the same purpose as pronouns? What's an example of a pronoun-like usage in a hypothetical programming language?


> Don't named variables serve the same purpose as pronouns?

Variable names are closer to real names, than pronouns. Try taking any piece of narration and remove the pronouns; it'll read something like:

Mark woke up in the morning when Mark's alarm went off. Mark dangled Mark's feet over the side of Mark's bed, and slipped Mark's feet into Mark's slippers.

A lot of code reads like that, too.


I imagine the parent meant something like this:

    if (x > maximum) {
        it = maximum; // 'it' refers to x
    }
As for a hypothetical useage, I imagine it would make refactoring easier in some places, but could also open code up to potential errors (accidentally referenced another variable? 'it' suddenly changes and you didn't notice!)


Named variables require the creation of, well, names. Naming things is one of the few hard problems in computer science, and at the extreme ends is single-letter variable names, or variables_that_are_short_phrases, which are annoying to use.

Several languages have usage I'd consider "pronoun-like". Perl has $_ and Bash has $?. Unfortunately, "$?" doesn't exactly roll off the tongue, and there's some truth to criticisms of Perl readability. That may be related to the the quantity of the that it supports though.

self in various languages supporting OO is a fairly common example of what I'd consider good pronoun-like usage - if I'm inside a method that belongs to a class, what self refers to is accessible and largely unambiguous.


See, I hate the use of self, or this in code. We want to specify things exactly in code, and deliberately adding that sort of indirection seems halfway mad. If forced to write classes, I would much rather refer to the class name from within methods, or something similar.


This is valid Python:

    >>> class Foo:
    ...    def __init__(Foo, x):
    ...        Foo.x = x
    ... 
    >>> foo = Foo(5)
    >>> foo.x
    5

The variable name `self` is a convention as is `cls`. You can name the first argument to a method whatever you please. The same is true for method receivers in Go:

    type Foo struct {
	X int
    }

    func (Foo Foo) Bar() int {
	return Foo.X * 2
    }

That will compile and perform as expected. Not sure about Java/C# but in Javascript you can bind `this` to another variable if you'd like:

    var Foo = {
       x: 5,
       foo: function() {
           Foo = this
           return Foo.x * 2;
       }
    }


_ in the Python interactive shell seems like a good example, it is always the result of the previous statement. It'd also be nice to have some sort of statement-scoped variable to help reduce error-prone repetition without polluting the block namespace, as functions often take multiple arguments that are derived from the same intermediate value.


Many programming languages have naming conventions which act as a sort of inflection. Classes may always be capitalized, whereas functions and methods not be. Constants might be written in all caps, while private variables might be preceded by an underscore.

You can draw comparisons between natural language and programming languages just fine. Maybe functions are like verbs and arguments are like the subject. And maybe partial application of a function is like having an intransitive verb. Maybe a decorator is like an adverb and a type constraint is like an adjective.

In general, though, I think the metaphor about programming languages is wrong: programming languages are instructions, not really languages. A programming language should be easy to read out of order more like a magazine than a novel, it should be possible to get a general grasp of what it's doing by glancing at the code, and the code should eliminate ambiguity and uncertainty. Natural languages don't work that way; they're repetitive, vague, and rely on context (social and otherwise).


Exactly. The only 'natural language' exercise I can think of that compares with programming is the drafting of legal documents, which also aim to specify something precisely according to a very particular set of definitions.

Unsurprisingly, legal documents have a reputation for being at least as difficult to read as some code.

Natural language, designed for human communication, is, by its nature, unsuitable for precise definition, and a precise language is also unsuitable for human communication.

It really makes sense once you realize how much conversation and prose depend on puns, allegory, allusion, symbols, context, emotion, and so on. All of which are hostile to stating a thing as exactly as possible.


There is something along those lines that I've been thinking about: Human languages evolve over time.

Programming languages also change from version to version, but it is the creators of the language making the changes. In human languages, the changes are organically created by the users. Changes that don't end up being useful never take off.

I think it would be practically possible to build a pipeline for this to happen: allow users to extend/modify the language [0][1] and allow those extensions to be shared.

Some languages already have a process for taking popular 3rd party libraries and eventually integrating them into the standard library distributed with the compiler, the same could apply to language modifications.

[0] Modifying the language could involve _removing_ features .

[1] This may need to be more complicated than syntactic macros, which is currently the most common means of extending languages.


In terms of the language specification it's the it creators who make the changes. But closer to human languages would be the dialect in use in any given project, often a subset of all the features, combined with conventions for how to use it, what to call things, etc. This might well have evolution similar to human languages.


Javascript does have an extendable transpiler (Babel) that allows everyone to make changes, and you can make proposals to the TC39 commitee, too.


Human language features are, by and large, egregiously detrimental to programming languages.

An expert speaker of one human language can require years, even decades, to become equally proficient in another.

Woe to a programming language which sucks that badly.


Great question!

I think we're slowly getting there. First we largely had verbs (procedures) that acted on data somehow. OO added a bit more on the noun side, so we could start to form simple sentences. IMHO much of the appeal of OO is due to this richer "sentence structure", with messages sent to objects.

Of course you could overdo it (see Yegge's Execution in the Kingdom of Nouns[1]), and the FP guys definitely think we overdid it and verbs were fine all along and who needs anything else. Verbs!

Higher Order Messaging[2] adds "adverbs", so we can say something about how those messages are to be delivered. It allows for some very expressive sentences, which you could also describe as crunching architectural patterns. See Sam Adam's great talk[3].

Polymorphic Identifiers[3][4][5] go back to the noun side, adding what you might describe as n noun-phrases, so you can have something more complex that again acts like a noun. Sort of.

Then there's Alistair Cockburn's work[6]

[1] https://steve-yegge.blogspot.de/2006/03/execution-in-kingdom...

[2] https://en.wikipedia.org/wiki/Higher_order_message

[3] https://www.hpi.uni-potsdam.de/hirschfeld/publications/media...

[4] https://link.springer.com/chapter/10.1007/978-1-4614-9299-3_...

[5] https://www.slideshare.net/MarcelWeiher/in-processrest

[6] http://alistair.cockburn.us/Using+natural+language+as+a+meta...


Damian Conway explored some of this topic in "Lingua::Romana::Perligata -- Perl for the XXI-imum Century". See http://users.monash.edu/~damian/papers/HTML/Perligata.html .


I dunno linguistics seem mess up programming more, like mixing singular and plural names for fields/variables (customer_name vs customers_name), I always am trying to whittle stuff down to the best descriptors than trying to make it more expressive. Expressive is best in comments. :-)


> Could articles[5] be used to do things like pass variables or instantiate objects?

I wrote about something similar to this (called "type-named objects") in a post last year[1]. The idea is that for small utility methods, you shouldn't have to interrupt your main flow of thought to come up with names for your formal parameters; in fact, you shouldn't even have to name your locals. If you're working in a language with a (perhaps already verbose) static type system, you can just leverage that.

So, for example, instead of this:

  appendNode(node: Node): void {
    this.tail.next = node;
    this.tail = node;
  }
... you can do:

  appendNode(Node): void {
    this.tail.next = the Node;
    this.tail = the Node;
  }
This does introduce some precedence/binding gotchas that I go more into. But that hints towards further improvements. From the aforementioned post, you could end up with a language that allows:

  the ControlMessage's id := defocus
Or, if you add support for "passive voice" to your language:

  id from the ControlMessage := defocus
Completely unrelated, but related to the use of "the", Carmack wrote a tweet[2] that I've got embedded in the comments of a project of mine.

> I have moved to naming global singletons with a The* prefix -- ThePacketServer, TheMasterServer, TheVoip, etc. Feels pretty good.

Since then, I stopped creating "proper" singletons altogether. Instead, it'd be a quasi-singleton, where I write a `MasterServer` class, and then bless an instance as `TheMasterServer`. Application code will always import, refer to, and otherwise operate on the latter, but tests are free to instantiate their own. This is especially useful if you have implemented some sort of developer switch in the UI that allows you to launch an embedded test harness within a live, running instance of the app. In that case, you don't exactly want your tests to be fiddling with your "real" singleton.

1. https://www.colbyrussell.com/2017/02/16/novel-ideas-for-prog...

2. https://twitter.com/ID_AA_Carmack/status/575788622554628096


I can't remember the name, but there is a language with type-named parameters. It's an object-oriented language inspired by Java.

It doesn't use "the" -- it's clear from context whether something is a type name or reference, so you just use "Node". To handle multiple variables of the same type, you can use Node, Node', Node'', etc.

It also supports phrasal methods, where arguments can appear in the middle of the method name. For example, what would be divide(x, y) could be defined as divide(x)By(y) instead.


I believe Swift (I think) uses this idea, where the 2nd and subsequent parameters are often named prepositions:

    divideInto(x, by:y, mod:z)


I feel like this would be a great argument for an args keyword, in the same way that this is a keyword. args would just be a JSON object with some additional syntactic sugar for adding/removing parameters.

So if you have an overloaded method, say

  print(String)

  print(String, Int)
you could just do

  print(args.extend(Int))


Seems like Applescript.


Code phonology (vocalization of code) might be a good place to start: https://news.ycombinator.com/item?id=16554865


Maybe some keywords relating to causality/attribution. For instance the word "because", used to express constraints.

"Usury is forbidden due to Sharia law."

Or the modern idiom,

"No universal health care, because reasons."

In programming you could do something like

"Foo is constrained because bar", where 'bar' is a set of limitations that are noted elsewhere and can be updated. Not sure if there's really anything new here though, plus it's really just a new keyword not a new linguistic graft.


Sounds like Prolog.


I would say Python is pretty close to English language, and that's part of why is it so easy to understand. If you write idiomatic Python code it can read like sentences. There are also jokes like "I wrote pseudo code and Python can actually run it". The ternary operator for example is more similar to English grammar than in any other language because it has been constructed with this in mind.


Too often do I begin writing pseudo code only to realize that writing valid python is actually quicker. Its sloppy python, but we'll get back to that once the concept is proven.


What you are looking for is APL and it's descendents. While you are at it find the paper, "notation as a tool of thought"


The biggest one I would say is "context".

It would be nice to avoid repetition in a controlled manner by having some context.

This would make it less tedious having to pass the same arguments to a function over and over again. For example let's say in a web application once you authenticate the user.

In the code that follows you could somehow tell it "by 'user' we mean the current user, by 'query' we mean query the database using the current database connection".

Effectively this would be similar to a transparent auto-wired dependency injection.

The biggest mismatch between human and computer langauges is that we as humans are very context aware.

And the computers are at the other end of the spectrum. Between two different function calls they don't remember or assume anything so we have to keep repeatedly passing stuff to them, what's obvious to us. Which itself increases the bugs because it's a lot easier to make a mistake when you are doing something repetitive 50 times, as opposed to declaring it right once somewhere.

Other useful ones could be having some expressions that are not strictly binary or function calls.

And the ability to define these for custom types.

For example "user is administrator", "1001 (is|is not) palindrome", "100 is divisible by at least 2 of [1, 10, 50]".

Another one, accessing attributes/fields, "firstName of user", "factors of 100", "favoritePosts of user".

Tests about attributes/fields, "if user has twoFactorAuthentication".

Another one, more "space freindly" languages. So you could do "let the thing we are printing := 'test';".

Now let me say I'm fully aware most of what I have said is very difficult if not impossible.

The problem is language complexity and parsing and also composition. The examples I have made don't really hold up in general cases or compose well so it's hard to fit them in general purpose languages.

If we were using projectional structure editors they would be much much easier to achieve. Because we wouldn't have to deal with language rules like that. You could render the program structure in a human friendly way without forcing the computer to parse and understand the same thing.

One of the biggest bottlenecks with programming languages today is that humans and computers are sharing a text based language with each other. And as a result we have to compromise on it to achieve a balance where it's easy enough for the computer to parse and understand and easy enough for humans to read and write.

If we could separate the two we could optimise the languages for ourselves and let the computers use something much more precise and unambigious like binary under the hood that makes them happy.

All much easier said than done.


> a transparent auto-wired dependency injection.

I think you're looking for dynamic scopes. These are annoying for most programming, since you can't just look around nearby the way you can with lexical scopes, you have to figure out what the call stack for the function would look like. When used well, however, they can simplify things considerably.

UNIX environment variables and stdin/stdout/stderr are dynamically scoped. Common Lisp seems to call these 'special variables' and has an "earmuff" convention of putting asterisks around their name to set them off.


> I think you're looking for dynamic scopes.

Thanks for writing the comment I now don't have to.


> context

Oh, please no. I've had to play fixer to too many codebases that used interesting tricks to implicitly import code, without pointing to a source. Trying to suss out the origin of a specific function named 'get' in a 100k line application, in a file that invokes, but never declares it, is an exercise in frustration. Less context, please, more explicitness.


I was imagining it in a wish list futuristic next generation way that is safe and controlled.

Not just "throw implicit/global stuff on the wall". We already know how to do that.


I think it's the default context that makes bash powerful (like the cwd).


Probably not when the same word has 5 meanings depending on context.


Pronouns/anaphora:

if (x.y.z == 3) delete it


If (a == 3) || (b == 4) delete it

Which one?


The propositions that are true. If a or b delete it becomes weird when both are true, because in conventional programming languages, b often isn't executed at all, since we don't care about the value of it. However, in this case you do need to execute it because you have to know if delete it reflects on a or both a and b.


If you want to delete both if either is true you could write:

  If (a == 3) || (b == 4) delete them


what if you want to delete a on the b condition?


If there's one condition, use `it`. If there's two, use `this` and `that`, i.e.

  if (a == 3) || (b == 4) delete this //to delete a
and

  if (a == 3) || (b == 4) delete that //to delete b


What if there are three?

Also this should refer to the temporally closer element, so your choices are backwards.


Instead of `it` for one, and `this` and `that` for two, use `este`, `ese` and `aquel` for three (from the 3-tier pronoun system in Spanish).

Please don't ask what if there's four. Very few (if any) natural languages have more than 3 tiers of demontrative pronouns, so computer languages shouldn't -- if they're using ideas natural language to be more readable. They would use identifiers or numeric subscripts instead.


In TXR Lisp, I came up with an anaphoric "if" called ifa which is quite different from Paul Graham's aif:

https://www.nongnu.org/txr/txr-manpage.html#N-018F39B0

"If two or more arguments are it-candidates, the situation is ambiguous. The ifa expression is ill-formed and throws an exception at macro-expansion time." [Rule 2]


Narrative.



> Anyway, there seem to be opportunities to make programming languages more expressive by borrowing human language features rather than using ever more complex typographical conventions. What do you think?

I think I'd like to see a single hypothetical example of such an improvement and how it actually improves anything. Things like this usually sound grand, but tend to go nowhere, for obvious reasons.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: