
The danger of "language" in programming (2009) - phowat
http://loup-vaillant.fr/articles/language
======
kbenson
I think it's worth bringing Perl into this discussion, as a language designed
by linguists to to make it easier to express yourself in ways similar to how
you think. Thinking about this, it puts the following statement into new
light:

 _Natural languages are exclusively used to talk to people, while programming
languages are also used to talk to machines._

We _also_ use programming languages to communicate to other programmers (which
includes our future selves). Making clear delineations as above can serve to
help define a problem, or arbitrarily reduce your options without cause. I
think this is the latter.

Perl has many, many nuances. This is often derided by people that prefer more
rigid languages. I like to think that it helps me see the programmer intent
and thought process more clearly through the code.

Obviously there are downsides to this, just as there are downsides to overly
rigid languages. Again, differentiating them based on what we _assume_ they
will excel at early may not be beneficial for us. People that move freely
between different language dichotomies seem to have success in choosing the
right tool for the job, whether that be algorithm, language or platform.

~~~
rtpg
As someone who has a passing knowledge of perl, what do you mean by nuances?
It always seemed that perl the language is pretty minimal.

~~~
stcredzero
No. Perl has a lot of syntax. There is a lot of language containing a lot of
nuance. Python is an example of a language with moderate syntax, probably
several times smaller than Perl 5's, assuming Perl is about the same size as
Ruby, which is a good assumption, given that they both depend on complex
interactions between lexer, parser, and runtime to fully parse. Smalltalk is
several times smaller than Python, and actual Lisp implementations have
syntaxes that are 1/3rd to 1/2 the size of Smalltalk's.

The above is all objective. You can reduce a language to a formal grammar and
count the number of terminals and nonterminals in that grammar.

~~~
jleader
Not all linguistic nuance is syntactic. For example, the semantics of object
models can contain a lot of nuances, as can the lifetimes of variables,
visibility of identifiers, etc. While there are ways of encoding semantics
(denotational, axiomatic, etc.), the size of the encoding will depend a lot on
how closely the language's semantics resemble the semantics of the encoding
you choose.

~~~
stcredzero
Okay, then. Single letters in regular expression statements.

------
dogles
I disagree with this argument. Modern programming is done with groups of
people; you are not only communicating your intent to the machine, you are
communicating your intent with fellow engineers. It is in this case that
nuance is important. For example, C++ references and pointers are exactly the
same as far as the machine cares. But the difference is important for
communicating intent to other engineers.

This isn't to say that C++ lacks significant flaws, but I don't think nuance
and variety of expression is one of them.

~~~
jeorgun
Out of curiosity: what intent would you say is communicated by choosing
references over pointers, or vice versa? I'm relatively new to C++, but I
haven't noticed any real difference beyond personal preference.

~~~
stonemetal
Pointers can be null references can not, therefore if you use a reference then
you are communicating nulls not allowed. You can't new or delete with a
reference, so you are saying something about the lifetime of the object
referenced(That this code absolutely refuses to manage it).

~~~
catnaroek
This is only partially true. A C++ reference can outlive the object it
references. A better argument would be Rust's borrowed pointers, which are
statically checked to never point to an invalid object throughout their entire
lifetimes.

To talk about lifetimes, you need linear logic in the core language. Merely
having destructors is not enough.

~~~
stonemetal
_This is only partially true. A C++ reference can outlive the object it
references._

I didn't mean to imply that it couldn't, just that by using a reference
instead of a pointer you are disclaiming responsibility for lifetime
management of the referenced item since you can not delete a reference.

C++ being what it is, one can ref = *(new object());, delete &ref;, but the
intention you are expressing by using a reference is more hands off in nature.
It isn't retargetable, no built-in ability to new\delete, etc.

~~~
catnaroek
Manually making sure a reference does not outlive the referenced object (or,
at least, that a reference is not used after the object has been destroyed)
looks pretty much like "lifetime management" to me. What C++ affords you is
merely ownership management.

------
candybar
Another thing that's easy to forget is that human languages often evolve to be
difficult to learn, with this difficulty later being used as a social weapon
against outsiders and a marker of status. This evolution is at times
deliberate, other times accidental, but all languages go through this phase -
where there's no such difficulty, insiders invent jargons and slangs to create
the barrier.

Though now I think of it, programming languages aren't immune from this
phenomenon either!

------
zimbatm
Another danger is the focus on the syntax or maybe standard library when
talking about "programming language". Programming is done in an environment.
Things like editor support, debugging capabilities, compatibility with
operating systems, speed of execution... they all count towards the usefulness
on a programming environment and are often overlooked when comparing fizzbuzz
implementations.

------
ianstormtaylor
_> There are two ways to deal with bugs. Correcting them, and avoiding them._

Does anyone on HN know of other places to read more about this idea? I
strongly believe it, and think it's incredibly important to being able to
argue for refactoring and code quality. I'd would love to read more about it
if there are good articles out there.

~~~
Mithaldu
I can mostly answer this from the perspective of a Perl developer, but maybe
this'll show you some ways to go on reading. The things i know as most
important are:

    
    
       Testing
    

Shortly put: Ever wrote software? Ever went and tested a feature manually by
running the executable or clicking in the web browser? Now imagine you never
had to do a manual test more than once because you put any manual test you did
into a little program that runs it for you and which knows how to output and
summarize the results. Suddenly you can write more code because you spend less
time testing your software and you test it more because you simply need to run
"make test" to run ALL the little programs that test your features. See:
[https://metacpan.org/pod/Test::More](https://metacpan.org/pod/Test::More)

    
    
       Statical Analysis
    

You think you write clean code? You think you don't use anti-patterns? You
sure? I know i'm not sure. Now how about running a program that will take
apart your source code and run it through hundreds of little rule checks
contributed by hundreds of people which will cruelly and mercilessly point out
where YOU FUCKED UP. See: [https://metacpan.org/release/Perl-
Critic](https://metacpan.org/release/Perl-Critic)

------
mjp94
The comparison made between English and C++ (spoken/written language and
programming language) makes me wonder if people should even be comparing the
two at at all. They do serve similar purposes in a sense, but what I'm
wondering is if we would even compare them if they both weren't called
"languages"? If the words used for programming language and written/spoken
languages were completely disjoint, would this point even be brought up?

~~~
agentultra
They share the same node, _language_ , so I don't see they could avoid
comparison. In fact, _programming language_ could be argued to be a
specialization of spoken language. After all we're not punching in bytes by
hand on control cards anymore.

~~~
hercynium
Indeed, the blurring of this distinction became the means by which Larry Wall
earned his college degree!

[https://en.wikipedia.org/wiki/Larry_Wall#Education](https://en.wikipedia.org/wiki/Larry_Wall#Education)

He talks about it in more depth in this interview:
[http://youtu.be/aNAtbYSxzuA](http://youtu.be/aNAtbYSxzuA)

------
hercynium
_There are two ways to deal with bugs. Correcting them, and avoiding them._
... _Of these two, the only one that is significantly influenced by the
structure of programming languages is avoidance._

This is an interesting statement to me, and one that I think may be telling
towards the author's experience as a programmer. This is not my way of saying
he is a poor or inexperienced programmer! I skimmed through a few other
articles on the site and what stood out to me is that he is heavily
"academic." Some of the best programmers I have worked with had deep academic
experience and were quite adept at writing fast, concise, correct code in many
different languages.

What stands out to me is the absence of something that (as is my personal
observation) most experienced programmers in non-academic settings begin to
realize at some point... While _avoiding_ bugs (by, for example, choosing a
language with strict type-checking, or immutable data/vars) is quite
important, writing code _that can be easily corrected_ is arguably just as
important!

In regards to just the use of a programming language, this comes down to
choosing a careful balance of constructs - syntax, features, naming, etc.
Coding standards for a project or organization are not just there to stoke
somebody's ego or cater to the "lowest common denominator". Well-chosen
standards and conventions are there to simultaneously avoid bugs _and_ make
code easier to debug and maintain - which is what one must do to make
corrections!

 _It is therefore crucial to insist on it when choosing (or designing) one._

This sentence is what really drove me to comment: No programming language can
prevent all bugs. Not even _most_ bugs.

In practice, I have even found that constructs and limitations that are
intended to prevent bugs of one type can lead to bugs of some other type. In
the worst cases, these limitations can lead to programmers making code that is
much more verbose and complicated, which, of course... leads to more bugs that
are _harder_ to correct.

IMO, one should choose a language based on many, many more criteria than
"avoidance of bugs." Personally, one of _my_ top criteria is to choose a
language with which those who will write and maintain the software (now _and
in the future_ ) are going to be most productive.

~~~
catnaroek
> writing code _that can be easily corrected_ is arguably just as important!

The easiest to correct errors are the ones that are detected as early as
possible.

> No programming language can prevent all bugs. Not even _most_ bugs.

Sure. No programming language can prevent you from misunderstanding a
specification. But some languages can prevent you from missing out corner
cases in your case analysis, or from redundantly writing overspecific code
that works in essentially the same way for a wide range of data types.

> In practice, I have even found that constructs and limitations that are
> intended to prevent bugs of one type can lead to bugs of some other type.

I have no idea what you are going on about. What kind of bugs do features like
algebraic data types (Haskell, ML), smart pointers (Rust, to a lesser degree
C++), effect segregation (Haskell) and module systems (ML, Rust) _lead_ to? I
can only see the bugs they prevent.

Normally, the kind of feature that "leads to bugs of some other type" does not
try to prevent bugs in first place, just mitigate their consequences. For
example, bounds-checked array indexing does not try to prevent programmers
from using wrong array indices, it just turns what would be a segfault into an
IndexOutOfRangeException.

> Personally, one of _my_ top criteria is to choose a language with which
> those who will write and maintain the software (now _and in the future_ )
> are going to be most productive.

I see writing buggy code as negative productivity. So a language that gives
you the illusion that you are writing correct code, when you in fact are not,
actually makes you less productive.

~~~
Chris_Newton
Probably my favourite open-ended interview topic for programmers is asking
them to rank various properties code might have in order of importance, and
explain why they chose the order they did.

For example, one possible list of properties might be conciseness,
correctness, documentation, efficiency, maintainability, portability,
readability, and testability.

Often, I can learn a great deal about what sort of person I’m talking to just
by watching them define their terms, decide what assumptions they think are
necessary, and then reason through the resulting dependencies.

I get the feeling that the parent posters (hercynium and catnaroek) might
argue for quite different orders, but both with good reasons.

~~~
galaxyLogic
Over the years I've come to the conviction that the 2 most important
properties of programs are simply 1\. Correctness 2\. Maintainability

A program that does not do what it's supposed to is of little value. This is a
relative metric however, a program can do many valuable things right, yet have
a few bugs.

But once a program does what we want, what else do we want from it? We want
the ability to change it easily, so it can do even more things for us.

Maintainability is also a relative metric, and even harder to quantify than
"correctness". However when looking at two ways of writing a specific part of
a program, it is often easy to say which produces a more maintainable
solution.

~~~
Chris_Newton
Many good candidates seem to anchor on those two properties (correctness and
maintainability) as a starting point. More generally, they tend to identify
that some of the properties are directly desirable _right now_ , while others
have no immediate value in themselves but are necessary to ensure that you can
still have the directly desirable properties _later_. Which ones take priority
under which conditions can be an enlightening conversation, often leading to
related ideas like technical debt, up-front vs. evolutionary design, and so
on.

------
sengstrom
I got a syntax error somewhere on this line: "Not because it's wrong (although
it is), but because it's right."

This is interesting however - how about substantiating it with a concrete
example? "C++ does have many nuances. It is a very interesting and very subtle
language, to the point even machines (namely compilers) disagree about its
meaning."

~~~
troels
Good you're not a computer then.

------
toolslive
Isn't the original statement about C++ a variation on the Sapir-Whorf
hypothesis ?
[http://en.wikipedia.org/wiki/Linguistic_relativity](http://en.wikipedia.org/wiki/Linguistic_relativity)

~~~
badman_ting
Sapir-Whorf is a claim about the effect a language has on the cognition of
those who speak it. i.e., the claim is about the cognition and not the
language. The C++ statement was just a silly reason (IMO) for why C++ is a
better tool than other languages for expressing to a computer what a person is
thinking.

------
vincvinc
Let me add a short side-note from the 'Natural Language' side of the table.

TLDR: Comparing programming languages and human languages is a dangerous
thing, not just because people differ from machines when being told to do
something, but even more so because since the daily use of NL by humans is so
fundamentally based on our biological and cognitive context, that if you
really think about it, the parallels in functionality of these two types of
'language' are interesting to consider but greatly limiting.

Human language works something like this: Agent A wants or feels something. If
this even in a slight way involves another agent (B), big chance that A will
choose to communicate something to B.

Before deciding what to say, (among others) the following is considered:

\- A knows that B shares a tremendous amount of similar information with her

\- About most of this info, A knows that B knows that A knows this, therefore:

\- A can expect B to infer anything that A would like B to infer from what she
says.

Results:

\- The code (language) used itself does not contain even 10% of the
information necessary to 'understand' the situation and what A motives are for
speaking. It merely contains lots of very multifaceted and nuanced pointers of
which any two agents would disagree on what's got the priority. [1]

\- A huge part of day-to-day communication is extra-linguistic.

Take this example:

You and me walk to the campus library together. I notice a certain bike and
point your attention to it. Since you know that I know that it's your
girlfriend's, you take it as me saying "hey! your GF's there too!". However,
you two might've broken up. If I know this, I might point at it to say "maybe
we'd better relocate, mate.". However, this totally depends on you knowing
that I know (so that you know that I mean this and not the opposite), and me
knowing that you know that I know (so I can assume that you will infer what I
actually meant out of possible meanings).[2]

And this is just pointing. Image when we start using language to talk about
the bike, or about other people and what they told us. Imagine all the
management of meta-knowledge required.

Basically, we all do this on a daily basis. We have been trained from birth to
make these kinds of considerations subconsciously in order to effectively
communicate with others. And not just about what the other person knows. Also
about what he expects, what kind of words he uses, what he is looking at, etc.

For more, read something like Tomasello's "Origins of Human Communication" to
get started. The more you know about this, the more you notice it around you
(and start using it to your advantage). Fun stuff!

[1] I believe this is one of the reasons Google Translate will keep sucking.

[2] above example also courtesy of Tomasello, but retold from memory, so
forgive me if I've gotten too creative!

------
mkramlich
There's also often no one single language that a modern software system is
expressed in. Even when there's ostensibly only one language used (Python,
eg.) the total system behavior of a website is also typically expressed in
SQL, sh, provisioning APIs, web server config, database config and possibly a
variety of other DSLs.

