
Mistakes in programming language design - SlyShy
http://beza1e1.tuxen.de/articles/proglang_mistakes.html
======
singular
I totally disagree on the parser-unfriendly syntax point (no. 1). C++ is a
particularly bad example in that code cannot be definitely parsed without
semantic information, however I think limiting yourself to a language which
can be expressed as LALR/LL(1) is pretty brain dead - you limit yourself to
what a given algorithm can express clearly rather than what is clearest to the
programmer. It's really actually pretty hard to make a language "naturally"
LALR anyway, at least without hacks of some kind. C# definitely isn't LALR,
there are many features which get in the way of LALR-ness, such as generics
which are, however, useful.

I do agree on every other point, however!

~~~
joe_the_user
And I completely agree with you here.

The progress in programming language design is going to come from eliminating
_human unfriendly_ syntax. The closer you can get to something that let's
someone express their intentions without restrictions, the better. The
computer is, uh, servant of a human.

There are many barriers in the way of programming languages or computers in
general adapting to human functioning. But whenever someone takes the line
that says humans must-adopt-to-computers, it is a fail. And that fail is going
to be swept away by the next technology which _just works_.

~~~
antichaos
Programming languages force programmers to think clearly and express ideas
unambiguously. It's not a bad thing we should avoid; it helps us write robust
software.

~~~
Zak
Both points are valid. Programming languages should be designed for people to
use, and only secondarily for machines to execute. However, they should be
designed so as to encourage clear expression of ideas when practical.

I think whether a general-purpose language should discourage or prevent
unclear thinking is an open question. I don't think I've seen an example of a
language doing so without being somehow crippled, and that's a Bad Thing.

~~~
binaryfinery
" Programming languages should be designed for people to use"

Exactly. Which means that it must be tool friendly, hence LALR/LL(k). Unless
you want to do all your refactoring yourself.

~~~
singular
I don't think LALR/LL(k) implies automated refactoring is easy and anything
else does not!!

Obviously non-context-free grammars are a problem, but a GLR or PEG grammar is
still perfectly refactorable.

You forget that refactoring an intuitive grammar to LALR involves totally
mangling that grammar and _making it more difficult for everyone_.

If the language is expressed as a GLR or PEG grammar then tool makers have it
easier I reckon - easier to understand grammar.

The way to really ensure it's easy for everyone is for the compiler write to
provide some abstract yet intuitive representation of the underlying grammar.

It really isn't an either/or, it's a case of both at once, I think!

~~~
binaryfinery
I think you hit the nail on the head there. "Context-free" is more of a
requirement than LALR or LL. The author clearly isn't a parser writer. I am,
but I've yet to try a PEG grammar because ANTLR does such a great job.

But this doesn't change his point (or mine): languages should be created that
are easy to write parsers for, so that its easier to write good tools.

And I think this applies to the grammar too: ANTLR's .g files are a good
grammar language because its easy to write tools for - hence AntlrWorks.

~~~
beza1e1
I'm the author and i did write some parsers. However i'm more in the write-
parser-by-hand camp and context-free is not so hard here. ANTLR needs strange
hacks to parse Python with its whitespace indentation. A handwritten parser
just needs some context within the lexer.

In my eyes a parser generator is the solution, if you want to trade efficiency
for cross-language-portability. ANTLR is probably the best solution at this
point, though the C backend is quite wierd.

~~~
binaryfinery
I like to get shit working first, and then downcode if the PG does a horrible
job. Nothing better than a hand-written anything, but I like the options of
tools. Stange hacks for python supports your original argument I believe.

------
RodgerTheGreat
It's possible that I'm missing the thrust of #0, but I don't see how an
object-oriented language completely without the concept of a "null pointer"
could be used to construct elementary data structures like linked-lists and
trees without resorting to special classes as terminators. It would be just as
clumsy as using null pointers in the first place.

If "not-null" references were provided in addition to normal references, the
effect would be pretty similar to the common usage of "final" fields in Java-
compiler-enforced initialization.

~~~
Chris_Newton
There are alternatives to making references nullable by default.

For example, functional programming languages often support algebraic data
types and pattern matching. Creating the equivalent of a nullable value is one
possible application of these features. Typically, the type system would make
sure you can't get to the underlying value without first checking that your
data is non-null via pattern matching, so the equivalent of derefencing a null
pointer simply isn't possible without generating a compile-time error.

Note that this idea is independent of the underlying type, so there's no
particular reason you couldn't have a type system that could wrap an OO class
type in an algebraic data type. Unfortunately, for now, these ideas seem to
live in different worlds: I know of no mainstream OO language that has
anything close to the power of algebraic data types and pattern matching that
is commonly available in functional programming languages. But this is one way
you could have an OO-friendly language with powerful data structures and
without relying on the idea of a nullable reference.

~~~
silentbicycle
Pattern matching seems to mix really poorly with conventional OO subtyping /
inheritance, though multi-methods seem like a good synthesis. (I haven't
really used CLOS enough to decide, though.)

~~~
Chris_Newton
I'm not sure OO fundamentally requires the kind of inheritance relationship
that is familiar from languages like C++, Java and C#, though.

For example, the Liskov Substitution Principle is phrased in terms of a
subtype relationship between two types. This is a logical relationship rather
than a physical one: it is about whether both types present the same interface
in terms of (some aspect of) their state and behaviour, and therefore
generalised code can be written that interacts analogously with instances of
either type.

This sub _type_ relationship need not be realised via a sub _class_
relationship, which is a physical relationship related to the mechanics of
inheritance hierarchies and so on.

I don't see any problem with mixing pattern matching and OO, if you allow a
representation as an algebraic data type as one view of a class and provide a
mechanism to convert between them in some suitable way. Indeed, I'm fairly
sure I've seen papers investigating this approach, though I'm afraid I can't
immediately remember what the authors called it so I could look them up; if
anyone knows what I'm talking about, please post citations if you have them.

~~~
silentbicycle
Late reply to a late reply.

While I think you're probably right, if I'm already representing things with
algebraic data types, I'd rather just skip using OO altogether.

------
jmillikin
> Dynamic languages are fast enough to implement internet services and outgrow
> the demeaning term "scripting language".

This is never going to happen, because "scripting language" is used as a slur
or insult, rather than a term with any usefully defined meaning. It's not
possible for Python to "outgrow" being a scripting language as long as that's
used as a condescending shorthand for "doesn't look like C++".

Consider Ruby. How many applications do you know of which use Ruby for
scripting? Or Python -- I think my system has two applications which can be
scripted in Python (Gimp and Blender), but many dozens of applications
_written in_ Python. If Ruby and Python are scripting languages, then so are
Smalltalk, Haskell, Boo, or dozens of other high-level languages. And yet,
you'll never hear of these being called "scripting languages" because nobody
has an axe to grind against them.

\------------

Unrelated to that, Haskell does have pointers (including null pointers) -- you
can use them just like pointers in C/C++.

~~~
j_baker
"This is never going to happen, because "scripting language" is used as a slur
or insult, rather than a term with any usefully defined meaning."

This isn't necessarily true. See Ousterhout's dichotomy:
<http://home.pacbell.net/ouster/scripting.html>

~~~
jmillikin
That's a really wonderful link, and I wish more people used the definition it
advocates. I've one minor complaint, in that it conflates typeless and
dynamically-typed languages, but considering when it was written that's not a
big deal -- strong, dynamically-typed languages probably didn't exist in any
significant way.

Unfortunately, almost every use of "scripting language" I've ever heard is
more along the lines of:

"We can't write our GUI frontend in Python^! Scripting languages are too slow
for rendering advanced graphics."

Usually uttered by somebody who's been writing C++ for 20 years and refuses to
touch anything more recent.

^ Substitute Boo, Haskell, Scala, Clojure, or anything else without enough
curly braces

~~~
Zak
_strong, dynamically-typed languages probably didn't exist in any significant
way_

Need I do the Smug Lisp Weenie thing?

That said, a lot of people didn't make the distinction between dynamically-
typed and untyped twelve years ago. It was common to refer to Scheme as
untyped then, and I think a few people still do.

~~~
jmillikin
I haven't used LISP in years, so my information may be out-of-date, but isn't
it mostly untyped? I don't remember it having any mechanism for defining types
-- the only types were conses and atoms (numbers, strings, etc). There was no
way to say "this string is a Name and this is a City, and they are different
types".

~~~
Zak
Common Lisp has several kinds of user-defined types (types, structs, classes).
It is possible to write non-trivial code without ever using them, and I don't
think they really fit in to traditional Lisp style. Classes are fairly popular
in modern Common Lisp code.

------
j_baker
I don't necessarily agree with #2. Yeah, SML has formalized syntax, but it's
also a frozen language. It simply isn't practical to have completely
formalized semantics for a language like Python or Ruby that are still
actively being changed. Nor is it necessarily always possible to eliminate all
implementation-specific quirks.

In short, it's a good thing to have a language have few quirks, but there is
such thing as overspecifying.

~~~
silentbicycle
These things don't need to be taken to their logical extremes, though -
handling _most_ of the semantic edge cases is still an improvement, even if it
isn't exhaustive.

Emphasizing well-defined semantics could also be a push to keeping the
language small and clean.

