
The Safyness of Static Typing - mpweiher
http://blog.metaobject.com/2014/06/the-safyness-of-static-typing.html
======
dylukes
This article misses the point of what a type system brings to the table. The
key benefit isn't catching errors.

The benefits are largely discoverability, enabling better tooling, improving
the number of optimizations compilers can make, and guiding the programmer.
These are all things we benefit pretty much unilaterally from. The trade off
is having to be slightly more explicit (Java and C# haven't helped this, as
they insist you sign everything in triplicate, rather than promote type
inference).

For functions with sufficiently rich types, there are often only a couple ways
to implement the function. Sometimes, there are few enough that a compiler can
actually derive it for you.

Having switched between typed and untyped languages repeatedly, I can't
emphasize enough just how much rich, strong types contribute to the
readability of the code.

And on yet another front, it may only be "2%", but I'm sure most people on
here know how it feels to have written a couple hundred lines of code, only to
suddenly find something of the wrong type somewhere it shouldn't be... It only
takes a fraction of a percentile for a program to be utterly and completely
useless.

~~~
dylukes
To expand a little bit:

I have found at least anecdotally that the benefits of a strong, rich type
system are multiplicative with other features, not additive. For example,
types in Java are largely a nuisance. The type system lacks facilities to
express obvious things (I want a list of things that all have this interface)
in clean ways. Instead you have to resort to "clever" hacks which ultimately
just circumvent the guarantees you wanted to establish. For small projects
this is a non-issue. For large projects, this is hell.

By contrast, type systems thrive in contexts with algebraic or sealed case
types, or any form of pattern matching really. Or just plain old enums. In
conjunction, these features enable very powerful static checks. Forcing you to
handle None/Null/NONE/Nothing/nil cases everywhere encourages critical
thinking. This introduces more issues, such as staircase code, but these are
largely (I would personally say completely, and with a nice surplus) fixed by
things like pipes, monads, computation expressions, and so forth.

This extends to libraries as well. I can't count the number of times I've used
libraries which changed their API's in "non-breaking" ways, such that certain
functions returned types I assumed would never be returned. Was this my fault?
Yes. But if the return type had been strongly typed and I had been working in
such a language as described above then:

1) I would be forced to handle every extant constructor in that type.

2) If a new one was added, I'd get a compiler warning alerting me to the fact
that I hadn't handled it (this is great!).

3) If the type changed entirely, I'd get an error warning me to the fact that
my code was no longer compatible with the API provided.

In the end, working with and writing libraries is about respecting contracts.
Types are a tool for codifying those contracts. Strong typing and matching
facilities are even more powerful tools for alerting you to violations.

~~~
dllthomas
I love refactoring in a good, statically typed language. Change the types in a
carefully breaking way, and you're immediately told every single place you
need to change. Seven times out of ten I can manage to squeeze this out of my
C, even before static analysis tools, though certainly wind up leaning more
heavily on my tests than in some other languages.

------
pcwalton
Depending on your type system, Heartbleed can be a type error. As can at least
(a) buffer overflows; (b) buffer size calculation; (c) format string misuse;
(d) SQL injection. There are many real programming languages used in practice
in which all of these are type errors. There are also more experimental and/or
research-oriented type systems that protect against other vulnerabilities; for
example, reliance on untrusted input in a security decision, or integer
overflows.

Despite that, I do agree with the notion that type systems are a tradeoff, and
some projects benefit more from strict typing than others. Dynamically typed
languages certainly have their place.

~~~
mindslight
Type system features are indeed tradeoffs, but whether type enforcement is
done at compile time or run time need not be. It seems like the dynamic camp
recognizes this best (eg Common Lisp, Dylan, Typed Racket), while beginning
with a static foundation would be more suited to providing the raw efficiency
that it enables.

In a sense this is already done with eg C+lua or C+python. But having them in
the same language, with the exact same type system/object model, with the same
syntax would be something else (hint hint ;)

------
jbapple
> In fact, Milner has supposedly made the claim that "well typed programs
> cannot go wrong". Hmmm...

That quote from Milner is from his 1978 paper "A Theory of Type Polymorphism
in Programming". The actual quote is about a technical property of some type
systems called "preservation". In his paper, "wrong" is a value without a type
that no program evaluates to. Milner's statement that well-typed programs
don't go wrong is a technical statement, not the title of an editorial piece.

Edit: here's a link to the paper:

[http://www.research.ed.ac.uk/portal/files/15143545/1_s2.0_00...](http://www.research.ed.ac.uk/portal/files/15143545/1_s2.0_0022000078900144_main.pdf)

------
forrestthewoods
I consider regular expressions to have the rare and elusive "write-only" flag
set. Those bastards are almost impossible to decipher after the fact.

I incrasingly consider dynamic languages to fall into the same "write-only"
camp. One week get work done fast and efficiently. A few weeks later a new
edge case is encountered and it doesn't work. Figuring out exactly why can be
more than a little frustrating.

Dealing with code other people wrote can be even worse. Yes the language is
dynamic. No the expectations are not. If you call a function with some object
and it's not the right kind then it's either not going to work or it won't
work like you expect. Jumping into the middle of a system and trying to figure
out what the requirements and expectations are for every object in every
function is a collosal waste of time. Yes it could be documented, and that
takes even more time! Static typing makes it remarkably clear and straight
forward.

~~~
nine_k
The OP says that much of the static typing benefits are in documenting code.

Having code properly documented, in a machine-verifiable form, is _no_ small
benefit. It's what maintainability is based on.

~~~
dllthomas
It's worth noting that one can hack together some machine-checkable
documentation orthogonal to typing, too. As a limited example, I do this with
inline TODOs (in a greppable format) while I'm working my way through a
problem.

~~~
carussell
Machine-verifiable TODO items sounds interesting. If this comment was bait, it
worked. Where're some examples I can look at?

~~~
dllthomas
I don't have much relevant posted anywhere. The gist is "git grep TODO", but I
do a little extra work to make it extra convenient with vim's quickfix buffer:

    
    
        search:
            @git grep -n "$$PATTERN"
    
        todo:
            @PATTERN=TO''DO make search \
                | sed -n 's/\([^ \t]*\)\(.*TO''DO \(([0-9]*)\)\):\?\(.*\)/\1 \3 \2\4/p'
                | sort -k 2
    

Running ":make todo" in vim then pulls everything of the form "TODO (1):" out,
sorts according to priority (the value in parentheses), and dumps them into my
quickfix buffer (including the locations), which I can easily step through.

I also surface changes in the todo list as comments in my git commit buffer,
for reference.

------
nbouscal
Trying to discuss the merits of static type systems by using a Java-level type
system as your example case is disingenuous and silly. None of the static
typing advocates I know (myself included) advocate for type systems that weak.
If you want to make an honest comparison, you need to compare to a strong type
system like those in the ML family. Those type systems can solve many of the
25 bugs referenced.

~~~
nine_k
Even stronger type systems, e.g. based on dependent types, can even solve the
buffer overflow problem at type level.

Alas, no widespread language uses such a system yet.

Various problems revolving around combining tainted / untainted strings, like
SQL injection, must be definitely solvable by a type systems of ML, Haskell,
or Scala.

------
lelf
> _for example, most of the 25 top software errors don 't look like type
> errors to me, and neither goto fail; nor Heartbleed look like type errors
> either,_

Most of them do to me. (But I'm coming from Haskell, so it's not only the
static typing, it's the very strong one).

Anyway, [http://ro-che.info/ccc/17.html](http://ro-che.info/ccc/17.html)

~~~
bunderbunder
I'm seeing where a lot of them could be prevented if you set up your types
very scrupulously, but I don't think most of them are inherently type errors.

That said, I would say that the OP's assertion is itself a type error. The
link is a list of "most dangerous" errors, where "most dangerous" is
specifically defined as bugs that create security vulnerabilities. This is
being used in support of a statement about what the most common bugs are.
"Security vulnerabilities" and "software defects" are two different (if
related) things, so that's a type error in the argument. And "dangerous" and
"common" are two different characteristics, so that's a second type error in
the argument.

~~~
dllthomas
Aren't "dangerous" and "common" different values of the same type?

~~~
bunderbunder
They might be different subtypes of a supertype called "stuff you should worry
about". But not every common error leads to an arbitrary code execution
exploit (for example), and not every arbitrary arbitrary code execution
exploit is a result of a common error.

------
yawaramin
> ... most of the 25 top software errors don't look like type errors to me....

First off, most of the 25 top software errors aren't actually _programming_
errors. There's nothing any programming language, whether static or dynamic,
can do to help you with them. So that's really a red herring.

Secondly, look again. At least two of the top 25 _are_ actually type errors:

CWE-134 Uncontrolled format string: this goes away if you constrain the input
to be a discriminated union type instead of a plain string.

CWE-190 Integer overflow or wraparound: this goes away if you use an integer
type which _can't_ wrap around. In fact, they almost come out and suggest
exactly this in the details:

> Use a language that does not allow this weakness to occur or provides
> constructs that make this weakness easier to avoid. > > If possible, choose
> a language or compiler that performs automatic bounds checking.

------
bfung
Here is the actual statement about 2%:

[http://schd.ws/hosted_files/buildstuff2013/ce/The%20unreason...](http://schd.ws/hosted_files/buildstuff2013/ce/The%20unreasonable%20effectiveness%20of%20dynamic%20typing%20for%20practical%20programs.pdf)

"2% of reported Github issues for Javascript, Clojure, Python, Ruby, are type
errors"

The study referenced in that presentation, [http://www.inf.fu-
berlin.de/inst/ag-se/teaching/V-EMPIR-2014...](http://www.inf.fu-
berlin.de/inst/ag-se/teaching/V-EMPIR-2014/doc/jccpprtTR.pdf), study 80
implementations of the same program by 74 people. The reported times in the
study do not last for over 63 hours.

The study referenced in the blog post,
[http://courses.cs.washington.edu/courses/cse590n/10au/hanenb...](http://courses.cs.washington.edu/courses/cse590n/10au/hanenberg-
oopsla2010.pdf), study "49 subjects that studies the impact of a static type
system for the development of a parser over 27 hours working time."

The only conclusion that can be drawn from these datapoints, if 2 is enough,
is that programs written in no longer than 3 days time can be implemented
faster in dynamic languages than statically typed languages. There's nothing
that calls out maintainability of a program over years of time, nor the number
of pre-written libraries used in those programs recorded.

------
josephlord
But when the type system can ensure that you handle nil values safely and can
check that every case is covered in a switch statement I'm sure that covers
more than 2% of bugs.

~~~
mpweiher
Why are you sure?

~~~
NotAtWork
Not properly guarding for None has been over 2% of my bugs in working with
Python.

I'd guess it's more like 5-10%.

There's probably another 5-10% that are problems dealing with trying to
uniformly iterate over different kinds of structures, but I'm willing to admit
I might just not know the correct way to do it (ie, map or fold or something).

~~~
actsasbuffoon
The interesting thing is that the majority of statically-typed languages
wouldn't catch the null for you. C, C++, Java, C#, and many others accept null
as a valid input for any type.

There are plenty of languages like Haskell, Haxe, Idris, Agda, Roy, Rust, and
Elm that get this right. Anecdotally it seems that most developers who
advocate static typing aren't using these languages. It seems like the
majority are writing in one of the Algol derived languages from the first
paragraph.

Scala gets it half right with Option, but included null for Java compat. I've
not written much Scala, but I've already encountered numerous bugs that
resulted from some library returning an unexpected null, and the type system
decided that was totally fine.

~~~
josephlord
Yes C/C++ are pretty weakly typed and Null pointers are an issue in them all.
They still catch some issues though and I find it does help me sometimes
compared with Ruby for example. If they only caught 2% of the issues I would
be a little surprised but it is certainly in the range of possibilities
depending on the stage of development (I think it would catch more as you type
your code but possibly 2% or less when evaluating a finished, debugged tested
product).

You can also add Swift to the list of languages that get it right in this
regard. When receiving a return value from obj-c that can be null it comes
back as an optional so you SHOULD safely transform it to a real value or only
perform conditional actions on it. Of course there is the possibility to
coerce it into the full type.

------
austinz
The 2% figure doesn't take into account opportunity cost. Maybe a project
written in a dynamic language has a well-engineered, comprehensive unit test
suite that is large enough to cover for most type errors. Maybe that same
project in a statically-typed language might not require so many unit tests -
or maybe it might. Maybe developers working on the dynamic language version of
the project spend more time reading through the code to understand how
unfamiliar modules work - or maybe they don't. This sort of stuff is far more
useful to know - and far more difficult to measure.

The 2% figure states little more than the fact that a project being developed
by competent engineers can be debugged and tested to the point where the
project becomes quite reliable and errors become rare, regardless of whether
or not it is implemented using dynamic or static typing. I don't think any
reasonable static typing advocate would try to argue that static typing is the
_only_ way to catch type errors.

The parser study is even less informative. It covers a project of very small
scope using both a custom language (with a type system of very little
expressive power) and a custom development environment, and does not take into
account long-term maintainability or extensibility.

If you want evidence, keep on looking.

------
chubot
Wow, this is fantastic. Now I have a word ("safyness") for what I and others
have been saying for so long: that static typing prevents the bugs that are
easiest to catch. If it were free, I would take it, but it often does so at a
large cost to productivity, and even understanding of the dynamic/runtime
behavior of the system.

That second problem I summarize using the phrase the "map is not the
territory".
([http://en.wikipedia.org/wiki/Map%E2%80%93territory_relation](http://en.wikipedia.org/wiki/Map%E2%80%93territory_relation))
I had heard this phrase for many years, outside the context of programming. I
never quite understood what it meant until I spent time around hardcore C++
and Haskell people. Static typing is a model (a map) for runtime behavior.

~~~
stcredzero
_what I and others have been saying for so long: that static typing prevents
the bugs that are easiest to catch_

Not only are such bugs the easiest to catch, but they include bugs that can be
insidious, expensive, and prohibit certain refactorings if they are _not_
caught.

This isn't to say that static typing is the only way. It's more accurate to
say: If you aren't catching those bugs automatically, you are doing it wrong.

~~~
dllthomas
Right. And it's not just _that_ the typing is static, but that you are _using_
it properly. Cf:

    
    
        struct price { int value; };
        struct quantity { int value; };
    
        int placeOrderUnsafe(int price, int quantity);
        int placeOrderSafer(struct price, struct quantity);
    
        ...
    
        placeOrderUnsafe(quantity.value, price.value) /* does the wrong thing */
        placeOrderSafer(quantity, price) /* caught at compile */

------
verroq
>Because static typing is only worth the 2%

>If you felt it helped it's because it was the placebo effect

>Remember the time you mistyped something and the program crashed at runtime
when it should have been a compile error. That was the 2%.

>There's not one time you passed in the wrong object and duck typing smoothed
over the error, until you find it at runtime. That was the 2%.

>Remember the time the compiler made your program faster because it could
deduce additional information from the types? That was the 2%.

>Remember the time your IDE helped you write programs, detect potential errors
because it could deduce more information from the types? That was the the 2%.

>if you advocate static typing, then you are comparable to a religious zealot.

What a load of garbage.

~~~
lisper
You left out the most important part: there is evidence -- in the form of at
least one peer-reviewed study -- to support all these claims.

~~~
dllthomas
Comparing the description of the paper to the abstract, TFA seems to
substantially overstate things.

From TFA: _" [T]here was a study [...] which found the following to be true in
experiments: not only were development times significantly shorter on average
with dynamically typed languages, so were debug times."_

From the abstract: _" This paper presents an empirical study with 49 subjects
that studies the impact of a static type system for the development of a
parser over 27 hours working time. In the experiments the existence of the
static type system has neither a positive nor a negative impact on an
application's development time (under the conditions of the experiment)."_

~~~
dllthomas
As I mentioned elsewhere, the abstract seems to disagree with the paper. I'm
not sure what conclusions to draw from that fact...

------
barrkel
I rotate between Java, Ruby (and CoffeeScript) in my current job.

Java is less productive in terms of having much less expressive style and
abysmal library support for higher level abstractions.

On the other hand, in Ruby I spend hours of my day fighting various DSLs -
rspec, Factory Girl, ActiveRecord etc. getting my code free of things that
could be trivially eliminated with static typing, so long as it was expressive
enough to cope with the abstractions I'm working with.

And that's the catch. The abstractions I'm working with in Ruby are simply not
sanely expressible in Java. I'd need a much, much better type system to
program at the same level of power.

Overall, I've found the productivity of dynamically typed languages highly
unconvincing when compared to typed DSLs - not embedded DSLs, actual DSLs with
parsers and type checkers and semantic errors at the same level of abstraction
as the domain. But maintaining a DSL is not something many people can afford
to do, so we muddle on, bouncing between horrifically verbose Java and Ruby
that needs ridiculous levels of testing to stop it falling apart into a pile
of mud - especially if you ever want to even think about refactoring it some
day.

~~~
Smudge
I find that the two languages are actually not as dissimilar as you may be led
to believe. You _can_ write your Ruby a bit more like Java (use more plain old
class objects for most of your business logic, and wrap usage of the heftier
DSLs/gems inside of simpler interfaces), and support it with the same kind of
tests (which are now easier to write & maintain because your code's concerns
are more isolated). Sure, you miss out on the type checking, but I'm not sure
that is as painful as it sounds when you remove many of the other factors it
can be easy to fall into with Ruby.

The key here is that, while Ruby lends itself to building DSLs layered within
DSLs (I'm looking at you, Rails), you don't _have_ to use it that way. And,
I'd argue, when you reach a point where your Ruby looks a bit more Java- or
C#-esque, the more "truthy"/"safy" aspects of static typing end up playing a
much smaller role than it seems from the outset.

------
garthk
I miss the automated refactoring I enjoyed writing a hundred thousand lines of
C#, but it's all feels. I'll never miss the time I spent fighting the type
system when it got in my way. Covariance and contravariance in generics,
doubly so. That's probably feels, too.

We're not talking productivity, though, but safety. Bugs that static typing
would have caught are rare enough that I call them out as I make them in
pairing sessions to throw a bone to the Java fans on the team. Dynamic typing
is simply not causing a massive uptick in bugs.

Our maintenance problems in JavaScript come mainly from trouble following code
using functional composition and callbacks. I'm not sure there's a type for
"this method had better call either call back or call an asynchronous method,
and the same goes for the callback provided to that method, in infinitum", but
I'd find that handy.

------
stcredzero
_I recently had a Professor of Computer Science state unequivocally that
anyone who doesn 't use static typing should have their degree revoked._

From what I have seen, static typing has one important quality: It is somewhat
more clueless-management proof! It's entirely possible to have perfectly
cromulent development in a dynamically typed environment. Unfortunately, over
a 10 year lifespan, it's also likely that during some span of time, the
project will be mismanaged and someone will do something stupid vis-a-vis
putting an object of the wrong type in an instance var, temporary, or
collection, thereby causing problems sometime down the line.

Does this mean that dynamic typing is no good, or should only be used for
prototyping? I think not. I think it's more indicative of the poor quality of
management of programming teams in the general population.

------
jestar_jokin
The greatest advantage I've found for static typing is IDE support. I find
it's a lot easier to change things in something like Java using Eclipse or
IntelliJ IDEA, vs Python using PyCharm, just because the IDE can have deep
knowledge on what you're the code is doing, and all the places that need to
change. Of course, if the language has restricted expressiveness, you might
introduce other components to make up for it (e.g. XML, templating DSLs like
FreeMarker, SQL); then you've basically introduced dynamic functionality and
can end up with runtime errors anyway.

------
ScottBurson
Types are invariants, but only some invariants can be expressed as types. The
most complex invariants -- which are the most difficult to keep working as the
code evolves -- generally can't be expressed as types. This is why, to me,
static typing as a religion is not attractive.

On the other hand, type systems are improving. On the other other hand, no
computably checkable type system will ever let us express all our invariants.
So this question will never be entirely settled.

~~~
theseoafs
> On the other other hand, no computably checkable type system will ever let
> us express all our invariants. So this question will never be entirely
> settled.

I don't understand. Unless we can produce a type system which literally
prevents _all_ errors, type systems are useless?

~~~
ScottBurson
Where did I say they were useless?

When I say that static typing _as a religion_ does not appeal to me, I mean
that I think dynamically-typed languages are reasonable choices for some kinds
of programs, and that I do not agree with the sentiment quoted in the article
that their use should be considered grounds for revocation of one's degree.

That doesn't mean I don't see the point of static typing as well.

------
maxk42
I find that in strongly-statically-typed languages, I spend more time juggling
the types of data than I do solving the problem at hand.

~~~
stefantalpalaru
You're supposed to solve most of the problem in the type design phase.

~~~
dylukes
This. If you're writing code before having at least thought out a preliminary
solution, you're doing it wrong.

Types let you codify your idea, and then make sure your implementation aligns
with it as you write it.

------
stefantalpalaru
It's not just static typing. It's strong static typing like you see in
Haskell, Idris, Scala, F#, etc.

------
gw
I too have noticed that static typing enthusiasts use dogmatic rhetoric more
often than their opponents. One reason may be that the benefits of static
typing are more obvious (catching type errors, documenting code). The largest
benefit of dynamic typing, I believe, is that it encourages you to solve your
problems with generic data structures like lists and maps, rather than
inventing new types for everything in your program. This, I think, is what
Alan Perlis was talking about in his forward for SICP [1] when he compared
Pascal to Lisp, noting that in the former "the plethora of declarable data
structures induces a specialization within functions that inhibits and
penalizes casual cooperation".

[1] [http://mitpress.mit.edu/sicp/full-text/book/book-
Z-H-5.html](http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-5.html)

~~~
muhuk
Do you mean heterogeneous lists and maps?

~~~
gw
No, that's not what I was referring to. Perlis' quote refers to something much
more fundamental. When you make custom data structures (classes, types, etc)
to represent each distinct part of your problem, this specificity makes your
functions less reusable and moves away from Perlis' famous quote (in the very
next sentence) about the benefit of many functions operating on few data
structures.

~~~
muhuk
I'm not sure if I understand.

Your data structures can be parametrized, no? I mean nobody creates a
ListOfFoos or HashMapOfBars, it's List[Foo] or HashMap[Bar] where Foo and Bar
can be replaced with anything and methods defined on List and HashMap would
still work.

Can you perhaps give a more concrete example?

~~~
gw
Take the example of a simple game where you move a player around. In
statically-typed languages, it is typical to define a type called Player
containing two floats representing the x,y position. In a dynamically-typed
language, it is typical to just use a map and store the position as key-value
pairs.

Note that inheritance does not solve the problem of overly-specialized
functions. The statically-typed language could make Player inherit from the
built-in HashMap type, but the functions that require Player will still not
work with anything else.

~~~
srean
> but the functions that require Player will still not work with anything
> else.

I think you are equating statically typed to Java. There are plenty statically
typed languages which will take care of this, while still avoiding many
runtime errors. Heck, even C++ templates will do this for you, although C++
with concepts would be a lot better, sans the compile time. Typeclasses also
address the same problem: to signal compile error early.

------
jeremyjh
Can we just leave Haskell out of it? It really makes no sense for the
Haskellers to come here and say 'but these would all be type errors in my
program'. Its like when there is an argument between Bud Light and Miller
Light and you come in with 'but no seriously you have to try this new IPA I
just brewed'. Its just not even the same topic.

