
The Unreasonable Effectiveness of Dynamic Typing for Practical Programs - greggman
http://games.greggman.com/game/dynamic-typing-static-typing/
======
eloff
I used to be a huge dynamic languages fan, with python being my favorite
language over the majority of my career. Then I worked on a large python
project of ~100k LOC with a team of ten. That's when I realized that writing
code faster isn't the problem. Reading it is, making changes to code someone
else wrote is, and refactoring across dozens of modules is a problem. Static
languages help a lot with all three. I still love dynamic languages for small
tasks, but I'd rather use a static language like Go which still keeps much of
the dynamic feel, and helps catch my mistakes at compile time. I actually just
set it to watch the project directory for changes and recompile automatically
in a terminal on another screen (actually to run the unit tests, which
includes compiling.) That's constant feedback and comparable to dynamic
languages speed for edit-compile-test runs. But the best part is the unit
tests. They run so much faster with Go that I don't get distracted waiting for
them to finish, and that keeps me in the zone and much more productive.

~~~
incepted
Agreed. I often find that most people who are very enthusiastic about dynamic
typing have only seen it in action on solo or small projects.

Add complexity or people and dynamic typing quickly crumbles.

~~~
dgritsko
I made this point in an earlier comment, but I'm a firm believer that if the
speed at which you can physically type code is the bottleneck, then you are
going to eventually have much bigger problems. Hastily written code will
require more time to fix than it will cost you to slow down and think about a
good design, regardless of language.

~~~
bjourne
If you can code quickly, then the effort required to refactor previously
written bad code is lower. I.e it is not that you can type faster that
matters, it is that you can fit more change iterations in the same time frame.
And many times multiple change iterations are required because you can't see
beforehand whether a particular refactoring will be a net positive or
negative.

~~~
habitue
Strongly typed languages beat dynamic languages hands down for refactoring.
The ability to know immediately what broke without having to run all your code
paths dominates any time savings gained by typing faster.

I'm a fan of strongly typed languages with good type inference (and therefore
less typing), but even in verbose languages like Java this is true about
refactoring.

------
Udo
Having worked with (and sometimes implemented) different flavors of both
static and dynamic languages over the course of my 20+ years in programming, I
still feel a little guilty about not coming down on one extreme side like most
of my colleagues.

I do like the fact that static typing gives me more confidence when
refactoring things, and I do appreciate the speedier execution, but at the
same time when programming in a dynamic languages I only rarely run into bugs
that could have been avoided by a (moderately strict) type system.

To me, the best feature of dynamically-typed languages are dynamic container
objects (JS object literals, PHP arrays, Lua tables, Lisp/Scheme lists),
because instead of having every single container be a purpose-built one-off
abstraction that is often not even completely inspectable, much less
extendable, I can just have one standard structure for any data.

In the end though, both paradigms are tools to be used appropriately. While
there is a great deal of problems that can be solved with either of them, they
clearly each have strengths and weaknesses to be understood within the context
of the program's purpose.

~~~
u320
_To me, the best feature of dynamically-typed languages are dynamic container
objects (JS object literals, PHP arrays, Lua tables, Lisp /Scheme lists),
because instead of having every single container be a purpose-built one-off
abstraction that is often not even completely inspectable, much less
extendable, I can just have one standard structure for any data._

This is perfectly possible in static languages as well.

~~~
Udo
It depends on the language and it depends on how it's idiomatically used. In
theory, you can make capability claims about most languages interchangeably,
but the question is if the language really affords it or not. For example, I
would say that Rust's _enum_ gives you a lot of this capability easily,
whereas C/C++ usually does not - _union_ s and explicit casting not
withstanding. But I do fully acknowledge you could make that claim for almost
any language, and be technically correct.

~~~
u320
Sure. I just mean that if you want to have that feature in your language,
static typing will not prevent you.

In practice almost no mainstream static language does this though.

------
ianamartin
Dan Luu covers a ton of literature here[0]. The take away is that all of the
major studies are flawed in pretty serious ways. Enough that you can't draw
strong conclusions.

[0] [http://danluu.com/empirical-pl/](http://danluu.com/empirical-pl/)

~~~
dgritsko
The study the OP references (from the original talk) seems suspect to me; in
particular, the fact that the researcher created both a static and a dynamic
version of his own language. Generalizations about the time required to "get
stuff done" in _all_ (or even _most_ ) statically/dynamically typed languages
feel unwarranted. And then there are statements like this:

"he found [that] it took less time to find [type] errors than it did to write
the type safe code in the first place."

Maybe that's an indication that his homegrown statically typed language was a
crappy language, rather than exposing some fundamental truth about statically
typed languages in general?

~~~
pekk
> Maybe that's an indication that his homegrown [...] language was a crappy
> language, rather than exposing some fundamental truth about [...] languages
> in general?

Can you come around and make this same argument any time someone makes a
blanket statement about static languages being unconditionally better than
dynamic ones?

~~~
dgritsko
I guess what I was hoping to emphasize is that the number of variables
involved in trying to quantify something like this make it difficult (probably
impossible) to have an accurate comparison -- even more so when attempting to
generalize the results of those comparisons to larger categories of
programming languages.

------
OmarIsmail
This is why I think gradual typing with structural types is the Right Way, and
will be what all modern programming languages move towards (like how we don't
have to worry about manual memory management anymore).

Javascript with Flow, or TypeScript is a great example of this. Perl6 is using
gradual typing.

The key thing is that there are times where you want the inflexibility of
static typing, and there are times where you want the benefits of dynamic
typing. Structural types also remove dependencies because the function defines
the _structure_ of the type it expects, and not a specific reference.

Also, I think github repos (and the person's tests) are heavily heavily bias
towards individual projects. There's a massive difference between something
only one person works on, and something that an entire team is developing over
the course of years (as team members come and go).

~~~
ionforce
> benefits of dynamic typing

What, specifically?

~~~
OmarIsmail
Speed, specifically when doing some quick prototyping/exploration.

~~~
ionforce
Why is a dynamic language quicker for prototyping and exploration?

------
Symmetry
I would think that the increased productivity of Python over C++ has more to
do with Python being a higher level language than static typing per-se.

I'm translating some Python into C++ right now for performance reasons. Manual
memory management, lack of list comprehensions, and many other things are
making it slower to write than the Python was but having to specify types is
really the least of it. The extra visual noise often makes the code less
readable but that isn't true of all static languages.

------
fjh
While I appreciate the sentiment of settling debates with data, you have to
actually measure the right things. That statistic that says that only 2% of
bugs would be prevented by static types? That's based on the assumption that
everything that would be a type error in a statically typed language manifests
as an exception in Python. But that's obviously not even close to being true.
The really annoying bugs that type systems prevent are far more insidious than
that. For example Python will happily let you use the greater-than operator on
an int and a function. So if you accidentally write `f < 10` instead of `f() <
10`, you will not get a type error, but your program will have a bug that
manifests in your program logic going wrong and leading to wrong results
somewhere down the line.

I've analysed all bug tickets for a Python system at a previous job for
several months, tracking how many of our bugs would have been prevented by a
Haskell-style type system. This isn't very scientific either, but for my
sample it was somewhere between 70%-90% depending on your interpretation. I'm
not saying this generalises to all projects, but I can definitely say that the
2%-number is hilariously wrong.

------
cryptica
It's good to see hard data on this. I hope more people will publish similar
research in the future. I've done a lot of work with both statically and
dynamically typed languages and I've always known that dynamically languages
were more productive.

I've often made comments on HN suggesting my preference for dynamically typed
languages and these comments often got downvoted - This surprised me because I
thought that my view was the consensus.

To be fair, I think the rise of the web and in particular of the JSON format
for APIs and the use of typeless NoSQL databases have favored dynamically
typed languages. JSON objects have no type, so when you write statically typed
code, you have to add logic to cast everything into concrete types instead of
accepting the data as provided. If you use a NoSQL database, you will get
dynamic typing in the storage layer as well so you won't have to worry about
types anymore... In such a scenario, you can enforce the consistency of
various parts of your data as much or as little as you like.

~~~
btilly
I believe that dynamic languages make initial development faster, and
maintenance more expensive. In particular with static types it is much easier
to launch into a refactor and depend on the type system to tell you about
dependencies that you forgot about. Similar refactors are scarier.

That said, I personally prefer working in dynamic languages. It is more fun
for me. But I don't think it is necessarily the right choice for all
employers.

------
tunesmith
This seems littered with potential fallacies to me, but I'm only basing that
off the summary of the video. It isn't really useful to lump all statically-
typed languages into one bucket since some have confusing compiler messages
and some don't. Same with counting "type errors" of dynamic languages, it
seems the definition of "type errors" would entirely depend on how exhaustive
the hypothetical static type system would be. Also, it's kind of weird (and
I've seen this before from other dynamic type enthusiasts) to imply that unit
tests are only for dynamically typed code - we use them in statically-typed
languages too!

I generally hold the point of view that dynamically typed languages are great
for prototyping and when you are in the "build fast and break stuff" phase,
and that statically typed languages are better for when correctness and
maintenance matter more. It's generally mapped my career path as I've moved
from being a freelancer for small clients to a contractor/consultant for large
clients. One basic indication for when you might want to consider pulling more
statically-typed languages into your stack is when you start running into
those really confusing run-time bugs/behaviors that are really hard to track
down.

~~~
nine_k
Entirely anecdotally, I found that in the same amount of time I can write down
more functionality in a (toy) Haskell program than in a (toy) Python program.
Partly this is because finding trivial errors becomes easier (despite cryptic
compiler messages), partly because the language and standard library often let
you express algorithms in fewer words. (Due to this, without a strict type
system and type inference, Haskell code would likely become brittle far faster
than Python code.)

~~~
tunesmith
It's a good point because I think a lot of the static-vs-dynamic typing
discussion gets confused by the functional-vs-imperative discussion.

------
andreyk
This 'Unreasonable Effectiveness' naming scheme has really started to be
abused, ha? I mean this really just points out that dynamic languages are
somewhat faster to work with than static languages - hardly 'Unreasonable'. I
think as a rule large project/codebase -> static typing is nice, but for fast
scripting/problem solving dynamic is clearly the way to go (hardly original).
I don't think the cited data here provides a good argument against that - the
study involved a quick problem (several hours), and the bug breakdown does not
convince me that having type checking does not prevent many bugs in the long
term. The final point about anti-modularity is just weird given OOP and all.
Still, some interesting data.

~~~
golergka
Unreasonable Effectiveness Considered Harmful.

------
preordained
Good to see some effort put into making a case in the debate (for either
side). I drank the Haskell koolaid pretty hard a few years back, and I wanted
to believe that all the anecdotes people used to support the language had to
be true. You're more productive because of all the compiler can do for you,
you're safer, you're more correct, etc... My experience after some time was
that significant effort was spent with me serving the type system, rather than
it serving me. The article's statements are in line with the lessons I've
learned, and I'm not going to feel bad about throwing about such an anecdote
around on a message board...but more to the point, I feel GOOd that some
people are taking strides to make this conversation less anecdotal.

------
Animats
He's doing experiments with programming in the small, probably with students
as subjects. Not useful. Typing is most valuable for enforcing consistency
within large programs, or between software components from different sources,
where a change in one part may break other parts. If he wants to experiment
with this, he should have groups of three or four students independently write
parts of a program, then integrate them.

There's convergence on when to declare types. As I mentioned in a previous
post, the trend is toward declaring function parameter types, but inferring
the types of local variables. Go and Rust take this route. C++ now supports it
with "auto". Even Python is considering adding optional function parameter
type declarations.

~~~
zokier
Inferred types are still considered static in most cases

------
throwaway13337
I know everyone has their preference here so it doesn't help to just talk
about anecdotes.

That said, the author really doesn't seem to take note of the largest benefit
of type safety: Self-documentation.

Looking at someone else's source code with type hints gives you so much more
of an idea of what's going on and what sorts of parameters any given function
might take. With static types, that documentation is guaranteed to be right.

Dynamic languages also encourage the use of clever reflection in ways that
make your code unreadable to someone who has limited scope of context.

For me, I'd rather be using a lib created in a static language.

------
AnimalMuppet
That first datapoint... at worst, programs took 60000 seconds to write. That's
1000 minutes. That's just under 17 hours, or two days. In that time, it took
less time for the programmer to catch type errors than for the programmer to
write the static typing to catch the errors. I can buy that.

Now make the code base 10,000,000 lines. Maintain it for two decades with
dozens of programmers. Now how relevant is that 2-days-at-the-worst datapoint?

------
tracker1
I'm glad to see this... I've felt for many years now that I'm far more
productive in dynamic languages than static ones. I always really liked C#,
but since I started using node.js more, I find I get a lot more done, when I
use modules and functional patterns, I tend to avoid a lot of the bugs I get
in C#/Java with simpler code that's easier to understand... Unit tests become
only slightly more interesting as I have to use proxyquire to override the
request system for the module I'm testing, but in the end the actual code is
easier to reason with. Also, DI/IoC is mostly un-necessary, which is a huge
painful issue when debugging .Net code (and would presume Java is similar).

It does take a bit of discipline to keep things smaller/modular and not do too
much in any single file/module.. but overall the code comes out more reliably.
It works very well for front end systems, and direct services they work
with... farther back, using another systems language may be a better option.

------
badloginagain
I was about to comment about my personal experience, but realized the irony
before posting. I won't say I'm convinced either way, but it's great to see
some data enter the conversation.

~~~
sago
Your irony meter is more sensitive than a large number of commenters who's
responses are variants of "no, because... anecdote!"

I agree, it is interesting to see data, though I struggle to design a study in
my head for how to really show this either way. But... anecdote... it does
amuse me how various languages claim to improve development time/safety/bug
rates/cost, (OOP, Types, Functional, etc). But these big claims are not quite
born out by companies investing in those languages wiping the floor with their
competitors. If there really was a significant difference, I'd expect
evolution to take its course.

~~~
EvanPlaice
Attempting to argue with ideology is a self-defeating proposition.

In the presence of 'belief' there is no room for reason.

Which reminds me. I haven't seen Dogma in a while. Metatron is also one of my
favorite roles played by alan Rickman. I guess I found my plans for this
evening.

------
mjt0229
Did his statically typed language include reasonable type inference? Because
if not, then of course it took longer to write static types.

If so, then, well, that's interesting. I'm not sure I'm ready to take the
training wheels off for myself, but maybe worth some thought?

~~~
ty56
The earlier parts of the talk generalize out to advanced languages like F#,
Ocaml. but the "data" part of the talk assume static languages are completely
represented by C++, Java and C#. It was pretty disappointing.

------
fmstephe
I would love to get a report from someone with a lot of Erlang experience.

Erlang is a great example of an industry oriented dynamically typed
programming language.

But... they recently, maybe more 5 years ago now, went to a lot of trouble to
integrate a really nice static type system into it. They also had a type
analyser, dialyser, for a long time before that.

The impression I had was that some significant number of people experienced
some dynamic typing pain and a lot of effort was put into reducing this by
strengthening the type system.

I am just a part time Erlang hacker, so I will keep out of this debate. But I
would love to hear from anyone who experienced this in a professional
development environment.

------
evanspa
I've been a lot of programming in both lately (Clojure and Objective-C), and
did a ton of Java earlier in my career, and I find I like both. The biggest
pain point for me in Clojure is the inability to easily refactor.

To overcome the limitation that the machine can't help as much when using a
dynamic language, I've found that writing good unit tests with solid coverage
gets me pretty far, and shortens the feedback loop when I break something.

But yeah, I sure do miss the right-click and 'rename' feature from Eclipse and
Xcode when writing Clojure.

------
u320
One of his slides says that C++ and Java is equally productive, but less
productive than C. Are we supposed to believe that OO is a net negative but
automatic memory management makes no difference?

------
muraiki
I've used both dynamic and statically typed languges, and also dabbled in some
of the more flexible static languages like Haskell and Scala. While I
personally prefer statically typed languages, I find Perl 6's gradual typing
to take an interesting approach. Perl 6's type checker operates differently on
functions and methods: functions, roles, and private methods are checked at
compile time, but public methods are checked at run time. As Jonathan
Worthington wrote in a presentation on Perl 6's approach, lexical scoping is
"your language" but a public method call is "the object's language": "it's for
the receiving object to decide how to dispatch the method," which might be
familiar to people who have used Smalltalk. This approach also allows easy
interoperability with libraries from dynamic languages, such as using Perl 5
or Python code in Perl 6.

I'm very interested to see how this approach works out in practice. Slides for
Jonathan's "Getting Beyond Static vs Dynamic" can be found here:
[http://jnthn.net/papers/2015-fosdem-static-
dynamic.pdf](http://jnthn.net/papers/2015-fosdem-static-dynamic.pdf)

------
jakub_g
(working full-time on JavaScript code for last 4 years)

TL;DR good dynamic codebase is possible, but not when you throw _too many
devs_ at it (particularly _junior devs_ ) and only when you set the very high
quality bars. Otherwise, in finite time the codebase will converge to a big
pile of mess.

IMO the big dynamic-language codebase with multiple people working on it can
only survive with extensive test suite, lots of mock data available, automated
quality tooling (jshint etc) and proper code review in place. All of those are
well-known best practices, but they require good developers and certain
discipline.

In particular, when the app retrieves lots of data from a server, it's easy to
get lost when the frontend app doesn't even know what data it operates on, it
just knows it gets "some JSON". In old-school java world you'd have this
mapped into a bean and you'd at least know what you work on.

Previous project I worked on was an inherited poor JS codebase with lots of
unrefactorable magic happening inside, multiple event buses and what not.
Since JS is a very dynamic and very permissive language, someone can abuse
some language constructs to make it very hard to reason about the code.

In a pathological extreme you may get "everything is an object" codebase but
no idea what keys each object has. Bonus points if the code is passing some
JSON around and adding some keys to that object in an ad-hoc manner, scattered
along many functions. No tests and no mocks meant that to learn that, you'd
have to run the app and put breakpoints. But, the same variable might have had
totally different type of data inside depending on how some if-elses executed.
Most of our bugs were due to some objects not having some keys (sometimes) for
some reason etc.

~~~
EvanPlaice
So, if you want to slice off an object and name it just use ES6 classes.

    
    
        class NameMe {
          constructor (obj) {
            for (var prop in obj) this[prop] = obj[prop];
          }
        }
    

This literally just spits out a named object with an identical structure to
the object it's being created with.

If you want to make all keys visible and guarantee no exceptions are thrown
when a key isn't set, just define default values.

    
    
        class NameMe {
          string = '';
          number = 0;
          bool = false;
          array = [];
          object = {};
    
          constructor (obj) {
            for (var prop in obj) this[prop] = obj[prop];
          }
        }
    

You could always _gasp_ initialize everything to null if you're concerned
about the defaults screwing up application logic. That's how static type
systems address initialization by default anyway.

Want to enforce specific types? Add setter methods that include validation
checks. Do I need to continue?

None of these characteristics are difficult to define in Javascript.
Structuring data in a manner that's easy to reason about is no more difficult
in Javascript than it is in any typed language. Even before the introduction
of classes you could achieve the same using prototypes.

If the code you've encountered was difficult to reason about, it's because the
devs who wrote it suck at writing code that's easy to reason about. Self-
documenting code is a naming problem not a type theory problem.

Any code base, whether static or dynamic should contain an extensive
collection of mock data and unit tests to verify common cases and check for
breaking edge cases. Unless the project is a one-off fire-and-forget
implementation. In which case; who cares, maintenance is somebody else's
problem (I'm not stating this is a good ides, just what usually happens in
practice).

\----

Bonus: How about a couple of cases that 'literally can't even' be done in
statically typed OOP languages.

Extending existing objects:

Sure, you can use object inheritance but that requires explicitly defining a
class that represents the new structure. Except that now creates a deeply
coupled relationship between the child and parent classes. Which will
inevitably become a maintenance issue if the business logic ever changes. Say
hello to technical debt.

Proof: Why do all of the collection classes in Java inherit from vector?

You can accomplish the same in JS in any object (class based or not) using
.extend() (provided by most 3rd party frameworks). Which essentially does a
deep copy of all the object properties. No deep coupling between objects, no
technical debt incurred.

Multiple Inheritance:

I'm surprised nobody in the OOP community talks about this anymore.

Since OOP relationships between objects only point to the external interfaces
(ie class definition) rather than reference the underlying data, there needs
to be a system to represent those relationships.

Multiple inheritance was cast off as not technically feasible due to the
possibility of users creating circular references. Ala 'the diamond problem.'

The solution. Branch everything off of a central 'god object' and represent
all relationships between classes as a directed acyclic graph.

This is an artificial constraint that only applies to OOP. That not only makes
it unnecessarily painful to reason about the structure of class definitions,
but it makes it impossible to pass data to adjacent leaves in the DAG without
some form of external global state (ie the singleton pattern).

In JS it's very easy to use multiple inheritance. Just create an object and
add data from as many other objects as you want using .bind(). Done with an
object but still need to maintain it's state? return a new function that
maintains a reference to it's enclosing function. Ie use it as a closure.

Stop for a second to consider. The .bind() method, which is crucial to adding
functional-like aspects to an imperative programming language, is physically
not possible in OOP. The closest equivalent is to pass in ref/out params as
arguments into the constructor (ie creating more deep links to external
classes).

------
kras143
My personal experience with a large project which I started in Python and
later moved to Haskell: I did indeed get things done quickly in Python and had
majority of the problem solved. Then I had few nasty bugs which made me
change/refactor the code and thats where my problem started. I quickly
realized refactoring a huge codebase in Python was really difficult. Maybe
there is a better way to organize my python code, I do not know. Then I moved
to Haskell (partly because of the excellent Parsec library which made things
very simpler compared to the yacc style PLY I was using in Python). Initially,
the fix-compile-execute cycle was really painful, but I soon realized how I
could figure out some functional bugs (not type bugs) just by reasoning about
the types. The compiler too helped in some cases with valid type conversion
errors. I would have caught such issues in python only if I had a very large
test suite which covers this corner case. Nevertheless, I am happy with the
move and my love for static typing is only going up everyday.

------
mapleoin
I wonder if the results would be the same had they compared a more advanced
statically typed language with type inference like an ML or even go.

~~~
mjt0229
I just (more or less) duplicated your comment somewhere...jinx.

------
dgritsko
A lot of points being made seem to focus on how quickly one is able to write
code, and how dynamically typed languages are better since "you can write code
faster!". If the speed at which you are physically able to write your code is
the bottleneck , you are going to have problems regardless of whether your
language is dynamically typed or statically typed.

------
AnimalMuppet
> Another point he made is that writing static types is often gross and
> unmaintainable whereas writing unit tests not.

I suppose they _can_ become "gross and unmaintainable", if you do it badly.
And I'm sure some people do it badly, but... really? That sounds like someone
didn't know how to use static types. (Yeah, I know, No True Scotsman...)

------
everyone
Does he mention performance? (I havent watched the whole video) That would be
the biggest reason to go for C++, say, for me, as a game programmer. Also
(once again) this is a very webapp developer orienteted talk, stuff like
"stringly typed programming" and "all we do is put string in http requests"
ignores all non webapp devs.

------
diebir
This does not take into account the fact that IDEs provide massive amount of
help for statically typed languages and marginal to no help for dynamically
typed ones.

For instance, every IDE I have tried fails miserably on Python. Javascript is
even worse, even with a very smart tool like Idea.

------
ergothus
Interesting - the primary point that devs spend more time with the type-
safetyness than they would fixing the few type bugs they'd have made matches
my impressions.

That said, when it comes to tests there are definitely times I miss having
strong typing.

I recall one poster several years ago remarking with surprise when he
discovered that not everyone agreed with him on what was most important to
optimize: dev time or run time.

In my world, devs are massively overworked/overneeded, so anything to reduce
dev time that doesn't cripple the product sounds like a good thing. Different
markets will have different needs, but I've found a number of devs that
consider run time vastly more important regardless of market.

------
KirinDave
Are we really still referring to Prechelt's work 16 years later (more really,
it took that group a long time to assemble their data)?

[https://page.mi.fu-
berlin.de/prechelt/Biblio/jccpprt_compute...](https://page.mi.fu-
berlin.de/prechelt/Biblio/jccpprt_computer2000.pdf)

This study has been cited over and over despite nearly universal criticism for
stacking the deck in favor of dynamic typing.

------
MaulingMonkey
Lies, damned lies, and statistics.

If you want to argue that most hip dynamic languages will allow faster
development than most hip static languages in many situations? Sure, I'll buy
that. E.g. jumping from C++ to Ruby for a bit was, for many of the projects I
worked on, a major productivity boost.

But I buy it because it's got enough weasel words, and it's focused on the
languages in practice rather than the actual language attribute. Because I can
think of clear and concrete examples where adding static typing helped: E.g.
using Typescript to add static typing information to existing Javascript. This
massively improved my speed in picking up new APIs via judicious abuse of
Intellisense. And typescript as I'm using it is doing very little more than
just adding static type information - much closer to purely comparing static
vs dynamic typing than any combination of languages I can spot on these
charts.

The article's statistics doesn't measure the productivity difference of static
typing vs dynamic typing - it measures _certain statically typed languages_ vs
_certain dynamically typed languages_ (among a million other caveats.) C++ vs
Python? The biggest difference there isn't static vs dynamic - the former is
weighed down by some of the most cumbersome explicit type annotation in
existence (when explicit type info isn't fundamental to static typing at all).
Worse, it has the nastiest grammar causing horrible build times, an absolutely
insane dependency system, and a standard so absolutely rife with
implementation defined, unspecified, and undefined behavior that it doesn't
even define the size of it's common integer types. And I've yet to meet a C++
codebase that doesn't use it's common integer types.

The article also dismisses type errors too lightly, IMO. "Out of 670,000
issues only 3 percent were type errors (errors a static typed language would
have caught)". But does this include such things as SQL injections, where SQL
Data was mistakenly treated as SQL Commands? While most APIs don't leverage
the type system to differentiate these in a way that will cause errors (be it
at runtime _or_ build time), I do consider this a type error, one that _could_
be caught with type system. Unfortunately, handling the single vanilla string
type you typically get tends to take precedence over creating such a type
separation...

------
incepted
> People believe static typing catches bugs. People believe static typing
> helps document code. People believe static typing makes IDEs work better and
> therefore save time, etc. BUT … those are all just beliefs not backed up by
> any data

Yeah... no. It's backed up by data. Mathematics, even.

Types enable safe automatic refactorings. Without types, refactorings can
break your code if not supervised by humans.

~~~
jwdunne
It's not that it can make just IDEs 'work better'. The compiler can work
better with a great static type system. Haskell, for instance, is mind
blowing.

Perhaps the problem is that static typing is confused with how some languages
implement it. It can, and has, been implemented such that it dramatically
improves compilation, correctness, static analysis and documentation without
making code horrendously verbose.

~~~
madflame991
> Haskell, for instance, is mind blowing.

Mind blowing is the amount of output and obscurity of the error message you
get from the type checker. It's harder to understand the error message than to
debug the program if compiled "unsafely".

~~~
agumonkey
It's a threshold I guess. My formal proof teacher said he thinks in type
checking unification all the time. To him a program is a proof tree
construction.

With time the way I approach code is also much more and abstract. You look at
diagonal invariants more than code itself.

------
andrewchambers
Considering the examples use C,C++,Java as the statically typed languages
didn't impress me. There are far better languages. Also, performance of the
solution will be worse in dynamic languages.

------
p4wnc6
> This first slide is from a research paper where the researcher wrote his own
> language and make both a statically typed and dynamically typed version then
> got a bunch of people to solve programming problems in it. The results were
> that the people using the dynamic version of the language got stuff done
> much quicker.

Does this first plot control for notions of quality and extensibility of the
different solutions? A faster-to-develop but sloppier solution in a dynamic
language which requires more painful investment to refactor for future use
cases should not necessarily be viewed as better. If you are only saving short
term time at the expense of much more long-term time, then whether it is a net
win for you depends on your discount function.

> What was most interesting was that he tracked how much time was spent
> debugging type errors. In other words errors that the statically typed
> language would have caught. What he found was it took less time to find
> those errors than it did to write the type safe code in the first place.

For which developers, and with what level of experience with static typing?
This was true for me 3 months after I started learning Haskell. Now I have > 8
years of Python experience and less than 2 years experience with Haskell and
the type system demonstrably speeds me up. Way, way faster to use Haskell's
type system first than to use Python's type system and trace backs to debug
type errors later. (I still like and use Python a lot -- just sayin.)

> The guy giving the talk, Robert Smallshire, did his own research where he
> scanned github, 1.7 million repos, 3.6 million issue to get some data. What
> he found was that there were very few type error based issues for dynamic
> languages.

> So for example take python. Out of 670,000 issues only 3 percent were type
> errors (errors a static typed language would have caught)

This strikes me as one of the most problematic parts of the post. To me this
just seems to be evidence that in Python, at least, TypeError is more common
when you are using something interactively, and you can resolve the issue for
yourself (because it generally directly means _you_ are using it wrong, and
it's _not_ the library's fault).

This also resonates with my experience with Pandas on GitHub. Early on there
was a lot of TypeError stuff with index-related issues, but once the bulk of
that work became mature, index errors were then a signal of a novice user who
needed to change the user code, and not at all an indication of a library
problem worthy of opening a GitHub issue.

It seems totally reasonable to me to hypothesize that the types of problems
worthy of becoming GitHub issues are not usually TypeError. But TypeError
might still be a huge proportion of all of the errors encountered out in the
wild.

Further, there's also some selection effects here for users who actually post
things to GitHub. When I worked in quant finance, and everything was in
Python, it was an hourly occurrence for hugely important parts of the system
to hit type errors, and they were all incredibly painful to fix in the legacy
code. This was just accepted as a way of life, and because the invest staff
weren't incentivized to care much about code, they usually just hacked their
own work arounds, and would never have dreamt of actually opening a GitHub
issue about type errors (that would be way too slow of a dev cycle for them,
which is why the state of the code was so poor in the first place!)

> His point there is that all that static boilerplate you write to make a
> statically typed language happy, all of it is only catching 2% of your bugs.

This is absolutely false and not a valid generalization of the presented data.
For one, a major claim of static typing proponents is that by writing with
static typing, it eliminates bugs from ever being introduced, and allows you
to use a compiler workflow to verifiably remove entire classes of bugs. When
you run some bit of Python and it does not produce a TypeError -- that doesn't
mean the code is free of errors. It might just mean you got lucky that the
data or the user selections or whatever didn't happen to hit the TypeError
corner case. With a static language, you _know_ that certain classes of errors
are not even possible -- not just that they didn't happen to occur this one
time, but that they _cannot_ occur. This is _very_ different.

Further, another claim of static typing proponents is that the _design
process_ of code with static also leads to fewer bugs because the mandate for
static types forces you to clarify befuddled design ideas before the program
will work. The benefit of this is murkier, for sure, but it's still something
that can't be addressed by this particular data.

> Some other study compared reliability across languages and found no
> significant differences. In other words neither static nor dynamic languages
> did better at reliability.

It's interesting to me that that chart doesn't include any functional
languages. Let's try it again with a pure functional language and see, and
then also compare, say, Clojure with Haskell. If it keeps on robustly bearing
out the same trend, then I might start to question my current beliefs on
defect rates in dynamic, imperative languages.

> Part of that was reflected in size of code. Dynamic languages need less
> code.

This again is relative to the ability of a developer and also relative to
different types of tasks. However, it's not really fair to compare languages
like C, where brevity of syntax was not too big of a language design priority,
with a language like Python, where brevity of syntax is sometimes militant
(just try talking with "Pythonistas" on Stack Overflow about why one-liner-
ness is really not that useful). And also, at least part of the result is
fixed for you: static typing at the very least requires the extra type
annotations -- although here again you could try against something like
Haskell where you have very powerful type inference. I would be extremely
surprised if, for equivalently experienced developers, Haskell programs were
not consistently shorter than Python programs.

> He points out for example when he’s in python he misses the auto completion
> and yet he’s still more productive in python than C#

Try Jedi in emacs (or whatever the equivalent must be in vim). Although, I for
one hate IDEs (get off my lawn) and I also hate autocompletion and editor
utilities that jump to function or class definitions. I've never noticed a
significant speed up from these, except possibly when I am merely reading code
from a large codebase that is brand new to me. But I have often experienced
huge slowdowns from the features getting in my way.

> Another point he made is that writing static types is often gross and
> unmaintainable whereas writing unit tests not.

See Haskell. Also, writing unit tests can be a nightmare in OO and imperative
settings, where you need some inscrutable cascade of mocked architecture to be
able to test things. This is where something like Haskell's QuickCheck can
make life a lot easier. I'm sure you could cook up something like that in
Python too. But I strongly believe that writing unit tests in Python is way
uglier and more frustrating than writing type annotations in Haskell.

~~~
p4wnc6
continued ...

> Static types are also anti-modular. You have some library that exports say a
> Person (name, age ..). Any code that uses that data needs to see the
> definition for Person. They’re now tightly coupled. I’m probably not
> explaining this point well. Watch the video around 48:20.

This seems just wrong to me. You can declare structs as static in C and
provide public helper functions that internally create data types, apply other
static functions to them, and the produce results from them. In Haskell, it's
very common to avoid exporting value constructors for data types, and to
instead provide helper functions that allow for the implementations to remain
hidden from anyone using the module. Modularity really has nothing at all to
do with the dynamic vs. static typing debate.

I'll also throw one more downside of dynamic typing into the ring -- you
sometimes will see really poor attempts to use so-called "defensive
programming." In Python this is an especially bad code smell -- you'll see a
huge block of assert statements right at the top of a function definition, in
which all kinds of type properties and invariants of the arguments are
asserted, so that TypeError can be raised immediately.

For one, in a dynamic typing setting, it's probably better if that stuff is
the burden of the caller rather than the callee, in the spirit of a function
"doing one thing and doing it well" it shouldn't also have to carry around all
of its own type and invariant assertions. Notice that in a static language
though, this isn't a problem and even is a huge benefit because it doesn't
require the huge, human-error-laden block of asserts to achieve it. Just a
nice, simple static typing annotation and then the _compiler_ will deal with
it.

Related to this, and as a final point, we should also need to give more
"severity" to dynamic typing exceptions that occur at run time due to type
errors. For example, in the financial job I mentioned before, it would be
common place for an analyst to submit a very large batch processing job to the
internal job manager. Some of these jobs took > 48 hours to compute and the
output would mutate databases and so on.

So when someone set it running on Friday evening and expected there to be
results in a database on Monday, imagine how awful it was to see that a
TypeError had occurred and that not only did your manually created assertions
fail to capture it, but also, there was no way of proving it couldn't happen
without just running your code -- so you burnt maybe 30 hours of computational
effort just to be told that upon hitting a certain point in the code, here's a
TypeError.

This kind of error, which is categorically eliminated from possibility in a
well-written static language program, should count for way, way more than a
simple and stupid "oh I tried to call the API function with a list instead of
a tuple, whoops my bad, let me just arrow-up in IPython and do it again" Type
Error (though it's not clear to me that several of the referenced data in the
post would make this distinction or penalize these types of errors more).

~~~
pekk
> You can declare structs as static in C and provide public helper functions
> that internally create data types, apply other static functions to them, and
> the produce results from them.

You can do that, with training and careful effort. But it was a design flaw
that you have to do it manually, and that it isn't mandatory and trivial for
even beginners to do. At the time C was "designed," this wasn't necessarily
known to be important. We have no excuse today. But languages which do this
wrong by default are still popular.

> In Haskell, it's very common to avoid exporting value constructors for data
> types, and to instead provide helper functions that allow for the
> implementations to remain hidden from anyone using the module.

In general, if calls require knowledge of type information at the call site,
and the type needs to change for any reason (which becomes more likely as type
annotation reaches further into program semantics) then all the call sites
will need to be updated, or there will be an error. In any published library,
this means backward compatibility is completely broken and everyone else's
code needs to change.

This is a misdesign in C and in a number of "statically typed" languages which
crib from it.

> you'll see a huge block of assert statements right at the top of a function
> definition,

I almost never see this. The only time I see it is when a dogmatic true
believer in the ideology of static typing writes Python. People can do stupid
things in any language.

> this isn't a problem and even is a huge benefit because it doesn't require
> the huge, human-error-laden block of asserts to achieve it.

Humans are still required to provide type information, which means they can
still make errors. Even better, correcting these errors often affects the
interface at call sites, which means the fix has to break backward
compatibility.

> so you burnt maybe 30 hours of computational effort just to be told that
> upon hitting a certain point in the code, here's a TypeError.

You were not reasoning correctly about your code. Proper testing should have
been your safety net, but you weren't testing properly. If you are even
vaguely trained and you are even vaguely trying, writing code which emits
TypeError in production takes some doing.

The number of shops which never have problems in production is vanishingly
small in ANY language.

It sounds to me like you got started in Python, and are identifying beginner's
mistakes with the language itself.

~~~
p4wnc6
> In general, if calls require knowledge of type information at the call site,
> and the type needs to change for any reason (which becomes more likely as
> type annotation reaches further into program semantics) then all the call
> sites will need to be updated, or there will be an error. In any published
> library, this means backward compatibility is completely broken and everyone
> else's code needs to change.

Notice I said you avoid exporting the _value_ constructors. You're still free
to export or not export the data type itself as you wish, allowing users to
reference the type in type annotations while still not letting them ever
construct their own value of the type except through helper functions.

This achieves even better modularity, because then in the implementation file,
you can change what happens with the _value_ constructors however you want,
and you can service backward compatibility to your heart's content without
ever requiring the _users_ of the data type to even be aware that anything is
changing.

Maybe you are referring to something else, but I am referring to data type and
value constructors in Haskell. The data type itself is a distinct semantic
construct in Haskell from the constructors of values of that data type, and
they can have different privacy properties.

> I almost never see this.

Well, I've seen it over and over in production critical code in three
different organizations ... so our anecdotes disagree.

> Humans are still required to provide type information, which means they can
> still make errors. Even better, correcting these errors often affects the
> interface at call sites, which means the fix has to break backward
> compatibility.

It depends on the language. In Haskell for example, you could just make a type
union, one for allowing passage of the old-style interface and one for the
new, corrected version. It's very easy to do, still has the upsides of type
checking, and doesn't break backward compatibility.

> You were not reasoning correctly about your code. Proper testing should have
> been your safety net,

Except you missed the relevant test case, whereas a tool like QuickCheck would
have had a better shot at discovering a corner case that humans couldn't have
anticipated.

> It sounds to me like you got started in Python, and are identifying
> beginner's mistakes with the language itself.

I'm not sure what you're referring to. The code I was working with was written
by a mix of many Python developers. Some were core committers to the Python
language itself; some were data analysts who didn't want to be programming.

I can say that I haven't had significant front-end experience in Python. But
I've touched a lot of most other major areas, particularly in very low-level
NumPy code, LLVM stuff with both Numba and llvmlite, pandas, Excel tools, and
many different database technologies and ORMs.

I will say though, that in the projects where we switched from pure Python
over to statically-typed Cython, it cleared up tons and tons of our issues,
many of them almost over night.

Rather than me finding beginner mistakes in Python, it seems to me like you
worked on one single system that suffered a lot of issues with backward
compatibility, and you're generalizing that backward compatibility experience
to other areas where you're less familiar (like solving the same backward
compatibility stuff in Haskell).

