Hacker News new | comments | show | ask | jobs | submit login
Curry-Howard, the Ontological Ultimate (psnively.github.io)
75 points by kryptiskt on Oct 15, 2014 | hide | past | web | favorite | 80 comments



This is very vague about what Curry-Howard means. Curry-Howard means that any type can be interpreted as a theorem in some logical system, and any term can be interpreted as a proof of its type.

This does not mean that those theorems have anything to do with your program. Take the following function:

    swap : forall a,b. (a,b) -> (b,a)
    swap pair = (snd pair, fst pair)
The type here is forall a,b. (a,b) -> (b,a). The logical meaning of this type is (a and b) => (b and a). Note that this is a theorem in logic, not a theorem about your program.

This is entirely different than what the types say about your program. Those properties you get from progress/preservation proofs of the type system. Now it is certainly the case that in dependently typed languages you can get the logic from Curry-Howard to express theorems about values in your program. That's the whole point of dependent typing. But it's important to to keep in mind that statements such as this:

> Look again at the OCaml Set module example above. You already get a universally-quantified property just from using the type system to guarantee an invariant expressed in an ADT. That’s the import of the Curry-Howard Isomorphism, minus any appeals to more expressively powerful type systems.

are nonsense. The fact that the module enforces the property

    P(n) === S1.mem n (S1.singleton n) = true
    Q(n) === forall m \in int. m /= n -> S1.mem m (S1.singleton n) = false
has nothing to do with Curry-Howard. Which proves Kell's point about the equivocation around type theory.


> You express your specification as types (theorems), and your implementation proves those theorems. Then your code is correct by construction. This is a true statement; it isn’t open to debate, not being a matter of opinion.

That's technically correct, but a little bit divorced from reality. Let's say I want to write a function that counts the primes below one million. Since that's a function with no arguments that returns a single integer, the corresponding theorem by Curry-Howard is that "at least one integer exists". The function "return 0" is a proof of that theorem. That doesn't seem too useful!

You might argue that we should use a richer type system to express the specification. Sure! We just need to express the idea of "primes below one million" in the type system. But that's at least as difficult and bug-prone as writing the program in the first place, so I have a hard time seeing the benefit.

Most real world functions are the same way. Say you're programming a game and you need a function to fire the gun. Play the sound, add some screenshake, create a bullet object, etc. Go ahead, express that as a theorem. Don't forget to come back and tell us what you gained by doing that.


>You might argue that we should use a richer type system to express the specification. Sure! We just need to express the idea of "primes below one million" in the type system. But that's at least as difficult and bug-prone as writing the program in the first place, so I have a hard time seeing the benefit.

   Theorem primes_below_million_spec: forall l m,
      (forall p, In p l -> p < 1000000 /\ ~exists n, n <> p /\ n <> 1 /\ Divisible p n) ->
      length l = m ->
      primes_below_million = m.
Done.


I think that's missing an implementation of the function that matches the type. Is that also easy?


That's not so easy. That's the point, right? It's easier to write a specification than an implementation, so it's valuable to check that the implementation satisfies the spec.


Well if I wasn't so lazy, it would be easy to implement a Sieve of Eratosthenes algorithm yielding a list of primes up to some number, and then another function feeding 1000000 to that function and returning the length of the resulting list. The hard part would be actually proving the theorem I stated in terms of those two functions.


Let's go further. If I'm writing, say, numerical calculations, types can save me from getting kilometers when I expect miles (or, worse, kilograms). What it can't do is save me from numerical instability, round off errors, and a whole host of other problems known to the field of numerical analysis.

If my code is calculating the angle of a control surface on a plane, the type system is not going to say anything about whether I get 40 results every second. Without those constant, timely updates, the plane is going to crash.

A robust type system means that "your code is correct by construction"? That seems incredibly naive. There's more ways for programs to be wrong than are provable via a type system.


Concerning 'numerical instability': with a sufficiently advanced type system you should be able to save yourself from something like numerical instability. You could e.g. ensure all numbers have an 'order of magnitude type' and define operations between 'order of magnitude' types to raise errors for certain (ranges of) combinations. However, there is (AFAIK) no convenient way to express this in any language, let alone without performance suffering greatly.


Um, maybe you missed the point about dependent types? You can express a function, that takes no arguments, that returns primes below 1,000,000 as it's type! That's kind of the point of the talk, and is supposed to be possible in languages like Agda and Idris (I don't know enough about them to give any syntax here). See jules' comment below about how curry-howard does not mean your implementation is a full proof as well.

Considering code that flies planes and shoots guns generally is written using proofs on control systems (if not required by law), your examples are a bit weak.


The constructive nature of Idris makes it hard (at least for me) to express a prime under 1,000,000 as a type. For example, it's trivial to express a composite as a GADT:

  data Composite : Nat -> Type where
    factors : (x : Nat) -> (y : Nat) -> Composite ((S (S x)) * (S (S y)))
but there is no analogous construction of a prime type. Luckily most day to day programming looks more like the composite case than the prime case.


As an example, here's how primality is defined in the Agda standard library. It is simply the negation of the property 'has a divisor and is greater than 2'.

http://agda.github.io/agda-stdlib/html/Data.Nat.Primality.ht...


It seems to me that using dependent types to express the type "the number of primes below one million" would take at least as much code, and be at least as error-prone, as writing that function in a dynamically typed language directly. In fact I'd be surprised if the dependently typed version were less than 2x the size of the direct version.


This is why they brought up the language that uses the same syntax for types as it does for your function definition.

It really sounds like you haven't played with Coq or Agda.. because expressing these things are not that much code at all. Also the presentation literally covers that complaint, and how they think more programming languages are needed.


But the "proofs" of those control systems isn't anything that looks like a type system, is it? (An argument of the form "due to Curry-Howard, all proofs are type systems" will be regarded as uninteresting at best.)


Absolutely not, at least not to me. I'm not a huge type theory guy or anything.

But saying that "proofs are a burden especially for things with side effects like shooting a gun" when systems that shoot guns specifically do have proofs in spite of their type systems invalidates the example, if not the argument.

The presenters certainly think these languages will help construct those proofs.. that I'm less certain of.


  Again, how he gets this from Amanda’s and my presentation is beyond 
  me
Yes, I don't doubt it is beyond him. Yet he is one of only two very smart people with very interesting things to say that managed to get themselves removed from my various feeds, because I just couldn't stand their idolatry of static-typing and never ending condescension of anything remotely critical of it anymore.

I haven't seen the presentation, but I don't for a second doubt they implied that:

  If somebody stays away from typed languages, it [means]
  they’re stupid, or lack education, or are afraid of maths. There 
  are [no] valid practical reasons for staying away [from type 
  languages]. 
He believes they were being very constructive, showing an understanding for other opinions and taking a nuanced view of static typing, with an eye for valid criticisms. I know, from the online reviews, many did not experience the talk that way.


Reading this debate has been an exercise in sifting valuable commentary from a bunch of opinionated drivel, on both sides. In particular, the current article suffers from the author assuming that every single comment is addressed to his presentation, despite it getting a very brief mention near the beginning. The 'seven deadly sins' article had a better tone, but still spent most of its time on non sequiturs and anti-intellectual cheap shots.

I'm quite interested in type systems (and the lack thereof) but the level of debate around them is astonishingly poor.


It's also odd because Kell only specifically accuses the Snively/Laucher talk of committing two of the "seven deadly sins", and that isn't one of the two.


I think he took Stephen Kell's post as a direct response to their talk when in fact Stephen Kell's post was about the entire culture around static typing and their talk is only one small piece of it.


Yes, and the interesting thing is to me this exactly demonstrates the problem that is being pointed out by Kell (for the community in general) and by me (for Snively in particular).

What I feel this blog post demonstrates is that if you are somewhat critical of something Snively says, you are met with a barrage of arguments that counters everything you said in all the ways he could possibly find to disagree with them. That motivation will always lead to some ridiculous and only remotely relevant disagreements, which are now being pointed out here and over on Reddit.

This blog post feels like it was written in anger: the enemy must not just be refuted on some points: he must be destroyed. His credibility must be tarnished by showing he didn't get anything right.

I think I understand this, because I used to do this. I stopped because I wondered why I was so often getting into ridiculous, irrelevant, arguments about things that I didn't even care about but that just came up in a discussion. Proving the other wrong in any way became more important than finding an answer to the question before us.


I've been guilty of this too. The funny thing is, if you actually want to win an argument that's exactly what you shouldn't do. If you respond to everything the other person said, and refute each of this points, then inevitably somebody will come along and take one of your refutations of an unimportant issue and point out a factual error in your refutation. Even if you are 100% correct, somebody will come along and misinterpret one of your sentences. At this point you've lost the debate, because now the entire debate is focused on that small point that you got wrong, and it can be difficult to steer it away from that onto the more important issues.

No, if you want to win an argument you should do exactly the opposite. You should find one weakness in your opponent's argument, and then make a bulletproof refutation of that. You may think that because he made other points that are wrong it's better to point out all the ways in which he is wrong, but that's a bad strategy. Inevitably you will be wrong or misinterpretable in one of your many refutations and then you've lost.

Both of these strategies are of course detrimental to having a productive debate ;-)


Can you recommend some even-handed reading on the opposing views here?


This comment by Matthias Felleisen's is a very polite and well-considered position from an academic on the dynamically-typed side of the spectrum: http://existentialtype.wordpress.com/2014/04/21/bellman-conf...

Harper himself can't be relied on to be so cordial, unfortunately.


Speaking of the culture around static typing: Bob Harper's blog, where Matthias Felleisen gets 15 downvotes per courteous, thoughtful response...


Seems like voting has now been turned off, or at least I don't see any votes any more.


I'm not sure that there are any opposing views here. I think that once you filter out the actual arguments, they are mostly independent of each other. Which opposing views did you take away from these two blog posts?


Curry-Howard is central to how you think about programming vs. Curry-Howard is almost completely irrelevant to how you program?

But maybe that disagreement goes away if I restate it slightly: Curry-Howard is central to how you think about programming vs. Curry-Howard is almost completely irrelevant to how you actually write programs. That is, if I am actually in the process of writing a program, I don't think about Curry-Howard very much, even if I'm working with the type system in Haskell. If I think about programming at a highly abstract level, then, OK, Curry-Howard might be considered a central insight.


The talk he's defending was tiresome and dripping with condescension (I agree with Stephen Diehl: https://twitter.com/smdiehl/status/519160854517149696). Typed-language folks don't need advocates who waste breath talking about "incompetent Ruby developers moving to Clojure" and whatnot. There's enough positive stuff to talk about! I think the folks who gave that talk need to read Gershom Bazerman's (a person who has no trouble giving talks filled with excitement about type- and category-theoretic concepts with no condescension) excellent post here: http://comonad.com/reader/2014/letter-to-a-young-haskell-ent...


It seems oddly defensive for Paul Snively to interpret Stephen Kell's "Seven Sins" essay as being mostly about his (and Amanda Laucher's) Strange Loop talk – sure, their talk was mentioned as an example in passing (along with another talk about Rust), but in no way is the essay specifically about their talk. The seven sins essay addresses the entire discourse about static and dynamic typing; whether one particular talk exhibits a particular "sin" or not doesn't mean there's not a real problem.


>>In a typed language, only polymorphism which can be proved correct is admissible. In an untyped language, arbitrarily complex polymorphism is expressible.

>This misses the point, which is that “arbitrary” has proven to be unlikely to be correct. Ultimately, this argument comes down to “you should trust human programmer intuition,” which I gladly admit to being an assumption I’m unwilling to make.

It's true that “arbitrary” has proven to be unlikely to be correct in the general case.

However, allowing untyped types(?) has also proven to be useful. I think this is because there is yet to be a language that makes expressing complex types easy.

We work in an environment where FizzBuzz is considered a good way of filtering people who can program.

I'm generally a proponent of "strongly typed" languages. But I do worry about the cognition overhead the syntax causes, and I do wonder if there is a "better way".


> I'm generally a proponent of "strongly typed" languages. But I do worry about the cognition overhead the syntax causes, and I do wonder if there is a "better way".

What about dynamic language + runtime type checking + (nearly) 100% test coverage? You get the flexibility of a dynamic language and the safety of a static type language (in fact you get more, since runtime type checking is more powerful).


100% Test Coverage means that every code path is executed by the test suite. This is probably a good thing.

Yet it does not mean that the code is correct for any definition of "correct" other than "passes the associated test suite." One unit test makes a line of code covered. It does not make the domain of inputs covered - 100% coverage does not mean that corner cases have been handled correctly - or that they have been considered at all. Again, all it means is that every line of code was executed at least once.

There are useless tests:

     def test167  [
       if [ val == foo(8); return true]
         then return passed
         else return failed
       endif ]
     enddef


This of course is true. But some test coverage tools with give you path coverage which helps with that.

Of course, this doesn't really substitute for a type system either.


It can be very costly to achieve 100% test coverage and static verification can reduce this cost by giving you some tests "for free". For example, you never need to write a test to check for null-pointer-exceptions if you code in Haskell.

Static types also help in situations other than testing. For example, you can use "guardrail programming" to be much more liberal when refactoring things: want to change a function signature? Just change it and the compiler will point out what other parts of the code need to be updated. This is even if you haven't written a test suite yet or are doing big changes that invalidate the old test suite.

That said, I think the perfect world would be one that lets you mix dynamic and static typing in a single program (some things are hard to verify statically, but would be nice if we could at least express them as types) but I think we are still a way off that being a practical alternative.


I know in C# they have the dynamic keyword which allows for dynamic typing when you need it, but lets you use static typing when you don't.


Yes, C# and lots of other language have that feature. Where it gets really tricky is if you want to dynamically check the type of functions or objects, since you need to run the assertions when the function/method gets called rather than when the function/method gets assigned to some reference. Its also tricky to give a parametric type to code that is implemented with dynamic types.


I'm not convinced that requiring test coverage is a good trade off.

Don't misunderstand me: tests are good. But language syntax that makes some bugs impossible is also good.


The cognition overhead of 100% test coverage is probably absurd, as well.


Probably you are right, but for a library I think it should be an obligation to provide 100% coverage. For a generic program, maybe I sould provide 100% coverage at least of the "critical paths".


Might also be worth it to emphasize the "intended paths."

Of course, that is probably just saying that there should be documentation.


Especially given that you can't reasonably enforce the rigor of your test suite by way of your language.


Oddly, you could, in a way. Just treat your language and compiler as a tool in the toolchain. Add in a testing segment and you can set rules on it.


I still don't understand the notion that dynamic types gives you anything more than static types. Are there really usefull programs that can't be written in, say, Haskell?


IMO, its not just about what useful programs you can write. Its also about how much of the type system you need to learn to get the program to compile. For example:

    function foo(f){
        return [f(["1", "2", "3"]),  f([1,2,3])];
    }

    function length(xs){
        return xs.length;
    }

    foo(length);
Anyone can see what this Javascript program doing but to write the Haskell equivalent you would need to understand higher-rank-types and enable the corresponding language extension, which gets in the way of beginners.


Unfortunately I do not know so well Haskell to give you an answer. I was referring to this:

> I think this is because there is yet to be a language that makes expressing complex types easy.

Since I'm obliged to use a dynamic language (JavaScript) I'm looking for a solution in it. And in JavaScript is really easy to express a complex type, say "the set of all the even integers between 2 and 14 but 10", using runtime checking to enforce the constraints. Is it simple in Haskell to define such a set in its type system?


Simple? Yes. Pleasant? Not really. I'm not a Haskell expert but I can write that type in Scala (the Haskell version probably has less boilerplate, but still some):

    sealed trait Even[N <: Nat]
    object Even {
      implicit object zero extends Even[_0]
      implicit def evenSS[N <: Nat](implicit e: Even[N]) = new Even[S[S[N]]]{}
    }
    sealed trait EvenAndBetweenTwoAndFourteenAndNotTen
    object EvenAndBetweenTwoAndFourteenAndNotTen {
      implicit def forN[N <: Nat](implicit e: Even[N], le2: Le[_2, N], le14: Le[N, _14]) =
        new EvenAndBetweenTwoAndFourteenAndNotTen[N]{}
      implicit object interfere10 extends EvenAndBetweenTwoAndFourteenAndTen[_10]
    }
Like I said, it's cumbersome, but there's nothing actually complicated about writing this, it's just boilerplate to represent it in the type system.

On the other hand, in Idris or the other languages mentioned in the article, you should be able to write that kind of constraint at the type level just as easily as you can at the value level. That's where it gets really exciting.


In javascript, any method that relies on receiving a value of "the set of all the even integers between 2 and 14 but 10" should probably check that it did in fact receive such a thing or it might be buggy. In a statically typed language, that check can easily be centralized to wherever an `EvensFrom2To14Without10` value is constructed.

Here's an example in rust[0]. If the `print_bare` method relies on only receiving even numbers between 2 and 14 excluding 10, then it is buggy, because it isn't checking its input, but the `print_tagged` method is correct without needing to check!

[0]: http://is.gd/9uHrc7


That's not really expressing a complex type in JavaScript, its expressing a complex (such as it is) runtime constraint -- which you can do easily in Haskell as well. JS doesn't give you anything Haskell doesn't there.

Its true that to make that a proper, statically-enforced type in Haskell takes some complexity, but if you aren't willing to do that, you could make it a runtime-enforced constraint on a statically-enforced type of integers as easily in Haskell as, in JavaScript, you could make it a runtime-enforced constraint on a value that is not statically enforced to be anything more specifically than a valid JS value.


I do not know Haskell enough either. But I didn't really mean 'using type system everywhere'. About your question, from what I've seen you can always go symbolic : define an abstract thing as twofourteenexceptten that represent that idea and use it. Or just use a filter everywhere to ensure that function requiring a 2and14but10 won't be called on something with other numbers. In that case you're not losing anything you'd do with a dynamic language.


Even Haskell has some kind of dynamic types.


Well, does it fucking not have dynamic types? Make an argument instead of just downvoting.

http://hackage.haskell.org/package/base-4.7.0.1/docs/Data-Dy...


"But I do worry about the cognition overhead the syntax causes, and I do wonder if there is a "better way"."

I won't claim this is the full answer, but part of the better way is to simply cut away the syntax, revealing it to have been accidental complexity all along, and never a fundamental part of static typing. Even just working in a language with "auto" or "var" or whatever local construct creates a variable based on the type of the expression on the right without you having to puzzle it out yourself makes static typing go from an enormous pain to barely more work than dynamic typing.

I learned C++/Java in school in the 90s and spent most of the 200Xs in the dynamic world precisely because the paperwork of the static languages of the time were absurd. However, working with Go or a modern C# is totally different. (I haven't done C++ with "auto", so I can't add it myself, but others can speak to that.) In my opinion it makes static typing easy enough that it tips the balance back in favor of it; you put this tiny effort in and you get a ton back out.

Go's structural typing-ish automatically-satisfied interface is even easier; do you have a concrete class with some method you like, but need to be able to switch out the actual object with something else? Just declare the relevant interface, done. You can even after-the-fact "duck type" standard library code that you shouldn't directly modify under pretty much any circumstances with no hassle. The costs of the static type system are almost (but not quite!) eliminated, and in return you get, well, static type. Weak by P. Snively's standards, sure, but still better than nothing!

Haskell, of course, well, you fight with the static typer for a few weeks at first, but once you get good at Haskell (and I don't even mean "great", just good) you eventually get to the point where you discover the static type errors are fundamentally correct and trying to tell you something important, at which point you stop hating it for telling you how wrong you are and start loving it for telling you about a problem long before any other language would have. Personally I think everyone who considers themselves a professional programmer should clock at least six months with Haskell, not for any of the reasons you usually hear, but because it will teach you the rare skill of listening to your code. Static type systems, even the simpler ones such as Go or Java, aren't just slapping your hands because you've failed to satisfy their disconnected, academic ideals... they are saying things, and not just things like "you screwed up this method call" but very often things like "this design is fundamentally flawed, you've failed to account for how this value needs to be communicated to that function". It's important to get to the point that you understand this feedback as more than just the computer just being a computer.

This stuff often ends up explained in very academic language, but I have found it to be a brutally, brutally practical skill. And I write "real code" for things that do things on networks at scale, in the cloud, etc.


The nice thing about dynamic types is that they give you the expressive power to code things that would not be allowed (or at least not encouraged) in your type system, even though they might be allowed in a different type system. Back in the day, one notable example of this was parametric polymorphism: it comes "for free" in even the simplest dynamic language but took a while to become mainstream in static languages.

IMO, in an ideal world you would able to choose between dynamic and static verification of types depending on what your program is doing. This might even let us work with richer types than we do now! For example, most typed languages still do dynamic tests for array indexes because doing those statically requires dependent types (which are hard to use and not always worth the trouble). But if we had a language that lets us talk about the array lengths in the type system we would at least be able to give better error messages if something goes wrong, even if we only find out about it at runtime.


I understand everything you are saying (didn't I make that clear in my original comment?). But I think you missed the point: I'm not arguing type systems aren't good. I think they are great. (re-reading - maybe I'm unfair on this. Your point about modern languages like Go and C# being better is a fair one)

I'm arguing that the single dimensional view the original article had on the speed of implementation vs correctness lacked some useful counter-perspectives. Infact, I think it was a pretty cheap shot, since it didn't recognize it as a critical tradeoff at all.

Go's structural typing-ish automatically-satisfied interface is even easier

Yes, I think the Go type system has some nice features.

Haskell, of course, well, you fight with the static typer for a few weeks at first

And that exactly the cognition overhead I'm talking about. Some productive programmers barely understand an if statement inside a loop. Using recursion is doubtful prospect.

I think the idea of people like that fighting a type system for weeks because it's good for them is completely unrealistic.

Note I'm not arguing that it isn't good for them: of course it is. And of course a type system "is telling you things". But people are writing real programs quicker without it.

I do like optional type systems. I think that some combination of optional type systems, test systems and tooling integration could lead to a system than slowly adds types to improve the reliability of programs, without the cognitive load that the traditional upfront typing workflow imposes.

But I'm just speculating about that.

I do know that untyped languages are often more useful that typed languages precisely because of the increased difficulty programming that type safety brings. I think that tradeoff is disappointing.


> This stuff often ends up explained in very academic language,

What, that software should be correct according to the specification (either a formal one or an informal and implicit one)? Because that's the academic motivation I've found for things like type systems and program specification.


I mean like "Curry-Howard isomorphism" and "state space" and "monad" and such. It makes it very easy to dismiss as academic wankery, which is helped by the fact that, well, yeah, some of it is, honestly. But under that academic wankery is some brilliantly practical stuff that is hard to learn from the communities that are over-focused on "practicality".


> I mean like "Curry-Howard isomorphism" and "state space" and "monad" and such.

What's the problem with those?


Regarding the referenced talk at Strange Loop, I have noticed that I have yet to see a good talk ever done in that style where dual-presenters oscillate back & forth. Especially when they're trying to elicit laughter. I did manage to enjoy the talk between cringeworthy moments, at least.

There is an interesting slide at around 21:00 where they show four quadrants that represent:

What do you get when you pass a Term to a Term? (a function)

What do you get when you pass a Type to a Type? (templates, generics, type variables)

What do you get when you pass a Term to a Type? (dependent types)

What do you get when you pass a Type to a Term? (inheritance, overloading)

It was an interesting slide. It manages to make dependent types seem like an essential missing piece for the sake of symmetry. I'm not sure I understand the last one, though: how is inheritance and/or overloading the act of passing a type to a term?

Does anybody know if this idea is explained better elsewhere?


That slide was a simplification of the Lambda Cube that they showed right at the end. That's probably a good starting point for better references: http://en.wikipedia.org/wiki/Lambda_cube ... the wiki page isn't particularly approachable, but there's more detail if you follow the links.


One problem with this talk (and many other talks about static typing) is that it seems to take at faith value the common claim that static typing is a net benefit for program correctness because it prevents programs from compiling that would otherwise create runtime errors (e.g., at around the 13 minute mark and again after the 18 minute mark). [1]

This seems like it's just common sense, but common sense claims do not always hold up in science and (as with many other claims regarding software engineering benefits that presumably derive from this or that language feature), it has rarely been tested.

This is not surprising: most computer science departments don't have the infrastructure to do work that involves human test subjects (unlike, say, psychology departments). Not only is there an ethics review board that has to sign off on such tests (no matter how harmless they may seem), creating meaningful tests involving non-trivial programming tasks can be pretty involved (a talk by the author of the first paper I discuss below was that one problem with the study was that it had been very expensive).

But at least some such tests have been done, and the conclusions are not necessarily what common sense would tell us.

Consider for example, the following series of papers:

Hanenberg, S. An Experiment About Static and Dynamic Type Systems - Doubts About the Positive Impact of Static Type Systems on Development Time. In Proceedings of OOPSLA/SPLASH 2010. ACM, Reno 2010. [2]

From the conclusion: "In this paper we presented results of an experiment that explores the impact of static and dynamic type systems on the development of a piece of software (a parser) with 49 subjects. We measured two different points in the development: first, the development time until a minimal scanner has been implemented, and second the quality of the resulting test cases fulfilled by the parser. In non of these measured points the use of the static type system turned out to have a significant positive impact. In the first case, the use of the statically typed programming language had a significant negative impact, in the latter one, no significant difference could be measured."

Now, that is one data point that only relates to the initial development of a piece of software, too (and as we'll see in a moment, it's not as though dynamic typing is a free lunch, once we look at maintenance tasks), but it does contradict common sense: The defect rate of the statically typed software is not actually better (at least not in a statistically significant way). Earlier studies had also shown that statically typed languages (unsurprisingly) did have benefits over languages that performed neither compile-time nor runtime type checks (such as ANSI C vs. K&R C w.r.t. function arguments).

Now, a follow-up paper:

Kleinschmager, S.; Hanenberg, S.; Robbes, R.; Tanter, É. & Stefik, A. Do static type systems improve the maintainability of software systems? An empirical study. IEEE 20th International Conference on Program Comprehension, ICPC 2012, Passau, Germany 2012, S. 153-162. [3]

Three conclusions from this paper on the effect of static type systems on software maintenance: (1) "Static type systems help humans use a new set of classes." (2) "Static type systems make it easier for humans to fix type errors." (3) "For fixing semantic errors, [...] no differences with respect to human development times [was observed]."

The paper also references a number of other studies with related results that may be interesting.

A second follow-up paper was the following:

Spiza, S. & Hanenberg, S. Type Names without Static Type Checking already Improve the Usability of APIs (As Long as the Type Names are Correct) - An Empirical Study. Proceedings of the ACM Conference on Aspect-Oriented Software Development, AOSD '14, Lugano, Swizerland 2014. [4]

From the abstract: "This paper describes an experiment with 20 participants that has been performed in order to check whether developers using an unknown API already benefit (in terms of development time) from the pure syntactical representation of type names without static type checking. The result of the study is that developers do benefit from the type names in an API’s source code. But already a single wrong type name has a measurable significant negative impact on the development time in comparison to APIs without type names."

What we're seeing is that while there do seem to be measurable benefits from static typing, the actual nature of these benefits may not be what one may expect on the basis of common sense alone (it's actually even more complicated than I can convey in this brief summary).

[1] To be clear, the talk makes many other immensely sensible points, and I don't think there's even enough material to conclusively say this one is right or wrong; my point is that it's essentially unproven either way, and that there is evidence that our intuition may be wrong here.

[2] https://courses.cs.washington.edu/courses/cse590n/10au/hanen...

[3] http://pleiad.dcc.uchile.cl/papers/2012/kleinschmagerAl-icpc...

[4] http://dl.acm.org/citation.cfm?id=2577098


Just took a quick look at paper #2, which is quite fascinating -- to try and isolate the effects of static / dynamic typing, they developed two slightly different programming languages which varied along only this axis.

I have some sympathy with this approach, but I'm not convinced you can actually separate things in this way. The type systems I actually find useful are not just a static analysis layer, but have significant impacts on the semantics of the language and the way libraries are designed. I'm not sure you could just 'turn off' the type system in Haskell and have anything even remotely sensible, for example.

I sometimes wonder if there's enough mature open-source projects in the wild now to be able to draw some statistical conclusions about the long-term effects of language choice, but that would be quite a project...


> I'm not sure you could just 'turn off' the type system in Haskell and have anything even remotely sensible, for example.

You can actually do exactly this:

https://ghc.haskell.org/trac/ghc/wiki/DeferErrorsToRuntime

Of course, the type system doesn't vanish but checking types is deferred until runtime, turing Haskell into a dynamic language with an unusually elaborate tag system.


It's still not comparable to a dynamic language however. Simply constructing a list [1, "Hello"] will already produce an error. Dynamic languages usually allow this, and only signal errors when the values are used in an erroneous manner, like 1 + "Hello". It does exactly what it says on the tin: it displays type errors that were found at compile time with the same type system that regular Haskell uses, but the actual displaying is deferred until run time. There is no actual tag checking going on at run-time.


It does make Haskell a dynamic language, but also isn't really what's meant. Generally when people compare statically typed languages and dynamically typed languages, what they are doing is comparing the various ways in which the type system imposes restrictions.

The way in which Haskell differs most strongly and obviously from Ruby is not that it catches the type errors at compile time - but that the type errors can be caught at compile time. This in general allows smarter tools - compare Eclipse for Java and the 'type hierarchy', 'find method calls', many refactoring calls and the far less rich tools available for Python.


Thanks so much for this point. I think the key here is conclusion (3) about semantic errors.

I remember endless discussions about the Liskov Substitution Principle and how it relates to subtyping/subclassing. I think what people easily forget that static types do almost nothing for semantics and the notion of "behavioral subtyping" is a lot more interesting but also a lot more elusive.

And I say all this as a lifelong static typer, by the way, and not even of the fancy Haskell variety. I know what not to expect from static type checking so I'm a lot happier with what it does provide me.


Before even loading up the papers I knew they'd be using some horseshit type system like Java's.


The programming tasks did not require complex data structures (such as a scanner and parser), so one would not expect Java's type system to be a limiting factor.


This comment alone is enough to make Haskellers (or other proponents of good type systems) not take this seriously.

Anything trying to cast broad strokes about the usefulness and effectiveness of type systems can't use a language with a type system like Java's.


And your comment is enough to make people not take you seriously. You sound like a stuck-up jerk rather than a thoughtful person.

Perhaps you could explain exactly what it is about Java's type system that makes it not worth even discussing. That would be a comment that was actually useful. (And don't say something like "because Java doesn't have higher-kinded types". We know that it doesn't. What about that lack makes it not even worth discussing?)


I stated the problem with citing that research when trying to appeal to people familiar with Haskell-like type systems (as in the linked blog post).

My real point is that you can't really draw strong conclusions from this research when talking about type systems in general. If you wanted to say "It doesn't seem like type systems like Java's improve productivity" then I wouldn't have any issues, and would probably even agree.

> You sound like a stuck-up jerk rather than a thoughtful person.

I will agree that I phrased what I said a little harshly, but damn.


OK, now you sound like a thoughtful person that I can actually have a conversation with. And I probably responded a bit overly harshly, too.

But I keep seeing this "Java's type system is essentially worthless" idea, tossed off glibly as if it was God's own truth, and I've never seen a justification for it. I get that it's not Hinley-Milner. I get that it doesn't have higher-kinded types. Why does that make Java's type system worthless?


As a type system, purely formally speaking, Java is... well, it's just kind of OK.

It has parametric polymorphism. It gets subtyping & co-/contra-variance right -- for generic types only, not arrays, and requires explicit annotations ("extends", "super"). It can express higher-order functions. Its primary formal weakness compared with, say, Haskell98 or SML, is the lack of higher-kinded types (ignoring for the moment the issue of Haskell-style typeclasses or ML-style modules & functors), which is a pretty big weakness but not a game-killer (see: Rust, Elm).

But pragmatically, it's a pain in the ass to work with. (Possibly this has improved since I worked with it. I hope it has!) It has a weird "everything must be an object" view of the world; it requires a file for every type I wish to define (barring nested classes, and anonymous classes - the latter of which is a notable but late improvement). Expressing HOFs is possible, but very unpleasant (AIUI it's gotten better lately). And just to top it all off, it doesn't have type inference, which just makes everything a chore.

The way I see it, Java is playing catch-up with the innovations introduced by more forward-thinking languages. It seems doomed to being forever a middle-of-the-road language. Which is... well, it's just kind of OK.

But it's not gonna inspire devotion, and people who worked with it when it really did suck (perhaps it still does, I dunno) are gonna remember that and hate it.


Thanks for the answer. But could you clarify something for me? What's the difference between higher-order functions and higher-kinded types?

And, yeah, Java's type system... I'm not going to claim enthusiasm, merely that it seemed adequate for what I wanted it to do.


> But could you clarify something for me? What's the difference between higher-order functions and higher-kinded types?

roughly, the relation is:

(type constructor):(simple function)::(higher-kinded type):(higher-order function)

That is, a higher-kinded type is a type which takes a type constructor as one of its parameters, in the same way that a higher-order function is a function that takes a function as one of its parameters.


Obviously, it's not worthless, that's just hyperbole, but it's sad to see people who would probably be the first to say that they are reasonable, evidence-based, pragmatic computer engineers fall into the manichaean trap that is extremism: either something sucks (Java) or it's awesome (Haskell).

The truth is that Java's type system is a bit in-between. It's a reasonable type system that's tractable for a huge portion of the developer community (hence its success) but which falls short of some required qualities to be able to show proofs by applying the Curry-Howard isomorphism.


Parsing, at least, absolutely can benefit from a powerful type system: consider Haskell's Parsec library, which uses monads & monad transformers to express a powerful parser combinator system.


Writing a parser combinator library does not require a powerful type system, especially not a custom one for a specific task (the students who were doing the scanner/parser task were limited to an environment where they had no access to third-party libraries, not that any existed for the language they were given). For a simple parser combinator implementation from scratch you require nothing more than basic types and the type of whatever AST you're generating. Parametric polymorphism is useful when you want to write a reusable parser combinator library, something that wasn't necessary, but even then you don't really need more than that (see other existing parser combinator libraries).

Higher order functions can make your life easier, but note that the students had access to a Java-like language (of which there was both a statically and a dynamically typed version) with Ruby-style closures, so that wasn't an obstacle.


Fair point.


> Parsing, at least, absolutely can benefit from a powerful type system: consider Haskell's Parsec library, which uses monads & monad transformers to express a powerful parser combinator system

But AFAICT, you don't really need type-system support to use monads or monad transformers -- Haskell's purity, static typing, and use of monads for IO forces Haskell users to become familiar with monads, which may encourage Haskellers to develop the skill to use them for other purposes, but there is nothing stopping you from using monads and monad transformers in a dynamically typed language.


No, you don't. But the question at hand isn't using a dynamically vs. a statically typed language, it is, given that you are in a statically typed language, is a Java-style one sufficient?

Indeed, this is one of the benefits of dynamically typed languages: you're not restricted to any particular type discipline; you don't have to worry that the type system of the language you're in is too restrictive.

OTOH, if you want it to be easy to write reusable code, having static typeclasses with type inference can be quite useful. You could write a monad transformer stack in a dynamically typed language, but you can't always write code that's generic across any monad without explicitly passing around an object that represents the monad in question. (Racket's "units" might be a reasonable solution to this problem, though.)




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: