
Look at the humongous type that Hindley-Milner infers for this tiny program - luu
http://spacemanaki.com/blog/2014/08/04/Just-LOOK-at-the-humongous-type/
======
flebron
Perhaps an easier example is to do this in GHCi:

    
    
        let x = id id id id id id id id id id id id id id id id id id id id id id id id
    

See how long that takes (add ids if your machine can take it). To deduce a
type for x, ghci will use 2^k type variables, if you have k ids. This is
easily seen by induction:

If k = 1, then the type of x is a -> a. This takes 2 = 2^1 = 2^k variables. If
k > 1, it is of the form id f. Let t be the type of f. t has 2^{k - 1}
variables, by induction. Thus the outermost id must take something of type t
(this is 2^{k - 1} variables) and produce something of that same type t (this
is another 2^{k - 1} variables), that is, it's of type t -> t. 2^{k - 1} +
2^{k - 1} = 2^k.

Note that this information is purely at compile time. Haskell isn't typed at
runtime, this is all just to check types. At runtime, x is a no-op, as
expected.

~~~
seanmcdirmid
I used to be able to kill Scala compiler performance with just a a few
identifiers. Well, Scala is trying to deal with nominal subtyping, which is
well known to be quite hard (so the type inference algorithm isn't really H-M,
and must rely on heuristics).

~~~
virtualwhys
> I used to be able to kill Scala compiler performance with just a a few
> identifiers

Interesting, got an example? Also, what version of Scala were you on?

~~~
seanmcdirmid
This was in 2006/7 when I was working on scalac, so the early 2.1/2.2 days. I
seemed to often push the type system in places that were not well defined,
mostly because my scala style was so different (very OO, but with heavy
recursive use of type parameters). Unfortunately, I've forgotten what the
cases were, but they were usually simple constructions.

~~~
virtualwhys
heh, ok, the stone ages of Scala ;-)

Current Scala (2.11) has many issues to be ironed out, but I'd be surprised if
the topic at hand here were one of them.

------
seliopou
The title's a little misleading, as it suggests that there's something strange
or even wrong about the type that OCaml or Haskell infer for the program. Of
course the type that these systems infer for the program is correct, and the
author does demonstrate an understanding of that. But that a small input to a
system can produce a large output should be little surprise to any person that
works with complex systems, especially computer systems.

The underlying mechanism that leads to the computational complexity in HM(x)
systems--the process of generalizing types to schemas and instantiating
schemas to types in order to support let-polymorphism--would make for an
interesting discussion, but the author hasn't hit on that yet. Unfortunately,
the only people that really understand that part of the algorithm are the
people that have implemented the algorithm, and there aren't many of those
people around. Strange but true. The algorithm that's taught and implemented
in most undergraduate PL courses (PL:AI-based courses included) does not
support let-polymorphism. If the author's reading, upgrade to ATTAPL[0] and
start implementing.

Also, to answer the question "How to compile Turing machines to ML types",
have a look at typo[1] for one way to do it in Haskell (by reducing it to
lambda calculus (shameless plug)).

[0]:
[http://www.cis.upenn.edu/~bcpierce/attapl/](http://www.cis.upenn.edu/~bcpierce/attapl/)

[1]: [https://github.com/seliopou/typo](https://github.com/seliopou/typo)

~~~
JadeNB
I think that this is my new favourite esolang. Type-level naturals
([http://www.haskell.org/haskellwiki/Type_arithmetic](http://www.haskell.org/haskellwiki/Type_arithmetic)),
eat your heart out! What was your inspiration for making this?

By the way, sorry for a "useless use of cat"
([http://en.wikipedia.org/wiki/Cat_%28Unix%29#Useless_use_of_c...](http://en.wikipedia.org/wiki/Cat_%28Unix%29#Useless_use_of_cat))
nit-pick, but note that

    
    
        $ cat examples/fac.typo | typo
    

should probably be

    
    
        $ typo < examples/fac.typo

~~~
seliopou
Thanks! Typo made it HN last summer, so here's a response to your question on
motivation[0]. Also, the 'cat' thing has come up before. Here's my response to
that[1] as well. But now that I'm rereading my response it doesn't seem clear
what I mean. The README for the language demonstrates how you can use shell
grouping as a poor man's module system. When you do that, you have to use
'cat' on single files in certain situations. So I stuck with 'cat' for the
simple example so it's consistent with the more complex examples. If you take
a look at the typo script[2] that wraps the compiler, you'll see that I know
when to use redirects and when to use 'cat'. But regardless of all that, I
think most modern machines can spare an extra process.

[0]:
[https://news.ycombinator.com/item?id=6176340](https://news.ycombinator.com/item?id=6176340)

[1]:
[https://github.com/seliopou/typo/pull/4](https://github.com/seliopou/typo/pull/4)

[2]:
[https://github.com/seliopou/typo/blob/cc64ec38603a50c6543b38...](https://github.com/seliopou/typo/blob/cc64ec38603a50c6543b3834d47b4ff0431c2b3e/typo#L14)

~~~
JadeNB
Thanks for the helpful reply. By the way, I didn't mean to suggest that you
didn't know when to use `cat` vs. re-directs; I just thought it might have
been an oversight. I'm sorry if it came across as rude.

------
throwaway_yy2Di
What about this example?

    
    
        f0 x = (x,x)
        f1 x = f0 (f0 x)
        f2 x = f1 (f1 x)
        f3 x = f2 (f2 x)
        f4 x = f3 (f3 x)
    

This is worse than O(c^n): the size of the last function's type is O(2^2^n).

(In spacemanaki's blog example, each new expression doubles the size of the
previous one. Here, each new expression _squares_ the size. f4 builds 2^2^4 =
65,536 copies of 'x', and f5 will crash your compiler if you define it).

~~~
natte
O(2^2^n) = O(4^n) = O(c^n) ; O(2^3^n) = O(8^n) = O(c^n) ; ...

~~~
cousin_it
I think grandparent meant 2^(2^n), not (2^2)^n.

~~~
chengsun
Even so, O(4^n) != O(8^n); there is no constant k such that k * 4^n >= 8^n.

------
munificent
This is a fantastic post. I'm passingly familiar with ML and have heard the
term "let polymorphism" a number of times, but this is the first time I've
seen it clearly explained.

~~~
spacemanaki
Thanks, that's very flattering as I've been a long time fan of your blog!

------
tomp
> Why can’t we just allow polymorphic lambda-bound variables?

Two reasons. The first is that in general, type inference for higher-rank
polymorphism is undecidable. The second is that even if it were decidable, it
would be impractical - I think that if we have a program like this one:

    
    
      let test(f) = (f(1), f(true))
    

the assumption that the programmer made an error is statistically much more
likely than the assumption that `f` should have a polymorphic type. Another
reason is that it's not entirely clear what kind of polymorphic type to infer
for `f` - should it be `forall a. a -> a` or e.g. `forall a. a -> int`?

> What does this have to do with different “ranks” of polymorphism?

Higher-rank polymorphism means that polymorphic types can appear _within_
other types; in a system with support for higher-rank polymorphic types, the
parameter `f` in the function `test` above could be declared with type `forall
a. a -> a`, the function `test` would have a type `(forall a. a -> a) -> int *
bool`. Higher-rank polymorphism is formalized using System F, and there are a
few implementations of (incomplete, but decidable) type inference for it - see
e.g. Daan Leijen's research page [1] about it, or my experimental
implementation [2] of one of his papers. Higher-rank types also have some
limited support in OCaml and Haskell.

> How is let-polymorphism implemented? How do you implement it without just
> copying code around?

The "standard" implementation consists of two operations, _generalization_ and
_instantiation_. At let bindings, types are _generalized_ by replacing all
unbound type variables with polymorphic type variables (care must be taken not
to generalize type variables that can be bound later, but that's a secondary
issue here). Every time a variable with a polymorphic type is used, its type
is _instantiated_ , which means that all polymorphic type variables are
replaced with fresh unbound type variables.

    
    
      let f = fun x -> (x, x)
    
      print f(1), f("boo")
    

In the example above, the type inferred for `fun x -> (x, x)` is `_a -> _a *
_a`, where `_a` is an unbound (but not polymorphic) type variable. At let
binding, this type is transformed into `forall a. a -> a * a` by replacing
`_a` with a polymorphic type variable `a` (in most ML languages, the `forall`
is implicit). Then, when `f` is used in `f(1)`, its type is instantiated into
`_b -> _b * _b` and `_b` is unified with the type of `1`, `int`. When `f` is
used in `f("boo")`, its type is instantiated into `_c -> _c * _c`, and `_c` is
unified with type of `"boo"`, `string`. Since `_b` and `_c` are different
variables, there is no error here (if `f` had a monomorphic type, `_b` and
`_c` would in fact be the same, and this example would result in an error
"cannot unify type int with type string").

> What’s the relationship between let enabling exponential function
> composition and the exponential time result?

So there is quite a bit of copying happening every time a variable with
polymorphic type is used. I've never studied the worst-case scenarios of
Hindley-Milner, but I imagine that it has to do with instantiation copying
large polymorphic types around.

> Do implementations of Hindley-Milner actually represent types as dags and
> utilize structural sharing?

Yes; an unbound type variable is typically represented as a reference cell,
which is assigned to the type the type variable is unified with. So, if we
have

    
    
      double : forall a. a -> a * a
    
      double (1, 1, 1) : (int * int * int) * (int * int * int)
    

then the type of double is first instantiated, yielding `_b -> _b * _b`, and
then the type of the parameter is unified with the type of the argument, `int
* int * int`. At this point, the reference cell in the internal representation
of `_b` is updated to point to `int * int * int`, which means that both `_b`
in the result type actually point to the same representation of `int * int *
int`.

[1] [http://research.microsoft.com/en-
us/projects/fcp/](http://research.microsoft.com/en-us/projects/fcp/)

[2] [https://github.com/tomprimozic/type-
systems/tree/master/firs...](https://github.com/tomprimozic/type-
systems/tree/master/first_class_polymorphism)

~~~
userbinator
_Another reason is that it 's not entirely clear what kind of polymorphic type
to infer for `f`_

How about forall a. a -> b ? If we make assumption that f _is_ polymorphic,
then nothing can be inferred about the return value, so it should also be
polymorphic?

~~~
tomp
Let's instead talk about the type of `test`. It could be one of these:

    
    
      test : (forall a. a -> a) -> int * bool
      test : forall b. (forall a. a -> b) -> b * b
      test : (forall a b. a -> b) -> (forall c d. c * d)
    

The first case is the most "obvious" one with polymorphic types, the one that
I described above.

The second is the one that I hinted at, but I used less general type `(forall
a. a -> int) -> int * int`, i.e. instantiating `b` with `int`. The second type
above is a more general one, as it could be instantiated into other types such
as `(forall a. a -> bool) -> bool * bool` as well, but this is the standard
first-rank HM polymorphism.

The third type above is the type that would be inferred if `f` had different
polymorphic type variables as parameter and return types. However, there are
no values that have the type `forall a. a`, so a function with type `forall a
b. a -> b` either (1) cannot return (i.e. diverges or raises an exception) or
(2) is a dynamic _cast_ function, i.e. a hack (`Obj.magic` is an example of
such a function, but using it is "dangerous" in the sense that it bypasses the
type system and so might result in undefined behavior at runtime).
Disregarding case (2), the function `test` would also not return, so it's
equivalent to say that it's type would be

    
    
      test : forall c. (forall a b. a -> b) -> c
    

and as such, it would be pretty useless.

~~~
pacala
Not sure why "test : forall b. (forall a. a -> b) -> b * b" is "the standard
first-rank HM polymorphism". The first comment implied "test" is a type error
in HM, typed as "forall a b. (a -> b) -> b * b", which fails to unify a with
both int and bool.

Based on the principle of least privilege, "test : forall b. (forall a. a ->
b) -> b * b" feels the right type. Only the body of test feeds a to the first
argument, so forall a is scoped to the first argument, whereas the caller of
test needs to consume b, so we scope forall b to the whole expression.

~~~
tomp
What I meant is that going between `forall b. (forall a. a -> b) -> b * b` and
`(forall a. a -> int) -> int * int` is something that is possible in standard
HM (i.e. if we consider the `forall a. a -> ...` part as opaque to HM).

I guess you're right, the "least polymorphism" we need to add to make `test`
pass the type checking is to make just the argument of `f` polymorphic...
However, in standard HM without any hacks/dynamic types, a function with the
type `forall a. a -> b` for some fixed `b` can't really do anything useful
with its parameter - it can't inspect it, it can only wrap it up in a data
structure, but it can't return the data structure, unless the type of the
parameter is wrapped in an existential type (but unwrapping it, there is still
nothing to be done with it...).

~~~
pacala
Great example, it now clicks. We lifted b all the way up because it needs to
be consumed by the caller, might as well lift a as well, as types with free
variables, like b in "forall a.a -> b", are useless. Which looks a lot like
let-polymorphism :)

------
quink
Somewhat related:
[http://en.wikipedia.org/wiki/Billion_laughs](http://en.wikipedia.org/wiki/Billion_laughs)

------
Camillo
If you're busy, just read the original Stack Overflow Q&A:
[http://stackoverflow.com/questions/22060592/](http://stackoverflow.com/questions/22060592/).
It's 1/20th the size of the blog post and I got all I needed out of it.

------
Patient0
I've found the following:
[http://okmij.org/ftp/Haskell/AlgorithmsH.html#teval](http://okmij.org/ftp/Haskell/AlgorithmsH.html#teval)

to be very useful for understanding how the HM type algorithm works.

They rephrase the problem as "writing an expression interpreter" for your
program that recursively evaluates the abstract types of each expression
instead of the concrete value.

I found this to be a very intuitive way to understand it.

It also then makes it easier to see why some "type inference" algorithms might
guarantee to terminate while others might run forever: the more complicated
your type system, the less abstract the "type values" are. They become like
actual "values", and the behaviour becomes closer to the actual running of the
program, which may not terminate.

------
bmh100
Fascinating behavior. Since this is an edge, would it ever occur in human-
written code? Could there be a situation where a computer would end up
generating such code?

~~~
spacemanaki
It's unlikely to occur as the type is exponential in size and thus quite
unwieldy to do anything with! I haven't worked with generated ML or Haskell
code, but I still think it's unlikely to be an issue in "real" code.

~~~
nostrademons
It's rarely a compile-time performance issue because O(2^N) algorithms perform
well on modern hardware when N ~= 3. You could nest the `double` example from
the article 6 times and it would still only be a 64x slowdown.

Where it does show up, annoyingly, is in error messages. You actually see this
all the time with GCC's STL error messages (C++ templates provide a
parameterized type system very similar to Haskell or ML, but without the
inference, at least until C++11). STL containers are usually parameterized not
only over the key and value, but over the comparator and allocator. Nest a
couple of them and you suddenly have a type with several dozen variables. GCC
expands the whole type out, without typedefs, when printing error messages,
which means that you will sometimes get compile errors that are 10 pages long.
Clang fixes this by preserving the typedefs, without which you wouldn't want
to write such a type. Haskell has the same problem, except worse because it
infers types, but GHC has gone to great pains to format error messages so they
are at least readable if verbose.

------
toolslive
Actually, there's some use case for these things: you can simulate varargs,
and yes the compilation time can be made arbitrary large:
[https://gist.github.com/toolslive/5957292](https://gist.github.com/toolslive/5957292)

    
    
      $> time ocamlc vararg.ml
      real    1m9.372s 
      user    1m9.264s
      sys     0m0.044s
    

Try adding a few a's and see what that gives.

~~~
spacemanaki
Thanks for pointing that out! I hadn't made the connection and now I've got
another reason to figure out how fold works (have yet to wade through all of
[http://mlton.org/Fold](http://mlton.org/Fold))

------
BorisMelnik
question: anyone know of where the pronunciation of "!" to be "bang" came
from? I always called these exclamation points but see a lot of programmers
calling them "bangs." anyone?

~~~
AnimalMuppet
Well, they are exclamation points. But for many punctuation characters,
there's a hacker nickname (or more than one). I've heard of "bang" for "!",
"hook" for "?", and "shriek" or "splat" for "#".

The Jargon File has more detail:
[http://www.catb.org/jargon/html/B/bang.html](http://www.catb.org/jargon/html/B/bang.html)

~~~
lmartel
My favorite of these: I've worked with a few people who used "huh" for the
coffeescript `?` operator.

~~~
quink
While I haven't seen anyone actually use this pronunciation I'd consider 'p',
short for predicate, a fairly logical way of pronouncing '?'.

If nothing else, it's LISPy in origin at least. Without actually being
inherent to LISP. For example, with a ternary operator:

var adult = (age > 18) ? true : false;

Pronounced: 'adult' equals (age greater than 18) p true else false.

------
troels
That sub title is hilarious.

~~~
pohl
This little program walked out on stage. What it did next blew the compiler
away.

------
mehwoot
Compilers hate him! Find out this one simple program....

~~~
JadeNB
Indeed, that is already (essentially) the subtitle of the article.

~~~
mehwoot
Well, I look like an idiot.. I read the whole article and then came back to
the comments and it popped into my mind. Which was apparently just me
remembering it.

[http://i.imgur.com/Uwysayc.png](http://i.imgur.com/Uwysayc.png)

~~~
JadeNB
Now that is a classy _mea culpa_!

------
htk
English called, they are running out of exclamation points.

~~~
antics
This talk was submitted to a conference which literally requires exclamation
points to be in the title. So that's where they come from.

