Hacker News new | past | comments | ask | show | jobs | submit login

> Why can’t we just allow polymorphic lambda-bound variables?

Two reasons. The first is that in general, type inference for higher-rank polymorphism is undecidable. The second is that even if it were decidable, it would be impractical - I think that if we have a program like this one:

  let test(f) = (f(1), f(true))
the assumption that the programmer made an error is statistically much more likely than the assumption that `f` should have a polymorphic type. Another reason is that it's not entirely clear what kind of polymorphic type to infer for `f` - should it be `forall a. a -> a` or e.g. `forall a. a -> int`?

> What does this have to do with different “ranks” of polymorphism?

Higher-rank polymorphism means that polymorphic types can appear within other types; in a system with support for higher-rank polymorphic types, the parameter `f` in the function `test` above could be declared with type `forall a. a -> a`, the function `test` would have a type `(forall a. a -> a) -> int * bool`. Higher-rank polymorphism is formalized using System F, and there are a few implementations of (incomplete, but decidable) type inference for it - see e.g. Daan Leijen's research page [1] about it, or my experimental implementation [2] of one of his papers. Higher-rank types also have some limited support in OCaml and Haskell.

> How is let-polymorphism implemented? How do you implement it without just copying code around?

The "standard" implementation consists of two operations, generalization and instantiation. At let bindings, types are generalized by replacing all unbound type variables with polymorphic type variables (care must be taken not to generalize type variables that can be bound later, but that's a secondary issue here). Every time a variable with a polymorphic type is used, its type is instantiated, which means that all polymorphic type variables are replaced with fresh unbound type variables.

  let f = fun x -> (x, x)

  print f(1), f("boo")
In the example above, the type inferred for `fun x -> (x, x)` is `_a -> _a * _a`, where `_a` is an unbound (but not polymorphic) type variable. At let binding, this type is transformed into `forall a. a -> a * a` by replacing `_a` with a polymorphic type variable `a` (in most ML languages, the `forall` is implicit). Then, when `f` is used in `f(1)`, its type is instantiated into `_b -> _b * _b` and `_b` is unified with the type of `1`, `int`. When `f` is used in `f("boo")`, its type is instantiated into `_c -> _c * _c`, and `_c` is unified with type of `"boo"`, `string`. Since `_b` and `_c` are different variables, there is no error here (if `f` had a monomorphic type, `_b` and `_c` would in fact be the same, and this example would result in an error "cannot unify type int with type string").

> What’s the relationship between let enabling exponential function composition and the exponential time result?

So there is quite a bit of copying happening every time a variable with polymorphic type is used. I've never studied the worst-case scenarios of Hindley-Milner, but I imagine that it has to do with instantiation copying large polymorphic types around.

> Do implementations of Hindley-Milner actually represent types as dags and utilize structural sharing?

Yes; an unbound type variable is typically represented as a reference cell, which is assigned to the type the type variable is unified with. So, if we have

  double : forall a. a -> a * a

  double (1, 1, 1) : (int * int * int) * (int * int * int)
then the type of double is first instantiated, yielding `_b -> _b * _b`, and then the type of the parameter is unified with the type of the argument, `int * int * int`. At this point, the reference cell in the internal representation of `_b` is updated to point to `int * int * int`, which means that both `_b` in the result type actually point to the same representation of `int * int * int`.

[1] http://research.microsoft.com/en-us/projects/fcp/

[2] https://github.com/tomprimozic/type-systems/tree/master/firs...




Clearly I should write more blog posts with lots of questions at the end, thanks!


Another reason is that it's not entirely clear what kind of polymorphic type to infer for `f`

How about forall a. a -> b ? If we make assumption that f is polymorphic, then nothing can be inferred about the return value, so it should also be polymorphic?


Let's instead talk about the type of `test`. It could be one of these:

  test : (forall a. a -> a) -> int * bool
  test : forall b. (forall a. a -> b) -> b * b
  test : (forall a b. a -> b) -> (forall c d. c * d)
The first case is the most "obvious" one with polymorphic types, the one that I described above.

The second is the one that I hinted at, but I used less general type `(forall a. a -> int) -> int * int`, i.e. instantiating `b` with `int`. The second type above is a more general one, as it could be instantiated into other types such as `(forall a. a -> bool) -> bool * bool` as well, but this is the standard first-rank HM polymorphism.

The third type above is the type that would be inferred if `f` had different polymorphic type variables as parameter and return types. However, there are no values that have the type `forall a. a`, so a function with type `forall a b. a -> b` either (1) cannot return (i.e. diverges or raises an exception) or (2) is a dynamic cast function, i.e. a hack (`Obj.magic` is an example of such a function, but using it is "dangerous" in the sense that it bypasses the type system and so might result in undefined behavior at runtime). Disregarding case (2), the function `test` would also not return, so it's equivalent to say that it's type would be

  test : forall c. (forall a b. a -> b) -> c
and as such, it would be pretty useless.


Not sure why "test : forall b. (forall a. a -> b) -> b * b" is "the standard first-rank HM polymorphism". The first comment implied "test" is a type error in HM, typed as "forall a b. (a -> b) -> b * b", which fails to unify a with both int and bool.

Based on the principle of least privilege, "test : forall b. (forall a. a -> b) -> b * b" feels the right type. Only the body of test feeds a to the first argument, so forall a is scoped to the first argument, whereas the caller of test needs to consume b, so we scope forall b to the whole expression.


What I meant is that going between `forall b. (forall a. a -> b) -> b * b` and `(forall a. a -> int) -> int * int` is something that is possible in standard HM (i.e. if we consider the `forall a. a -> ...` part as opaque to HM).

I guess you're right, the "least polymorphism" we need to add to make `test` pass the type checking is to make just the argument of `f` polymorphic... However, in standard HM without any hacks/dynamic types, a function with the type `forall a. a -> b` for some fixed `b` can't really do anything useful with its parameter - it can't inspect it, it can only wrap it up in a data structure, but it can't return the data structure, unless the type of the parameter is wrapped in an existential type (but unwrapping it, there is still nothing to be done with it...).


Great example, it now clicks. We lifted b all the way up because it needs to be consumed by the caller, might as well lift a as well, as types with free variables, like b in "forall a.a -> b", are useless. Which looks a lot like let-polymorphism :)




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: