Hindley-Milner (never seen it written as M-H) is beautiful because it's an elegant extension of simply-typed lambda calculus (STLC) that brings a very useful trait, polymorphism.
Here the first 3 rules are those of STLC. The magic happens in [Gen] and [Inst]: namely, you can generalize free type variables. If you have a function that works on lists of elements of type t when there is no constraint on t, they can work on any lists of any type. This brings quantification to the type system, ie you go from:
∀ a. reverse : α list → α list
to
reverse : ∀ α. α list → α list
(in the first one, ∀ is part of the discourse, and in the second one, part of the type). [Inst] states that you can replace a quantified variable by any variable. This is the role of the ⊑ operator, which is unfortunately not explained in this notation. This is confusing for people not used to type systems, ie most people).
Technically, [Let] is redundant because it is a combination of [App]+[Abs] : let x = e1 in e2 is equivalent to (λx.e2) e1. There is a difference in the ML branch of languages (as opposed to the Haskell branch) because in the presence of memory references (mutable cells) you can't generalize everywhere, so only let-bound variables are given a type scheme σ (both in the let...in construct and in toplevel phrases). That's why there is usually an explicit generalisation step at this point. Or maybe I have overlooked this particular presentation. HTH
[Abs] is monomorphic while [Let] is polymorphic. Using the Wikipedia article exemple : \f.(f true, f 0) will not be typed while let f = \x.x in (f true, f 0) will be reduced to (bool, int).
No, this nonequivalence applies directly to the typing of the lambda-calculus which was the topic of the original papers by Hindley and proven by Damas.
On the contrary, the value restriction is a part of the extended rules used for the typing of ML. It was introduced because ML isn't pure, it supports references. The value restriction is there to prevent them having polymorphic type. The rule is simple : syntactic values are the only things receiving a polymorphic type. What is a syntactic value ? To quote Geoffrey Smith notes on SML97 "Basically, a syntactic value is an expression that can be evaluated without doing any computation".
All this iss done in order to ensure type safety. Actually, if references could have a polymorphic type, the type system would accept incorrect programs such as :
let r : 'a list ref = ref []
let r1 : int list ref = r
let r2 : string list ref = r
r2 := "a"
let v : int = 1 + !r1
Which does not work because r, r1 and r2 are references to the same value.
Value restriction is a bit overkill. It rejects program which could be safely typed. Other solution to this problem exists. Actually, Standard ML used a different solution at the beginning (carrying information about the typed hold in a cell along) but it has other drawbacks. MLTon has a really good page about all that : http://mlton.org/ValueRestriction
You may be interested to know that the value restriction is not there to protect against unsafe programs. Rather, it is because polymorphic values get elaborated to functions after type inference, which produces some exceptionally unintuitive (but safe) behavior.
You could elaborate polymorphic values to type functions; and if you were compiling ML to System F or System F-omega, that is certainly what you would do. But in practice this might produce performance problems (I shouldn't have to call a function every time I use the empty list just because it has polymorphic type!). I must admit I'm not sure what SML and Haskell compilers actually do.
So in some sense the value restriction is there to prevent us from having to make the ugly choice between unsafety on the one hand, and inefficiency and unintuitive behavior on the other.
In the presence of polymorphism and mutable cells, value restriction (or a restriction thereof) is necessary to establish type soundness (ie that typed programs are safe), so I'd argue that it achieves this goal.
As for this explanation, I'm not entirely convinced. In the first example, seeing x as two different cells depending on the type is interesting, but I have trouble seeing how it is possible to express its semantics without keeping runtime type information. Is System F compatible with type erasure?
I'm getting a few detailed responses :) Unfortunately, I'm getting conflicting answers. I'm having trouble wrapping my head around why App+Abs couldn't be equivalent to Let in the absence of the value restriction.
It's because of the way the premises are written in regard to the type environment in the rules and how things are unified. It's directly linked to free and bound type variables and yes, it's complicated to understand.
If you look at the [Abs] rule, you will see in the premise "x : tau". Here, by definition, tau is non qualified. It represents a unique type. It's necessary monomorphic. Concretely, it means x will be bound to a type variable in the type environment and this type variable will uniquely be unified.
On the other hand, the premise for [Let] is e0 : sigma, x : sigma. Contrary to tau, sigma is qualified (polymorphic). So, it can be unified to a different type variable in different expressions.
I completely understand why it fails you. The way qualifier affects the unification process is quite complicated. I finally understood it only after having to write my own rules when I was implementing a type checker for my master thesis.
If you don't mind the formalism, read carefully the Wikipedia article paying particular attention to how tau and sigma are defined and to the paragraph about free and bound type variables. If you want something more practical, look at the Peyton Jones book I gave a link to in this thread. Part 9.5 is the important part but it's difficult to get without having read the whole chapter (eventually chapter 8 too). It 's still complex but far less abrupt than the Wikipedia article.
I understand the notation, but I don't understand the reasoning behind the Abs+App restriction (why does it have to be that way). I've been working on my own language and haven't gotten to the type inference yet, so maybe I'll finally understand once I go through it.
If the SPJ book is the one about implementing functional languages, then I've got a copy on my laptop I can peruse when I get home from work.
The difference between abstraction bindings and let bindings is that let variables are only ever bound to the value of a single expression, whereas the argument of a function may be bound to the value of any number of expressions.
This causes difficulties because the set of free type variables needs to be calculated in order for generalisation to produce safe type schemes, and that is tough to do for an unknown set of expressions.
I was not totally right in the my "uncle" comment. It has to do with the value restriction in the sense that it is where the distinction occurs (lambdas are never genezalized ; let-bindings are generalized for values), but the original HM system (without refs) indeed made this distinction.
I believe that if you don't do it and generalize everywhere, unification of types is harder or even impossible because everything has a type scheme.
This would look a lot like System F, for which type inference is undecidable, for example.
Yes. For example, if you generalize reference values, you can type ref [] as ∀a. (a list) ref, then store [4] to it, then retrieve a float from the head of the stored list, thus breaking the type system. It is a bit similar to array type variance problem that struck Java. One solution is to distinguish between syntactic forms that may allocate memory (eg, function applications) and the ones that can not (eg, abstractions, constants, variables), and only generalize the latter ones.
(Operationally speaking, the two forms (let and abs+app) are always equivalent, though).
His question is very valid. I got a math degree before I got a PhD in computer science, and this weird notation was the hardest part about the switch. The funny thing is, if you submit papers using a more sane and easier to understand notation (like, notation more close to a functional program) then computer scientists freak out and complain about notation. I predict this kind of notation to be dead within the next 20 years, its just too ugly.
The inference rule notation is standard within mathematical logic, which is where it comes from. I'm not an expert on the history of mathematics, but I've seen inference rules for example in Gentzen's 1935 paper and I'm sure they are quite a bit older.
I haven't seen a notation that is easier to read than inference rules. Writing them as functional programs can be problematic, as the rules do not always fully specify an algorithm. One judgement may have different derivations. If one writes down algorithms, such as HM type inference, then one does in fact often use notation in the style of functional programming.
Given the JVM supports Unicode identifiers, and given the rich vein of mathematical operators therein, I always thought it would be fun to do the APL thing with Java. Scala went a (very) little way towards this, but I don't think anyone has taken it to logical extremes yet.
(although I did once threaten my co-workers that I would tell the theoretical mathematician on the team about Unicode identifiers in Java. He used to write LaTeX in the Javadoc comments [which Doxygen handles rather nicely])
I believe you misremember.
From what I recall fortress had an ascii syntax where special combinations would be translated into symbols by the doc/rendering system, eg NN would be rendered as (bold N used for natural numbers).
Emacs has the TeX input mode which lets you type special characters using their TeX names. So you can type \alpha for α, \to for → and so on. And if you don't like that for whatever reason, there are other alternative input modes you can use.
If you're using a tool which forces you to use a map to find glyphs, you're simply doing it wrong. Really, there is no excuse in this day and age not to support Unicode well!
I don't like the notation, because there are usually a lot of implicit assumptions made for each particular set of inference rules. In order to understand the inference rules, you usually first have to understand what implicit assumptions are made in the current context.
In the current example, there is just no need to write Hindley-Milner in this notation. Using pattern matching and sets will do just fine.
One thing that's bad about the notation is not so much the horizontal bar notation, but that the pattern matching leaves so much implicit. You would have exactly the same confusion with pattern matching with sets. The problem is that the names of the variables are semantically significant. For example x can only stand for a program variable, not for a compound expression, e can stand for a compound expression, but not for a program variable. There is nothing except the name to distinguish them, and this is usually not explained. Similarly, τ stands for a monotype, whereas σ stands for a polytype (leaving aside the inst rule, where σ can confusingly also stand for a monotype, and the first rule, which should use τ instead of σ). Additionally a lot is left unexplained by the absence of the square inclusion relation and the free(.) function.
The horizontal bar notation is not ideal however. In my opinion Prolog notation is clearer, because it reads the right way around, and it makes it clear what relation we are defining. In Prolog notation you'd define a typeof(context, term, type) relation:
typeof(C,var(X),T) |- X:T in C
typeof(C,app(X,Y),B) |- typeof(C,X,A -> B), typeof(C,Y,A)
...
In Prolog you should read |- as "if". So the first rule says: "the type of a variable X is T if `X:T` is in the context C".
I think there are philosophical reasons why people prefer inference rules over working with sets. Many people work or have worked in a context of constructive logic, as constructivism comes in naturally when you consider computability. Now, implication is a far simpler concept than sets. Implication is modelled by cartesian closed categories, while for sets one needs additional structure, e.g. toposes. So I suppose the fact that inference rules are preferred over pattern matching and sets has to do with them being percieved as being based on simpler principles.
Of course, all this is a matter of taste and for particular applications like HM it doesn't make any difference at all which notation to use. I don't want to defend the rule notation too much; it takes up a lot of space in papers and if you have something better then fine. But maybe that explains why people prefer this notation.
Let us agree to disagree. There are tendencies only to remove strict math notations (particularly ones with implicit assumptions) from programming. Once removed programming languages that use such notation gets adapted by wider and wider communities and gradually become mainstream. And on the opposite, languages that use heavy math notations, even if pushed vehemently by academic community tend to die down.
I think this will be demise of Haskell by the way. As good as it is, the notation is still full of '||', '!!', '++', ':' ... So I think there would be another iteration in the functional world. Haskell will die, just like ML/Ocaml/etc had died, because Haskell with slightly better syntactic sugar had captured that segment of developers.
Actually, I also think that Haskell will not become mainstream. As a matter of fact, one of the main things I do not like about Haskell is that the type class mechanism makes too many things implicit and that it allows too much overloading. It seems that it is sometimes hard to understand what a code fragment does by just looking at the fragment. By contrast, OCaml code is much easier to understand.
But these are not so much syntactic issues. It always seemed to me that people are actually drawn to Haskell's syntax.
> It always seemed to me that people are actually drawn to Haskell's syntax.
Totally anecdotal, but this is really not the case for me. I have only dabbled in Haskell, and never really used it for a project from end-to-end, but have used Standard ML quite a bit in the last few months to write a lexical analyzer generator [0] and just recently spent a weekend and a half using OCaml.
There are a number of Haskell features and general design that I think are better than the MLs. While modules and functors are quite cool, I think typeclasses end up being more practical. Standard ML gets a lot of little things right that I think OCaml gets flat out wrong (mutable strings, throwing exceptions instead of returning 'a option), but Haskell seems to take it further, with a standard library that I can only describe as very tasteful. This is just my impression after reading half of Real World Haskell and doing a lot of dabbling but not a lot of real work, so I'm sure it's incomplete.
Haskell's syntax IS nicer than Standard ML's and OCaml's, but that is the least important thing IMHO. If there were an ML (strict, impure) with typeclasses and with as taseful design as Haskell at the edges, I would probably use that over Haskell, even if the syntax was ugly(ier). As it stands, I'm inclined to eventually switch to using Haskell, and just figure out how to deal with the other bits of it (laziness, purity), just so I can use typeclasses and the well-designed bits.
[0] In case you doubt how much ML I've written, the repo is on GitHub
No, I haven't, but does it add more to ML than a bunch of syntax changes? It looks like an attempt to make ML more palatable to programmers familiar with C and Java. I skimmed the docs on that site and saw that it does have ML-style modules and functors, but does it also include something like typeclasses? I thought combining the two was still almost an open research question, although tbh I haven't slogged through this paper yet: http://www.cse.unsw.edu.au/~chak/papers/modules-classes.pdf
To be completely honest, I'm a little skeptical given that the first page in the "Mythryl for SML programmers" highlights adding special-cases to the front-end to support printf: http://www.mythryl.org/my-Mythryl_printf.html I don't really miss printf in SML, although if I wanted it I think I would rather reach for this kind of type hackery instead: http://mlton.org/Printf
Yes, it's verbose, but verbosity also adds information that may be useful (although + is probably not a good example for this). With 'easier to understand' I mean that when looking at a bit of code I can tell what it does without having to know much about the context (typing, overloading). I did not mean to say that the code is prettier. Haskell code tends to be very pretty.
It's probably difficult to strike the right balance between too little and too much overloading. Maybe OCaml does not allow enough overloading. For my taste typical Haskell code uses too much. But that's a matter of taste, I suppose.
>when looking at a bit of code I can tell what it does without having to know much about the context
How do type classes make that harder? You either need to understand what "foo" does, or you need to understand what "fooForThisType" does. What's hard about the former?
You need to know the type. I wouldn't want to say that makes reading code "hard", but it means that you need more context in order to understand the code. (Imagine, for example, that you're trying to understand a patch, where not all types are obvious).
You only need more context in order to know where to look to understand the code. If you don't know what foo does, you have to go look at the definition of foo. If foo is obvious or common, then you don't care which instance's version of foo is being called. To a person reading the code, + is + regardless of whether it is adding ints or floats. If you are looking at a patch and don't know what a function does, then you need to go look at the actual code. In which case, the types are now obvious (or can be given to you by the compiler if they are not obvious).
Are there alternative approaches I am not aware of? All I can see is encode the type in the name of the function (fmapList, fmapMaybe, fMapVector, etc), or forcing the functions to all be in separate namespaces and have to qualify them (List.fmap, Maybe.fmap, etc). Both seem much worse than typeclasses.
Is really sequent calculus (http://en.wikipedia.org/wiki/Sequent_calculus) that odd, that you never encountered it in your math studies? I personally find it quite easy to read, with nicely structured proofs.
Of course what is "easy" to read is always up to the individual to a certain degree. But this notation is used nowhere in math outside of logic. That should tell you something :-) Also in modern logic that is actually used in interactive proof assistants, this notation actually is not present any more, but then reintroduced when you write a paper about what you did!
Well, of course the logic notation in question is used in mostly in logic contexts. In the same manner, as a student of logic I will not come across special notation from analysis.
Although I am not working in that area any more, when I used interactive theorem provers the one I used mostly would present the proofs in Sequent calculus. It showcased a nice connection between the logical framework and a small step operational semantics reading of te formulas.
In my view, it is important to understand the reasons for different fields choices of notation; it typically reveals interesting things about the fields in question. When starting out learning a new field, it often seems strange and un-intuitive, but after a while one starts to recognize that there are many valid reasons for why a certain style of notation is used. Of course, sometimes it is just an accident of history, and bad choices stick around :)
I've hopped around quite a bit in my studies (semantics, formal methods, operations research, theoretical computer science, pure algebra, logic), and I always feel overwhelmed with new notations and concepts. Nowadays I tend to expect it when trying to learn something new.
Hmm, H-M, AKA Damas-Milner, p 331 in TAPL (indexed as D-M only, so easy to overlook, and I can't see where it's in Harper's Practical Foundations, AFAICT.
As an aside, the other day i started flipping thru Harper's Lazy Eval writeup, and found it surprisingly readable, even if i was skipping reading the rules.
A solid computer science education would have helped him understand this. Not required for a 'normal' programmer but the more interesting parts of our trade exist in Computer Science rather than Software Engineering.
Eh. The first time I saw typing rules in class was in a moderately specialized graduate course, so it would have been very easy to get a perfectly good CS education without learning about them.
Of course, I go to a university that seems very biased towards practical engineering sorts of topics over theory, so your experience may vary.
In Univ of Cambridge CS course it's covered in second year undergraduate material - see http://www.cl.cam.ac.uk/teaching/1213/Types/ (though not referred to by name explicitly).
Yeah, CS programs vary. At a SUNY school about 8 years ago, my experience was that it veered slightly more towards the practical, at least to the extent that we spent relatively little time with complex mathematical notation as we learned CS concepts.
Except for basic stuff like set theory and Boolean logic, math was for the required math classes.
If you take a class in compilers you should see these sorts of inference rules when talking about operational semantics. Though maybe it isn't as popular to use operational semantics anymore. It's very useful to define rules in this syntax when working on semantic analysis and code generation.
I didn't get the background really necessary to really understand the type-theory parts of this notation until I took a graduate course in programming language theory.
Yeah, at my school operational semantics was only covered in that same PL theory course. The undergraduate programming languages/compilers course never went into it, unfortunately.
Computer science is a very wide field these days. I am a computer scientist but my education (math-wise) mostly focused on numerical math, not high-level symbolic logic. Sure, I could understand it if I took the time, but it certainly looks like abracadabra to me at first glance :-)
Indeed interesting how different educations differ. We certainly learned some of that, but it was never the center of attention. Geometric algorithms were another larger focus. Ofc there is so much to learn and it's only possible to confer part of that in the 4-5 years.
Ha! I think you'll find that's a matter of opinion.
Personally, I loved studying CS (my degree was in logic so I straddled the worlds of Math and CS). It's a topic I find infinitely interesting.
Having siad that, Engineering is also very interesting. All the CS theory in the world isn't going to save you when you're trying to complete a large system full of lots of moving parts - many of whom are developers :)
I tend to disagree with you in the sens that Damas-Milner seems to me as the typical exemple of an algorithm which seems simple but actually have subtle implication and trade-off. Robinson's unification which is at the center of the algorithm is already made a bit tricky by quantification as soon as you have polymorphism. Then, some rules have actually deeper implication and trade-off than it seems (especially let).
For people interested in type checkers, there is a couple of really good chapters by written by Peter Hancock in a book available online : The Implementation of Functional Programming Languages by Simon Peyton Jones. They entirely cover the theory and there is a complete implementation in Miranda. Here is the link :
http://research.microsoft.com/en-us/um/people/simonpj/papers...
Two professors were arguing in front of a chalkboard, for hours and hours. Finally one of them nodded in agreement and said, "You're right, it's trivial."
I've been in situations like this. I'll give a simple rule set of some topic that someone hasn't quite managed to defragment in their brain (and thus know it as a bunch of rules and special-cases), and then they'll spend hours going "But what if <x>?". Eventually they believe me that my simplified model is equivalent to their overcomplicated internal model. The result is usually closer to "You're right; it is that simple." but that's fairly similar to "It's trivial."
In short: Often people need to try a lot of "Yeah, but what if <x>?" before they satisfy themselves that something is in fact as simple as it appears.
The quote was familiar but I couldn't remember the source. The meaning is clear though! Amazing neverending discussions, it makes me remember the Feynman anecdote when he observed matematiians discussions.
The first edition of PLAI has a nice introductory chapter on types that explains this notation and has a whole section on type inference. It's also freely available online, so that's nice. (chapter 10 is on types) I haven't looked at the second edition at all yet, but the chapter on types looks like it has more code than math.
Benjamin Pierce's Types and Programming Languages seems to be the gold standard of intro type theory, and definitely covers this notation and the math behind it, in a much more rigorous fashion than PLAI. It is a much deeper book though, so it doesn't get to type inference and Hindley Milner until later. I've read some of it and haven't actually gotten this far yet, but can say that it is a superb textbook, it's very well written and easy for an amateur to follow, as long as you're relatively comfortable reading and writing proofs. (but you don't need more math than your standard undergrad discrete math course)
You might also want to look at Lambda the Ultimate's Getting Started page, which has a wide range of suggestions covering more than just types, but programming languages in general.
I learned from Foundations of Programming Languages by John C. Mitchell. However, it is a very dense book and I wouldn't want to work through it without a partner or a class.
abstract languages like this are merely 'clever' as opposed to useful. it is a shame that maths and science is polluted by so much of it - it makes otherwise easy subjects seem alien and difficult
Here the first 3 rules are those of STLC. The magic happens in [Gen] and [Inst]: namely, you can generalize free type variables. If you have a function that works on lists of elements of type t when there is no constraint on t, they can work on any lists of any type. This brings quantification to the type system, ie you go from:
to (in the first one, ∀ is part of the discourse, and in the second one, part of the type). [Inst] states that you can replace a quantified variable by any variable. This is the role of the ⊑ operator, which is unfortunately not explained in this notation. This is confusing for people not used to type systems, ie most people).Technically, [Let] is redundant because it is a combination of [App]+[Abs] : let x = e1 in e2 is equivalent to (λx.e2) e1. There is a difference in the ML branch of languages (as opposed to the Haskell branch) because in the presence of memory references (mutable cells) you can't generalize everywhere, so only let-bound variables are given a type scheme σ (both in the let...in construct and in toplevel phrases). That's why there is usually an explicit generalisation step at this point. Or maybe I have overlooked this particular presentation. HTH