Types 445 points by ivank on Aug 24, 2016 | hide | past | web | favorite | 192 comments

 >You don't get this for free and you have to help the computer verify these.For library types, the most commonly used properties will have already been proven; you wouldn't have to write a proof that "sort" sorts a list any more than you'd have to write the sort function itself.The computer can also be surprisingly good at proving these things in many cases. E.g. I was playing with the proof assistant Isabelle recently, and I tried to prove that given an int, a proof that twice that int is greater than 13, and a proof that four times that int is less than 29, then that int must be 7. I thought I'd have to fiddle around with induction on Peano numbers and the like, but nope, it could be proved with just a single "by arith".
 > For library types, the most commonly used properties will have already been proven; you wouldn't have to write a proof that "sort" sorts a list any more than you'd have to write the sort function itself.Sorting was just an example. Most proofs of nontrivial algorithms come with reams of hand written proofs. Happy to see examples that go against this though.> The computer can also be surprisingly good at proving these things in many cases. E.g. I was playing with the proof assistant Isabelle recently, and I tried to prove that given an int, a proof that twice that int is greater than 13, and a proof that four times that int is less than 29, then that int must be 7. I thought I'd have to fiddle around with induction on Peano numbers and the like, but nope, it could be proved with just a single "by arith".Doesn't that just fall into the Presburger arithmetic decision procedure? The problem is you can rarely tell when the proof assistant is going to complete the proof for you (e.g. you're fine if it's within Presburger arithmetic but not if it requires induction).
 You don't have to write a proof if you don't want to. It's quite possible to simply write your sorting function without a proof, it's just that at that point you don't know that your function works as advertised (which of course is then just the same thing that you get in a non-dependently-typed language).`````` data Comp = Gt | Lt | Eq sortBy : (a -> a -> Comp) -> List a -> List a sortBy = ... `````` The above is perfectly acceptable in Idris, given that your implementation of sortBy type checks. And it makes no use of dependent types. Separately to this function, you can write a proof that it's correct (if you want). Or unit tests, for that matter.
 > You don't have to write a proof if you don't want to.I know this, I'm simply pointing out that when you do decide to write a proof it can be a very challenging task (much harder than adding generic types into Java code for example).
 > it could be proved with just a single "by arith".Tactics are probably a usability improvement for mathematicians who are used to proving everything by hand. But for programmers used to type inference, they're a step backwards. Mathematicians typically prove much deeper results, but they do so at a much slower rate than programmers write programs.
 Huh tactics are pretty great. Say exactly what you want and the computer programs itself!It's unfair to just compare development time between tactic-generated programs in a dependent language with manually written programs in a non-depenendent language. The end result in the dependent language is much more valuable.
 I'm not comparing dependent types vs. no dependent types. I'm comparing higher-order dependent types (Agda, Idris, Coq, etc.) vs. first-order dependent types: https://news.ycombinator.com/item?id=12350147Seriously, Coq-style proof scripts are utterly unredable when they grow past a certain size. The only way to understand them is to replay them, so that you can see the intermediate hypotheses and goals.
 > The only way to understand them is to replay themReminds me of the same kind of problem that arises in trying to understand the runtime behavior of nontrivial programs involving side-effects, mutable state, multithreading, etc. Sometimes the debugger is the only practical way to figure out what the program is doing. Add concurrency into the mix, and things can get tricky very quickly.Dependently-typed programming and machine-assisted proof is still an emerging thing, and there is probably lots of room for exploring ergonomic improvements. It will be interesting to see where this goes.
 Your analogy between debugging effectful programs and replaying proof scripts is frighteningly accurate.
 I agree. Note that the Coq-style tactics-based approach is not the only possibility.In Agda, you are expected to write proofs in a functional style, similar to Haskell programs. There is a small amount of integrated automation, which helps you fill in holes in your proofs — however, the results are explicit proof terms, inserted at the right place in your program.Systems which behave in this fashion have the de Bruijn criterion, as described in Geuvers (2009) “Proof assistants: History, ideas, and future”.
 I had/have a similar feeling.I have thought about it some time ago and came up with the following proposal:
 That's mostly true, but the speed programmers write programs depends how bulletproof they have to be. There have been times that if you include the debugging and maintainability difficulty, proving things to my compiler would be a step forward for me. There's probably a sweet spot.
 > That's mostly true, but the speed programmers write programs depends how bulletproof they have to be.Of course, but, in any case, writing programs at the same rate mathematicians prove theorems simply isn't viable.
 Agreed. These days, the most popular dependently typed languages are based on some variant of Martin-Löf type theory, in which arbitrary mathematical propositions are expressible at the type level. I'm not convinced that these designs offer a good power/cost ratio. For example, these languages often have very limited type inference.In practice, most of the properties that a programmer (rather than, say, a mathematician using a proof assistant) would want to enforce mechanically (say, invariants of data structures) are expressible as first-order statements (lambda-bound variables are never applied). This strongly suggests that the ergonomics of dependent types could be significantly improved by deliberately limiting their expressive power. By the way, the idea isn't new: the Damas-Milner type system, which is most certainly known and proven technology, is a subset of System F-omega's type system for which type inference is possible, at the reasonable (IMO) price of not having higher-ranked or higher-kinded types. A dependently typed language with a first-order type level would benefit from reusing the basic architecture of a Damas-Milner type checker: a constraint generator and a constraint solver powered by a first-order unification engine.
 > By the way, the idea of deliberately limiting the expressive power of types isn't new: the Damas-Milner type system, which is most certainly known and proven technology, is a first-order subset of System F-omega's type system for which type inference is possible. A dependently typed language with a first-order type level would benefit from reusing the basic architecture of a Damas-Milner type checker: a constraint generator and a constraint solver powered by a first-order unification engine.Have you seen languages like DML (let's you capture linear arithmetic properties only and automates the proof of these)?https://www.cs.bu.edu/~hwxi/DML/DML.html> Dependent ML (DML) is a conservative extension of the functional programming language ML. The type system of DML enriches that of ML with a restricted form of dependent types. This allows many interesting program properties such as memory safety and termination to be captured in the type system of DML and then be verified at compiler-time.What program properties are useful to capture and what properties you can mostly automate is a really interesting problem.
 > Have you seen languages like DML (let's you capture linear arithmetic properties only and automates the proof of these)?Yes. But what I have in mind is a little bit more ambitious. I want dependent types restricted to be parameterized by syntactic values (in the sense of ML's value restriction) of what Standard ML calls `eqtype`s. For example, correctly balanced red-black trees would be:`````` (* clearly eqtypes *) datatype Color = R' | B' datatype Nat = Z | S : Nat -> Nat datatype Tree = | Empty : Tree (B', Z, 'a) | R : Tree (B', 'h, 'a) * 'a * Tree (B', 'h, 'a) -> Tree (R', 'h, 'a) | B : Tree ('c, 'h, 'a) * 'a * Tree ('c, 'h, 'a) -> Tree (B', S 'h, 'a)``````
 Any examples? I'm curious for examples of a nontrivial type that would catch lots of common programming errors where the proofs can be automated. I see lots of new dependently typed languages but not much interest in addressing the proof automation aspect.
 You don't need type theory to verify program properties, people have been using first-order methods for decades to prove properties about programs. For example there was a line of work in ACL2 that verified a microprocessor implementation, as well as plenty of modern work using tools like SMT to automatically prove program properties, see Dafny, F* for language based approaches. Though there is plenty of language agnostic approaches as well. My colleagues at UW have a paper in this year's OSDI verifying crash consistency for a realistic file system with no manual proof.
 If (like me) you didn't know what the abbreviation SMT stands for -- I looked it up and it is Shiver Me Timbers http://encyclopedia.thefreedictionary.com/Shiver+Me+Timbers Seems to be some kind of pirate speak.Only kidding! It stands for Satisfiability Modulo Theories https://en.wikipedia.org/wiki/Satisfiability_modulo_theories“In computer science and mathematical logic, the satisfiability modulo theories (SMT) problem is a decision problem for logical formulas with respect to combinations of background theories expressed in classical first-order logic with equality. Examples of theories typically used in computer science are the theory of real numbers, the theory of integers, and the theories of various data structures such as lists, arrays, bit vectors and so on. SMT can be thought of as a form of the constraint satisfaction problem and thus a certain formalized approach to constraint programming.”I'm not sure I'm any the wiser after that …A notable building block appears to be SMT-LIB http://smtlib.cs.uiowa.edu/index.shtml
 Nice. SMT can be (and indeed has been) integrated into type-based approaches to program verification as well.
 Yeah I think SMT is really the state-of-the-art right now in this area. Leo De Moura (one of the main authors of Z3) has been working on Lean for the past couple of years. There is a small group of us (5~) who have been working on a big release for the past couple of months. The goal is to bring the easy of use of SMT automation to type theory, so you can get both the ability to do induction and automation.
 I wish you guys good luck. Freedom from Coq-style proof scripts should be the goal.
 I'd love to give more concrete examples, but I'm slightly hampered by the fact the type system I want isn't actually implemented anywhere. I have a rough sketch of the design of the type system I want, and I've been looking for a computer scientist, logician or mathematician to help me polish the design and prove that it is, in fact, type safe. But I couldn't find anyone. :-|
 Why not just learn Coq or Agda yourself? There are also sorts of introductions for simple type systems that I'm sure you can build off of.
 I have no idea how to implement fresh unification variable generation (my type system is a superset of Damas-Milner, kind of like how the calculus of constructions is a superset of System F-omega) in Coq or Agda. Everywhere I've seen it implemented, it involves mutable state. So, “in principle”, I could fake it using something like the state monad, and plumbing around the collection of unification variables generated so far, but in practice reasoning about this seems like a pain.
 You could take a look at Coquand (2002) “A formalised proof of the soundness and completeness of a simply typed lambda-calculus with explicit substitutions”.https://www.dropbox.com/s/xmj7yv1i1moe0ag/Coquand2002.pdfDrop in to #dependent on Freenode IRC, and let’s chat.
 Type inference isn't the most important, most languages that employ type inference are moving towards features sets make full inference hard, if not impossible. Not to mention you can get pretty good "inference" in dependently typed programming languages just by designing an appropriate higher order unification procedure. I have never met a daily user of these tools that find this to be the bottleneck.In practice you spend way more time writing proofs in these system then doing anything else. There is evidence that we don't need to limit the expressive power of our logics, just design appropriate automation for the logic. Lean, a new ITT based theorem prover from MSR, has been focused on building automation for type theory. It already has promising results that formal methods techniques can be scaled to type theory (see the website for more information on recent work).
 > Type inference isn't important, most languages that employ type inference are moving towards features sets make full inference hard, if not impossible.What I'm saying is that I don't like this direction, because type inference is very important to me.> Not to mention you can get pretty good "inference" in dependently typed programming languages just by designing an appropriate higher order unification procedure.Such higher-order unification procedures must always be tailored towards specific use cases, since higher-order unification is in general undecidable. Effectively implementing a different type checking algorithm for every program I write doesn't feel like good use of my time.What I want is a type system that is more honest about how much it can help me: it only promises to enforce properties expressible as first-order statements, but offers much better inference, and the remainder of my proof obligation is hopefully small enough that I can fulfill it manually.> In practice you spend way more time writing proofs in these system then doing anything else.Indeed, and that's what I'd like to fix.
 I like your proposal of first-order, dependent types. Shocked I havent heard of it before now given advantages. I hope someone helps you build it.On related note, what do you think about the Shen LISP approach where Sequent Calculus can be used for advanced typing? Does it have the expressibility and decidability benefits you described? Hadn't seen many specialists talk about it so still an unknown to me. If it does, I figure some default type systems could be created for a specific style of Shen and/or specific libraries. Or something similar done in a ML.
 I'm not familiar with Shen, but, if I recall correctly, its type system is Turing-complete, so it's probably more complicated than I'd be comfortable with.IMO, the topmost priority in a type system designed for programmers ought to be automation, not expressiveness. Programmers don't spend their time proving deep theorems. Most of what a programmer does is actually fairly boring, but it ought to be done correctly, reliably and within a reasonable time frame.
 I totally agree. They're not going to use it if we make them work for it. Neither work hard or even lightly. Needs to be no harder than basic annotations but preferably closer to existing, static typing. Let them focus on functions instead of proofs.
 > Proving F matches the type is arbitrary hard and is impossible to automated in generalSo what?Even proving that array indices are not accessed out of bounds within loops would be a considerable leap in the state of the art of industrial programming languages. For most cases (i.e. linear integer arithmetics), that's always automatically (dis-)provable. We have to start somewhere...
 > Even proving that array indices are not accessed out of bounds within loops would be a considerable leap in the state of the art of industrial programming languages. For most cases (i.e. linear integer arithmetics), that's always automatically (dis-)provable. We have to start somewhere...It's a super interesting and important area, I was just pointing out it was a challenging one with lots of open issues.
 The only time I remember using indexed list access in Haskell was on a dirty script that would break in 3 months anyhow. They are just ugly, and throw a huge amount of safety away, for no big gain at all.Mainstream languages will gain much more from better iterators than by a ton of effort that just proves your indexes are within bounds.
 While I agree that better iterators are desirable, iteration is far from the only operation on arrays (or any other collection for that matter).
 To some degree, you can choose what you want to prove about your algorithms in those languages.It's perfectly possible to implement a sorting algorithm without proving that it actually sorts the input, or indeed returns a permutation of the input list at all.In that sense, you can choose how much efford you want to put into it.
 > To some degree, you can choose what you want to prove about your algorithms in those languages. > It's perfectly possible to implement a sorting algorithm without proving that it actually sorts the input, or indeed returns a permutation of the input list at all.Sure, but my point is that any nontrivial property requires the programmer to write difficult formal mathematical proofs which is an activity completely unlike what you see in mainstream strongly typed languages (e.g. Haskell, OCaml).Also, consider how when you start making use of generics in type systems by converting all you Object lists to String lists; you find you have to start adding generic types all over the place for any code that touches that code until the compiler errors stop. If you decided to capture permutations of input lists in a dependently typed language, the exact same thing happens except this time capturing that property for all your algorithms could be monumental task of writing complex proofs (unlike before where you're just labelling things).
 > Sure, but my point is that any nontrivial property requires the programmer to write difficult formal mathematical proofs which is an activity completely unlike what you see in mainstream strongly typed languages (e.g. Haskell, OCaml).So don't write the proofs then. Dependently typed languages don't require you to write proofs; they give you the ability to if you choose. Considering that it's impossible to write such proofs in OCaml or Haskell, it seems strange to complain that they're hard to write in a dependently typed language. "Hard" is strictly less difficult than "impossible".
 > Considering that it's impossible to write such proofs in OCaml or Haskell, it seems strange to complain that they're hard to write in a dependently typed language.My sole point is that when I see articles on here about dependently typed languages, the issue of how the proofs are generated and the challenges involved in doing this is overlooked almost every time. I think it's important these challenges are highlighted so people don't think dependent types are a magic wand that gives you safety with little effort.
 The article was talking about the differences in power between different type systems. It's not even primarily focused on dependent types, let alone advocating for using them or claiming that they're easy. In fact it notes that the more advanced type systems "make the language more complex." If the article were suggesting everyone should use dependent types, or that they were easy, then your admonishments about dependent type systems would make sense here, but as it is I'm not sure.
 The Idris example seems to need further explanation:> In Idris, we can say "the add function takes two integers and returns an integer, but its first argument must be smaller than its second argument":> add : (x : Nat) -> (y : Nat) -> {auto smaller : LT x y} -> Nat> add x y = x + yThat's all well and good, if you know the values of x and y at compile time. Consider a program that reads x and y from STDIN. The user could provide an x that is equal to or larger than y (or could provide only one value, or values that are not numbers). I see no way to deal with that except to throw a runtime error. Is that what would happen?
 The missing explanation is that `smaller` is a third argument to the function. It's type is a proof that x <= y. Since it is in curly brackets with the auto keyword, the compiler will fill in this proof in many cases, like when the values are statically known.In the case where the values are read from the external environment, first you would have to compare them before calling the function. The comparator would return either a proof that x <= y, or x > y. In the first case you plug that value into the `smaller` argument, in the second case it's your responsibility to signal whatever kind of application-specific error is appropriate (assuming x > y is some kind of erroneous condition).
 though the comparator cannot return just a boolean, because it could be 'wrong'. this is where the dependent constructs come in to make sure the answer is correct
 What would happen if you did not compare them before calling the function, and just passed them in such that x >= y?
 Nothing; it would return a partially applied function that waits for a proof that `x < y` before continuing. (Trying to use that function as an integer would be a type error).The key here is that `add` is not a function `Nat -> Nat -> Nat`, it is a function `(a : Nat) -> (b : Nat) -> LT a b -> Nat`. There are just some compiler features that allow you to avoid having to write out the full `LT a b` proof each time you want to add things!
 What's even more interesting is the structure of "LT a b". Let's assume that Nat has the common Peano arithmetic structure:`````` data Nat : Type where Zero : Nat Succ : Nat -> Nat `````` Here "Zero" represents zero and "Succ" represents "one more than". Hence "Succ (Succ (Succ Zero))" is one more than one more than one more than zero, AKA three.Given two such values "x" and "y", how on Earth can we prove that "x < y"? What would that even look like?Well, there's a really obvious case: we know that "x < Succ x", so we can define a value "Obv" which represents this:`````` Obv : (x : Nat) -> LT x (Succ x) `````` Notice that we don't write "y" explicitly, since it can be written in terms of "x".What about those cases where "x" and "y" differ by some other amount? We could define values for "x < Succ (Succ x)", "x < "Succ (Succ (Succ x))", and so on, but that's clearly redundant and inconvenient. Instead, we just need to spot that all of these fit the pattern "x < Succ y", where "x < y". This lets us argue by induction: if "x < y", then "x < Succ y", which we can represent using a value "Ind" like this:`````` Ind : (x : Nat) -> (y : Nat) -> LT x y -> LT x (Succ y) `````` This is enough to prove that "x" is less than any number greater than "x". In fact, we don't need to give the values of "x" and "y" explicitly, as they appear in the "LT x y" types and can hence be inferred. In languages like Idris we indicate inferrable parameters using braces, hence our "LT" type looks something like this:`````` data LT : Nat -> Nat -> Type where Obv : {x : Nat} -> LT x (Succ x) Ind : {x : Nat} -> {y : Nat} -> LT x y -> LT x (Succ y) `````` Now, this looks familiar. Compare it to the definition of "Nat": we have two constructors, one of which ("Zero"/"Obv") can be written on its own, whilst the other ("Succ"/"Ind") is recursive, requiring an argument containing the same constructors.Values of type Nat include:`````` Zero : Nat Succ Zero : Nat Succ (Succ Zero) : Nat `````` And so on, whilst values of "LT x y" include:`````` Obv : LT x (Succ x) Ind Obv : LT x (Succ (Succ x)) Ind (Ind Obv) : LT x (Succ (Succ (Succ x))) `````` Although LT contains more static information than Nat, it actually follows exactly the same structure. What does this mean? Values of type "LT x y" are numbers; in particular they're the difference between "x" and "y"!In the case of "LT", these numbers start counting from one (since one number is not less than another if their difference is zero). If we define a similar type for "x <= y" it would count from zero, and the same can be done for ">" and ">=".These types are actually really useful. They're also closely related to linked lists, vectors, etc. although they store dynamic information as well as static.I assume this is all old hat to those with mathematical training, but I found it interesting enough to write about at http://chriswarbo.net/blog/2014-12-04-Nat_like_types.html
 You can't because the function takes three arguments: an x and y of type integer, and a "z" of type "proof that x < y".This is the "types as propositions, proofs as instances" Curry-Howard correspondence lightbulb: The type "x < y" is a proposition, and if you can find any z at all of that type, then z is a proof of the proposition. So by passing in such a z that has been verified to have that type, you've certified that x < y.
 > What would happen if you did not compare them before calling the function, and just passed them in such that x >= y?It would be a type error. All the compiler would know is that "x" and "y" are strings/integers so it would tell you they were the wrong type. If you do a branch on checking they are the correct type, then in that part of the branch the compiler will know they are the correct type and allow "add" to be called.Think about a Java program that takes a string input, converts the string input to an integer type and then passes this to a function that accepts integers only. If you just tried to pass the string to the function directly the compiler wouldn't allow it. Same thing but the type system is more expressive.
 Assuming they were integers before, what type would they become after they were compared?
 Its not that they change type. Its that the comparison function returns either a proof that one is greater then the other, or that they are equal. When you are in the right branch, you can pass that proof (type) along with the value into other functions.
 here's a full code example:`````` import Data.String -- takes two integers, and a proof that x < y, and yields an integer add : (x : Integer) -> (y : Integer) -> (prf : x < y = True) -> -- require a proof that that x < y Integer add x y prf = x + y main : IO () main = do sx <- getLine -- read string from input sy <- getLine -- read string from input let Just x = parseInteger sx -- assuming int parse is ok, else error let Just y = parseInteger sy -- assuming int parse is ok, else error case decEq (x < y) True of -- decEq constructs a proof if x < y is True Yes prf => print (add x y prf) No => putStrLn "no prf, x is not less than y" `````` lets say I mess up the sign of the comparison on the case line and write decEq (x > y) instead... then I'd get a type error`````` When checking argument prf to function Main.add: Type mismatch between x > y = True (Type of prf) and x < y = True (Expected type) `````` there's no way to construct the prf value artificially, or sneak in different parameters that are unrelated to the prf value.it's either a compile error or it's valid.
 that is pretty awesome
 An existential type - "some unknown types x (a subtype of integer) and y (a subtype of integer) for which LT x y" (or else the other branch). Languages designed for these techniques generally make it easier to write those types than it is in say Java (and in some languages it would be impossible to write that type at all) and infer them so you're not constantly writing them, though there's usually a way to express them directly/explicitly if you need to.
 Maybe an analogy to a more mainstream language helps. I like using Kotlin these days:http://kotlinlang.org/which has nullable types. Imagine this code:`````` fun useMessage(msg: Message) { .... } val s: Message? = someSocket.readNextMessage() // Returns null if the socket has been closed. useMessage(s) `````` In Kotlin this would be a type error, the type of 's' is Message? and the question mark means it's possibly null. It won't compile. You can fix it by doing this:`````` if (s != null) useMessage(s) `````` The act of testing 's' restricts its type inside the if block: we've proven it's not null, therefore the compiler will now accept this proof as evidence that the code is safe.Idris isn't quite the same because it's a lot more general and the proofs are explicit instead of being implicit in the control flow: Idris types don't change when you test them, you get given a proof 'object' instead. But the basic idea is the same; your program does something that proves something about the type of a runtime value and that data can be used to improve program correctness.
 > That's all well and good, if you know the values of x and y at compile time. Consider a program that reads x and y from STDIN. The user could provide an x that is equal to or larger than y (or could provide only one value, or values that are not numbers). I see no way to deal with that except to throw a runtime error. Is that what would happen?Before you could call "add", you would have to add a branch/condition to verify "x" and "y" met the pre-conditions. The compiler would then know from the context of where you called "add" that the pre-conditions were always met. As you couldn't guarantee the pre-conditions were always met, you would have a code branch that decides what to do when the pre-conditions aren't satisfied.The important difference to other languages though is that the compiler enforces that you're never allowed to call "add" unless you know the inputs satisfy the pre-conditions.
 not in idris, because it would force you to do that check manually and pass in the evidence as proof. it has a bunch of tools to help you do this.. namely lifting values into types with 'dependent pairs'. idris doesnt automagically calculate everything for you, but makes sure at compile time, that you provide evidence at run time for the things you have asserted in the types. it makes more sense after going through some of the tutorials and books.
 I do NOT know anything about Idris, but I'd imagine you have to do a check along the line of `if (x
 > Consider a program that reads x and y from STDIN.A program like this should fail to compile! You can't feed two arbitrary numbers from stdin to the "add" function above. In order to make it compile, you'd need to add some logic that uses min/max to swap the values so x is less than y.
 x = readFromStdiny = readFromStdinif(x < y) {// in this scope, you have proven x < y} else {// in this scope, you have proven x >= y}
 x and y do not have to be known at compile time. If I let y be a random Nat, and x be y - 1 (or 0 if y == 0), then I can statically prove that LE x y.
 Decent overview of many concepts, but the opening line isn't strictly true:> A type is a collection of possible values.A type is a proposition, and "This binding has one of these possible values" is merely one type of proposition. So a type is much more powerful than a simple set-based interpretation, even though this is how most people think about it.For instance, in Haskell you can encode a region calculus that ensures prompt reclamation of file handles and other scarce resources [1]. The relations between nested regions to ensure correctness seems awkward to interpret as a set, but it's straightforward if you just think of types as a relation.
 (Static) types may be seen as propositions, but this is just one interpretation of them. We can also interpret types as sets (although this doesn't work out very well except in very simple languages), as domains (most commonly), as cpos, as relations over closed terms (PERs), and so forth.The types-as-propositions interpretation is a particularly useful one because it lets us take ideas from the field of formal logic and apply them to programming. But it is only one interpretation, and not a definition of what types are.I don't think it's possible to give "type" a good, short definition. For me John Reynolds' definition comes closest: "Type structure is a syntactic discipline for enforcing levels of abstraction." But that's rather a mouthful, and not a good introduction to what types are.> A type is a proposition, and "This binding has one of these possible values" is merely one type of proposition.The type that an expression or variable has does not correspond to the proposition that that expression has a certain set of possible values! You're mixing the object-level (the language & its types) with the meta-level (assertions about the language). This is a total misunderstanding of Curry-Howard. Rather, the values an expression might take on can be interpreted as proofs of the corresponding proposition.
 >The types-as-propositions interpretation is a particularly useful one because it lets us take ideas from the field of formal logic and apply them to programming. But it is only one interpretation, and not a definition of what types are.Ehhh, it kind of is a definition. Think of it this way: a type system is a topology we lay over programs, with open sets representing observable (using finite computation) properties and closed sets representing deniable (using finite computation) properties. The finest such topology possible is the Scott topology, which can distinguish every single observable value and deniable property from every other: it encodes all the information you can have about a program without solving the Halting Problem.Type systems, then, are coarser topologies over programs than that one, which group observably different expressions and traces into the same open sets, which are then called "types". When those types are guaranteed not to contain _|_, the divergent program, they can also be considered topological spaces, with the programs in them acting as points in the topological space.
 > (Static) types may be seen as propositions, but this is just one interpretation of them. We can also interpret types as sets (although this doesn't work out very well except in very simple languages), as domains (most commonly), as cpos, as relations over closed terms (PERs), and so forth.Right, I wasn't advancing the propositional definition as the only one, merely as one that sufficiently captures the generality, because the article's current definition of types-as-sets is insufficiently general.
 Under HoTT's take on propositions as “types where any two inhabitants are equal”, “types as propositions” is even less general than “types as sets”.As I stated in a comment on that gist, what types denote varies from one type system to another: https://gist.github.com/garybernhardt/122909856b570c5c457a6c...
 No one except mathematicians and type theorists are using HoTT, where every programmer in existence is using first-order and sometimes second-order logic. I think it should be obvious which is more appropriate if this is supposed to be an introduction.The idea that types express propositions in first- or second-order logic about a program one is writing is sufficiently general for every programmer to understand the proper scope of type checking. Types-as-sets is simply false as an introductory concept.
 > No one except mathematicians and type theorists are using HoTT,Right.> where every programmer in existence is using first-order and sometimes second-order logic.I'm not so sure about this one. Perhaps by “programmer” you mean “every Haskell programmer”? Even then I'm still not sure. Are you counting “let's approximate `forall x. P(x)` with the large but finite conjunction `P(a) /\ P(b) /\ P(c) ...`” as “using first-order logic”?
 I mean pretty much every programmer, even C programmers. Simple business logic is rife with such propositions. Note that I didn't say they use first-order logic to encode invariants in types or anything, but they obviously use it everyday to write programsHaskell programmers also use higher order logic, which is also common now in Java and C#.
 > Note that I didn't say they use first-order logic to encode invariants in types or anything, but they obviously use it everyday to write programsI wasn't talking about types either. Most programmers I've talked to simply can't use formal logic.
 Hmm, I don't think you're being very charitable. That these programmers can write working programs suggest they sufficiently understand disjunction, conjunction, negation, conditionals and quantification.Perhaps they have trouble abstracting the semantics from their preferred language and/or syntax, but that doesn't mean they don't grasp their meaning.
 I offer this thread as evidence: https://news.ycombinator.com/item?id=12342583
 Hey, it's not the fault of the programming masses that some evil motherfucker conflated implementation extension (inheritance) and subtyping.Subtyping = "A <: B, therefore I can start with an A, forget the details irrelevant to B, and just use it as a B."Inheritance = "A inherits B, if I try to actually use it as a B, it promises to work. However, it might actually blow up in my face, because it was keeping its extra details secret from me and promising they wouldn't blow up."
 > programming masses, some evil (bleep)Too harsh. Logic is hard for humans. Evolution has trained us to take lots of mental shortcuts, not to be perfect calculating machines. Even the most technically competent people rely on their intuition. What differentiates them from the rest of us is that they have trained their intuition to make it more reliable [0].It's the job of a language designer to arrange things so that intuition doesn't mislead programmers regarding the meaning of language features. Before proposing a new feature, the designer must anticipate how programmers will think about code that uses this feature, and there better not be a mismatch between what programmers think the code means and what the code actually means.> A inherits B, if I try to actually use it as a B, it promises to work.There's nothing wrong with implementation inheritance per se [1]. It addresses a very common use case: you have a thing Foo, and you want to produce another thing Bar that's similar to Foo, with minor changes here and there. This describes accurately how many things work in real life. However, you can't assume that Bar can always be used where Foo is expected. Those “minor changes here and there” might have rendered Bar incompatible with Foo. Hence, “inheritance is not subtyping”. As stated above, it's the job of a language designer to anticipate this.[1] Other than the lack of mathematical elegance, but few things are mathematically elegant in life.
 >Hence, “inheritance is not subtyping”. As stated above, it's the job of a language designer to anticipate this.Yes, it is their job, by not allowing subclasses to be used as subtypes.
 Some subclasses can be safely used as subtypes. For example, those that meet the following two conditions:(0) Not directly mutating superclass fields. (Reading them is fine.)(1) Not overriding non-abstract methods of the superclass.
 If classes are types, subsclasses should be subtypes -- so the above restrictions should just be applied to subclass relationships.One could also have an non-subclassing mechanism of implementation sharing, that might be declared something like this:`````` class Related does (method1, method2) like Base; `````` or`` class Related does all except (method3) like Base;``
 > If classes are types, subsclasses should be subtypesDisagree. C++ added private inheritance for a good reason. There are scenarios when you want to inherit part of an implementation, but override some of the behaviour for code reuse without inheriting any kind of subtyping relationship.There are also scenarios where you want to declare a subtype without inheriting any parent behaviour.So ultimately subtyping is distinct from subclassing.
 That syntax is too heavyweight (IMO). AFAICT, most of the time, when people use inheritance, they don't want to define perfect subtypes. They just want to reuse as much already implemented behavior as possible.
 > That syntax is too heavyweight (IMO).I'd be fine with something more compact for the default case of "steal all behavior except what is explicitly overridden", such as:`` class Related like Base;``
 (0) Do you have a use case for cherry-picking which methods are inherited? If not, it seems like too much complexity for me.(1) Do you think there would still be a use case for “extends”? If not, I'd rather drop it as well.
 > Do you have a use case for cherry-picking which methods are inherited?The most obvious is that it makes intent explicit in the case of multiple similarity (I prefer "similarity" to "inheritance" for non-subtyping implementation reuse.)It might in some cases make sense to do this on an interface rather than (or in addition to) single method level; there's probably a lot to work out in the ergonomics of non-subtyping implementation reuse.> Do you think there would still be a use case for “extends”?Sure, classical subclassing (subtyping with implementation sharing) remains an important use case. It might be desirable to statically verify certain guarantees (some have been suggested in at least one subthread of this thread by another poster) to assure that the subtyping is reasonable, but even without such verification the use case remains important.
 > The most obvious is that it makes intent explicit in the case of multiple similarityKeep in mind that similarity is explicitly intended to be a quick-n-dirty form of code reuse. If the programmer has enough time to refactor the code, he or she should probably factor out the common parts, and then use proper (subtyping) inheritance. In view of this, I don't think making similarity overly complicated is a good idea.> Sure, classical subclassing (...) remains an important use case. (...) even without such verification the use case remains important.It's precisely the lack of such verification that destroys the guarantees you can extract from Liskov's substitution principle.
 > If the programmer has enough time to refactor the code, he or she should probably factor out the common parts, and then use proper (subtyping) inheritance.I do see the attraction of the idea that all implementation sharing ultimately reflects something that can be expressed through a subtyping hierarchy adhering to the LSP, but I'm not 100% convinced that it is the case. Absent certainty on that point, I'd like to have a model of similarity that is workable when viewed as an ultimate, rather than interim, model.Also, I think that this kind of explicitness (with similar syntax) is desirable for subtyping inheritance in languages that support multiple inheritance, so it keeps similarity in line with inheritance, with the distinction only in the relationship being expressed (so, while it may make similarity more complex, it keeps the whole language more simple but explicit.)> It's precisely the lack of such verification that destroys the guarantees you can extract from Liskov's substitution principle.I agree that static verification of sanity in subtyping inheritance reduces the problem of LSP violations (as, frankly, does non-static verification, such as testing frameworks using subtype relationships to automatically apply tests for supertypes to subtypes.)The degree and type of verification (as well as other details like cherrypicking and support for multiple similarity) that is appropriate for these type of things really depends on the language. Extending Perl or Python (dynamic languages with multiple inheritance) to support similarity alongside inheritance is probably going to favor different tradeoffs than extending Java (static with single inheritance) to support similarity.
 > I agree that static verification of sanity in subtyping inheritance reduces the problem of LSP violationsThe LSP itself is never violated in a type-safe language. What's violated is the guarantees that you expected the LSP to buy you. For instance, Java doesn't promise about the meaning of non-final methods. If a programmer assumes such guarantees, the problem is with the programmer, not Java. If he wants such guarantees to hold, he has to use a different language.
 Very true. Mixin delegation is the thing we actually want here.
 > Yes, it is their job, by not allowing subclasses to be used as subtypes.That's probably too extreme. Subtyping should just be explicitly declared rather than implicit when subclassing.
 That's definitely not being charitable. Subtyping has some pretty subtle interaction with other typing features, which is why it's still being studied more than 20 years after Cardelli's "Subtyping Recursive Types".The pernicious conflation of subtyping and subclassing in common OO programming languages just compounds the confusion.
 > For me John Reynolds' definition comes closest: "Type structure is a syntactic discipline for enforcing levels of abstraction." But that's rather a mouthful, and not a good introduction to what types are.The nice thing about this definition is that it reveals why reflection renders a type system useless: it destroys the type system's ability to enforce abstractions.
 Well, there are several ways to think about types, especially regarding this article, which tries to cover static and dynamic types.Most formal definitions/treatments of types I've come across do not apply to dynamic types at all; usually the typing formalism says nothing about dynamically typed programs/values other than giving them one big recursive type. What we would informally call "dynamic types" are then treated completely separately, e.g. via "tags".While such distinctions are certainly useful, an introductory/broad-overview explanation like this seems to benefit without such complications/distractions.
 > While such distinctions are certainly useful, an introductory/broad-overview explanation like this seems to benefit without such complications/distractions.I don't think misrepresenting types is the way to go. An introduction like I just gave seems simple enough, where a type is a proposition about a program, and you just then explain that most such propositions are "variable X may take on values from set Y".It's important not to perpetuate the false notion that sets of simple sets of values suffice to explain types, because it entails the common and equally false notion that full test coverage in a dynamically typed language suffices to match the safety of a statically typed language.
 > > There are several ways to think about types.> I don't think misrepresenting types is the way to go. An introduction like I just gave seems simple enough, where a type is a proposition about a program...chriswarbo's point is: That's one way to think about types. It's not the only way, or even the only valid way. If you're going to disagree with chriswarbo, simply ignoring his/her point and reiterating your own isn't a good way to do it.Can you demonstrate that your definition is the only valid way to think about types? Can you demonstrate that all other ways are subsets of your way? Those would be useful replies that would actually answer chriswarbo.
 I never claimed it was the only valid definition, but the current set-theoretic definition is simply wrong. The propositional definition is at least right (and fairly common), and isomorphic to other definitions.
 Is it set-theoretic though? The article states:> A type is a collection of possible values."Collection" doesn't necessarily imply "set"; perhaps the author chose this word on purpose to avoid such complaints? Would you also complain if a Web page giving an informal introduction to topology said "a space is a collection of points"? Keep in mind that in HoTT, types are spaces and points are values, and hence those two statements have similar meanings.I know when I'm writing about technical topics, I often look for words which don't have a widely-used technical definition, since that frees me up to focus on what I want to talk about, rather than getting bogged down in the particulars of this-or-that preconceived notion. I could certainly imagine myself wanting to get across the idea of types "containing" values, and carefully choosing the word "collection" specifically to avoid the baggage associated with the word "set". Maybe the author did the same?In such cases, I would much rather someone point out an existing field which is the same as what I describe ("Your idea of types as 'collections of values' sounds a lot like HoTT's idea that types are spaces of points"), rather than an existing field which is not what I describe yet uses a similar word ("Your use of the word 'collection' sounds a lot like 'set', which is a different theory without much relevance here")
 > "Collection" doesn't necessarily imply "set"; perhaps the author chose this word on purpose to avoid such complaints?I think it's still wrong, but let's get concrete in the more unusual type systems to check. What is the "collection of values" for a type in TyPiCal [1]?What is the "collection of values" for substructural types, like linear types [2] or affine types [3]?I don't think "collection", set-based or otherwise, can capture these types.
 Didn't know about Typical. Like the properties it can prove. Thanks for the link!
 It's also simple enough to direct readers to note that types refer to expressions, not values. It's only through later evaluation relations do those expressions and their values end up sharing types.
 Types can be seen both as collections of values and as collections of expressions. (I'm defining a “value” as “something the operational semantics of the language lets you substitute a variable with”.) In a call-by-value language, the former is embeddable in the latter. In a call-by-name or call-by-need language, the two coincide.
 I disagree that it's a good idea to think of types as a "collection" (loosely) of values. It may be the case that you do not distinguish computation and value (here, I'm thinking of CBPV) but types only classify values as interpreted via their embedding into expressions.
 If a type couldn't be regarded as a collection of values in a strict language, then induction on datatypes would simply be unsound. Of course, induction on datatypes is unsound in Haskell, but it's sound in Standard ML.
 > If a type couldn't be regarded as a collection of values in a strict language, then induction on datatypes would simply be unsound.Just because induction on some types is sound, does not entail that induction on all types is sound. Indeed it's not in general, even though it may be in specific languages.
 (0) I didn't say all types. I said datatypes.(1) Induction on datatypes is a powerful tool for reasoning about programs in strict languages, and it's only possible when you regard datatypes as collections of values.
 Then data types are a restricted notion of types for which induction applies, so we agree that types in general are not a collection of values.
 All types in a call-by-value language are two collections: one of values and another of expressions, the former embeddable in the latter. This is plain obvious when you see its operational semantics.
 Not all types need to have induction principles. The ones that do are nicer, of course, precisely because they correspond to values.
 > An introduction like I just gave seems simple enoughSimple enough for what?And I don't see why a bit of pedantry about Haskell's type system requires rewriting an introduction that's expressive and actually simple, not just "simple enough" from a certain point of view.
 It's not pedantry, it's a basic definition. There are plenty of languages with stronger type properties that can't be captured by types-as-sets. While the current introduction may be simple, it's also wrong.
 And of what consquence is this purported "error" to the reader of this article? Would you edit a children's book about zoo animals from "This is a lion" to "This is a photograph of a lion printed onto paper?" What pedagogic purpose does your correction serve?We teach Newton's physics to children (and even many adults) even though they've been supplanted by later theories, not because we don't know what the correct answer is, but because Newtonian physics are much easier to grasp and because they still give a useful view of the world around us, even if they break down in the extremes. Why is this any different?
 Formalism aside, it seems like it's most useful in practice to think of type as defining what can be done with a value. The sidesteps the idea of type being something permanent and inherent to a value. In practice, the type might be metadata discoverable and actionable at run-time, it might be something that exists at compile-time to determine correctness of a program (based on how values are used), or both.Does this point of view leave out anything crucial?
 > Formalism aside, it seems like it's most useful in practice to think of type as defining what can be done with a value.That might be a simpler way of explaining it, since "proposition" is rather abstract. Or perhaps, a type defines the valid state transitions your program can undergo at any given point, ie. given the data in your program (the program state S0), the types of all of that data define a set of valid transitions to another data set (program state S1).
 > merely one type of propositionShouldn't that read "one kind of proposition"?:p
 `````` 1 + eval(read_from_the_network()) `````` > If we get an integer, that expression is fine; if we get a string, it's not. We can't know what we'll get until we actually run, so we can't statically analyze the type.> The unsatisfying solution used in practice is to give eval() the type Any, which is like Object in some OO languages or interface {} in Go: it's the type that can have any value. Values of type Any aren't constrained in any way, so this effectively removes the type system's ability to help us with code involving eval. Languages with both eval and a type system have to abandon type safety whenever eval is used.No, you can give EVAL a returning type of any (T) but still make type inference work.`````` (+ 1 (eval (read))) `````` The meaning of a type in SBCL is: if an expression evaluates without error, then its type is .... In other words, the type of the above expression is NUMBER.There is also a NIL type (bottom) which represent the empty set of values. If a function has a return type of NIL, it means that it does not return a value normally. This is the case for the ERROR function, or the (LOOP) expression.
 > if an expression evaluates without error, then its type is…What’s curious is that this is true in every language with general recursion/looping:`````` int f(int n) { while (true) {} return n + 1; } `````` (In other words, the type of “f” isn’t the logical formula “a → a”, but rather “a → ¬¬a”.)The type of “eval” can also be fully static; it can just return a type that you have to do case analysis or runtime type inspection on to use.In Haskell, with a data type:`````` data Value = Integer Int | Text String | ... eval :: String -> Value foo = do value <- read_from_the_network let result = eval value case result of Integer n -> return (1 + n) _ -> error "that was not an integer!" `````` Or with RTTI:`````` eval :: String -> Dynamic foo = do value <- read_from_the_network let result = eval value case fromDynamic result of Just n -> return (1 + n) Nothing -> error "that was not an integer!" `````` “fromDynamic” has the type:`````` Typeable a => Dynamic -> Maybe a `````` Which is to say “given a dynamic value, and its runtime type information, I can either give you the value back with a static type, or nothing if I can’t perform the downcast”—much like “dynamic_cast” on pointer types in C++.
 > What’s curious is that this is true in every language with general recursion/looping:Yes, thanks for pointing that out.> The type of “eval” can also be fully static; it can just return a type that you have to do case analysis or runtime type inspection on to use.I'd say that's what type T stands for in Lisp, since you can then dynamically dispatch on the actual type of the value.
 > The type of “eval” can also be fully static; it can just return a type that you have to do case analysis or runtime type inspection on to use.Yes! The problem is, we don’t have a static type system which would allow us to intensionally analyse syntax at runtime. See section 9 of Sheard (2001) “Accomplishments and research challenges in meta-programming”.https://www.dropbox.com/s/pathrmr7rufkw1d/Sheard2001.pdfIf you’re interested in this subject, perhaps drop in to #dependent on Freenode IRC, or ping me on Twitter.
 `````` add : (x : Nat) -> (y : Nat) -> {auto smaller : LT x y} -> Nat add x y = x + y `````` Actually, (GHC) Haskell can express something like this type if you turn on the required extensions.`````` data Nat = Z | S Z deriving (Eq, Ord, Show) data SNat (n :: Nat) where SZ :: SNat Z SS :: SNat n -> SNat (S n) type family Compare (a :: Nat) (b :: Nat) :: Ordering where Compare Z Z = EQ Compare (S a) Z = GT Compare Z (S a) = LT Compare (S a) (S b) = Compare a b type a < b = Compare a b ~ LT type family (+) (a :: Nat) (b :: Nat) :: Nat where Z + a = a a + Z = a (S a) + b = S (a + b) add :: (a < b) => SNat a -> SNat b -> SNat (a+b) add = ...``````
 Since this articles is about types, any behaviour the compiler doesn't check or doesn't generate a type error is "allowed".
 The issue with that idea is that there is more than one C compiler. Some of them issue no warnings at all, and some of them (like `gcc`) will yell at you and probably break your code if you attempt to do some of the things I outlined.I disagree with saying it is 'allowed' by the type system, because a smart C compiler could throw out your code and refuse to compile it if you do such things, and in doing so would still be within the C standard. Some work has already been done on this front in `gcc` (And I would assume `clang`) - the issue tends to be that by doing this, you end-up breaking some older code that used to work fine. And since older compilers didn't enforce these rules, people aren't as aware of them.
 In theory, languages have type systems. In practice, compilers do.If there's more than one C compiler, and they behave differently on this, then in practice, C has more than one type system.
 Someone who has experience with both Agda and Idris, could you comment on which is easier to get started with? Coming from a Haskell background, I'd like to try my hand at writing more interesting constraints into my types. Any experiences using these languages for (non-research) work?I've played around with Coq a little and it certainly feels more like a proof assistant than a programming language. It was fairly confusing and unfamiliar, but interesting nonetheless. Something with slightly more traditional ergonomics might be easier to pick up.
 > Someone who has experience with both Agda and Idris, could you comment on which is easier to get started with? Coming from a Haskell background, I'd like to try my hand at writing more interesting constraints into my types. Any experiences using these languages for (non-research) work?The learning curve is almost vertical to be completely honest from personal experience. I find most tutorials are written by people with a strong background in logic, type systems and mathematics, so it's a struggle to rely only on your knowledge of mainstream programming languages as you're learning lots of new things at once (e.g. formal proofs, formal specifications, logic, pure functional programming).It's a super interesting area if you can stick with it though so keep at it!
 > most tutorials are written by people with a strong background in logic, type systems, and mathematics...I'm a little used to this from Haskell, although I imagine the dependently-typed languages are even worse in that regard. Working through the Idris tutorial now and at least the syntax is familiar.
 > I'm a little used to this from Haskell, although I imagine the dependently-typed languages are even worse in that regard. Working through the Idris tutorial now and at least the syntax is familiar.That's what I mean by the difficultly jump. Like when you go from C++ to Java or the other way around, there's a lot of familiar things to cling on to while you make the transition. With dependently typed languages, there's a massive pile of unfamiliar stuff so the steep learning curve is unavoidable really. Someone needs to write a "dependently typed programming for Java programmers guide". :)
 I found Idris pretty easy to get started with - that's part of it's goal, making dependent types easier to get started with. From there I think it would be easier to go to Agda or Coq, but I haven't explored much there myself.
 +1 for Idris. The tutorials were pretty solid for it too when I tried it, and the mailing list was very helpful when I had questions [1]. There's even a WIP book! [2]
 Their main docs are pretty good too: http://docs.idris-lang.org
 Thanks. I've started with the tutorials but have yet to see an example that convinces me it's worth all the pain. I'll carry on a little further, though.EDIT: the pain of such a strict language, not necessarily idris in particular.
 I have very limited experience, but it's worth saying that Idris was designed by an experienced Haskell user with the intention of using it to write practical programs.
 > If we try to call this function as add 2 1, where the first argument is larger than the second, then the compiler will reject the program at compile timeThis is not entirely true in general. auto is only a best bet. It might not be able to prove something right and will fail. For example, if it has to recurse deeper than 100 constructors. In this case, the user of the function has to provide a proof himself.See it as a bloom filter. if it doesn't fail, you're sure your function is correctly called. if it does fail, it might still be correctly called but the compiler just didn't have enough time to prove it. In that case a user needs to convince the compiler that he is right.
 I don't see the problem here. I said that Idris will reject the program if the proof fails. That's what you're saying too, unless I'm very confused. (It's late, so maybe I am?)
 Yes, but it might also reject correct programs. that's the point I'm trying to make.
 That doesn't contradict the quote that you provided. The quote just says "the compiler will reject the program at compile time [if there's no valid proof]", which is true.
 If this topic interests you, Julia's type system is worth checking out [1]. Julia is unusual among dynamic languages not just for having a powerful type system (including parametric types), but also for making it idiomatic to make full use of types. Aside from the benefits to the programmer, this also allows the compiler to get to Rust-level performance where it's needed.I find that this is a nice compromise that gets a lot of the benefits of static type systems (small, local errors as opposed to behavioural bugs, ability to express concepts and constraints) without the drawbacks (loss of interactivity, or inability to escape the type system when it's appropriate).
 It sounds very ad-hoc? (And Julia's benchmarking has not been reputable in the past, FWIW).> without the drawbacks (loss of interactivity, or inability to escape the type system when it's appropriate).One can absolutely have interactivity in a statically typed language, and virtually all statically typed languages support casting when you absolutely need to.
 "Ad-hoc" seems subjective but to me it feels very carefully thought out. And while I've found the benchmarks to be accurate, I more importantly mean myself (and many others I know) being easily able to get high performance code when we need it.My second paragraph was only a personal take on what works for me, so YMMV, but nevertheless: I'm not arguing against static type systems in principle, but these are practical concerns rather than theoretical ones. I don't know of a mainstream static language that supports a fully dynamic REPL (Haskell's isn't great) and casting back and forth to `interface{}`, `Object`, or `Any` everywhere is a huge overhead compared to duck typing. I look forward to a future language solving those issues, but that's not the case today.
 > "Ad-hoc" seems subjective but to me it feels very carefully thought out.I guess I'd like to see a simple formalism.> myself (and many others I know) being easily able to get high performance code when we need it.That's true for a lot of languages. It doesn't make it Rust-level.> I'm not arguing against static type systems in principle, but these are practical concerns rather than theoretical ones. I don't know of a mainstream static language that supports a fully dynamic REPL (Haskell's isn't great) and casting back and forth to `interface{}`, `Object`, or `Any` everywhere is a huge overhead compared to duck typing. I look forward to a future language solving those issues, but that's not the case today.Shrug, not my experience. I don't know what you mean by "fully generic", but I've been very happy with the Scala REPL - I've never once thought "this would be easy in the Python REPL but I can't do it because of the type system". Casting is indeed a big overhead if you're doing it everywhere, but my experience is the cases where you need it are extremely rare.
 What do you feel is lacking from Haskell's REPL?
 > The unsatisfying solution used in practice is to give eval() the type Any, which is like Object in some OO languages or interface {} in Go: it's the type that can have any value. Values of type Any aren't constrained in any way, so this effectively removes the type system's ability to help us with code involving eval. Languages with both eval and a type system have to abandon type safety whenever eval is used.It's not quite true that they must abandon type safety: rather, they must abandon a certain kind of compile-time type check. It's not unsafe to use eval or Any values (i.e., it won't crash the system); one must simply examine the Any value (at runtime, natch) and do stuff with it.
 > Haskell has no equivalent of the Idris type above, and Go has no equivalent of either the Idris type or the Haskell type. As a result, Idris can prevent many bugs that Haskell can't, and Haskell can prevent many bugs that Go can't. In both cases, we need additional type system features, which make the language more complex.Type checking Idris is much simpler than Haskell.
 It doesn't say "type checking", though. It says "the language", which I think of as including the respective preludes.
 The core theory (and implementation) is much simpler for almost all dependent type theories. At least this is my personal feeling after having read a lot of GHC code, written an Idris backend, the native backend for Lean, and significant portions of `rustc`.
 It's generally (or maybe universally) more difficult to "get" dependent typing than plain old ML-family static typing. I mean, this HN comments page alone has multiple threads where people don't think that the "x > y" proof example even works. For most programmers, that's how non-obvious the possibility of proving nontrivial properties of code is. Maybe I should change the wording in the OP to make it clear that I'm talking about the set of ideas required to use the language, not the characters that need to be typed into a computer to make it work.
 > It's generally (or maybe universally) more difficult to "get" dependent typing than plain old ML-family static typing.That's true, but Haskell has come a long way since a direct comparison like that would make sense.> I mean, this HN comments page alone has multiple threads where people don't think that the "x > y" proof example even works.I think this is because people have some preconceived idea about what dependent typing is before they come to dependently typed languages. So, not only are they learning about dependent types, but they are also fighting their preconceived notions. That was certainly the case for me.It's an interesting facet of the human condition; every time we learn a new tool, we have to be reminded that it's not magic.Now, having become proficient, I have a better intuition for how to get my complex Idris programs to type check than my complex Haskell ones. Idris is actually simpler, it's not just about line counts.
 > Type checking Idris is much simpler than Haskell.Is it? How's that?
 A short answer is that the theory is just simpler. Non-dependent type theory spends a lot of effort separating values, types, kinds and valid operations over them. For example you need lots of extensions to make it possible to use a type both at the value, type, and kind level in Haskell. Each extension to the type system adds more complexity, and requires special code to implement its functionality (DataKinds, PolyKinds, TypeFamilies, etc). You also end up needing at least two or more different kinds of ASTs one that describes expressions, one for types, one for kinds, etc.All of this falls away in dependent type theory you can represent the whole language with a class of pesudo-terms. For example the core expressions can be captured by the below grammar:`````` term := x | f x | forall (x : Type), B x | lambda (x : A), e | Type `````` Type checking (not inference) is straight forward to implement for these theories and can be done in about a page or two of code. Of course "elaboration" or "type-inference" for these theories can be much more complex, but Haskell's isn't simple either. The most recent publication on the TI algorithm is roughly 80 pages.
 Haskell's type system has accrued complexity over the years, whereas a small dependently typed language can be implemented very easily [1].
 That's not a fair comparison, though. A fair comparison might be Haskell's type system compared to Idris, or a small dependent typed language against a small type inference language.
 The question was, "how's that?". My answer was, it's possible for a small dependently typed language to be simpler than Haskell.I'd already made the assertion that Idris was less complicated than Haskell.
 I've been very curious about Perl6's gradual typing[1] which, as I understand it, draws a lot from Haskell's type system but in an optional manner.`````` subset NonNegativeInt of Int where * >= 0; sub fib(NonNegativeInt \$nth) { given \$nth { when 0 { 0 } when 1 { 1 } default { fib(\$nth-1) + fib(\$nth-2) } } } `````` [1] http://blogs.perl.org/users/ovid/2015/02/avoid-a-common-soft...
 Kind of shocked nobody mentioned this, even if it is a bit of an aside, but umm, I've been dying for anything at all from Gary Bernhardt - I don't even know what to say except that it makes me hope however unrealistically that will one day get something like the magnum opus that is "Destroy All Software" from him again.
 You know he just started to post new stuff, right?
 I know. Mind blown.
 I havent read/seen it. Im guessing you know more than Id see in an abstract. So, what's the significance of that work?
 The article is mainly focused on static typing. I guess it makes sense since the title is "Types". Although reading this, you might just think that dynamic typing is good for nothing.The more practical part of the article is great. However, the theoretical part could use some works. Especially the terminology isn't actually clear.Takes some example, what is "memory-safe" ? The only example makes it sounds like bound checking. And what does "no way to escape the language's type rules" mean? There are some usages of "valid program" and "invalid program" that makes my spider sense tingling as well.> However, no dynamic language can match the speed of carefully written static code in a language like Rust.> Any blanket statement of the form "static languages are better at x than dynamic languages" is almost certainly nonsense.Those two quoted statements come from the article.The section "Arguments for static and dynamic types" didn't mention a single reason why dynamic typing could be preferable over static-typing. It could at least have mentioned macro, or hand-waving claim that dynamic typing makes for prettier code at time (I think the author of python's requests library made that argument in one of the blog post).I think the last part should be removed (ranking the language by their typed system power).> There's a clear trend toward more powerful systems over time, especially when judging by language popularity rather than languages simply existing.Well, it looks true since we didn't consider any dynamic language at all...Focusing more on the "Concrete examples of type system power differences" would have been better for the article, I think. Something along the line of an example of Non-nullable type with something simple in TS, to optional/maybe monad, then dependent type that you can implement matrix multiplication (m * n) with (n * p) and got the type (m * p) back.
 > The author is highly probably on the static typing camp.First and foremost, i don't think dividing people into these clear-cut "camps" is good for this discussion (and probably for most discussions too...).Second, i very much doubt that the author is a purist of static typing, or even a strong proponent of it. He has a series of programming screencasts called Destroy All Software, and in most the episodes i've seen he uses Ruby, Bash or Python. So, unless he has dramatically changed his style since then, i don't think he has any problems using dynamically typed languages.
 Sure, I will edit that sentence out.
 > And what does "no way to escape the language's type rules" mean?I can give you an example, from bitter experience.In the original Pascal, the size of an array was part of the type of an array. There was no possible type for a variable-sized array. And you couldn't cast from an array of one fixed size to an array of another fixed size.Well, we were trying to do a 2D numerical simulation on a grid, with the size of the grid being user-specified. The original version of the program used a linked list to simulate the array, since you could have an arbitrarily-long linked list. But this meant that, if the simulation was 60x60, to reference the cell directly below the current one, you had to follow 60 links to get there!I replaced it with a 2D array of a fixed size, with the size being the largest amount we could fit into available memory (it was on an embedded system, so the amount of memory available was predictable). Of that, we used only the user-specified subset to do the simulation.TL;DR: If you need to do something that's outside what's allowed by your type system, and there's no way to escape the language's type system, you're trapped.
 >> However, no dynamic language can match the speed of carefully written static code in a language like Rust.>> Any blanket statement of the form "static languages are better at x than dynamic languages" is almost certainly nonsense.> Those two quoted statements come from the article.I think author had in mind "ALL static languages are better at x than ALL dynamic languages" as a nonsense. He probably didn't have in mind that "A static language_ IS better at x than A dynamic language_" is a nonsense.
 So what might make dynamic languages easier to write in? I enjoy dynamic languages more because:* "double kilometerToMiles(double km) { return km / 1.6 }" here the types are of really low value, yet I have to type them in.* a mutable list of generic and variable arity event handlers. Really hard to specify as a type (most languages cannot do it). Really easy to use.* co- and contravariance and lists, the math works exactly opposite of intuition.* when you change your mind on the types/arity beyond what your IDE can follow, it is a lot of low value work* if your test suite has 100% coverage of arguments/return-value and field usage, you have type checked your program.If I am going to write down types, they better be of value to me. Like types that are null or non-null, or tainted (from the user) or non-tainted. Or two numbers that are kilometers vs miles. Or only accessible under a certain lock or other preconditions. And be flow sensitive (kotlin style null checks), not make me write even more code.But if you are a dynamic language, do be strong typed (no "1" == 1). Also be dynamic in arity, that is (part-of) a type. Don't be handwavy scoped, lexical scoping is probably the only predictable way to do scoping. Fail fast, like prevent spelling mistakes for non existing field names for example. And allow this to work: "2 * Vector(10,10)".And from that perspective there is still quite some room for improvement in all classes of languages, I suppose, and good that there is lot of movement and experiments today.
 > double kilometerToMiles(double km) { return km / 1.6 }But what if you could have `miles kilometersToMiles(kilometers km)`? Under the hood its the same old doubles but you would be able to catch errors such as the one that caused Surveyor failure [1].e.g. Haskell has units[2] and dimensional[3] that allow you to do exactly that
 > * "double kilometerToMiles(double km) { return km / 1.6 }" here the types are of really low value, yet I have to type them in.You shouldn't have to, and you wouldn't in say Haskell.> a mutable list of generic and variable arity event handlers. Really hard to specify as a type (most languages cannot do it). Really easy to use.You want the handlers to have types corresponding to the events they handle, right? I.e. a labelled product/coproduct (similar to HList). They do exist.> * co- and contravariance and lists, the math works exactly opposite of intuition.I don't understand the claim here?> * when you change your mind on the types/arity beyond what your IDE can follow, it is a lot of low value workIn a well-designed language most of your functions will be generic, and the parts you have to change should only be the parts that your change actually affects. Obv. this ideal is not fully achieved yet, but we keep getting better at it.> * if your test suite has 100% coverage of arguments/return-value and field usage, you have type checked your program.Not true. That one example works for a given function does not prove the type is correct for all arguments. As a trivial example you could define a function like "def f(i) = if(i == 1000) "blah" else i". Typechecking would catch that immediately, but to catch it through testing you'd have to test every possible integer as input.
 To most commenters: Indeed static typing has advantages as well, I just didn't list them. And some languages solve certain "complaints" I had.I would suggest though, that comparing haskell to javascript, is almost like comparing a train with an offroad bike. Which you choose when is a discussion at a whole different level.As for the list of event handlers example. The point it is a trivial piece of code. Any jQuery plugin writer can do it. Yet the type math behind it is complex. Java cannot express it. Most languages cannot be generic in function arity so cannot express it. Spending time on it is not worth the trouble. And paying this penalty at API level, where ever user has to pay, is just a sin.Some discussion from the dartlang on covariant lists: https://www.dartlang.org/articles/design-decisions/why-dart-...The test suite idea was incorrect. Even 100% code coverage will not do. But for "normal" code, add reasonable amount of coverage. And any production runtime type error will be really surprising. And even when it does happen, the fix is clear and quick.A personal feeling on this topic is that it is all tradeoffs. But it makes me really happy to see gradual typing gaining traction. So you can start on your offroad bike, but end on a well oiled train, where that is worth the tradeoff. So you don't have to choose up front. Perhaps when we can even get runtime guarantees on perf/object layout when the type checker is happy.
 I only see this now. Good reply. All valid concerns. But> "I want all runtime errors to be at least this surprising", that level will be easier to achieve via types than via testsFor the type of the values, yes, but you still need tests for the content of the values. E.g you can only index into arrays with integers, but can your code access the array out of bounds? So that level is implicitly achieved.> but the trouble is that code that's 95% typechecked might as well be 0% typecheckedThat is absolutely not my experience. Typed at module boundaries really helps, even is that is only 5% of the code. On the other hand, getting to 100% type checked can be hard in current gradual systems. Mostly because some code is hard to type, even though it works fine, and humans can reason about it fine. Though you might be right, it is probably better expressed in a more type sound way. (Unless it is parametrization on function arity that does you in, grr.)But to me, unless you are coding in idris or such, types are like the derivative of your code. Yes they show the general "slope"; how it fits together. But it says nothing on the values that flow through it, or the correctness of the code. You still need tests, and the type checker is not much more than a spell checker. Its usefulness is not in dispute (though overestimated by some). Just that in most of todays popular languages, you pay a price that is not non trivial. The choice is a tradeoff.
 > For the type of the values, yes, but you still need tests for the content of the values. E.g you can only index into arrays with integers, but can your code access the array out of bounds? So that level is implicitly achieved.So you create a path-dependent type for valid indices into a given array (or, more likely, use a safe way of expressing iteration so that you don't need to see indices directly). Yes you can check that your indexing is valid with tests, but it's more efficient to do it with types.> That is absolutely not my experience. Typed at module boundaries really helps, even is that is only 5% of the code.Maybe if they're enforced at those boundaries - does that happen? I'm thinking mainly of gradual typecheckers that are erased at runtime, which IME are useless, because a type error can easily propagate a long way before it's actually picked up, and you see an "impossible" error where something has a different type at runtime to its declared type.> But to me, unless you are coding in idris or such, types are like the derivative of your code. Yes they show the general "slope"; how it fits together. But it says nothing on the values that flow through it, or the correctness of the code.Not my experience once you get used to working with the type system. Anything that should always be true in a given part of the code, you make into part of the types. Make the types detailed enough that a function with a given signature can only have one possible meaning, and then any implementation of that function that typechecks must be correct.
 > * if your test suite has 100% coverage of arguments/return-value and field usage, you have type checked your program.The most horrible argument in favor dynamic typing in my opinion. Test suits shouldn't be about checking if a return value is of expected type, that's ridiculous. It's implementing a type checker manually.There are a lot of things statically typed languages can do to make types painless :- (global) type inference (Crystal does that)- co and contravariance, it is not limited to dynamically typed language- optional parameters- adhoc type declaration like in C# (ex {x:1,y:"a"} is a type).- contracts- pattern matching- generics- ect ... etc ...I just dislike dynamically typed languages. Even interpreted languages should be statically typed IMHO. Again, there are multiple ways to make statically typed language painless when it comes to types.
 > Test suits shouldn't be about checking if a return value is of expected typeIndeed and so they don't have to. When you have reasonable code coverage, it is really surprising to encounter type issues in production. And they are the easiest mistakes to fix. But the point is, runtime checks happen implicitly as you run the code. So no "implementing a type checker" needed ...On the other hand my comment on when you get the equivalent of being fully type checked was incorrect. Even 100% code coverage will not guarantee it for some code.I like both kind of languages, and use both. While you get some advantages in static languages, you do also pay a price in language semantics and effort. It is good to point that out, so we can continue making progress, like Crystal or latest C# (and latest F#!). There used to be a time when the mood was: Java is good for all, stop confusing us with your new languages.
 > * if your test suite has 100% coverage of arguments/return-value and field usage, you have type checked your program.This is absolutely false. Others have pointed this out, but types are much more general than runtime checks in dynamically typed languages. For instance, you can check at compile-time that your program doesn't have data races or deadlocks:http://www-kb.is.s.u-tokyo.ac.jp/~koba/typical/Or design a library in Haskell to ensure file handles and other scarce resources are promptly reclaimed:http://okmij.org/ftp/Haskell/regions.html#light-weightNeither of these are testable in dynamically typed languages. Static types are much more powerful than dynamic types, in general.
 Writing down the types isn't a necessary property of static type systems. In Haskell, F# or (OCa)ML, you almost never have to actually mention any types.I do agree that Java-style types aren't of much value. That catches few errors at significant cost.
 Catching logical errors is only one out of several things that static typing gives you.Incidentally, right now this is a few positions above this posting: https://news.ycombinator.com/item?id=12350063Better IDE support, a kind of self-documentation and increased performance come to mind. I know there are counter-examples for each of these points, but the general trend cannot be denied.
 The cost of non-Java-style types is compile time :)
 > "double kilometerToMiles(double km) { return km / 1.6 }"What about:`````` toMiles (Kilometers k) = Miles (k/1.6) `````` Then you do:`````` let x = Kilometers 5 print (toMiles x) `````` And, of course, any calculation will only accept the correct units. At this naive implementation, you'll have to add the conversions to satisfy the compiler, but there are Haskell libraries that bring automatic conversions too.Anyway:> Like types that are null or non-null, or tainted (from the user) or non-tainted.That's Haskell.> Or two numbers that are kilometers vs miles.That's my example up there.> Or only accessible under a certain lock or other preconditions.Yep, Haskell.> And be flow sensitive (kotlin style null checks), not make me write even more code.That's emulating what people do in Haskell.
 I think types can be both frustrating and life-saving.However, the "danger" for me lies primarily in implicit conversion. That is, Python vs Lua/JS/R. When writing large programs, some things happen implicitly and linting may not catch it.Everything else besides implicit conversion just helps structure the programs in specific ways and helps speed them up.
 So, how do these advanced types work?For their `add` example, I imagine just having x
 In Idris, there are functions `believe_me` and `really_believe_me`, which assert things to the compiler. They're generally only used for FFI code, or for extremely tedious proofs.If you wanted to prove that a list was sorted, you'd write another datatype:`````` IsSorted : Ord a => List a -> Type `````` which represents the proposition that the input list is sorted. You could encode this by requiring that any new element in the list is either the first element, or less than or equal to the previous element. Then, your sorting function could return a dependent pair:`````` (l : List a ** IsSorted l) `````` which can be read as the list, accompanied by a proof that it is, in fact, sorted. Since your `IsSorted` type requires that the list is sorted, via construction, you won't be able to supply the output of the function unless your code is correct (or you use `believe_me`!).The compiler doesn't generally come up with counterexamples for you; with this style, if the proof isn't complete, it won't compile.
 What types are and what they do is all well and good, but what I want to know is how they would help me write a larger application. Would it help it be more correct, be written faster, or be more flexible? Does it just help on a line-by-line programming basis, or can it have any ramifications on a higher level like product design?
 I wrote a little article, which has been posted here before, that shows the power of different type systems and what kind of bugs you can expect to prevent: http://kevinmahoney.co.uk/articles/tests-vs-types/
 > A type is a collection of possible valuesOk, but do two collections with the same values always correspond to the same type?
 It depends on the language. If you have, for example, a type-and-effect system, a type can contain information (like "we wrote to stdout while computing this") that is never reflected in the value itself. In a formal setting, it's more appropriate to describe a type as a collection of pieces of source code, so that `2+3` and `print "computing...\n"; 2+3` have different types even though they'll evaluate to the same thing.
 Generally, no, at least with nominal typing.
 Yes, but if remember you can always say that the values are different.A blue 5, and a red 5.Five feet, or five inches.
 Yes, they do.
 I read that type checking is easier with dependent types.I'm barely aware of adga adn other langs that are used for profs, but wonder how useful (and what danger ahead) could cause to design a language with this for more mundane purposes (like, for example, build websites, data manipulation, gui, etc)
 Hope this makes it into Wikipedia instead of here.
 It'll get moderated out.
 For this article specifically, it shouldn't be on wiki. The quality still needs a lot of polish, and too much opinions.
 This is a decent summary of why I'm doing it. Wikipedia is full of "high quality" content, where "high quality" means "directly exposing every detail that anyone ever thought to nitpick", and this topic is a nitpicker's paradise. There's value in that detail-heavy style, but it's difficult to learn from and it's a type of writing that I rarely want to read. As for opinions (and lack thereof), they prevent Wikipedia from saying many things that are both well-known and easily stated. The lack of clear summaries of widespread disagreeing opinions keeps our implicit knowledge implicit.
 I agree with your reasoning and your description of Wikipedia.However, despite all the flaws, wiki is reasonably helpful, and I doubt filling it with more opinions would be helping the case.

Applications are open for YC Summer 2019

Search: