Hacker News new | past | comments | ask | show | jobs | submit login

Great overview article but I have a comment.

> In Idris, we can say "the add function takes two integers and returns an integer, but its first argument must be smaller than its second argument":

> If we try to call this function as add 2 1, where the first argument is larger than the second, then the compiler will reject the program at compile time.

> Haskell has no equivalent of the Idris type above, and Go has no equivalent of either the Idris type or the Haskell type. As a result, Idris can prevent many bugs that Haskell can't, and Haskell can prevent many bugs that Go can't. In both cases, we need additional type system features, which make the language more complex.

I really wish when people talk about dependently typed programming languages (e.g. Idris, Coq, Agda, ATS), they would mention the effort involved to get these compile time checks to happen.

These languages allow you to capture almost any program property you can think of as a type. Instead of being limited to saying function F "returns a number/list/string" like common type systems, you can capture arbitrarily complex properties like "F returns an even number", "F returns a permutation of the input list", "F always halts" or "F returns a proof of Fermat's Last Theorem". You don't get this for free and you have to help the computer verify these.

Proving F matches the type is arbitrary hard and is impossible to automated in general (e.g. the halting problem tells us we cannot always tell if a function halts). For example, consider the type "F returns a sorted permutation of the input list", where F could be an implementation of bubble sort, quicksort, merge sort, radix sort or something even more complex. Each algorithm will require a different and potentially complex proof and most of the time the programmer needs to help the computer find the proof. It's completely unlike what people are used to from type systems and is an even more massive leap than for someone moving from the type systems of e.g. C++/Java to Haskell.

I'm glad people are more aware of dependently typed languages but I feel the immense jump in how practical they are to use is severely overlooked.




>You don't get this for free and you have to help the computer verify these.

For library types, the most commonly used properties will have already been proven; you wouldn't have to write a proof that "sort" sorts a list any more than you'd have to write the sort function itself.

The computer can also be surprisingly good at proving these things in many cases. E.g. I was playing with the proof assistant Isabelle recently, and I tried to prove that given an int, a proof that twice that int is greater than 13, and a proof that four times that int is less than 29, then that int must be 7. I thought I'd have to fiddle around with induction on Peano numbers and the like, but nope, it could be proved with just a single "by arith".


> For library types, the most commonly used properties will have already been proven; you wouldn't have to write a proof that "sort" sorts a list any more than you'd have to write the sort function itself.

Sorting was just an example. Most proofs of nontrivial algorithms come with reams of hand written proofs. Happy to see examples that go against this though.

> The computer can also be surprisingly good at proving these things in many cases. E.g. I was playing with the proof assistant Isabelle recently, and I tried to prove that given an int, a proof that twice that int is greater than 13, and a proof that four times that int is less than 29, then that int must be 7. I thought I'd have to fiddle around with induction on Peano numbers and the like, but nope, it could be proved with just a single "by arith".

Doesn't that just fall into the Presburger arithmetic decision procedure? The problem is you can rarely tell when the proof assistant is going to complete the proof for you (e.g. you're fine if it's within Presburger arithmetic but not if it requires induction).


You don't have to write a proof if you don't want to. It's quite possible to simply write your sorting function without a proof, it's just that at that point you don't know that your function works as advertised (which of course is then just the same thing that you get in a non-dependently-typed language).

    data Comp = Gt | Lt | Eq

    sortBy : (a -> a -> Comp) -> List a -> List a
    sortBy = ...
The above is perfectly acceptable in Idris, given that your implementation of sortBy type checks. And it makes no use of dependent types. Separately to this function, you can write a proof that it's correct (if you want). Or unit tests, for that matter.


> You don't have to write a proof if you don't want to.

I know this, I'm simply pointing out that when you do decide to write a proof it can be a very challenging task (much harder than adding generic types into Java code for example).


> it could be proved with just a single "by arith".

Tactics are probably a usability improvement for mathematicians who are used to proving everything by hand. But for programmers used to type inference, they're a step backwards. Mathematicians typically prove much deeper results, but they do so at a much slower rate than programmers write programs.


Huh tactics are pretty great. Say exactly what you want and the computer programs itself!

It's unfair to just compare development time between tactic-generated programs in a dependent language with manually written programs in a non-depenendent language. The end result in the dependent language is much more valuable.


I'm not comparing dependent types vs. no dependent types. I'm comparing higher-order dependent types (Agda, Idris, Coq, etc.) vs. first-order dependent types: https://news.ycombinator.com/item?id=12350147

Seriously, Coq-style proof scripts are utterly unredable when they grow past a certain size. The only way to understand them is to replay them, so that you can see the intermediate hypotheses and goals.


> The only way to understand them is to replay them

Reminds me of the same kind of problem that arises in trying to understand the runtime behavior of nontrivial programs involving side-effects, mutable state, multithreading, etc. Sometimes the debugger is the only practical way to figure out what the program is doing. Add concurrency into the mix, and things can get tricky very quickly.

Dependently-typed programming and machine-assisted proof is still an emerging thing, and there is probably lots of room for exploring ergonomic improvements. It will be interesting to see where this goes.


Your analogy between debugging effectful programs and replaying proof scripts is frighteningly accurate.


I agree. Note that the Coq-style tactics-based approach is not the only possibility.

In Agda, you are expected to write proofs in a functional style, similar to Haskell programs. There is a small amount of integrated automation, which helps you fill in holes in your proofs — however, the results are explicit proof terms, inserted at the right place in your program.

Systems which behave in this fashion have the de Bruijn criterion, as described in Geuvers (2009) “Proof assistants: History, ideas, and future”.

https://www.dropbox.com/s/4mwtxojg7yqb365/Geuvers2009.pdf


Thanks for the links (this one and the one from your other reply), I'm checking them.


I had/have a similar feeling.

I have thought about it some time ago and came up with the following proposal:

http://matej-kosik.github.io/www/doc/coq/coq-trunk-proof-tre...


That's mostly true, but the speed programmers write programs depends how bulletproof they have to be. There have been times that if you include the debugging and maintainability difficulty, proving things to my compiler would be a step forward for me. There's probably a sweet spot.


> That's mostly true, but the speed programmers write programs depends how bulletproof they have to be.

Of course, but, in any case, writing programs at the same rate mathematicians prove theorems simply isn't viable.


Agreed. These days, the most popular dependently typed languages are based on some variant of Martin-Löf type theory, in which arbitrary mathematical propositions are expressible at the type level. I'm not convinced that these designs offer a good power/cost ratio. For example, these languages often have very limited type inference.

In practice, most of the properties that a programmer (rather than, say, a mathematician using a proof assistant) would want to enforce mechanically (say, invariants of data structures) are expressible as first-order statements (lambda-bound variables are never applied). This strongly suggests that the ergonomics of dependent types could be significantly improved by deliberately limiting their expressive power. By the way, the idea isn't new: the Damas-Milner type system, which is most certainly known and proven technology, is a subset of System F-omega's type system for which type inference is possible, at the reasonable (IMO) price of not having higher-ranked or higher-kinded types. A dependently typed language with a first-order type level would benefit from reusing the basic architecture of a Damas-Milner type checker: a constraint generator and a constraint solver powered by a first-order unification engine.


> By the way, the idea of deliberately limiting the expressive power of types isn't new: the Damas-Milner type system, which is most certainly known and proven technology, is a first-order subset of System F-omega's type system for which type inference is possible. A dependently typed language with a first-order type level would benefit from reusing the basic architecture of a Damas-Milner type checker: a constraint generator and a constraint solver powered by a first-order unification engine.

Have you seen languages like DML (let's you capture linear arithmetic properties only and automates the proof of these)?

https://www.cs.bu.edu/~hwxi/DML/DML.html

> Dependent ML (DML) is a conservative extension of the functional programming language ML. The type system of DML enriches that of ML with a restricted form of dependent types. This allows many interesting program properties such as memory safety and termination to be captured in the type system of DML and then be verified at compiler-time.

What program properties are useful to capture and what properties you can mostly automate is a really interesting problem.


> Have you seen languages like DML (let's you capture linear arithmetic properties only and automates the proof of these)?

Yes. But what I have in mind is a little bit more ambitious. I want dependent types restricted to be parameterized by syntactic values (in the sense of ML's value restriction) of what Standard ML calls `eqtype`s. For example, correctly balanced red-black trees would be:

    (* clearly eqtypes *)
    datatype Color = R' | B'
    datatype Nat = Z | S : Nat -> Nat
    
    datatype Tree =
      | Empty : Tree (B', Z, 'a)
      | R : Tree (B', 'h, 'a) * 'a * Tree (B', 'h, 'a) -> Tree (R', 'h, 'a)
      | B : Tree ('c, 'h, 'a) * 'a * Tree ('c, 'h, 'a) -> Tree (B', S 'h, 'a)


Any examples? I'm curious for examples of a nontrivial type that would catch lots of common programming errors where the proofs can be automated. I see lots of new dependently typed languages but not much interest in addressing the proof automation aspect.


You don't need type theory to verify program properties, people have been using first-order methods for decades to prove properties about programs. For example there was a line of work in ACL2 that verified a microprocessor implementation, as well as plenty of modern work using tools like SMT to automatically prove program properties, see Dafny, F* for language based approaches. Though there is plenty of language agnostic approaches as well. My colleagues at UW have a paper in this year's OSDI verifying crash consistency for a realistic file system with no manual proof.


If (like me) you didn't know what the abbreviation SMT stands for -- I looked it up and it is Shiver Me Timbers http://encyclopedia.thefreedictionary.com/Shiver+Me+Timbers Seems to be some kind of pirate speak.

Only kidding! It stands for Satisfiability Modulo Theories https://en.wikipedia.org/wiki/Satisfiability_modulo_theories

“In computer science and mathematical logic, the satisfiability modulo theories (SMT) problem is a decision problem for logical formulas with respect to combinations of background theories expressed in classical first-order logic with equality. Examples of theories typically used in computer science are the theory of real numbers, the theory of integers, and the theories of various data structures such as lists, arrays, bit vectors and so on. SMT can be thought of as a form of the constraint satisfaction problem and thus a certain formalized approach to constraint programming.”

I'm not sure I'm any the wiser after that …

A notable building block appears to be SMT-LIB http://smtlib.cs.uiowa.edu/index.shtml


Nice. SMT can be (and indeed has been) integrated into type-based approaches to program verification as well.


Yeah I think SMT is really the state-of-the-art right now in this area. Leo De Moura (one of the main authors of Z3) has been working on Lean for the past couple of years. There is a small group of us (5~) who have been working on a big release for the past couple of months. The goal is to bring the easy of use of SMT automation to type theory, so you can get both the ability to do induction and automation.

Lean: http://leanprover.github.io/


I wish you guys good luck. Freedom from Coq-style proof scripts should be the goal.


I'd love to give more concrete examples, but I'm slightly hampered by the fact the type system I want isn't actually implemented anywhere. I have a rough sketch of the design of the type system I want, and I've been looking for a computer scientist, logician or mathematician to help me polish the design and prove that it is, in fact, type safe. But I couldn't find anyone. :-|


Why not just learn Coq or Agda yourself? There are also sorts of introductions for simple type systems that I'm sure you can build off of.


I have no idea how to implement fresh unification variable generation (my type system is a superset of Damas-Milner, kind of like how the calculus of constructions is a superset of System F-omega) in Coq or Agda. Everywhere I've seen it implemented, it involves mutable state. So, “in principle”, I could fake it using something like the state monad, and plumbing around the collection of unification variables generated so far, but in practice reasoning about this seems like a pain.


You could take a look at Coquand (2002) “A formalised proof of the soundness and completeness of a simply typed lambda-calculus with explicit substitutions”.

https://www.dropbox.com/s/xmj7yv1i1moe0ag/Coquand2002.pdf

Drop in to #dependent on Freenode IRC, and let’s chat.


Type inference isn't the most important, most languages that employ type inference are moving towards features sets make full inference hard, if not impossible. Not to mention you can get pretty good "inference" in dependently typed programming languages just by designing an appropriate higher order unification procedure. I have never met a daily user of these tools that find this to be the bottleneck.

In practice you spend way more time writing proofs in these system then doing anything else. There is evidence that we don't need to limit the expressive power of our logics, just design appropriate automation for the logic. Lean, a new ITT based theorem prover from MSR, has been focused on building automation for type theory. It already has promising results that formal methods techniques can be scaled to type theory (see the website for more information on recent work).


> Type inference isn't important, most languages that employ type inference are moving towards features sets make full inference hard, if not impossible.

What I'm saying is that I don't like this direction, because type inference is very important to me.

> Not to mention you can get pretty good "inference" in dependently typed programming languages just by designing an appropriate higher order unification procedure.

Such higher-order unification procedures must always be tailored towards specific use cases, since higher-order unification is in general undecidable. Effectively implementing a different type checking algorithm for every program I write doesn't feel like good use of my time.

What I want is a type system that is more honest about how much it can help me: it only promises to enforce properties expressible as first-order statements, but offers much better inference, and the remainder of my proof obligation is hopefully small enough that I can fulfill it manually.

> In practice you spend way more time writing proofs in these system then doing anything else.

Indeed, and that's what I'd like to fix.


I like your proposal of first-order, dependent types. Shocked I havent heard of it before now given advantages. I hope someone helps you build it.

On related note, what do you think about the Shen LISP approach where Sequent Calculus can be used for advanced typing? Does it have the expressibility and decidability benefits you described? Hadn't seen many specialists talk about it so still an unknown to me. If it does, I figure some default type systems could be created for a specific style of Shen and/or specific libraries. Or something similar done in a ML.

http://www.shenlanguage.org/learn-shen/types/types_sequent_c...


I'm not familiar with Shen, but, if I recall correctly, its type system is Turing-complete, so it's probably more complicated than I'd be comfortable with.

IMO, the topmost priority in a type system designed for programmers ought to be automation, not expressiveness. Programmers don't spend their time proving deep theorems. Most of what a programmer does is actually fairly boring, but it ought to be done correctly, reliably and within a reasonable time frame.


I totally agree. They're not going to use it if we make them work for it. Neither work hard or even lightly. Needs to be no harder than basic annotations but preferably closer to existing, static typing. Let them focus on functions instead of proofs.


> Proving F matches the type is arbitrary hard and is impossible to automated in general

So what?

Even proving that array indices are not accessed out of bounds within loops would be a considerable leap in the state of the art of industrial programming languages. For most cases (i.e. linear integer arithmetics), that's always automatically (dis-)provable. We have to start somewhere...


> Even proving that array indices are not accessed out of bounds within loops would be a considerable leap in the state of the art of industrial programming languages. For most cases (i.e. linear integer arithmetics), that's always automatically (dis-)provable. We have to start somewhere...

It's a super interesting and important area, I was just pointing out it was a challenging one with lots of open issues.


The only time I remember using indexed list access in Haskell was on a dirty script that would break in 3 months anyhow. They are just ugly, and throw a huge amount of safety away, for no big gain at all.

Mainstream languages will gain much more from better iterators than by a ton of effort that just proves your indexes are within bounds.


While I agree that better iterators are desirable, iteration is far from the only operation on arrays (or any other collection for that matter).


To some degree, you can choose what you want to prove about your algorithms in those languages.

It's perfectly possible to implement a sorting algorithm without proving that it actually sorts the input, or indeed returns a permutation of the input list at all.

In that sense, you can choose how much efford you want to put into it.


> To some degree, you can choose what you want to prove about your algorithms in those languages. > It's perfectly possible to implement a sorting algorithm without proving that it actually sorts the input, or indeed returns a permutation of the input list at all.

Sure, but my point is that any nontrivial property requires the programmer to write difficult formal mathematical proofs which is an activity completely unlike what you see in mainstream strongly typed languages (e.g. Haskell, OCaml).

Also, consider how when you start making use of generics in type systems by converting all you Object lists to String lists; you find you have to start adding generic types all over the place for any code that touches that code until the compiler errors stop. If you decided to capture permutations of input lists in a dependently typed language, the exact same thing happens except this time capturing that property for all your algorithms could be monumental task of writing complex proofs (unlike before where you're just labelling things).


> Sure, but my point is that any nontrivial property requires the programmer to write difficult formal mathematical proofs which is an activity completely unlike what you see in mainstream strongly typed languages (e.g. Haskell, OCaml).

So don't write the proofs then. Dependently typed languages don't require you to write proofs; they give you the ability to if you choose. Considering that it's impossible to write such proofs in OCaml or Haskell, it seems strange to complain that they're hard to write in a dependently typed language. "Hard" is strictly less difficult than "impossible".


> Considering that it's impossible to write such proofs in OCaml or Haskell, it seems strange to complain that they're hard to write in a dependently typed language.

My sole point is that when I see articles on here about dependently typed languages, the issue of how the proofs are generated and the challenges involved in doing this is overlooked almost every time. I think it's important these challenges are highlighted so people don't think dependent types are a magic wand that gives you safety with little effort.


The article was talking about the differences in power between different type systems. It's not even primarily focused on dependent types, let alone advocating for using them or claiming that they're easy. In fact it notes that the more advanced type systems "make the language more complex." If the article were suggesting everyone should use dependent types, or that they were easy, then your admonishments about dependent type systems would make sense here, but as it is I'm not sure.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: