Hacker News new | past | comments | ask | show | jobs | submit login
The Little Typer (2018) (thelittletyper.com)
297 points by bmer on Oct 11, 2022 | hide | past | favorite | 96 comments



This is one of my two favorite books in The Little <X>er series. The other is The Reasoned Schemer.

The Little Typer provides an introduction to dependent types. These can by used to guarantee things like "applying 'concat' to a list of length X and list of length Y returns a list of X+Y". It is also possible, to some extent, to use dependent types to replace proof tools like Coq. Two interesting languages using dependent types are:

- Idris. This is basically "Haskell plus dependent types (and minus lazy evaluation)": https://www.idris-lang.org/

- ATS. This is a complex systems-level language with dependent types: http://www.ats-lang.org/ This fills a niche (vaguely) similar to Rust and Ada Spark, with a focus on speed and safety.

The Reasoned Schemer shows how to build a Prolog-like logic language as a Scheme library. This is a very good introduction to logic programming. And the implementation of backtracking and unification is fascinating. (Unification is a powerful technique that shows up in other places, including type checkers.)

This is an excellent series overall, but these two books are especially good for people who are interested in unusual programming language designs. I don't expect dependent types or logic programming to become widely-used in the next couple generations of mainstream languages, but they're still fascinating.


> [Dependent types] can by used to guarantee things like "applying 'concat' to a list of length X and list of length Y returns a list of X+Y".

As a curious bystander, what else? When are dependent types useful in practice? How do they fit into software development practices such as testing and DRY?

As an example of what I'm looking for, I see regular static typing as a way to move a class of runtime errors to compile time errors, increasing your confidence in aspects that otherwise would need repetitive unit tests.


One interesting aspect, at least for us old systems guys where runtime is important, is that you can not only move errors into compile time, you can move validation checks into compile time.

Simple example: returning the head of a list is partial; normally, you have to have a check if the list is empty and return an error. With dependent types, you can guarantee that the list is not empty, moving that test to some higher level where it's actually meaningful.

If you're manipulating arrays, you can guarantee at compile time that the indices have the right values, moving the tests outside of a function that gets called often. (In fact, when I was playing with ATS, this is what I did: I picked one boring validation test in the code at a time and moved it into the type. It vastly simplified the code.)


> Simple example: returning the head of a list is partial; normally, you have to have a check if the list is empty and return an error. With dependent types, you can guarantee that the list is not empty, moving that test to some higher level where it's actually meaningful.

You don't need dependent types for that. In particular, non-empty lists are isomorphic to a tuple containing a value for the head and a list for the tail:

  type NonEmptyList<T> = Pair<T, List<T>>
Note that this is just the type of cons ( https://en.wikipedia.org/wiki/Cons ). We can use this to define List recursively (representing an empty list (AKA nil) as the single value of type Unit):

  type List<T> = Either<Unit, NonEmptyList<T>>
If we inline the definition of NonEmptyList, we get the usual definition of List<T> = Either<Unit, Pair<T, List<T>>>


Internally, the Idris2 compiler uses deBruijn numbering for its terms, which is tricky to get right once you start trying to manipulate terms. To help with this, it keeps a list of the names corresponding to the deBruijn indices at the type level. The type checker catches mistakes as you write code to manipulate them (inserting names, removing names, or renaming things).

I ran through some exercises teaching this technique[1]. My take-away as a beginner was that it was a pain to make the compiler happy and my code that the compiler didn't like was actually wrong.

I remember reading someone had written a functional red-black tree (with proof of correctness leveraging dependent types) and said the delete algorithm naturally followed from the types. That has was not my experience (delete wasn't immediately obvious to me), but I need to find some time to give it another go.

The Idris2 docs have an example of a lambda expression interpreter[2] whose terms have a vector of the types in the environment as a type parameter (erased at runtime), to ensure everything is type safe in the resulting interpreter.

Typescript has added some bits of dependent typing. For example, asserting a function returns true if the argument has type X, that a function throws if the argument is falsy, and I think type level string manipulation. They are fairly pragmatic, it's not a research language, so I presume that specific functionality is useful to a lot of people. I've used it a little bit of it to help the type checker infer non-null after my assert function was called.

All of that said, I think people are still exploring how dependent types can be used in practice.

[1]: https://github.com/edwinb/SPLV20/blob/master/Code/Lecture2/E... [2]: https://idris2.readthedocs.io/en/latest/tutorial/interp.html

There is a copy of #1 at github.com/dunhamsteve/SPLV20 that is updated to work with the latest Idris2.


TypeScript does not have dependent types. It has type-level constants and surprisingly powerful type manipulation facilities, including string manipulation on string type constants. But there's no way to operate on types at the "dynamic" expression level - they're deliberately erased, to the point that transforming TypeScript to JavaScript is almost trivial.

For example, take the very first example of a dependent type in the Idris tutorial - a function `isSingleton` that consumes a `Bool` and produces `Nat` if you give it `True` and `List Nat` if you give it `False`; it then uses `isSingleton` as part of the type of `mkSingle`, a function that consumes a `Bool` called `x` and produces an `isSingleton x` - that is, a `Nat` if `x` is `True` and a `List Nat` if `x` is `False`. There's no way to write this in TypeScript. You could have `isSingleton` return dynamic constructors instead, but you can't use that `isSingleton` as part of the type of `mkSingle` to generate a return type. Your `mkSingle` would be stuck with the signature `(x: boolean): number | number[]`; which branch of the return type you end up with depends on `x`, but there's no way to make TypeScript properly understand the connection.


> Internally, the Idris2 compiler uses deBruijn numbering for its terms, which is tricky to get right once you start trying to manipulate terms

I'm lost. Why does it use them; what does it get from doing so? Also if it's internal why would you as a language user (vs implementer) ever encounter them?


Dependent types allow for more nuanced statements made by types. You can for instance have a lists that are sorted in ascending order as a type, or implement messaging protocols at the type level where you can only construct a reply in the message exchange if you've received the correct preceding message.

It's very cool, but it's very expensive to compile, which is why it's still research stuff, I believe.


Not only “compile expensiveness”, but you quite literally have to prove your types in the program, which is seldom easy.

E.g. if you were to write a merge sort, and have it in the type that it will output a sorted array, you will have a very hard time actually proving it correct, akin to algorithm analysis.

Though to be fair most dependent typed languages allows for types without proofs, and having it in the type that an array is sorted for example can be valuable even without formal verification.


One way to think about it is that you can extend your testing to prove the correctness of your program up to some notion of what “correct” means. You can reason about your program in your program and that allows you to capture another class of errors than regular static typing does. In principle, a strong enough dependent type systems allows you to encode mathematics, and you can then prove thing about your program in the same language as your are writing it. As for practical examples, you should look at Idris and some of the talks given about it. They do very fancy things while still being somewhat performant. Besides errors and testing, having dependent types also allows your IDE to reason about your program and can give much stronger hints and code completion.

But I agree that it is often a hard sell for normal developers. Given the choice, I would happily write large code bases in a dependent language, but I am also very biased having worked a lot in them.


One really nice "everyday" example that comes from Idris that I really like is the ability to typecheck a printf function. You provide the format string, and then the typechecker will know what to expect from the rest of the arguments!

Usually that is implemented in C compilers, so that "printf("%d", 2.3f)" becomes an error, but with dependent types you can do it on the user level in library code.


And here is a gist that demonstrates how simple it is to implement such a type-safe printf function:

https://github.com/mukeshtiwari/Idris/blob/master/Printf.idr


It seems to be doing string ops at compile time and emitting a suitable function eg.

   format ('%' :: 'd' :: cs ) = FInt ( format cs )
Does that make it a bit like template metaprogramming if TM could disassemble strings?


Idris uses the same language for run-time and compile-type; and for value-level and type-level. There's no need for a separate template language, or even a separate language for writing types. There's nothing string-specific; it's just using ordinary functions from the standard library :)


Intriguing! When working on its own programs does it interpret or compile the functions?


(Note that I've not used Idris 2.x yet, so some of this might be out of date)

The type-checker uses an interpreter. In fact there are a few steps involved:

- The "elaborator" translates Idris code into a simpler, more verbose intermediate language (an extension of lambda calculus). Idris programs can actually do metaprogramming/codegen by controlling this elaborator: http://docs.idris-lang.org/en/latest/elaboratorReflection/el...

- The type checker can beta-reduce (AKA interpret) expressions which occur in types. The rules of this interpretation are a little different to normal Idris evaluation: in particular, Idris is call-by-value, whilst the type-checker isn't (I think it's lazy?). For example, say we had a program like this:

  -- This lets us write types like 'Equal Int 1 1' and 'Equal String "hello" "bye"'
  data Equal: (t: Type) -> t -> t -> Type where
    -- This is the only constructor for such 'Equal' types. Notice that it can
    -- only return values of type 'Equal t x x', i.e. values only equal themselves
    refl: (t: Type) -> (x: t) -> Equal t x x

  -- A simple implementation of natural number addition
  data Nat: Type where
    zero: Nat
    succ: Nat -> Nat

  plus: Nat -> Nat -> Nat
  plus zero     y = y
  plus (succ x) y = succ (plus x y)

  -- A proof that 0 + n = n for all n 
  zeroLeftIdentity: (n: Nat) -> Equal Nat (plus zero n) n
  zeroLeftIdentity n = refl Nat n
Let's type-check this 'zeroLeftIdentity' function:

Its return type is 'Equal Nat (plus zero n) n', and its return value is 'refl Nat n'. The definition of 'refl' tells us that value has type 'Equal Nat n n', so this function is only well-typed iff 'Equal Nat (plus zero n) n' is the same as 'Equal Nat n n'.

These two expressions are not "intensionally equal" (i.e. they aren't the same sequence of symbols). However, Idris is a bit more lenient, and will accept 'beta-equivalence' (i.e. running them gives intensionally-equal results). In this case, we could run the function call 'plus zero n' to see if it returns 'n'.

However, there's a problem: Idris (like most programming languages) is call-by-value, so it evaluates all of the arguments of a function call before evaluating the function body. For example, consider how the following program is evaluated:

  times zero     y = zero
  times (succ x) y = plus y (times x y)

  double x = plus x x
  square x = times x x
  doubleThenSquare x = square (double x)


  doubleThenSquare (succ zero)
  square (double (succ zero))
  square (plus (succ zero) (succ zero))
  square (succ (plus zero (succ zero)))
  square (succ (succ zero))
  times (succ (succ zero)) (succ (succ zero))
  plus (succ (succ zero)) (times (succ zero) (succ (succ zero)))
  plus (succ (succ zero)) (plus (succ (succ zero)) (times zero (succ (succ zero))))
  plus (succ (succ zero)) (plus (succ (succ zero)) zero)
  plus (succ (succ zero)) (succ (plus (succ zero) zero))
  plus (succ (succ zero)) (succ (succ (plus zero zero)))
  plus (succ (succ zero)) (succ (succ zero))
  succ (plus (succ zero) (succ (succ zero)))
  succ (succ (plus zero (succ (succ zero))))
  succ (succ (succ (succ zero)))
This evaluation strategy doesn't work for our 'plus zero n' example, since we don't know the value of 'n'. Our type must be correct for all n, which includes values like 'zeroLeftIdentity (doubleThenSquare (succ zero))'. We just saw that call-by-value evaluates arguments first, which would have the type 'Equal Nat (plus zero (doubleThenSquare (succ zero))) (doubleThenSquare (succ zero))', which would reduce like this via call-by-value:

  Equal Nat (plus zero (doubleThenSquare (succ zero))) (doubleThenSquare (succ zero))
  Equal Nat (plus zero (square (double (succ zero)))) (square (double (succ zero)))
  ...
  Equal Nat (plus zero (succ (succ (succ (succ zero))))) (succ (succ (succ (succ zero)))) 
  Equal Nat (succ (succ (succ (succ zero)))) (succ (succ (succ (succ zero))))
Notice that the call to 'plus zero _' was the last thing we evaluated, since call-by-value requires that its arguments ('zero' and 'doubleThenSquare (succ zero)') be fully-evaluated first. Going back to the general case, where we don't know what 'n' is, we cannot evaluate the type 'Equal Nat (plus zero n) n' any further; it's "stuck".

However, we can use a different evaluation strategy, for example call-by-need (AKA "lazy evaluation", as used by e.g. Haskell and Nix). In that case we evaluate function bodies before evaluating their arguments (treating arguments more like local variables). If we use call-by-need, the above reduces like this:

  Equal Nat (plus zero (doubleThenSquare (succ zero))) (doubleThenSquare (succ zero))
  Equal Nat (let y = doubleThenSquare (succ zero) in y) (doubleThenSquare (succ zero))
  Equal Nat (doubleThenSquare (succ zero)) (doubleThenSquare (succ zero))
  Equal Nat (let x = succ zero in square (double x)) (let x = succ zero in square (double x))
  Equal Nat (let x = succ zero; x2 = double x in times x2 x2) (let x = succ zero; x2 = double x in times x2 x2)
  ...
  Equal Nat (succ (succ (succ (succ zero)))) (succ (succ (succ (succ zero))))
Notice in this case that we don't actually need to fully-evaluate these expressions: they matched the required pattern by the third line! Since the actual value of 'n' is irrelevant for those first few lines (it's just getting passed around unevaluated), that works for all n, which is what we need, i.e. 'Equal Nat (plus zero n) n' reduces (lazily) to 'Equal Nat n n', and hence the type of 'refl Nat n' is beta-equivalent to 'Equal (plus zero n) n', and hence zeroLeftIdentity is well-typed.

Hence type-checking tends to use call-by-need (or the less-efficient call-by-name) evaluation, since it prevents expressions involving unknown variables from "getting stuck" unnecessarily; even when the actual language we're checking uses some other strategy, like call-by-value.

PS: Regarding the actual code outputted by the compiler: the Idris compiler has multiple targets, including C, Javascript, etc. Recently it's added an interpreter based on Chez scheme, which I think is actually faster than compiling to C.


That's an amazing answer, thanks! This is going to take a bit of time to digest but much appreciated.


I was gonna say that's possible in Rust and also isn't a big problem but the more I think about it the more this comes up.

First, languages where a compiler plug-in or linter do this don't really count, because you can't extend it and they rely on specific tools. Even Rust, which has relatively good meta-programming, is very cumbersome to use. Writing proc macros is known as an archaic skill that – even though it is technically also in Rust (which is nice) – involves heavy manipulation of ASTs and has numerous pitfalls. It would be better to leave the ASTs for compiler folks if possible.

It get the sense that these features can be really useful for deserialization, which is a very common use-case in regular software development. Even where it's safe, like in Rust, deserialization is prone to logic errors and often requires lots of boilerplate (unless you use an existing library but then you must rely on unintelligible proc macro wizardry).


It's not possible in Rust. Yes, you can write a macro for it or use the existing one, but now you get all the disadvantages of macros.

For instance, you cannot (to my knowledge) manipulate a macro with another macro. You can't use a macro and feed it with the format macro to get another macro. With types you can do that (obviously).


This is also possible in typescript which is a little more mainstream than Idris.


TypeScript's type system is not consistent though.


> As an example of what I'm looking for, I see regular static typing as a way to move a class of runtime errors to compile time errors, increasing your confidence in aspects that otherwise would need repetitive unit tests.

Dependent typing just makes it easier to do that, by making the type system more expressive. For example a map where the values are of different types, but where the type for each key is fixed, is something that people do all the time in dynamic languages, but it's very cumbersome to express that to a static typechecker without dependent types, so even in a statically typed langauge people usually make do with a Map<String, Object> and have to use unit tests to confirm the behaviour.


IIUC, you can use it to ensure you never index outside the limits of your static array at runtime, this check being done at compile time.


One example I like is a binary search Funktion which only allows to pass a sorted list


Well, you can do that in traditional languages as well, just wrap your list in a SortedArray type.

Chances are your “sorted” function won’t actually prove the algorithm used formally either in a dependently typed language.


I have a different pov about this. It's an intellectual need to encode logic at the type level. It's not showing lots of practical value yet.. just like most interesting and generic techniques in the mainstream today.


It's a practical need. It allows for the compiler to enhance your productivity by stopping mistakes and errors as soon as possible.

For instance, you can't even write a generic "window slide" function support dynamic window length without this typelevel logic. Except of course if you don't mind your program to blow up on a mistake or write excessive tests for every piece of your code.


Well, can it spot race conditions? Or will it suddenly solve the halting problem?

Safety is not a binary value. Dependent types can increase safety, though they do come with a significant cost, which may very well be spent on additional testing, indeed.


> Safety is not a binary value

Exactly. Encoding logic in types doesn't have to fix all problems all the time. It's sufficient if advantages outweight the costs.



> I don't expect dependent types or logic programming to become widely-used in the next couple generations of mainstream languages

I do, these are low hanging fruits for improving software engineering practices.


Here's my reasoning on why dependent types may still be a couple of language generations from the mainstream:

- Dependent types are hard to retrofit onto an existing language. (Retrofitting generics has had very mixed results, and dependent types are likely harder to retrofit?)

- It takes a good 10 years for a new language to really hit the mainstream, even if everything goes right. (This could change if a big player pushes a language.)

- All type systems represent a tradeoff, "Roughly how much extra confidence can I get in this program, and for how little extra time spent learning and coding?" Dependent types give you some very interesting new guarantees, but they require picking up new skills. But a future language in this space might make these tradeoffs more appealing!

- I'm not sure that any of the existing dependently-typed languages have hit the magic combination of features and tools that would be needed to break out.

In the nearer term, I can imagine dependent typing showing up in at least two niches:

1. If you're currently writing extremely high-assurance software, the sort of thing where you write a proof in Coq and then use that to generate C code, then something similar to Idris might be a win. Ditto for some people using model-checker tools like TLA+?

2. I could imagine a dependent type system being "bolted on" to unsafe Rust to formally verify correctness of code that regular Rust can't verify. This might require using Rust attributes and an external tool. But I could see it happening if somebody loved the The Little Typer and wanted an ambitious thesis project, for example.

But even though I don't expect dependent types to go mainstream in the next 5-10 years, they're one of those ideas than might occasionally contribute a major inspiration to a challenging project.


The big problem I see with dependent types is that they ultimately require doing proofs whether you're using proof assistants like Coq or model checkers like (normal :-)) TLA+. Proofs are hard, for those who haven't had the training and experience (the coding bootcamp goes right out), and proof software is bizarre, opaque, and normal manual proof techniques don't apply or are difficult to translate.

That being said, things like Ada SPARK and Frama-C are the same sort of tools as dependently typed languages, but are more from the assertion model and so use imperative code rather than tossing functional language on top of everything else.


> The big problem I see with dependent types is that they ultimately require doing proofs... Proofs are hard, for those who haven't had the training and experience (the coding bootcamp goes right out), and proof software is bizarre, opaque, and normal manual proof techniques don't apply or are difficult to translate.

Proofs are programs (via curry-howard), so it's not completely unrelated to what programmers do day-to-day. Often, the correctness proof is as simple as beta-reduction, which languages like Idris will perform automatically; although this usually relies on a good choice of representation, which is a skill in itself. Outside of the happy-path, things certainly do get harder, but there's always a choice of whether to just skip the proof and say "trust me", which is no worse than other approaches.

A nice example is vector concatentation:

  // Peano encoding of natural numbers; e.g. 3 = Succ (Succ (Succ Zero))
  data Nat: Type where
    Zero: Nat
    Succ: Nat -> Nat

  // Addition, by recursing on the first argument
  plus: Nat -> Nat -> Nat
  plus Zero     y = y
  plus (Succ x) y = Succ (plus x y)

  // Lists of a known length
  data Vec: Nat -> Type -> Type where
    VNil : (t: Type) -> Vec Zero t
    VCons: (t: Type) -> (n: Nat) -> (head: t) -> (tail: Vec n t) -> Vec (Succ n) t

  // Append a Vec of length m to another of length n:
  //  - The resulting Vec has length 'plus m n'
  //  - All elements of the inputs and output have the same type t
  append: (t: Type) -> (m: Nat) -> (n: Nat) -> Vec m t -> Vec n t -> Vec (plus m n) t
  append _ _ _ (VNil _)         y = y
  append t _ n (VCons _ l x xs) y = VCons t (plus l n) x (append t l n xs y)
The correctness proof of 'append' follows from beta-reduction. In the `VNil` case, we need to prove that the type of y matches the return type, i.e. that for all n and t, Vec n t = Vec (plus m n) t.

In this branch the first vector argument is VNil _ (where '_' indicates something we don't care about). According to the definition of Vec, this argument has type 'Vec Zero _', and according to the type of 'append' it has type 'Vec m t'. Hence Vec Zero _ = Vec m t, telling us that m = Zero (and this occurrence of '_' corresponds to t, but we don't care).

Hence the context of this first branch contains the binding m = Zero; so our original equation Vec n t = Vec (plus m n) t beta-reduces to Vec n t = Vec (plus Zero n) t. This occurrence of 'plus' matches the first branch of its definition, so we can beta-reduce further to get 'Vec n t = Vec n t', Q.E.D.

In the second branch of 'append', we need to show that the type of 'VCons t (plus l n) x (append t l n xs y)' is the same as the return type 'Vec (plus m n) t'. First we can check that the recursive call to 'append' is well-typed, i.e. xs has type Vec l t, and y has type Vec n t, which are both correct.

The return type of that recursive call is Vec (plus l n) t, matching the '(plus l n)' argument given to 'VCons'. Hence the type of the 'VCons' call is Vec (Succ (plus l n)) t, so we need to match this with the overall return type of 'append': Vec (Succ (plus l n)) t = Vec (plus m n) t

In this branch, the first vector argument is VCons _ l x xs, which (from the definition of Vec) has type Vec (Succ l) t. According to the type of 'append', it also has type Vec m t, so we know that Vec (Succ l) t = Vec m t, and hence m = Succ l.

Hence our equation for the return type beta-reduces to Vec (Succ (plus l n)) t = Vec (plus (Succ l) n) t. The occurrence of 'plus' on the right-hand-side matches the second branch of its definition, so we can beta-reduce this to: Vec (Succ (plus l n)) t = Vec (Succ (plus l n)) t. Q.E.D.

The steps above correspond exactly to "running the program" (i.e. evaluating/reducing the expressions), and is performed automatically in languages like Idris. Hence we don't need to write out any separate "proof", the definition of the 'append' function is proven correct just by evaluating the expressions which occur in its types (taking into account the context of the branches those expressions occur in). This is the "happy path".

The reason I like this is example is that we can completely break it, simply by swapping around the 'plus m n' to 'plus n m' in the type of 'append'. That gives us expressions like 'plus n Zero' and 'plus n (Succ l)', which don't match the branches of 'plus', and hence won't beta-reduce. In order to make this work, we need to prove the general theorem 'plus x y = plus y x', and use that to flip the type into the original order. Proving that theorem requires proving lemmas like 'plus x 0 = x', etc. which is absolutely the sort of thing day-to-day programmers don't want to bother with!

On the other hand, it's certainly nice for library developers to have the option of proving such things. Again, we can always skip the proof if we don't feel it's worth the effort, and the result is no less safe than the non-dependent systems already used by those day-to-day programmers!


> Dependent types are hard to retrofit onto an existing language.

Dependent types are hard to retrofit onto existing problems. Software everywhere wants to work with data of indeterminate size/parametrization at runtime. You want your system that tracks inventory stock to work with three warehouses or four, etc... Dependent types make all that stuff part of the type system, so now "list of 3 warehouse_t" and "list of 4 warehouse_t" are no longer equivalent.

Obviously that's all "solvable", but not in a way that matches the intuition of generations of hackers who have been building these systems. It's in some sense analogous to Haskell's "Death by a Thousand Monad Tutorials". At some point you just have to recognize that this isn't the right metaphor for the problem areas it's trying to address and move on.


> Dependent types make all that stuff part of the type system, so now "list of 3 warehouse_t" and "list of 4 warehouse_t" are no longer equivalent.

Dependent types allow you to make that stuff part of your type system. But of course, Idris (and any other dependently typed language) still has "List" as a datatype which does not encode the length as part of its type - and if it didn't, it would be trivial to define. And it's easy enough to convert between the two types, should there be any need to either increase or decrease the precision of your types at different points in your program, depending on the need. There may not be a need to know at a type level how many items are in your cart, but there's surely a value in knowing, for example, that sorting a list doesn't change the number of elements, or that matrix multiplication is well-defined.

You may argue that merely having the capability to use dependent types will lead to people abusing and overcomplicating things - but somehow we're already doing a very good job overcomplicating things with much simpler languages, e.g. Java reflection magic, or Ruby metaprogramming, so I'm not sure the "we shouldn't give people the rope to hang themselves" argument is valid.

This is a bit different from the monad example in that it's practically impossible to write Haskell without understanding monads, since you can't do IO without it.


Your point is good, but your example isn't. :-)

A dependent type can easily express "list of n warehouse_t's where 0 < n <= 1,000,000,000".


Not if the input data isn't known at compile time it can't. I get what you're saying, and admit that my example was glib and short, but the point remains: putting "state" data into the type system forces programmers to be explicit about things that were just part of the algorithm before.

The word "automagical" is a term of art for a reason. It's an ideal that our ancestors have been baking into the concept of "good code" for decades and decades, and dependent types throw it in the trash in favor of a variant idea of "correctness" that IMHO doesn't correspond to what society actually wants its computers to do for it.


> Not if the input data isn't known at compile time it can't

look at the idris example of printf (https://gist.github.com/chrisdone/672efcd784528b7d0b7e17ad9c...) for how to do this, the idris book has some really good example of how you can handle what you would think is "not known at compile time" but can actually be parameterized


Sure it can. It can be as simple as this (pseudo code):

warehouses = get_warehouses_via_io()

if len(warehouses) == 0:

    -- Here you have a proof warehouses is empty

    return Err("Oh no couldn't get warehouses")
else if len(warehouses) > 10000:

    -- Here you have a proof warehouses contains more then 10000 elements.
else:

    -- Here you have a proof that the amount of warehouses is not 0 and not larger then 10000 or you have between 1 and 10000 inclusive warehouses

    do_stuff(warehouses)
Even Haskell which barely has dependent typing can do that. You simply have to introduce a proof that your lists length is between those.


Once again, I'm not saying you can't do it. I'm saying that doing it requires filling your app with that sort of boilerplate nonsense when five decades of software engineering have been trying to remove it and considering it good practice to have done so.


Honest question: if the different list lengths are relevant to the business and they are not in the types, where do you think they go?

They surely don't disappear, just like type errors and bugs don't disappear in dynamically typed languages.

No, it's just that instead of in the code they are now in your head (but not your coworkers head). Is this really better?


If statements are boilerplate? How else would you do branching? Gotos? There are definitely cases where dependent types add a lot of boilerplate. But it's not these simple cases. And guess what. You pay for what you use. If you don't want to prove something and just run your program then you can just do it.


By "boilerplate nonsense" do you mean input validation?

Because that is all that is.


It isn't, though. "Write a program to sort the words you find in the input file" is something everyone here learns in leetcode training, and when done "properly" requires no "validation" of the input (beyond stuff like runtime error checking of allocations I guess).

I shouldn't have gotten involved. And I think I need to retract my hedging above: this is exactly like monads. People caught up in the mania just can't understand why the rest of the world doesn't like their solution and insist on it being some kind of failure on the part of humanity to understand it.


> "Write a program to sort the words you find in the input file"

That's as trivial to write in a dependently typed language as it is in, say, Python (leaving aside the fact that you don't like monads), e.g. Idris:

  main : IO ()
  main = do
    Right contents <- readFile "myfile.txt"
    let theLines = lines contents
    let sorted = sort theLines
    -- do something with sorted
Of course, that doesn't use any dependent types, but nobody says that you have to use dependent types for a program so trivial you're unlikely to screw anything up.

(And also: the standard library could have instead defined the functions "lines" such that it does use a dependent type. In that case, almost nothing about the program would change, except you would now write "let (n * theLines)" because the length is now part of the return value.)

Dependent types are for where it makes sense to encode invariants, e.g.

  matMult : Matrix k m -> Matrix m n -> Matrix k n
And of course, that function works for matrices of arbitrary dimensions, so the argument "we can't know the size of the matrices at compile time" doesn't apply.


People have been giving you counter examples left and right while you keep it abstract and non-concrete. Why not just provide simple examples that show what you mean? Stomping your feet and calling other people maniacs is not really a great look.


A dependent type can also just be "list of n warehouse_t", where n is completely unconstrained. I feel the constraint on the size of the input has made the discussion much more complicated than necessary, because now we need to carry around proof types, which are a more advanced use case.


True, but at that point you're not using dependent types.


You are, because the type still depends on the value of n. It's just that that value is unconstrained. It can still be used parametrically in type signatures, though, e.g.

  append : Vect n a -> Vect m a -> Vect (n+m) a
That's not possible without dependent types.


I think variants of Datalog have some traction

https://en.wikipedia.org/wiki/Datalog

full Prolog though is not that great of a language. At first it looks clever that you can write imperative code by taking advantage of the implementation but it's actually really awkward. It was hoped that you could parallelize Prolog but you can't unless you weaken the language and make it even more awkward.


> It is also possible, to some extent, to use dependent types to replace proof tools like Coq.

Not sure what you mean here - Coq is a proof tool _based on_ dependent types. It's a dependently typed programming language with a mechanism for constructing proof terms in an (arguably) more pleasant or natural way bolted on top.


They probably mean that bringing dependent types into a programming language, proper, partially obviates the need for Coq and similar systems since it renders them (at least partially) redundant. Having dependent types, and having mechanics in-language to produce proofs, would cover a large part of what people (who are producing programs and not focused strictly on the theorem proving side for other purposes) are using Coq & others for.


For others: I think what was meant was "The Reasoned Schemer": https://mitpress.mit.edu/9780262535519/the-reasoned-schemer/


Fixed. Thank you!


> I don't expect dependent types or logic programming to become widely-used in the next couple generations of mainstream languages, but they're still fascinating.

why have an expectation one way or the other? the sophistication of compile-time analyses added to, e.g., Rust seems a plausible place such might appear.


A very mind-blowing and challenging book about Types.

The strange loop talk and my own interview of the authors might give some sense of what it's all about. There is also some interesting work online showing how the language used in the book can be created.

https://www.youtube.com/watch?v=VxINoKFm-S4

https://corecursive.com/023-little-typer-and-pie-language/

https://davidchristiansen.dk/tutorials/nbe/


I was at that talk by David which is the first link. Highly recommended watch, he is such a good presenter. The way he went full circle there and had the room in raucous applause at the end was something I never forgot.


Thanks for the links! If Haskell is more your style than Racket, there's a Haskell version of the implementation tutorial at https://davidchristiansen.dk/tutorials/implementing-types-hs... .


The Strange Loop talk was very enjoyable. You are a gifted public speaker. And I just want to say that I am excited to read your book on Lean in full, too!


For a way more practical introduction to coding a simple type checker I like [1].

For a practical intro to type inference I love the articles on Eli Bendersky's blog [2]. Follow the links to learn also about underlying concepts (unification, logic programming, etc).

--

As an aside, I know a lot of people love the "The little ...X..." series of books but in my personal opinion they are unnecessarily cryptic. For me, the supposedly pedagogical style of teaching feels confusing and frustrating.

Not sure what came first, but that book series make me think of a trend that possibly started with "_why's poignant guide to Ruby". I just feel like at this point of my life I'm a cranky programmer that just wants to get things done. The extra "sugar" and whimsy does nothing for me.

I know, I know... I must sound like a party pooper... Sorry if you like these books... I just feel like there's a lot of praise for them and not much of a counter opinion, so i thought i would share mine :-).

--

PS. I used to be dismissive of blog posts when comparing to books, but I eventually discovered that there are subjects that are way better documented from individual blog authors, often demoing with ++working code++ in languages widely popular like python or JavaScript! I'm very grateful to these people that take the time and effort to share they knowledge and I'm sorry I did not come to this realization earlier in my career.

--

1: https://pragprog.com/titles/tpdsl/language-implementation-pa...

2: https://eli.thegreenplace.net/2018/type-inference/


I agree with you. I’ve gotten about halfway and I don’t like the style. It lacks some much needed explanations. And while I can appreciate levity, too much of this book is frivolous chatter, IMO.


I've been reading through this and greatly enjoy it - up until chapter 9, which is like a giant boulder that tumbled to block a nice hiking trail. The definition of replace is very weirdly presented compared to everything that has come before, and in reality it isn't even very difficult to understand! It's just a way of rewriting expressions using known/existing statements of equality, which is a very basic operation in dependently-typed proof languages like Lean. A few other reviews I've found online also mention difficulty with chapter 9. Maybe I'll write a blog post that re-writes the introduction to this chapter so others aren't stopped in their tracks like I was. If I weren't on sabbatical and had other work concerns I probably would have quit reading the book here.

Chapters 1-6 I only needed to read once, chapter 7 (introducing induction) I had to read twice, and chapter 8 (introducing inductive proofs of equality) I also read twice. Chapter 9 I'll probably read at least twice (although getting started reading it was the hard part) and allegedly the book is much less difficult after this point.

I'm using this book to deepen my understanding of the Lean theorem proving language. It's a testament to how well the dependent type checking/theorem proving equivalence works that I've used Lean a fair bit without understanding at all that I was just writing a program to construct a value whose type is the theorem I am trying to prove.


Yup - I had a similar experience [1].

Chapter 9 is certainly the weakest part of the book but it picks up very quickly thereafter.

[1] https://www.thatgeoguy.ca/blog/2021/03/07/review-the-little-...


Well, I did it. I wrote an extended dialogue explaining the replace function as a blog post: https://ahelwer.ca/post/2022-10-13-little-typer-ch9/

I'd found your review a week or so ago which was nice since it reassured me I wasn't the only person having trouble with it.


Idris has a replace function, and it's hard to understand and use there too. I consider its use to be a code smell, and in my experience replace expressions can always be rewritten in other ways to be clearer and less verbose.


That's interesting, coming from the Lean language the analogous function - rw, or rewrite - is a real workhorse. It's often very useful to manipulate expressions when proving things. Why do you think it's a code smell?


Dependent types are very cool and very frustrating. Once you get it, it opens up a whole new way to express program behavior; but until then, it all seems like pointless busywork. The Little Typer is somewhat tougher to get through than other Little books because the benefits of the paradigm it explains aren't apparent until you understand everything.

For anyone struggling to make sense of it, I suggest looking at the end of chapter 15, where a proof that the Principle of the Excluded Middle is not false is derived step by step. I found it analogous to the derivation of the Y combinator from The Little Schemer in terms of being a "holy shit" moment.

ThatGeoGuy's review gives a nice overview of the book's contents. I also wrote a detailed review that discusses what's cool and what's frustrating about the book:

https://nickdrozd.github.io/2019/08/01/little-typer.html


I'm intrigued. Will this book do anything to help me become a better Rust developer, or will it only be valuable if I start using languages like Idris or Agda?


No and no. Err, no. :-p

IMHO learning Rust is the most effective way of learning Rust...

This is half tongue in cheek but really, this book is not even my fav when it comes to learning about type checking and type inference, simply and quickly (see my other comment in this post).

Also I feel like the really different thing about rust when comparing to other strictly typed languages is actually the borrow checker. Honestly i have no idea how it's implemented, or if this book will help you build one. Maybe someone in here may know.


I've read through maybe half and doubt you'll find it useful for Rust development. If you're interested in learning a dependently typed language it is fantastic though.


Actually the dependent type view gives clarity to thinking of type level functions, like container types parameterized over an element type.


It may help you to have a more firm grasp of how a type system and its value language interact. Which is important in understanding any language with a sophisticated type system, Rust included. But it'll be fairly marginal, I suspect.


No.

In fact if you're intrigued I recommend you read the TAPL book first as a prerequisite for this book: https://www.cis.upenn.edu/~bcpierce/tapl/

Even then, it might not be suitable for you because learning to be a developer in a language is very different from being an expert in the implementation of a language (specifically the type checker). Just flip to the appendix of the book and look at those type deduction rules; do they interest you?


Related:

Book review: The Little Typer (2021) - https://news.ycombinator.com/item?id=31465368 - May 2022 (23 comments)

The Little Typer - https://news.ycombinator.com/item?id=18046745 - Sept 2018 (132 comments)


A beautiful book, written in the "The Little..." series' question and answer format. A challenging topic but just about as accessible as I can think anyone can make it. Warmly recommended if you want to know what dependent types are all about.


A number of modern languages (F-star, LEAN, Coq, Agda, Idris, …) use Dependent Types. It enables you to prove code correct using just the type system. Very cool.

I highly recommend the book “Type Driven Development”. I prefer that to “The Little Typer”.


This is just a market site yeah? nothing that I can sample?


Here's a Strange Loop talk where one of the authors covers some of the content of the book: https://www.youtube.com/watch?v=VxINoKFm-S4


Not even a description anywhere that I can find on mobile.


Click "Look inside" on the book's Amazon page.


    This space reserved for JELLY STAINS!


Pie reference docs, in case you'd skim something before committing to the book: https://docs.racket-lang.org/pie/


(2018)


Is there a book like this that's for people who know absolutely nothing, but has more explanations?


For dependent types? I'm really fond of Type Driven Development With Idris.


The Little Schemer is probably this.


Hmm, I don't think so. This book is about dependent types, The Little Schemer is more about introducing Lisp/Scheme fundamentals.

FWIW I found The Little Schemer remarkably devoid of essential context; I found myself wondering "that's interesting, but why is it important?" far too often.


I've run into something similar going through these books. For me to be successful in going through these and understanding the importance of each piece, I think I'd need a project per-chapter to help me gel what I've learned.


Yeah I read this, scheme doesn't involve types.


There is a large sample of this available on Amazon's "Look Inside"


HoTT?


What is this?



From [1]

> Homotopy type theory is a new branch of mathematics that combines aspects of several different fields in a surprising way. It is based on a recently discovered connection between homotopy theory and type theory. It touches on topics as seemingly distant as the homotopy groups of spheres, the algorithms for type checking, and the definition of weak ∞-groupoids. Homotopy type theory offers a new “univalent” foundation of mathematics, in which a central role is played by Voevodsky’s univalence axiom and higher inductive types. The present book is intended as a first systematic exposition of the basics of univalent foundations, and a collection of examples of this new style of reasoning — but without requiring the reader to know or learn any formal logic, or to use any computer proof assistant. We believe that univalent foundations will eventually become a viable alternative to set theory as the “implicit foundation” for the unformalized mathematics done by most mathematicians.

This sounds fascinating, especially the idea of swapping out set theory as a foundational idea.


Higher Order Type Theory.

No idea beyond that.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: