Haskell's kind system: a primer

fmap · on Oct 18, 2018

That's a really beautiful article!

Using kinds to keep track of the concrete representation of a reference to a type (rather than the concrete representation of the type itself) is one of the real gems in GHCs design.

However, one problem with it is that a polymorphic type, such as [] of kind "* -> *", or a function like map of type "forall a b. (a -> b) -> [a] -> [b]" really only works with boxed types. I always wondered how expensive it would be to monomorphize everything at the level of kinds and have what GHC calls levity polymorphism by default.

Languages like Rust and C++ always monomorphize everything and while this does get slow it is still not the exponential blowup that theory suggests. In fact, there is an ML compiler (MLton) which does the same and it is still not too bad. Doing it at the level of reference representation - which is where the generated code actually needs to change - should be much cheaper and definitely a practical default.

The only real problem I see is that you could in principle write code that does "kind polymorphic" recursion, which couldn't be compiled away. But that sounds seriously crazy and I see no reason to allow it in the first place...

Athas · on Oct 18, 2018

> However, one problem with it is that a polymorphic type, such as [] of kind "* -> ", or a function like map of type "forall a b. (a -> b) -> [a] -> [b]" really only works with boxed types.

Can you explain why this is, or provide a reference to literature where it was investigated? I still (even after getting a PhD in compiling functional languages!) do not understand why you could not create code parameterised over the sizes of the arguments. For example, a 'cons' cell could be represented as first the size of the 'car' in bytes, then the 'car' value inline/unboxed, then a pointer to the 'cdr' cell. The Sixten language[0] does this, although it's still very early in its development. I'm not saying this is necessarily a good solution, I just wonder why I have never seen it seriously considered (and perhaps rejected).

In practice, I do believe that monomorphisation is very often a good strategy.

[0]: https://github.com/ollef/sixten

gpderetta · on Oct 18, 2018

> why you could not create code parameterised over the sizes of the arguments

If I understand correctly, that's what BitC was trying to do, but the extreme complexity of the compilation model was cited as one of the reasons the project failed.

The postmortem is here I think, but I can't access it right now:

http://www.coyotos.org/pipermail/bitc-dev/2012-March/003300....

lmm · on Oct 18, 2018

> The only real problem I see is that you could in principle write code that does "kind polymorphic" recursion, which couldn't be compiled away. But that sounds seriously crazy and I see no reason to allow it in the first place...

Wouldn't you want kind-polymorphic recursion for implementing recursion-schemes like patterns at the type level?

divan · on Oct 18, 2018

The article is exemplary – it's reasonably short, concise, beautifully visualized and explains a complex topic with simple terms.

I'm confused with "type promotion" in the `ConnectionStatus` example, though. When we promote `Open` and `Closed` from terms to types, we make it easy to create a variable that can be either `Open` or `Closed`, but we're also breaking the mental model of the whole thing. `Open`/`Closed` is a property – not a "type" in a traditional way of thinking about types. For me, the type should map onto some abstraction in your mental world map, and `Open`/`Closed` is definitely not a good fit here.

How do Haskell programmers address this problem? Do they think about types more in mathematical terms, rather than as tools for describing mental maps of the problem domain?

tiniuclx · on Oct 18, 2018

Open/Closed would still conceptually act as properties, but by promoting them to types you enforce the property at compile time rather than runtime. Note that they are not used on their own, but to describe a `Connection`. This lets you statically enforce the fact that you must only send data through an open connection, for example.

divan · on Oct 18, 2018

Right, that's the obvious benefit for the compiler (statically enforcing behaviour, etc), but I'm struggling with not seeing here a detremential effect for the human.

To me, when you write code, you're encoding your understanding of the problem domain (mental map) into the code, describing it in a way that permits to restore it back. By reading a code your coworker or you 6 month later should be able to restore the same mental map you had while writing the code. And using type system to describe properties (there is another way to do it already in place – fields in structs), seems like complicating this task and it eventually should hurt readability.

almostdeadguy · on Oct 18, 2018

The advantage is you basically don't have to think about the provenance of data (in this example what state is a connection in? was it closed before being given to me? etc.). It allows the compiler to ensure the user of this code can't perform operations that lead to error states by allowing the programmer to express the state transitions of a connection in their function declarations, as the provided example shows:

  newConnection     :: Address           -> Connection Closed
  openConnection    :: Connection Closed -> Connection Open
  closeConnection   :: Connection Open   -> Connection Closed
  connectionAddress :: Connection s      -> Address

From a readability perspective, I would love this, it's effectively a state machine diagram explained by your function types.

thedufer · on Oct 18, 2018

Statically enforcing behavior helps the human as well - in this example, it would make it more clear what types of things you can do with an open connection vs a closed one just by inspecting the types, whereas otherwise you would have to either find documentation or test it and see.

We use type tricks like this (although not exactly in this way, as we don't really have a concept of kinds) in OCaml, and we find that in practice people new to the language aren't actually confused by it much - it actually makes readability better, not worse.

lmm · on Oct 18, 2018

> When we promote `Open` and `Closed` from terms to types, we make it easy to create a variable that can be either `Open` or `Closed`, but we're also breaking the mental model of the whole thing. `Open`/`Closed` is a property – not a "type" in a traditional way of thinking about types. For me, the type should map onto some abstraction in your mental world map, and `Open`/`Closed` is definitely not a good fit here.

I don't think I follow? Something open is different from something closed; the construction of an open thing is different from the construction of a closed thing. The operations you can do on them are different. They're different varieties of thing, they should have different types.

Edit: is your point that you wouldn't ever have a term that's just Open or Closed, only a Connection[Open] or Connection[Closed]? I guess that's true but it doesn't seem important. If you're comfortable with parameterised types at all then you're already comfortable with types that aren't directly applied to terms - e.g. you'll never have a term that's a List, only a term that's a List[Int] or what have you.

divan · on Oct 18, 2018

> Something open is different from something closed

Right, but that's just a property of "something", not a type of it. When someone asks you "What type of door you have?" we don't say "I have an open door", rather "I've a timber door" or "glass door". This is how we understand the world after all, and to me type system should reflect our mental and cognitive processes.

> If you're comfortable with parameterised types at all

I think I have the same issue with parametrized types as well. Having `Vector<int>` is good for compiler, but not necessarily for humans. When you need to put a beads onto the string (the closest real-world analogy I came up with vector, haha), you don't think "hey, I need to find a string of type "beads" and then put beads onto it". You just take a string, and it implicitly "works" for beads. If you try to put something weird (why would you, btw?) onto it, you'll break the whole thing or it just didn't work. I'm not implying that we need to do everything as in offline world, but at least try to minimize discrepancies in the reasoning about the world in the code.

uryga · on Oct 18, 2018

> When someone asks you "What type of door you have?" we don't say "I have an open door", rather "I've a timber door" or "glass door".

I like analogies, but I don't think bringing the real world into this helps – irl, "is a" relationships are pretty fuzzy and arbitrary. What does it mean for something to "be" a door? If I'm using a window as a door, does it become a door? "Being a door" is just another property. And if I have a door and start chopping away pieces of it, at what point does it stop being a door? "things" often don't have well defined boundaries, so even the notion of "objects" becomes meaningless. (see also: Ship of Theseus). A lot of it is up to a person's perception too... So in general, I think it's better to keep the real world separate from formal/mathematical concepts.

---

Encoding some "properties" in the type system is a tool for ensuring (some degree of) correctness. Sometimes, a property is important enough that you want to put it in a type variable, as in `data Connection (status :: ConnectionStatus)`. But if you'd prefer your properties to be "disconnected" from the actual type, you might enjoy dependently typed languages[1] like Idris, where you can have statically enforced "proofs" that a value has some property. In such a system, a function `readData` could take a `Connection` and a proof that the connection is open. It'd look something like this:

    readData : (conn: Connection) -> IsOpen conn -> Bytes

Note that the proof, `IsOpen conn`, refers to the value of the first argument. How do we obtain those proofs is up to the connection library. For example, `openConnection` could return a connection along with a proof that the connection is open:

    openConnection : Address -> ((conn: Connection), IsOpen conn)

The library could also supply a "smart predicate" function that gives you proofs of a connection's status:

    status : (conn: Connection) -> Either (IsOpen conn) (IsClosed conn)

Take this with a grain of salt though – I haven't actually used Idris, just read a few tutorials.

lmm · on Oct 18, 2018

> Right, but that's just a property of "something", not a type of it. When someone asks you "What type of door you have?" we don't say "I have an open door", rather "I've a timber door" or "glass door".

I think that's just because a given door can change between open and closed and be the same door? We're used to persistent global state and monotonic wall-clock time, but those are bad for programming and we should take the trouble to cast off those intuitions.

> I think I have the same issue with parametrized types as well. Having `Vector<int>` is good for compiler, but not necessarily for humans. When you need to put a beads onto the string (the closest real-world analogy I came up with vector, haha), you don't think "hey, I need to find a string of type "beads" and then put beads onto it". You just take a string, and it implicitly "works" for beads.

The point isn't that the string itself has a type, it's that a string of beads is a different kind of thing from a string of charms or a string of lights (at least, to the extent that a bead is a different kind of thing from a charm or a light). And you can't have a "string of" on its own - it's always a string of something - but there are things that it makes sense to do to a string of things without regard to the specific type of thing that's on it (e.g. count how many items are on it).

Any kind of abstract concept has that issue. You can't ever get just "a square" in the real world - you can get a square of wood or a square of plastic or a square of cake, but it'll always be a square of something. Or you can't get "three" on its own - you can only ever get three of something. If we ask what type of thing you're sitting on, maybe it's a wheeled chair or a three-legged stool, and those types make sense even though it wouldn't make sense to say that you're sitting on "a wheeled" or "a three-legged".

Fellshard · on Oct 18, 2018

Domain modeling is fungible. In most cases, I'd expect it to be a property. In some cases, it could be it's better represented as separate types, especially if it makes both domain and software logic simpler to reason about and change.

tiniuclx · on Oct 18, 2018

I've been writing a lot of Haskell recently. I had to switch to embedded C and the difference between the two was striking. The C program I was editing compiled, but failed to do anything useful. To fix that, I had to reorder the stateful functions and #define the right things without much help from the compiler.

Had I been using Haskell and a reasonably well designed library, these mistakes could've been caught at compile time, with the compiler telling me exactly what I did wrong. The kind system described in the article is how you would implement such a library, or something even more advanced.

Sure, comparing embedded C to Haskell is disingenuous, but it does show how far programming languages have come. And I think my experience could definitely happen in many other mainstream languages.

m_mueller · on Oct 18, 2018

What exactly is it that stops Haskell from being used in embedded? GHC has an LLVM backend, and LLVM should support most embedded architectures, am I mistaken there? I have some suspicions of what it could be, but it would be interesting to read your opinion/experience first.

tiniuclx · on Oct 18, 2018

Sure, you could likely compile a Haskell program to an embedded architecture, but Haskell is a garbage collected language. Allocating memory takes a long, nondeterministic amount of time on a microcontroller. And microcontroller are often used in control systems, where latency kills.

Haskell also suffers from space leaks due to its lazyness. Microcontrollers have memory in the order of kilobytes, sometimes even less than that. You simply can't risk running out of memory, which is why micros are programmed in C or assembly, where you have a vice-like grip on how much RAM you use.

For a higher-level look at the problem, consider that embedded systems are a high volume product. Saving a penny by using a cheaper, weaker micro adds up very quickly when you're shipping tens, or hundreds of thousands of devices. At that scale, it's profitable to spend more on non-recurring engineering (programmer time) in order to reduce the cost of manufacturing.

If you want a functional-style language that is likely very well suited to embedded devices, look no further than Rust. While I haven't had the time to learn it (maybe next year), the language looks very promising due to its memory model.

calebh · on Oct 18, 2018

I've worked extensively with functional programming on tiny microcontrollers like the Arduino. I created a research language called Juniper[1], which is a Haskell/F#-like language targeting the Arduino.

As you mentioned, the memory size and the program size are the biggest two constraints. There's no space for a garbage collector, and everything has to be allocated on the stack. It turns out that it is possible to make the entire thing stack-free. The two biggest hurdles are function closures, arrays, and recursive data types.

Function closures are typically allocated on the heap since it's possible to return functions from other functions. The solution is to include the closure as part of the function type, which means that they can then be allocated on the stack. Arrays are a bit more tricky to allocate on the stack. My solution is to statically size arrays with type level natural numbers, which means that the sizes are known at compile time.

Recursive datatypes are a problem that I have not dealt with yet. In most Arduino projects recursive datatypes are not really used, so I just don't allow them. It might be possible to use type level natural numbers to put a bound on recursion depth. Another idea I had was to use run-timed size datatypes. For example, let's say you have a function F1 which calls F2. The frame pointers for their corresponding frames are at S1 and S2. When F2 returns a value of size N, the solution is to copy the value to a scratch space, return to F1, decrement the frame pointer, and then copy the value back into the stack frame. I'm not sure if this is really suitable for embedded systems since it's a lot of code/machinery which takes up precious program space.

Rust is a great language which can work around these issues due to its linear types. I'm looking forward to using Rust once it starts to support the Arduino microcontrollers.

[1]: http://www.juniper-lang.org/

naasking · on Oct 18, 2018

Very cool! Hadn't heard of Juniper before. Do you use something like regions to group and inline memory allocations?

calebh · on Oct 18, 2018

In the next version of Juniper everything will be moved over to purely stack based allocation. Currently only function closures and refs are allocated on the heap. The refs are only created once when the program initializes itself, so that isn't an issue. The closures will become part of the function type. The FRP model works very well for the Arduino platform since you're essentially modelling a data-flow from input to output devices. Currently the compilation target is C++, which means that the Juniper compiler doesn't have a ton of control over the final assembly code.

naasking · on Oct 19, 2018

> The closures will become part of the function type.

Can you elaborate? Do you mean something like closures and function pointers will be distinguished, with closures essentially being something like an existential package paired with a function pointer? ie. ∃'x.'x * 'a -> 'b

m_mueller · on Oct 18, 2018

Thanks! That‘s pretty much exactly what I thought and I agree Rust looks promising for this purpose. On the other hand this is a variation of the two-language-problem and I wonder if it couldn’t in theory be solved with an approach like in Julia (even though it hasn’t so far in practice).

naasking · on Oct 18, 2018

You don't need to actually run Haskell on the target, you could just use Haskell as a imperative DSL which compiles to C. Copilot does this for hard realtime programs in avionics.

setr · on Oct 18, 2018

There’s not really any reason you couldn’t have a language that eventually offer almost all of haskell’s compile-level safety features, while keeping the vice grip, is there? Most of haskell’s benefits comes from its “sufficiently advanced compiler”, without reliance on gc benefits, I would think. Other than laziness, is haskell-the-language actually that dependent on a gc?

m_mueller · on Oct 18, 2018

A language is not just a set of possible semantics, it‘s also an ecosystem that herds people towards using it in a certain way. If that ecosystem breaks down under certain conditions the language becomes unusable. I think that aspect of language design is often underappreciated and maybe the main reason why we don’t have (and maybe will never) a one language to rule them all.

pjmlp · on Oct 18, 2018

While you are kind of right, when comparing managed languages with modern micro-controllers, people should go back and check the hardware capabilities of those 70's mainframes running Lisp binaries.

A PIC won't do it, but there are others that will surely be able to.

pflanze · on Oct 18, 2018

The PIC18 can run the following Scheme implementation: https://github.com/stamourv/picobit

(The (last of the original) main author(s) has stopped maintaining it, although there are many forks, some of which might be maintained, I haven't checked.)

pjmlp · on Oct 18, 2018

Thanks for the link.

kccqzy · on Oct 18, 2018

Mostly the large runtime. Which requires pretty extensive OS features.

But check out Haskell-based unikernels.

dllthomas · on Oct 18, 2018

Note, though, that even in C you can get some super useful help from the compiler if you structure your code for it.

MaxBarraclough · on Oct 18, 2018

I had no idea Haskell used 'kind' to refer to something codified and concrete in this way -- usually we see the word 'type' used to refer to the concrete type-systems stuff, and 'kind' to refer to the fuzzier, less precise idea of, uh, what it is the data is.

Type theorists say things like "Type systems let us reify what kind of data we are dealing with".

It's a bit like when Java took the fuzzy term 'interface' and made it a concrete technical concept, or when Apple added closures to Objective-C and called them 'blocks', which was previously a fuzzy term.

Will we have to start using another word to refer to the fuzzier concept, if not 'kind'? 'Sort' is already taken. 'Category' is already taken. Perhaps the 'nature' of the data?

hiker · on Oct 18, 2018

Going further one will need a word for "the kind of a kind", "the kind of a kind of a kind" and so on.

One solution is TypeInType, that is the type of a type is another type (not kind or something else). That's where Haskell is going with the work of adding Dependent Types to the language.

The problem with TypeInType is that it is logically inconsistent - it leads to Girard's paradox, analog to Russel's paradox in set theory.

The correct solution is type universes where every type lives in some type universe indexed by a natural number. For example terms are of type 'Type 1', 'Type 1' types are of type 'Type 2' and so on. The usual 'Type' is then just the first universe level 'Type 1' and 'Kind' is 'Type 2'.

Type universes is what most proof assistants employ, e.g. Agda, Coq, Lean. Idris uses TypeInType which is easier to use and there's no need to massage your program until all type universes type check but is logically inconsistent so one has to be careful to not reproduce Girard's paradox in some form or another.

lmm · on Oct 18, 2018

Kinds are a distinct concept from types, and that concept needs a name, and it's best to choose a name where the intuitive meaning of the name aligns with the technical one (indeed Haskell attracts a lot of criticism when it uses names like "monad" that don't have a "fuzzy" meaning).

It goes even further: types-of-kinds are called "sorts", though in Haskell there's only one possible sort of kind.

There are only two hard problems in computer science, after all.

Intermernet · on Oct 18, 2018

>There are only two hard problems in computer science, after all.

Including off-by-one errors, there are four.

MaxBarraclough · on Oct 18, 2018

Maybe we should add 'induction' to the list.

jacobush · on Oct 18, 2018

Naming and ??

(Computability?)

hiker · on Oct 18, 2018

"There are only two hard things in Computer Science: cache invalidation and naming things." -- Phil Karlton

But I do like "Computability and naming things" better.

gpderetta · on Oct 18, 2018

There are only two hard things in Computer Science: cache invalidation, naming things and off by one errors. -- Anonymous

kccqzy · on Oct 18, 2018

Are you just saying you don't like the term "kind" or something else? The name has been used since the last century.

MaxBarraclough · on Oct 18, 2018

I suppose I wasn't clear.

In my experience, in computer science, "kind" is generally used in an informal way, whereas "type" is used to refer to the technical concept of type in type systems.

If Haskell gives a precise meaning to "kind", I'm going to have be careful which word I use to refer to the informal concept.

As I said, type theorists say things like "Type systems let us reify what kind of data we are dealing with".

kccqzy · on Oct 18, 2018

I see. I think it's fine to overload an English word to have both a precise meaning or an informal one.

hiiq · on Oct 18, 2018

Objective-C called closures blocks because that is what they are called in Smalltalk.

Objective-C was modelled on Smalltalk.

imh · on Oct 19, 2018

The same happens with all sorts of collectiony words. You could have a set of things, or a group of them, or a variety or a category or a kind or a sort or a class or type. We're running out of collection words without formal definitions!

mlazos · on Oct 18, 2018

actually I’ve seen sort used for kinds of kinds in some PL papers that I’ve read. Not sure how official it is, some PL researchers I’ve talked to say that they don’t go higher than sorts.

ainar-g · on Oct 18, 2018

This was the clearest introduction into types vs kinds in Haskell that I have ever read. Definitely going to recommend to my friends. Are there other good articles by the author?

rnavarro · on Oct 18, 2018

His typeclasses article is also fantastic: https://diogocastro.com/blog/2018/06/17/typeclasses-in-persp...

calebh · on Oct 18, 2018

Once you understand kinds, it's not so hard to understand typeclasses and monads. Type classes are a way to overload functions in a controlled way, by defining a set of function signatures where the overall typeclass is parameterized by a type constructor of arbitrary kind. I cover this in my blog post here: http://www.calebh.io/Monads-as-Programmable-Semicolons/

state_less · on Oct 18, 2018

I always thought it was beautiful that Haskell turned type construction into just another function on a different domain. What do people call this abstraction?

kccqzy · on Oct 18, 2018

Type-level functions are called type families. They are really not just another function. But if you appreciate that beauty, check out a dependently types language like Idris where normal functions can be used on the type-level as well. So you might have a vector type with a length computed by a complicated expression.

danharaj · on Oct 18, 2018

Pronounced System F Omega.

https://en.wikipedia.org/wiki/System_F#System_F%CF%89

mlevental · on Oct 18, 2018

what is this blog format/static site generator thing?