Ironically, I'd point at GHC Haskell and Scala as the prime examples of what's s...

lmm · on Jan 3, 2018

It's the industry that only wants small improvements that it can digest one at a time. Look at how much anti-intellectual hostility Haskell and Scala attract (even on this very site) for the really quite small steps they're taking. It's not the academics' fault.

catnaroek · on Jan 6, 2018

I am not criticizing the academics behind Haskell and Scala for “not doing enough”. I am criticizing them for doing counterproductive “work”.

(0) Haskell and Scala are already very complicated, and they are becoming even more complicated at a very fast rate. This will only make life harder when we finally bite the bullet and admit that we need a formal semantics for the mess we have created.

(1) Moreover, this complexity only yields at best marginal increases in the extent to which these languages aid software verification. Effort would be better spent trying to figure out how much verification you can wrangle out of relatively simple type systems like ML's. I have actually tried this, and obtained (IMO) promising results: core types prove some simple things, proper use of modules proves a few more simple things, but most importantly, they make complicated things easier to (formally!) prove by other means.

(2) On more philosophical grounds, the extended community behind typed functional languages draws inspiration from the wrong source. Martin-Löf type theory is doubtlessly a beautiful foundation for mathematics, but it has a fatal flaw for programming purposes: it makes no distinction between values (which are already there, you just use them) and computations (which you need to run to produce values, and might have effects during the process). As a result, a programming language based on Martin-Löf type theory has to be crippled in unacceptable ways (e.g. totality, IOW, “fixing a prior notion of acceptable recursion”) to prevent it from being type-unsound.

(3) There is ample theoretical and practical evidence that no single language can be good for everything, so there will be situations in which we need to compose program fragments written in different languages. Alas, type theory is conspicuously silent about FFIs. Is this perhaps because of a refusal to acknowledge that computation is a prior notion, which types can help organize, but by no means define?

lmm · on Jan 8, 2018

> Haskell and Scala are already very complicated, and they are becoming even more complicated at a very fast rate. This will only make life harder when we finally bite the bullet and admit that we need a formal semantics for the mess we have created.

It's very hard to know what we need formal semantics for if we don't simultaneously try to write practical programs. There's a tick-tock pattern at work here: start with a clean formal language, figure out what extensions are needed to write practical programs in it, at first in an ad-hoc way, then figure out a way to make that extension in a sound, compatible way, then finally form a new clean formal language that accommodates these extensions. If you just want a language with formal semantics, ML or Haskell 98 exist already; Scala or GHC Haskell extend these to languages we can write industrial programs with, and then Dotty or Idris are clean formalisable designs with the lessons learned from this.

> Moreover, this complexity only yields at best marginal increases in the extent to which these languages aid software verification. Effort would be better spent trying to figure out how much verification you can wrangle out of relatively simple type systems like ML's. I have actually tried this, and obtained (IMO) promising results: core types prove some simple things, proper use of modules proves a few more simple things, but most importantly, they make complicated things easier to (formally!) prove by other means.

Not my experience. I've found HKT and dependent types extremely useful in practice; they give me a simple way to directly encode the business properties I care about verifying. Of course if you've come up with something useful I'm interested, and I'm sure many others would be, but I'd strongly disagree with the notion that what the academics are doing isn't useful.

> As a result, a programming language based on Martin-Löf type theory has to be crippled in unacceptable ways (e.g. totality, IOW, “fixing a prior notion of acceptable recursion”) to prevent it from being type-unsound.

Not at all convinced that e.g. Idris-style totality is "unacceptable". What's the problem with it?

> There is ample theoretical and practical evidence that no single language can be good for everything, so there will be situations in which we need to compose program fragments written in different languages.

Again not at all convinced. Certainly my practical Scala experience is that it's good for everything.

catnaroek · on Jan 8, 2018

> figure out what extensions are needed to write practical programs in it, at first in an ad-hoc way, then figure out a way to make that extension in a sound, compatible way, then finally form a new clean formal language that accommodates these extensions.

This is not possible, and the history of Haskell, Scala, and to a lesser extent OCaml, is precisely the evidence for it.

> Not my experience. I've found HKT and dependent types extremely useful in practice;

In practice, most uses of HKTs or even term-level HOFs are just bad workarounds for

(0) The lack of a proper hierarchical module system, allowing type components to be shared by fibration.

(1) The lack of properly designed data structure libraries with decoupled concerns. For example, you need traversals and folds to manipulate sets and maps, but you don't need them to manipulate search trees (i.e., trees that allow you to do binary search, rather than ADTs that might use binary search under the covers).

> they give me a simple way to directly encode the business properties I care about verifying.

Do you actually verify them? How exactly do you verify Scala programs in the absence of a formal semantics for Scala?

> Not at all convinced that e.g. Idris-style totality is "unacceptable". What's the problem with it?

“There is no largest class of recognizably terminating programs.”

https://mail.haskell.org/pipermail/haskell-cafe/2003-May/004...

> Certainly my practical Scala experience is that it's good for everything.

One thing that I find useful, due to my current job, is numerical linear algebra without array bounds checks at runtime. Have fun doing this in Scala.

lmm · on Jan 9, 2018

> This is not possible, and the history of Haskell, Scala, and to a lesser extent OCaml, is precisely the evidence for it.

Then how do you explain the existence of e.g. Idris or Dotty?

> In practice, most uses of HKTs or even term-level HOFs are just bad workarounds for

> (0) The lack of a proper hierarchical module system, allowing type components to be shared by fibration.

They don't feel like a workaround. They feel like a nice way of modelling my problems that has the properties that I want. Maybe there are other ways, but if those other ways work then why aren't we seeing them in industry?

> Do you actually verify them? How exactly do you verify Scala programs in the absence of a formal semantics for Scala?

Any verification method has a false-negative rate. I use a combination of trusting the compiler's typechecking, sticking to the scalazzi safe subset and enforcing with linters etc., manual review of recursion (limiting the cases where it occurs as far as possible) and fast and loose reasoning.

Maybe I'm missing something, because there always seems to be some disconnect in these conversations: as far as I can see any method that lets you associate logical deductions to your program and mechanically verify that those deductions are legitimate is verification, and I can certainly do that with propositions-as-types in the Scala type system. I don't verify the faithfulness of the compiled code, but many verification systems don't do that either. I don't have a proof of soundness, but plenty of verification systems don't have that.

> “There is no largest class of recognizably terminating programs.”

Sure. But not all recognisably terminating programs are important or useful. To form a terminating language we have to have a finite limit on how many levels of nested 'eval' we can do at compile time, sure (though I see no reason we can't form a parameterized family of languages and pass the "depth" as a compiler argument). That's irritating, but it's a big jump to "unacceptable"

> One thing that I find useful, due to my current job, is numerical linear algebra without array bounds checks at runtime. Have fun doing this in Scala.

"Without array bounds checks at runtime" is not a real requirement. The real requirement would be something like "using x amount of CPU" or, ultimately, "at cost of no more than $y". FWIW I did numerical linear algebra in Scala in production at a previous job; the benefits outweighed the costs.

More to the point, there's nothing fundamental about Scala that makes it impossible to incorporate linear algebra without array bounds checks at runtime into the language. Yes the current implementation doesn't do it, but that's a contingent fact about the current implementation, not something fundamental.

catnaroek · on Jan 9, 2018

> Then how do you explain the existence of e.g. Idris or Dotty?

Have fun convincing people to port their existing Haskell and Scala programs to Idris and Dotty, respectively. In particular, Idris is not even remotely similar to Haskell beyond the syntax - it even uses a different evaluation strategy!

> They don't feel like a workaround. They feel like a nice way of modelling my problems that has the properties that I want.

The trouble with Haskell and Scala-like higher-kinded type constructors is that they are merely parameterized by other type constructors, rather than by entire structures. (It is as if, in mathematics, all functors were functors from Set!) Haskell and Scala attempt to remedy this situation using type classes, i.e., identifying every type constructor `T` with a canonical structure whose carrier is `T`. But, what if you need to use two different algebraic structures, both having `T` as their carrier? For example, say you need to use maps whose keys are lexicographically ordered strings, as well as maps whose keys are colexicographically ordered strings. Do you create a `newtype` wrapper as in Haskell? Or do you use different implicits in different contexts, and risk conflating maps created in one context with maps created in the other?

This problem doesn't exist in ML, because functors allow you to parameterize entire structures by other entire structures, not just their carriers.

> Maybe there are other ways, but if those other ways work then why aren't we seeing them in industry?

What do I know. Ask the industry, not me.

> I use a combination of trusting the compiler's typechecking,

Trust is neither here nor there. Without concrete type safety theorems (“such and such doesn't happen in a typable program”), you don't even know what type checking is supposed to buy you.

> sticking to the scalazzi safe subset and enforcing with linters etc., manual review of recursion (limiting the cases where it occurs as far as possible) and fast and loose reasoning.

That's a perfectly fine set of software development practices, but it's not verification.

> I don't have a proof of soundness, but plenty of verification systems don't have that.

Then all you have is a fancy linter.

> More to the point, there's nothing fundamental about Scala that makes it impossible to incorporate linear algebra without array bounds checks at runtime into the language.

Um, the inability to statically verify array index manipulation using Scala's type system?

> "Without array bounds checks at runtime" is not a real requirement.

Not paying the runtime cost of superfluous runtime checks is absolutely a real requirement in my current job.

> More to the point, there's nothing fundamental about Scala that makes it impossible to incorporate linear algebra without array bounds checks at runtime into the language.

It can't be implemented as an ordinary library in the existing Scala language without relying on unsafe features (e.g., a C FFI). Doing this safely requiers bona fide dependent types - in spite of their name, so-called “path-dependent types” are just fancy first-class existentials.

lmm · on Jan 9, 2018

Thanks for the reply

> Without concrete type safety theorems (“such and such doesn't happen in a typable program”), you don't even know what type checking is supposed to buy you.

What kind of safety theorem should I be looking for? I mean, what I generally want is something like "if an expression of type A evaluates, the result will be a valid thing of type A i.e. a value constructed by one of the constructors I defined for values of type A". Or perhaps it would be better expressed via elimination - "if I provide eliminators for all my constructors for type A then I can eliminate any expression of type A". I kind of want a "the system will never magic up an A out of nowhere" result, but AIUI I can never get that because I can't have a proof of consistency of the system, no?

> Doing this safely requiers bona fide dependent types - in spite of their name, so-called “path-dependent types” are just fancy first-class existentials.

What is it that we can't do currently? I mean, I can define a natural number type, I can convert between values and types in both directions. I can't use ad-hoc polymorphism with a type defined at runtime (the types I form at runtime must be phantom) but I wouldn't expect to do that in well-behaved code in the first place? The only way I can see for a type to ever get into code, in any language, is either through a literal or a cast, and both those approaches are open to me in Scala. What am I missing?

catnaroek · on Jan 9, 2018

> What kind of safety theorem should I be looking for? I mean, what I generally want is something like "if an expression of type A evaluates, the result will be a valid thing of type A i.e. a value constructed by one of the constructors I defined for values of type A".

This is probably sufficient for a Coq-style proof assistant, but from a general-purpose programming language I expect more. See below.

> I kind of want a "the system will never magic up an A out of nowhere" result, but AIUI I can never get that because I can't have a proof of consistency of the system, no?

A type system can and should guarantee that you can't create values of a (positive) type other than by using the type's introuction forms.

> What is it that we can't do currently?

(0) The obvious: Be sure your program won't attempt to read from or write to a close file.

(1) Obtain an arbitrary integer `n`. Test once whether it is a valid array index. If the test succeeds, use `n` arbitrarily often as an array index, without any further bounds-checking.

(2) Obtain three `foo.Index` values `a,b,c`. Test once whether the arrays `foo` and `bar` have the same length. Use `a,b,c` arbitrarily often as values of type `bar.Index`, without any further bounds-checking.

etc. etc. etc.

> The only way I can see for a type to ever get into code, in any language, is either through a literal or a cast, and both those approaches are open to me in Scala.

Are those casts safe without runtime checks?

---

In any case, at some point using types for program verification offers diminishing returns. We have to accept the possibility that maybe it's us, not type checkers, who have to verify our programs.

lmm · on Jan 10, 2018

> (0) The obvious: Be sure your program won't attempt to read from or write to a close file.

I can do that in Scala with monadic regions, which only need existentials/rank-2 types.

> (1) Obtain an arbitrary integer `n`. Test once whether it is a valid array index. If the test succeeds, use `n` arbitrarily often as an array index, without any further bounds-checking. > (2) Obtain three `foo.Index` values `a,b,c`. Test once whether the arrays `foo` and `bar` have the same length. Use `a,b,c` arbitrarily often as values of type `bar.Index`, without any further bounds-checking.

I see no reason these things wouldn't be possible to implement in the language, though one would have to pass around the (phantom) evidence that these things are true to the point where it's used, which is cumbersome and something newer languages fix.

I'm not saying any of this is as easy as it should be, I'm asking how "real" dependent types would be different from what Scala currently has.

> Are those casts safe without runtime checks?

No, we'd make the runtime check at the point of casting. What I mean is: the only way something I'd want to type as an array of length n could ever get into my program is (ultimately, once we flatten out all the arithmetic from e.g. concatenating arrays of length p and q), either a literal array of length n, or a manually checked cast of a dynamic-length array received at runtime.

> In any case, at some point using types for program verification offers diminishing returns. We have to accept the possibility that maybe it's us, not type checkers, who have to verify our programs.

My experience has been that types are more than good enough. My defect rate is much, much lower than it needs to be, lower than I can really justify putting any more work into, and I don't feel like I'm close to the limits of what's possible with types.

catnaroek · on Jan 10, 2018

> I can do that in Scala with monadic regions, which only need existentials/rank-2 types.

Monadic regions are inflexible. As I have previously told you on lobste.rs, try implementing braided allocations:

(0) Allocate A

(1) Allocate B, do something with A and B together, deallocate A

(2) Allocate C, do something with B and C together, deallocate B

(3) etc.

Turns out, you can't, because monadic regions are strictly LIFO. Lest you think this is just a theoretical exercise, this allocation pattern is used in the implementation of concurrent database indices.

> I see no reason these things wouldn't be possible to implement in the language, though one would have to pass around the (phantom) evidence that these things are true to the point where it's used, which is cumbersome and something newer languages fix.

The only way you could do it is if the design of the array data type anticipates every use case (including ones not in my original list). To give you an analogy, your response is exactly the same as if someone said “I don't need sum types. I can anticipate every single way in which a sum will be case-analyzed, and turn it into virtual method of a base abstract class.” This is completely ridiculous.

> No, we'd make the runtime check at the point of casting.

Then it's effectively dynamically typed, don't you think?

> My experience has been that types are more than good enough.

Types are good enough for some things. I certainly wouldn't want to program in a completely typeless language - parametric polymorphism and algebraic data types are just too nice to give up. But I also wouldn't want to use types to prove everything. Nor would you, as evidenced by your follow-up.

> My defect rate is much, much lower than it needs to be

Lucky you. Mine has to be zero, but it's always positive, sadly.

lmm · on Jan 11, 2018

> Monadic regions are inflexible. As I have previously told you on lobste.rs, try implementing braided allocations:

Earlier in this thread you said "this complexity only yields at best marginal increases in the extent to which these languages aid software verification. Effort would be better spent trying to figure out how much verification you can wrangle out of relatively simple type systems like ML's" - aren't monadic regions exactly a case of wrangling more verification out of a relatively simple type system rather than building something more complex?

There will always be things we don't know how to do; monadic regions themselves seemed impossible until they were implemented. I don't think we've put anywhere near enough effort in yet to conclude that it's impossible to extend that kind of approach to support braided allocations; indeed that's the sort of thing I'd expect the academic process you deride to come up with.

> The only way you could do it is if the design of the array data type anticipates every use case (including ones not in my original list).

The Scala type system can implement Turing-complete computation, you can do arithmetic with type-level numbers, so I don't see that the array design would needs to anticipate anything? Accessing a specific element could demand evidence that that element was within the array's bounds, operations that form new arrays would need to provide evidence about that array's bounds. But the user is free to connect those two ends together in arbitrary ways.

> To give you an analogy, your response is exactly the same as if someone said “I don't need sum types. I can anticipate every single way in which a sum will be case-analyzed, and turn it into virtual method of a base abstract class.” This is completely ridiculous.

Um, I do say this. In a language that has first-class functions etc. I don't need any language-level support for sum types (e.g. pattern matching); Either#fold covers all the same use case. Language support is a nice syntactic convenience but nothing more than that AFAICS.

> Then it's effectively dynamically typed, don't you think?

No, because the type system is still checking that my deductions are valid, that my conclusions follow from my premises (e.g. if I concatenate an array that I've dynamically checked to be of size n with one that I've dynamically checked to be of size m, I statically know that the result is of size n + m). I'm just adopting an extra premise of "this particular runtime size check is correctly implemented"; of course if I adopt a false premise then I'll deduce nonsense, but that's on me. I don't see any qualitative difference between having the (outside array of unknown size -> array typed as being of a given size) function be part of the system versus implemented by me.

> Types are good enough for some things. I certainly wouldn't want to program in a completely typeless language - parametric polymorphism and algebraic data types are just too nice to give up. But I also wouldn't want to use types to prove everything. Nor would you, as evidenced by your follow-up.

I don't prove everything, but I feel like I could; since I'm going to be using types anyway, I don't feel the need to pursue non-type ways of proving things unless and until I hit something that I want to prove that's harder to prove with types than it should be, that feels like an easier way to prove it should exist. So far I haven't.

catnaroek · on Jan 11, 2018

> Earlier in this thread you said "this complexity only yields at best marginal increases in the extent to which these languages aid software verification. Effort would be better spent trying to figure out how much verification you can wrangle out of relatively simple type systems like ML's"

Substructural types are a much, much, simpler addition to a type system than rank-N types, rest assured. For instance, substructural types don't impede global type inference. In fact, adding substructural types to HM require minimal changes to how the type checker works. Read ATTaPL, chapter 1!

> The Scala type system can implement Turing-complete computation, you can do arithmetic with type-level numbers, so I don't see that the array design would needs to anticipate anything?

By itself, a Turing-complete type system does not give you the ability to enforce arbitrary program properties. It only guarantees that you can wait arbitrarily long for the type checker to complete. Kind of impressive, but not useful.

> Um, I do say this. In a language that has first-class functions[sic] etc. I don't need any language-level support for sum types (e.g. pattern matching); Either#fold covers all the same use case. Language support is a nice syntactic convenience but nothing more than that AFAICS.

Encoding data as particular higher-order procedures only works in an effect-free language. (Nontermination counts as an effect!) Mmm, like Scala:

    trait ChurchBool {
        def pick[A](x: A, y: A): A
    }
    
    object ChurchTrue extends ChurchBool {
        def pick[A](x: A, y: A) = x
    }
     
    object ChurchFalse extends ChurchBool {
        def pick[A](x: A, y: A) = y
    }
 
    object ChurchWTF extends ChurchBool {
        def pick[A](x: A, y: A) = pick(x, y)
    }

Sorry, did you say anything?

> I don't see any qualitative difference between having the (outside array of unknown size -> array typed as being of a given size) function be part of the system versus implemented by me.

Here is an important qualitative difference: A runtime check that checks something that is statically knowable introduces an unreachable control flow path.

> I don't prove everything, but I feel like I could

Whereas I actually prove everything, and in the process have found a practical boundary between what (IMO) is reasonable to prove with types, and what I'd rather prove using other technology.

lmm · on Jan 12, 2018

> Encoding data as particular higher-order procedures only works in an effect-free language. (Nontermination counts as an effect!) Mmm, like Scala:

Scala pervasively allows nontermination, sure. All the program properties I have are up to "if it terminates ...". (I mean, I would claim I work in a fragment of Scala that disallows definitions like ChurchWTF#pick, but I can't really justify that without formalising it). Language-level support for sum types doesn't change that; programs written in Scala carry just as much risk of nontermination as programs written in Scala-without-pattern-matches.

> Here is an important qualitative difference: A runtime check that checks something that is statically knowable introduces an unreachable control flow path.

I'm not talking about having runtime checks for things that are statically knowable, I'm talking about runtime checks for things that aren't statically knowable like the size of an array received from an external source. Again, how is that any different from what you would do in a language with "true" dependent types?

catnaroek · on Jan 12, 2018

> Language-level support for sum types doesn't change that; programs written in Scala carry just as much risk of nontermination as programs written in Scala-without-pattern-matches.

Admitting nonterminating computations is actually a good thing. What's a bad thing is conflating computations with the values they produce. In other words, the types `Bool` and `ChurchBool` are not isomorphic.

> I'm not talking about having runtime checks for things that are statically knowable, I'm talking about runtime checks for things that aren't statically knowable like the size of an array received from an external source. Again, how is that any different from what you would do in a language with "true" dependent types?

There's no difference here. But you totally glossed over my point that your abstract type of arrays with a member type of indices would have to be bloated with every operation imaginable on indices.

simon_o · on Jan 3, 2018

But those language communities also have to do some self-reflection.

People are often hostile to things they don't know. Not taking this into account when trying to introduce new things is a recipe for disaster.

This means ...

– picking up developers from where they are, not telling potential adopters "if you write Java-like Scala we don't want you here".

– meaningful documentation, not answering efforts to improve documentation with "it's not necessary, because everyone already knows Scala".

– describing and explaining design choices like companion objects, type classes, context bounds, and so on; not expecting that beginners will understand their superiority through osmosis.

– making technologies accessible to people on their platforms, not treating backends such as Scala.js, Scala on Android or Scala-Native as add-ons for existing Scala users (or worse).

– taking tooling seriously, and not releasing major versions of the language without tooling support. Or breaking compatibility in minor versions like in Scala 2.12.

– clean-ups that removes unnecessary baggage from the language and its libraries. Not lecturing people about the number of keywords or counting the lines in the language's grammar in comparison to other languages.

Haskell does some things better in this regard, but both are far from a convincing value proposition over "worse is better" approaches like Kotlin (which will probably swipe the floor with Scala in the future, because they address the bad parts of Scala).

catnaroek · on Jan 6, 2018

Neither Kotlin nor any other industry language addresses my original point, namely:

> What we need is “a formal semantics for a language, with practical proof principles for designing correct programs”.

However, I'm not disappointed in industry languages, because I don't actually expect anything from them.