
Are unsound type systems wrong? - mpweiher
http://frenchy64.github.io/2018/04/07/unsoundness-in-untyped-types.html
======
nothrabannosir
It would be nice to have some examples of what these terms mean, for type
system dilletantes like me. I must say, I was a little lost :)

Re: TypeScript:

 _> TypeScript challenges the status quo, claiming that lowering the cognitive
overhead of using types is more important than type soundness. They argue that
JavaScript programmers, their main audience, would find it easier to use
TypeScript if they deviated from traditional norms._

Im not sure about the official mission statement, but as far as I understand,
the purpose of their typesystem is to allow static description of the way
existing JavaScript is typed, _de facto_ , in the wild. This explains why e.g.
literal strings are a type so you can do:

    
    
        interface Eventor {
            on('foo', FooHandler);
            on('bar', BarHandler);
        }
    

Not because it’s nice to have a type that "foo" satisfies and not "bar", but
because there is actual code out there that does this, today. Same for
structural typing (TypeScript would be absolute horror to interface with
existing code without it). And, I assume, covariant arrays.

TypeScript was not created in a void, as a fresh, new typesystem. It was
created to formalise the way existing JavaScript was informally typed. It
would have been nice to see a little reference to or acknowledgement of this.

Or an example of how this could have been achieved better; preserving type
soundness.

~~~
Tarean
There are basically two important results for type systems, soundndess and
completeness.

Soundness means that every program that typechecks is valid. Completeness
means that every valid program can be typed.

So basically typescripts type system doesn't match the language semantics so
type checked programs may crash at runtime.

~~~
s_kilk
I'm curious, given Godels incompleteness theory, isn't it impossible to have
both soundness and completeness in the same language?

~~~
karatinversion
Gödel's incompleteness is for proof systems for languages which contain enough
arithmetic to express its own syntax. Lots of things don't go this far - first
order predicate logic is sound and complete, for example (see Gödel's
completeness theorem(!)).

~~~
pas
Could you please expand on the soundness and completeness of first order
logic? Isn't that false generally? Because you need a deductive/axiom system,
so that you can ask the question of completeness/soundness. And ZFC is too
strong, hence cannot prove its soundness and correctness at the same time. But
you can formalize weaker systems in first order logic, that are provably
complete and sound.

~~~
gpm
(It's been a year and a half since I took a course on this, so I apologize for
any mistakes)

You do need to define a deductive/axiom system before you can ask the
question. We can use a few different standard deductive systems. We require
that the axioms are sound. Then the system is complete. (I think one should
also always be able to prove false if the axioms aren't sound... but I don't
recall proving that)

ZFC isn't too strong. That sounds like a contradiction, but it isn't, because
we don't have a rich enough vocabularly in first order logic to state the
problematic statements that make it either inconsistent or incomplete in
stronger logic systems. Every true statement you can state about ZFC _in first
order logic_ is provable.

~~~
indexerror
> It's been a year and a half since I took a course on this

Can you please recommend a MOOC/resource for learning more about this?

~~~
gpm
My only real exposure to it is the course I took, csc438 at UofT taught by
Steven Cook.

There are pretty complete notes for the course as well as assignments here
[0], but not videos or planned lessons. You certainly could learn about it by
reading them, but I don't know if it would be the most efficient way.

[0]
[http://www.cs.toronto.edu/~sacook/csc438h/](http://www.cs.toronto.edu/~sacook/csc438h/)

------
evmar
I've worked a lot with TypeScript after a background in more type-y languages
like O'Caml and Haskell and have eventually gained a real appreciation for the
engineering tradeoff they made. This article sorta frames soundness like it's
a feature -- one could have soundness or not, so why not have it -- when in
really a specific technical term about a property of a language that
programmers often don't understand (as evidenced by the other comments on
here) and which has specific consequences.

Instead, I now view type systems as a tool that can make tradeoffs to get the
job done, in the same space as "IDE" or "test-driven design". With those it's
easy to make statements about the specific programmer tradeoffs, such as
"test-driven design means you spend more time writing tests but gives you more
confidence about the results". With a given type system, you can evaluate it
like "type system X lets you get autocomplete in your editor", or "type system
Y catches the common error of Z".

Meanwhile if you only focus on soundness you end up with counterintuitive
results like where Java is sound but still has NullPointerExceptions
everywhere, or languages like Rust where whole data structures and algorithms
are disallowed because you can't express their correctness to the compiler.
(To be clear I appreciate exactly why Rust must do this, but again you can
evaluate that tradeoff as "preserves memory safety without GC, disallows
doubly-linked lists" rather than "is the type system 'correct'".)

One view that really opened my eyes on this Gilad Bracha's talk on pluggable
and optional type systems: [http://lambda-the-
ultimate.org/node/1311](http://lambda-the-ultimate.org/node/1311) .

~~~
jjnoakes
> whole data structures and algorithms are disallowed

Nothing is disallowed in rust. You may need a specialized crate or a small
'unsafe' block to work beyond the compiler's ability to prove your code meets
rust's guarantees, but that's not the same as inexpressible.

~~~
quotemstr
Unsafe rust is a superset of safe Rust. They're two different languages. The
OP is saying, correctly, that certain data structures are inexpressible in
safe Rust. That's fine. Certain control flow structures are inexpressible in
goto-less C.

~~~
irundebian
> Certain control flow structures are inexpressible in goto-less C.

I'm not sure about that. C contains while loops. And every GOTO-computable
programs should be expressible by WHILE-computable programs.

See (slide 21):
[http://ai.cs.unibas.ch/_files/teaching/fs16/theo/slides/theo...](http://ai.cs.unibas.ch/_files/teaching/fs16/theo/slides/theory-d03.pdf)

~~~
stormbrew
I don't think "certain control flow structures are inexpressible" is the same
assertion as "certain programs are unexpressible"...

After all, any turing complete system will have a similar set of computable
programs as another, but they will not all have the same control structures.
For example, you can have a turing complete language with no loops, only
recursion. Both can express the same programs but not in the same way.

That said, it's also a bit of a tautology so.

------
RyanCavanaugh
Unsoundness is important because soundness and expressiveness come at a trade-
off.

Here's some code which is syntactically legal in both TypeScript and Flow, but
the languages disagree on whether a type error is present:

    
    
      type T = { // Legal decl in Flow, illegal in TS
        x: number;
        [s: string]: boolean;
      };
      const m: T = { x: 1, xz: true };
      m['xz'.substr(0, 1)] = true;
      m.x.toFixed(); // Crashes
    

TypeScript is actually _more_ sound than Flow in this example because it
disallows the declaration of T in the first place. Flow allows this code and
it fails due to a type error at runtime.

Yet in JavaScript this is an extremely common pattern - an object bag where
some known property keys have a certain type, but "all the others" have a
different type. You can't represent this in a bolt-on type system without
forcing the developer to jump through enormous hoops proving that any lookup
key _isn 't_ one of the "known" keys. By enforcing soundness at all costs, you
lose the expressiveness needed to ergonomically describe this object.

Since most people don't directly aim guns at their feet, the unsoundness here
is unimportant in practice. This is one place where I think Flow made the
right trade-off to be unsound for the sake of practical usability for a common
pattern.

~~~
quotemstr
Why wouldn't you just make your untyped bag a field of the larger and
completely specified object then?

~~~
RyanCavanaugh
"Will it be easy for a computer to prove that my API is being called
correctly?" is not a question that JavaScript library authors seem to give any
thought whatsoever to

------
xg15
To be honest, after reading this article, I don't really have more information
why you'd want to use an unsound type system at all.

If you have a sound type system, it's harder to write code but the compiler
and IDE can infer useful invariants and give you guidance in return.

If you have no type system, you're on your own with error detection, but
writing code may be a lot faster.

(Though I personally think a modern language with sound types and useful type
inference can solve a lot of the tedium as well)

If you have an unsound type system you get the worst of both worlds: You have
the cognitive overhead of a typed language but at the same time don't get any
of the guarantees and invariants.

~~~
rb808
A big advantage of unsound types is versioning. IE you can add some member
variables/methods to a structure/class and users of that library that were
written with the old version still work without trouble, recompilation or
redeploying. Of course if you remove the variable they're in trouble but
removing can be avoided for that reason.

~~~
galaxyLogic
In this sense typing is like unit-testing it would seem. You can make your
program "better" by adding more unit-tests to it. But more tests you have the
more work it will be to modify your tests, to accommodate the changes in your
program.

The more strict your type-system is the more things you can prove and thus
understand about your program. But if you need to change your program it may
no longer compile with the existing types. Does your type-system then make it
easy to change your types so they reflect your new program?

Now assumably when you are developing a new program it is changing all the
time. If you fix its types early doesn't it actually make it more work to
create your program?

Once your application is done and works and is big it is essential that it is
"typed" so that you can see what changes don't require changes in the types
vs. what changes would require a lot of rewriting of the types all over the
place.

I think the fact that many useful programs are written in dynamically typed
languages speaks for this. Small programs don't need types.

Does this mean that optional typing is a good feature?

~~~
pas
I usually find a lot more serious bugs in quick and dirty scripts, than in big
systems, regardless of language. (Naturally, big systems have already survived
the small PoC phase, plus usually big things are viewed as complex, whereas
small scripts are dangerously innocent looking.) So in that way things like
Ammonite (easy scripting using Scala) seem very useful - even if the runtime
requirements are a lot heavier, than using bash (or python).

Furthermore, usually development uses type system as a tool, so it'd be hard
to add typing when it's done. (I mean, usually functional doneness is checked
or "asserted" in a large part through types, and unit/integration tests help
with the rest.) So in this sense typing doesn't add more development time. It
just makes problems a lot more explicit, so you know you're not done yet.

~~~
galaxyLogic
Explicitly declared types are like anchors. They make it harder to change the
program to be a different program. In many cases this is good since it tells
whether a change breaks things or not. But it does add to the "mass" of your
program, makes it harder to change its direction.

Your declared types are the shared "language" under your program. It is much
harder to change the language that what is said in it. Changing the language
would actually change the MEANING of what has already been said.

------
devit
Yes: your program fails at runtime, and you have no idea of where the problem
is, because the place where the type invariants are violated is not related to
where an exception is thrown or an incorrect result is produced.

It's basically like using and debugging C/C++ code, where it looks like there
is a "type system" but in fact you are just arbitrarily manipulating a big
byte array that contains all your data and the call stack (except with
TypeScript it's a graph of untyped arrays and dictionaries instead).

~~~
jeremyjh
It is absolutely no different from debugging Javascript. In Javascript you
have an expectation of types in your mind, that are not guaranteed. The same
is true of Typescript. The difference is, Typescript will catch most
exceptions to the type system at compile time. It is not sound, but is
extremely _useful_.

~~~
acjohnson55
"Most" isn't necessarily a good thing, because it can lead to a false sense of
security. For example, if you start thinking, "no need to check for null here,
because the types won't allow it", you're open to being surprised when a null
sneaks in at run-time. So you have to deal with the overhead of both types and
run-time guards.

~~~
jeremyjh
This is the sort of thought people have when they haven't actually used the
language much. It is also an example of something Typescript is really good
at. If you try to use an `any` value - or any nullable typed value - that
hasn't been checked for null, it is a type error. Sure you could write a type
definition for a Javascript function that is a lie and say that it cannot
return a null when it can, and then guess what? You'll get a runtime error and
debugging it is identical to Javascript.

In fact I do not know of any language that is actually safe at it's FFI
boundaries. Rust isn't, Haskell isn't. You could get a value back from a C
function that you were told would be allocated 64-bytes and actually it was
only 32, so you'll have an overflow somewhere later. The Typescript /
Javascript boundary is similar, though obviously not as dangerous.

~~~
acjohnson55
The problem is in just how much boundary there tends to be, since many
TypeScript programs are heavily mixed with ordinary JS. I would agree with you
for programs that exclusively use TS for application code.

------
toolslive
It's an interesting question. It seems that developers are quite able to cope
with weird type systems, moreover most of them don't even think about it or
even care.

A simple example of python2 that would make developers brought up on a healthy
diet of haskell cringe is this:

    
    
       >>> for x in range(100): print type(1<<x)
    

And yet, pythonistas seem to mind nor bother.

~~~
pjc50
This kind of thing is why I'd favour two different types of numeric types:

\- "Number", which would be the generic DWIM arithmetic type. In your example
both x and the literals would be Number.

\- Hyper-specified arithmetic types. Not just number of bits and signedness,
but also overflow and trap behaviour. Less convenient to use - you can't
easily add a "32bit signed wrap on overflow" to "16 bit unsigned saturate on
overflow", but have the advantage that there is no unspecified behaviour. This
also allows the user to make use of machine-provided saturate and overflow
behaviour where present.

~~~
mannykannot
I see where you are coming from, but C offers a cautionary tale: It offers
more choices than most languages, and that creates a number of traps for the
unwary. Maybe, however, that is only because it allows mixed-type expressions?

~~~
pjc50
Choice for the programmer, or choice for the implementer? Part of the problem
of C is the amount of stuff with was left as a choice for the implementers on
a particular platform and therefore could not be safely assumed to be
universal. Forcing the programmer to enumerate what they actually want from
their "fast machine integers" avoids that.

C offers a fairly small choice of types to the programmer, certainly compared
to one of the ML/Haskell style of language. It doesn't even have a proper
"string" type!

C doesn't exactly allow mixed-type expressions, it just performs a lot of type
coercion and reinterpretation. Which is another one of those things that
trades safety for convenience in so many languages.

Type flexibility is responsible for most of the wtf type moments in
Javascript:
[https://www.destroyallsoftware.com/talks/wat](https://www.destroyallsoftware.com/talks/wat)

------
superice
Unsound typesystems are not necessarily a problem, but consider the problem
the other way around: given a sound typesystem, how do we improve this in such
a way that a) the average Joe will understand it b) it decreases the overhead
of writing the types c) the type system helps the programmer write better
programs

Haskells approach to this keeps the constraint of soundness:
[https://youtu.be/re96UgMk6GQ?t=52m5s](https://youtu.be/re96UgMk6GQ?t=52m5s)

The whole video is very interesting even if you know nothing about Haskell,
but are just interested in language design in general.

To summarize the part of the video I just linked: Their approach so far has
been to try and design the type system in such a way that you reduce
programmer pain while not sacrificing soundness at all. It is a painful
constraint for language designers, but it allows you to gradually add
functionality to the type system from observing real world examples. The goal
can be described as: Design a typesystem such that every working program can
be typed in both a sound and correct way, while minimizing the amount of
programs that can be typed, but are not working as intended.

------
LeanderK
I don't really think that unsound type systems are wrong, because I think it
depends on the role of the types in the programming languages. Typescript,
Java etc. don't have that sophisticated type systems and you can just check
the types for yourself very quickly. You rely on the type-system to correct
obvious mistakes, but not more.

In Languages like Haskell or completely dependently typed languages it would
be morally wrong (in my opinion) to have an unsound type-system. Here the role
of the types is different, you use types to compute types and need to trust
the compiler in complicated situations that this is sound, just like you trust
the compiler to do correct floating arithmetic. Checking the types per hand
may be a real challenge, for example if your writing super polymorphic code.
Sometimes you even have to prove that your code fits the type-signature (per
induction etc.), here you want guarantees.

------
cakoose
> The way TypeScript was designed is in fact more promising than older systems
> like Java. Those systems inadvertently broke type soundness instead of
> making it a deliberate action.

Not sure what the article is referring to here...

Java soundness issues like array covariance, final field mutability, unchecked
generics casts, etc were definitely surprises to many users, but the Java
language designers were well aware. They added runtime checking to make sure
those soundness holes didn't cause the JVM to crash or violate the security
guarantees.

Sure, there were some additional surprises, but I'm sure TypeScript has had
many more. (Partly because the TypeScript type system is more complicated, but
also because it's not important to get it 100% right -- JavaScript runtimes
don't perform optimizations that rely on the soundness of TypeScript.)

------
eximius
_Wrong?_ That's a bit of an odd question. It's only _wrong_ if the context is,
"what is an example of a sound type system?"

Now, there are distinct disadvantages to unsound type systems, but that
doesn't make them "wrong". I love Python and sometimes wish I could turn on a
switch in the interpreter that made it statically typed, but there is no
denying there is a freedom and (initial) productivity boost in ducktyping.

~~~
icebraining
Python does have static typing nowadays :) Checkout Mypy, and also Dropbox's
PyAnnotate, which helps statically type existing sources by recording types
during runtime and then adding them to the code.

------
a_turnip
Dunno about "wrong" but I really do like having what the author calls "open-
world" soundness in Typed Racket.

Other than that it really depends where. How bad unsoundness is depends on how
sneaky the resulting errors are and how hard they are to debug.

------
zengid
I think some of the recent developments in run-time type checking systems like
the one in Mobx-state-tree [1] and Clojures Spec [2] are interesting. I don't
know enough about either to make any witty remarks but I'd like to see the
impact they have on the field.

[1] [https://github.com/mobxjs/mobx-state-
tree#thanks](https://github.com/mobxjs/mobx-state-tree#thanks) see tcomb

[2] [https://clojure.org/guides/spec](https://clojure.org/guides/spec)

------
MrBuddyCasino
Most readable introduction to the topic I‘ve read so far. Nicely outlines the
dichotomy between unit testing and types as both serving as different kinds of
verification tools.

Types as „approximation of the result of a computation„ is also a nice and
concise way to describe them, haven’t seen that definition before.

------
ridiculous_fish
TypeScript cannot be sound because its model of the runtime environment
necessarily diverges from the runtime truth. A simple example: some properties
were removed in ES6, so type checking against an ES5 runtime will result in
type errors when run in an ES6 browser, and vice versa.

Soundness only provides a guarantee if you control when and how your code will
be executed, and that's a tough constraint. To prove is not to forsee the
future.

~~~
johnny_reilly
> some properties were removed in ES6

Which properties are you referring to? My understanding is that ES6 is back
compatible and so that can't be case. There's "use strict" but that's a
different kettle of fish; a pragma that changes runtime behaviour..

------
k__
People arguing that type systems like TypeScript are useful, because even if
they don't catch all errors, they still catch the most common ones.

The most common JavaScript error I encounter is accessing fields of undefined
objects.

Can't we simply have a type-system for JavaScript that only prohibits
undefined field access and nothing else?

If we accept that the system will be unsound anyway, why not focus on the main
benefit and throw all the baggage out?

~~~
seanmcdirmid
> The most common JavaScript error I encounter is accessing fields of
> undefined objects.

TypeScript has optional null checking if you want it. Given that it supports
union types, null/undefined tracking fit nicely into the language.

~~~
k__
Can I type stuff as any but still non nullable?

~~~
seanmcdirmid
I think any in TypeScript's context means "not in the type system", and all
type rules bounce off it. So if k's type is any, and you type k.foo, it
doesn't bother checking it. I assume the same applies to any null checking.

~~~
k__
The "object" type is probably what I was searching.

~~~
spiralx
The object type is the type of all non-primitive types, what you want is the
Object type instead. Yes, it's confusing, especially as there's also the {}
type! This article goes over the differences:

[https://blog.mariusschulz.com/2017/02/24/typescript-2-2-the-...](https://blog.mariusschulz.com/2017/02/24/typescript-2-2-the-
object-type)

------
jacinabox
I feel like a distinction needs to be made between soundness and precision. A
sound type system never tells you something wrong while a precise type system
tells you everything that needs to be said about a value. Gradual type systems
are imprecise, because they could type something at some * type. There's
nothing wrong with that. Unsoundness is bad on the other hand, it keeps you
from omitting dynamic type checks in places where you would want to in your
compiler. A natural thing to want to do is include dynamic checks expressly
where you are downcasting from a * type, omitting them where the type is known
precisely enough. That is how you reap a performance gain from your type
system, and unsoundness prevents you doing that.

(Downcasting from * in a gradual type system is analogous to how C lets you
cast from void* to any pointer type implicitly, except that RTTI checks that
the cast is safe.)

------
linkmotif
Flow has a good page on this subject:
[https://flow.org/en/docs/lang/](https://flow.org/en/docs/lang/)

------
catnaroek
> Programmers often use other verification tools like unit testing or
> contracts to get strong assurances of similar properties.

Neither unit testing nor contracts are verification tools.

------
cjonas
Is typescript wrong? Maybe, but if it is wrong, I don't want to be right.

After switching all my JavaScript Dev to ts a couple years ago, I cant imagine
how I would get by without it.

------
dmitriid
I tend to call Typescript's type system "pragmatic typing":

\- type inference

\- structural types

\- union and intersection types

\- _important_ : easy escape hatches such as trivial cast to `any` or the
ability to write your own type definitions for almost any random piece of code
including third party libs

Type purists may scoff at Typescript's type system, but it's there, it mostly
works, it's very easy to get started with, it's reasonably fast. I would
personally wish for exhaustive type checking, and I wouldn't be surprised if I
get my wish sometime in the future.

Another rather pragmatically typed language is OCaml/ReasonML

------
hota_mazi
That's an unusual use for "type soundness". To me, type soundness has more to
do with whether all the functions in a program are total, i.e. defined across
their entire range of inputs (can't throw exceptions, can't crash, always
return values belong to the return type they declare).

~~~
s_ngularity
The author's use of "type soundness" matches the precise, widely-accepted
mathematical definition which is used in the literature. Type soundness does
not require total functions, although they certainly make it simpler.

It is possible to encode the semantics of exceptions such that a reasonable
definition of soundness is still provable. See, for instance, Chapter 14 of
Types and Programming Languages by Pierce.

------
z3t4
What is a high level type system anyway ? Annotations to help feed the
intellisense engine, to warn about miss-spelled variables, make sure you don't
use an apple where there should be a banana, Or are types just objects with
limited states !?

------
retor
I miss MeeGo

------
tree_of_item
> Interestingly, fixing these areas in TypeScript has been a fruitful source
> of new research

That's not interesting. TypeScript did things in way Y, different from the
usual way X, and some academics realize they get a free paper out of
reimagining TypeScript doing X.

------
domnomnom
Many things in the computer science fields are wrong. If people want to have
blockchain and turn one security problem into six, I won’t stop them.

Let us synchronize ALL of the computers, while we’re at it.

Also, suddenly cryptography and computers are cool :)

