
What is type safety? - tenslisi
http://www.pl-enthusiast.net/2014/08/05/type-safety/
======
nmrm
I think "type safety" is one of those things that's so well-defined in theory
that there's always a huge disconnect between people who know the theory and
people who don't. Therefore, I always like to explain the "progress +
preservation" view whenever describing type safety, because it illustrates the
term really has a pretty precise meaning at least in the simple cases.

In the simplest case, think of your 'interpreter' as a program that translates
programs into values. For example, 1 is a value and \x.x+1 (the lambda
expression) is a value, but (\x.x+1)1 is not a value because it evaluates to 2
(or, at least, to 1+1).

Let's say that your interpreter does this a single small step at a time; so
for instance, if you have* f(x,y) = x+y, then f(1,2) |--> (x + 2)(1) |-->
(1+2) |--> 3.

Type safety goes like this: If x has type T then either x |--> x' or x is a
value. Also, if x has type T and x |--> x' then x' also has type T.

The first property is "well-typed programs don't go wrong" and the second
property closes a loophole ("if you're a well-typed program, then you're never
going to evaluate to anything that isn't also well-typed", so we never escape
from the first property by evaluating)

* using the product syntax because some people will get confused by f 1 2, but that's in fact what I mean.

~~~
tomp
But this type theoretical definition is not really well-suited for the real
world; either you have to include exceptions in the set of values, which could
include even things like Segmentation Fault, making ASM well-typed, or you
don't allow exceptions as values, which means that not even OCaml and Haskell
are well-typed with their DivisionByZero and ArrayOutOfBounds exceptions.

~~~
Nav_Panel
To add on to tel's response, another method of supporting exceptions is to
rewrite progress as "if x : t then either x |-> x', x is a value, or x is an
error".

To do this, you need to define another decently large set of judgments for
error propagation (such as "if x is an error and y : t then (x,y) is an error"
and vice versa), but ultimately you can maintain type safety via progress and
preservation while still accounting for exceptions as we know them.

On the topic of whether it would make ASM well-typed, I figure that the lack
of array bounds checking would be one reason why ASM would have problems, but
I haven't thought it through fully. However, I found a neat paper that tries
to create a type-safe assembly language pretty similar to x86:
[http://www.cis.upenn.edu/~stevez/papers/MCGG99.pdf](http://www.cis.upenn.edu/~stevez/papers/MCGG99.pdf)

~~~
nmrm2
wrt ASM, I think tomp meant something like what Robert Harper is getting at in
his post on the linked article.

edit: got grandparent's username wrong.

------
lmm
I think it's worth talking about the ability to propagate user-defined
invariants. To me the most useful aspect of a type system is that it enables
you to declare that all Xs have some property, and then the language
guarantees that this holds. "Has a method named .foo(...)", while useful,
shouldn't be the end of what we try to encode, but there's a big gap between
allowing things like "an X is either a Y or a Z" and allowing arbitrary type-
properties, liquidhaskell style, which the article just glosses over.

Which is a shame, because it's precisely this range - the gap between non-
liquid Haskell or Ocaml and Java - which is most relevant to today's
programmers making choices of programming language.

~~~
colanderman
Yes, I've long wanted to be able to attach arbitrary predicates to types. This
is great for functional state machines; say the state machine type is sm, you
can specify functions like

    
    
        fun initialize(sm) -> sm'
        where sm.state == uninitialized, sm'.state != uninitialized
    

or

    
    
        fun ping_clients(sm) -> sm' where has_clients(sm)
    

in concert with

    
    
        fun has_clients(sm) -> bool
    

The compiler can then track these, and ensure e.g. that "has_clients" is
called and returns "true" before "ping_clients" is called on a given value.

An even smarter compiler can, if it knows the definitions of these functions,
determine statically whether a given predicate is guaranteed to produce the
same result on a modified object. (e.g. simply by inspecting which fields are
inspected and modified)

~~~
tel
You can get this example working with indexed monads in Haskell. You can also
do it quite nicely in dependently typed languages like Idris/Agda.

However, _abitrary_ predicates is a different matter. Even if you have a
language which allows arbitrary computation in types there exist undecidable
predicates which may weaken your desire for arbitrary ones.

~~~
AnimalMuppet
It seems to me that you probably don't want to do arbitrary computation in
types at compile time. Even "this type is an integer that must be non-
negative" is a problem if that type allows subtraction. Now your compiler
needs to know the the value of all such types at the time that subtraction was
applied to them, to know if the operation is valid. That's not a good position
to put your compiler in.

It seems to me to be better to make this a runtime check, _which the compiler
makes sure to call every time subtraction is applied to a value of that type_.

~~~
jerf
"Even "this type is an integer that must be non-negative" is a problem if that
type allows subtraction. Now your compiler needs to know the the value of all
such types at the time that subtraction was applied to them, to know if the
operation is valid."

This is one of the common misconceptions about the stronger-typed languages. I
know because I had it before I learned Haskell. To say that you have a
guaranteed non-negative number does not _necessarily_ mean that if you
subtract two such numbers that the compiler must be able to prove on the spot
that the result is positive.

What exactly it means depends on the language. Haskell is not particularly
capable of representing that type (or at least, _simple_ Haskell is not), so
it'll happily let you write:

    
    
        subtract :: NonNegative -> NonNegative -> NonNegative
    

and then simply propagate the incorrectness if you do end up with a negative
number, doing whatever the underlying system does. But a programmer will
notice this, and a correct subtract implementation may be:

    
    
        subtract :: NonNegative -> NonNegative -> Maybe NonNegative
    

in which case the compiler isn't doing anything, really; the implementation
will examine the two numbers and return Nothing if the subtraction is invalid.

In Haskell, this _is_ a runtime check... it is simply _one you can not
ignore_. It isn't an exception (very easy to forget to check for), and you
can't use the result (if there is one) until you unwrap it, and the language
syntax to do so tends to strongly encourage you to handle the error. You can
not subtract two such numbers without having it shoved in your face that it
may fail. You may then choose to fail to check it, since the language can't
actually stop you, but you will be doing so with full knowledge of the fact
that you are doing so.

And in general, that's how a lot of the "bad computations" that a type system
prevents is done... it isn't "prevented" so much as "you are forced to deal
with violations". For instance, imagine you are deserializing input from an
external source... in a strongly-typed language, all the deserialization
routines will have Maybe (or perhaps Either with errors) pervasively used,
since deserialization routines are able to fail at pretty much any point. The
language doesn't prevent failures from happening, because it can't. It
prevents you from sweeping failures under the rug, accidentally or otherwise.

Thus, in Haskell, the first subtract example up above, while the compiler
won't actually stop you from writing it, is incorrect; there will exist no
implementation that doesn't behave badly for some inputs. In a dependently
typed language, there exist safe implementations, but you will have to provide
some sort of proof that it works. However, since in general it's difficult to
prove, that would probably involve some local context, and it's not hard to
imagine that in practice for such a simple thing you might still end up using
the equivalent of the Haskell function. It just depends on what you have in
hand.

In a way, programming in something like Go has a very similar feel in this way
to programming in Haskell, where the fact that the error is constantly shoved
in your face and you are forced to deal with it before moving on is the way it
generally works. Haskell has a wide variety of clever ways of dealing with it,
and Go mostly uses brute programmer force, but the feel is quite similar,
IMHO.

~~~
tel
It's also worth noting semantically what the type of `subtract` reads as. It
says that for any two NonNegative typed values you can produce a new value in
the type `Maybe NonNegative` which is the type of NonNegatives adjoined with
one special "failure" value.

So this is a perfectly well-defined notion of subtraction on non-negative
numbers. It reflects the obvious fact that NonNegatives are not _closed_ under
subtraction.

------
freyrs3
> As such, we can think of these languages as type safe according to the null
> type system

I think a term like _vacuously type safe_ is a good description of what
Python/Ruby are under this set of definitions. So as to distinguish from
languages where type safety actually means something slightly stronger than
"all syntactically valid programs are accepted by the null type system".

~~~
vinceguidry
I don't think it's quite right to denigrate the actual safety you get in
dynamic languages by referring to their type systems as such.

They are indeed safe, you won't get undefined behavior in the vein of C/C++.
In fact, if they weren't such, there wouldn't any point to using them over
more performant architectures.

The only time as a Ruby developer I've ever encountered a segfault is when a
library calls out to a C extension, the problem is fixed by switching to a
different gem. I'm safe in that whenever my program fails, the bug is in
something I've written. I've never found an actual bug in Ruby.

Sure, I'm giving up a hell of a lot of performance for this safety, but it's
more than worth it for my company's use case.

In fact, I can't even imagine a form of safety you could add to Ruby that
isn't a Very Hard Problem, e.g. safety from race conditions. TDD in Ruby gives
you a kind of certainty about the correctness your code that is hard to get in
other languages. If something breaks, just add another test.

~~~
freyrs3
You're putting words in my mouth by implying I denigrated Ruby as unsafe using
a much stronger definition of safety. Ruby is type safe under the article's
definitions, but I suggest that this claim doesn't carry much information
content because a conclusion about the null type system is such a weak claim,
like a claim about the null set.

~~~
vinceguidry
Reducing dynamic typing's safety to a null set claim is exactly the sort of
denigration I'm arguing doesn't make sense. Type doesn't disappear just
because you aren't encoding it as a first-class PL design concern.

~~~
freyrs3
This discussion is just going to reduce down to a _runtime tag_ vs _formal
type_ definition. This particular article is using the term type to mean
_formal type_ , and my comment is also under this assumption. Under that
definition Ruby does have a null type system. That's not a moral judgement of
it's design, it's just a fact that follows from the definitions.

Rather than debate this for the gazillionth time, I'll just point you at tel's
article which breaks it down very nicely:
[http://tel.github.io/2014/07/08/all_you_wanted_to_know_about...](http://tel.github.io/2014/07/08/all_you_wanted_to_know_about_types_but_were_afraid_to_ask/)

------
AnimalMuppet
"Well typed programs don't go wrong" only for sufficiently narrow definitions
of "go wrong". They don't go wrong by doing undefined behavior. They can still
go horribly wrong by doing unintended behavior, though. Type safety doesn't
eliminate all bugs, just a subset of them.

~~~
andolanra
Depending on your type system, though, the subset that is eliminated can be
larger or smaller. In Java, it's a smaller subset; in Haskell, it's a larger
subset (because e.g. the type system statically rules out the possibility of
null), and in a dependently typed language like Coq or Agda, it's a
ridiculously large subset of bugs. It depends greatly on your particular type
system.

------
tel
I tried to write a post a little bit similar to this one. I'd like to hear any
feedback:
[http://tel.github.io/2014/07/08/all_you_wanted_to_know_about...](http://tel.github.io/2014/07/08/all_you_wanted_to_know_about_types_but_were_afraid_to_ask/)

------
emillon
It is also interesting to consider the PL definition of type safety: the
progress and preservation theorems.

Progress: if an expression is well typed then either it is a value, or it can
be further evaluated (in the sense that "an appropriate evaluation rule
exists").

Preservation: if an expression e has type t, and is evaluated to e', then e'
has type t.

That's it, really (well, you can do a few extensions to allow errors such as
division by zero, or take memory into account).

If your language has both of these properties, they tell you that typed
expressions will behave nicely at run time.

~~~
tel
That's _it_ for the definition, but it's far from _it_ for the meaning.
Python/Ruby make a great example when they're interpreted as vacuously type
safe—clearly this definition needs other components in order to be meaningful.

And that's really the rub. You want a language which is type safe _and_
possessing expressive types. Getting both is challenging.

~~~
mnarayan01
Saying that e.g. Ruby is "vacuously type safe" strikes me as fairly
misleading. Consider an extension to Ruby which defines the ++ operator as
casting its argument as an integer (e.g. Fixnum) and incrementing it.
"Properly" defined, this operator would break type safety in a way that __can
not__ currently be done in Ruby (i.e. modifying a string from "abcdefgh" to
"bbcedefgh", "abcdefgi", or some other nonsense). This is clearly a type
error, and it is one that Ruby does not allow, so describing Ruby's type
_safety_ as vacuous strikes me as wrong.

Additionally, Ruby prevents e.g. calling on an undefined method on an object.
Given the way the interpreter works this comes trivially, but that does not
mean that it does not affect type safety.

To summarize: there are certain classes of type errors related to e.g. casting
which would be dumb to allow in a (non-toy) dynamically typed language and
others which come with no particular cost. This does not mean the type safety
of the language is "vacuous" (though it may negate the need for an explicit
"type checker").

~~~
tel
What you're referring to may be both (a) true and (b) valuable but simply _isn
't_ "type safety" as is being discussed. It's unfortunately confusing to talk
about this stuff since "type" is an overloaded word.

------
Ixiaus
To those that know more than I: my understanding before this article was that
the only "type-safe" languages were or like Haskell, OCaml, Idris, Agda,
etc...

If that's not the case is "type-safe, pure, and functional" a good epithet to
extend, say, Haskell's "type-safe" tag with many of its other features that
weave with the type system well?

~~~
mnarayan01
I believe Haskell would be considered have "strong static typing".
Unfortunately, "strong" in this context is less precise than one might like:
both Haskell and Java are generally considered "strongly typed", but Java
allows e.g. type casting in a way that will not be statically type checked.

Haskell also uses type inference rather than requiring type declarations, but
that is not _directly_ related.

~~~
tel
Type casting has to be modeled (more or less) as (a) upcasting only (b) a
failure of type safety, or (b) an admission that the type system is entirely
vacuous.

~~~
AnimalMuppet
No, you can have downcasting that is checked for validity at runtime, a la
Java (or C++ with RTTI). This is not a failure of type safety, since only
valid casts will be performed. It's not vacuous, either - it's not just one
big type.

~~~
tel
What happens when a cast is invalid?

You have type-safe casts in Haskell, for instance, where the failure is
reflected at runtime.

    
    
        cast :: (Typeable a, Typeable b) => a -> Maybe b
    

Those are alright.

~~~
AnimalMuppet
In Java, IIRC, the cast fails with a ClassCastException. That seems to me to
be "failure is reflected at runtime".

In C++ with RTTI, IIRC it returns null. That's also "failure at runtime" (of a
sort), but you have to check it.

~~~
jeorgun
RTTI returns nullptr when casting a pointer, but it does throw an exception
when casting references. Both of these behaviors are well-defined, although
the first is kind of questionable.

------
kazinator
Article claims that, for Python and Ruby, "language semantics gives meaning to
all programs, so the well defined and all circles of our diagram coincide".

How can that be so, if Python and Ruby provide escape hatches that allow C
libraries to be called?

Perhaps what is meant that in the absence of using certain features, the core
semantics is well-defined.

------
cousin_it
So type safety is just a synonym for "no undefined behavior"? What about code
like "i=i++", which leads to undefined behavior but doesn't seem to have any
type errors? I thought type safety referred to only a subset of undefined
behaviors, namely using a value of type A as though it had type B.

~~~
AnimalMuppet
This seems to be C/C++ code. The article said that C and C++ were not type
safe, so "doesn't seem to have any type errors" within C/C++ doesn't mean that
it's type safe.

That said, the article did say that Java was type safe, and you can write that
same expression in Java. I don't recall whether the result is undefined in
Java.

~~~
cousin_it
As far as I know, Java has no undefined behavior in the C/C++ sense (nasal
demons), and "i=i++" is well defined in Java.

------
igravious
> A classic English example is Chomsky’s “Colorless green ideals sleep
> furiously”—perfectly syntactically correct, but meaningless.

Interesting. But is it meaningless? Isn't its very meaninglessness its meaning
because it has come do be known due to it being bandied about far and wide as
an example of meaninglessness.

Maybe it's something more. Clearly, for all X, X cannot be at one and the same
time both green and colorless. Similarly ideals do not _literally_ belong to
the class of things which can sleep. Finally, sleeping as an activity is never
done furiously as fury signifies intent and whatever else may be true when one
is asleep it is certainly true that one lacks intent when asleep.

Thus packed into that sentence are three ways in which it _cannot_ be the case
or refer to something _actual_ in the world. Is it in this way that it fails
to have meaning and so will always continue to fail to have meaning?

postscript:

> Let’s deconstruct this phrase and define its parts, considering the second
> part first.

Let's use the word analyse rather than deconstruct when that is the word
needed.

~~~
tel
These are dramatically philosophical, but also dispatched by epistemologists
for a long time now.

1\. "Isn't its very meaninglessness its meaning" is a confusion of reference
and sense. The _quoted_ sentence "Colorless green ideals sleep furiously"
interpreted as an object in its own right holds meaning as a common example of
syntax and semantics but if you "remove the quotes" and interpret its sense
then we run into problems... and obviously that's what everyone is talking
about when they bring it up.

2\. "Is it in this way that it fails to have meaning and so will always
continue to fail to have meaning?" Yep. If we assert that we're stating this
sentence as an assertion or judgement of reality then we would like to
evaluate its "truth". Even without getting too mystical it's a long
established sense of the definition of truth that it ought to somehow
_correspond_ with reality and that any notion of truth which does not is
hypothetical or meaningless. This is encapsulated in Dummett's C and discussed
by Per Martin-Löf here: [https://michaelt.github.io/martin-lof/Truth-and-
Knowability-...](https://michaelt.github.io/martin-lof/Truth-and-Knowability-
On-the-Principles-C-and-K-of-Michael-Dummett-1998.pdf)

