> an "unsound type system" is a technical, precise notion
Yup. Milner's "can't go wrong", progress and preservation, etc.
> You cannot just reply "well there are a few runtime checks so it's all good."
Sure I can. I really like how Shriram Krishnamurthi describes soundness in Programs and Programming Languages [1]. I can't think of a better definition for soundness than:
"The central result we wish to have for a given type-system is called soundness. It says this. Suppose we are given an expression (or program) e. We type-check it and conclude that its type is t. When we run e, let us say we obtain the value v. Then v will also have type t."
The "we obtain the value v" part is critical. If an expression of type e doesn't produce a value at all (it terminates or throws an exception), then we have also satisfied soundness.
Indeed, note that he also says:
"Any rich enough language has properties that cannot be decided statically (and others that perhaps could be, but the language designer chose to put off until run-time to reduce the burden on the programmer to make programs pass the type-checker). When one of these properties fails—e.g., the array index being within bounds—there is no meaningful type for the program. Thus, implicit in every type soundness theorem is some set of published, permitted exceptions or error conditions that may occur. The developer who uses a type system implicitly signs on to accepting this set."
A term like "soundness" for a programming language should be useful. We could, for example, define "evenality" as a property of programming languages where we say that a language whose built-in atomic types have names that are all an even number of letters has evenality and other languages don't. That's a well-defined concept and we could neatly partition extant languages into whether they have evenality or not. But who cares?
When it comes to soundness, the above definition from PAPL is useful for (at least) two concrete reasons:
1. When a user is reading code, if they see an expression has some type T, they can safely reason that any value the expression evaluates to will have type T and when they are reasoning about code surrounding that expression, they can rely on that fact.
2. Likewise, when a compiler is compiling code, it can safely assume that if an expression has type T, then all subsequent code that depends on the value of that expression can assume it has type T. The compiler can optimize safely and correctly based on that assumption.
Neither of these properties require that all type checks are performed at compile time. If the runtime throws an exception on out of bounds array indices, that still correctly preserves the soundness invariant that the type of an array element access is the type of the array element. The reader might have to think about the fact that the expression could throw. But they don't have to think about it evaluating to the wrong type.
If that's not your definition of soundness and you require a sound language to have zero runtime checks, then I'm not aware of any widely-used language that meets that requirement, nor do I see how it's a particularly useful term.
Note that it's not the case that every language is sound according to the above definition. C, C++, TypeScript, and Dart 1.0 (but not 2.0 and later) are all unsound. In the first two, it's possible to completely reinterpret memory as another type which leads to the majority of software security issues in the world. In the latter two, the only reason that doesn't happen is because the underlying execution environment doesn't rely on the static types of expressions at all.
Yup. Milner's "can't go wrong", progress and preservation, etc.
> You cannot just reply "well there are a few runtime checks so it's all good."
Sure I can. I really like how Shriram Krishnamurthi describes soundness in Programs and Programming Languages [1]. I can't think of a better definition for soundness than:
"The central result we wish to have for a given type-system is called soundness. It says this. Suppose we are given an expression (or program) e. We type-check it and conclude that its type is t. When we run e, let us say we obtain the value v. Then v will also have type t."
The "we obtain the value v" part is critical. If an expression of type e doesn't produce a value at all (it terminates or throws an exception), then we have also satisfied soundness.
Indeed, note that he also says:
"Any rich enough language has properties that cannot be decided statically (and others that perhaps could be, but the language designer chose to put off until run-time to reduce the burden on the programmer to make programs pass the type-checker). When one of these properties fails—e.g., the array index being within bounds—there is no meaningful type for the program. Thus, implicit in every type soundness theorem is some set of published, permitted exceptions or error conditions that may occur. The developer who uses a type system implicitly signs on to accepting this set."
A term like "soundness" for a programming language should be useful. We could, for example, define "evenality" as a property of programming languages where we say that a language whose built-in atomic types have names that are all an even number of letters has evenality and other languages don't. That's a well-defined concept and we could neatly partition extant languages into whether they have evenality or not. But who cares?
When it comes to soundness, the above definition from PAPL is useful for (at least) two concrete reasons:
1. When a user is reading code, if they see an expression has some type T, they can safely reason that any value the expression evaluates to will have type T and when they are reasoning about code surrounding that expression, they can rely on that fact.
2. Likewise, when a compiler is compiling code, it can safely assume that if an expression has type T, then all subsequent code that depends on the value of that expression can assume it has type T. The compiler can optimize safely and correctly based on that assumption.
Neither of these properties require that all type checks are performed at compile time. If the runtime throws an exception on out of bounds array indices, that still correctly preserves the soundness invariant that the type of an array element access is the type of the array element. The reader might have to think about the fact that the expression could throw. But they don't have to think about it evaluating to the wrong type.
If that's not your definition of soundness and you require a sound language to have zero runtime checks, then I'm not aware of any widely-used language that meets that requirement, nor do I see how it's a particularly useful term.
Note that it's not the case that every language is sound according to the above definition. C, C++, TypeScript, and Dart 1.0 (but not 2.0 and later) are all unsound. In the first two, it's possible to completely reinterpret memory as another type which leads to the majority of software security issues in the world. In the latter two, the only reason that doesn't happen is because the underlying execution environment doesn't rely on the static types of expressions at all.
[1] https://papl.cs.brown.edu/2020/safety-soundness.html#%28part...