
Untyped Programs Don’t Exist - gbrown_
https://www.williamjbowman.com/blog/2018/01/19/untyped-programs-don-t-exist/
======
porpoisely
A couple of points.

Javascript, Racket, etc that he listed aren't "untyped" languages. They are
typed. Weakly/dynamically typed. Although I guess some people equate the two.
So lets go with that.

Also, division by zero is an runtime operation error, it is not really a type
error. The division operation doesn't fail because it violated the type
system, it fails because you can't mathematically divide by zero.

Finally, x86 assembly doesn't really have what we call a type system in the
general sense.

But assuming his premises and logic, the conclusion should be "Typed Programs
Don't Exist". If x86 checks its type at runtime ( division by zero ), then it
is a dynamic language. Dynamic languages are untyped languages like javascript
and racket. And since all programs are ultimately assembly programs and all
assembly programs are created by dynamic languages then it must mean that all
programs are untyped. Hence, no typed programs exist.

Ultimately, the author's argument is that runtime division by zero error is
proof of x86 type system. But if that is true, x86 assembly is a dynamic ( aka
untyped ) language. And his conclusion is wrong.

~~~
naasking
> They are typed. Weakly/dynamically typed.

Unityped, as in, they have only one type. You can assign any type to any
variable, so all values have a single type.

> Also, division by zero is an runtime operation error, it is not really a
> type error. The division operation doesn't fail because it violated the type
> system, it fails because you can't mathematically divide by zero.

That is a type error in some type systems. The general notion of a type is a
logical proposition. "Non-zero number" is a proposition and so you can
represent this with types. The type of division is then `Num -> (Num != 0) ->
Num`.

> Dynamic languages are untyped languages like javascript and racket.

This is not true. All languages are typed, either with rich types, or they are
unityped (like the lambda calculus or JS).

Machine language is typed, and the fundamental types are (roughly) word and
double. These are the registers of machine language, which you can roughly
think of as variables in programming languages.

~~~
int_19h
Values in JS still have different types (as evidenced by their different
behavior). _Variables_ could all be argued to have one type in JS, which is
the root type (which everything else subtypes).

So no, JS is still not untyped, merely dynamically typed. An example of a
language that also has only one data type for _value_ would be something like
Tcl, where the only type is a string - and everything else is just an
optimized way to store certain kinds of strings, that could be removed without
affecting semantics of any program.

~~~
naasking
> Values in JS still have different types (as evidenced by their different
> behavior).

Behaviour is based on values, not types. 1+2 will print out a different value
than 1+1, but these expressions have the same type (typically).

> So no, JS is still not untyped, merely dynamically typed.

I said JS was unityped, not untyped.

~~~
int_19h
Values also have types, it's not something that's exclusive to expressions and
bindings.

------
madmax96
I'm going to be a little nitpicky here:

> Definition. An expression is a symbol of sequence of symbols given some
> interpretation. > Definition. A language is collection of expressions.

I immediately reject these definition. A language is a subset of the free
monoid over some alphabet [1]. This definition is _widely_ used and the author
changing the definition here is significant. Next, because the author requires
that each expression be given an interpretation, the author conflates language
(e.g. valid strings) with semantics (e.g. the interpretation of strings).

> The result of this sequence of symbols is undefined in C.

...

> Definition. Undefined behavior is the result of interpreting a non-
> expression.

The author is mixing around a few definitions of "undefined behavior". Of
course, C compilers will _gladly_ compile programs containing expressions that
are total nonsense. The resulting behavior of the program has no meaningful
semantics but the program itself consists _only of valid expressions._ Because
a program may contain expressions that exhibit UB when operating with
_standard definitions_ (e.g. the definitions of "expression" and "language"
ubiquitously used when discussing programming languages and compilers) the
proof is incorrect.

The immediate implication of this result is that technically a C program with
UB is free to modify itself non-deterministically and thereby prevent any
existing proof system from predicting its behavior.

However, definitions aren't really "right" or "wrong." The proof is otherwise
sound. My critique is that real systems aren't built using the definitions the
author uses and therefore the resulting theorem isn't really applicable in any
practical circumstance.

[1]
[https://en.wikipedia.org/wiki/Formal_language](https://en.wikipedia.org/wiki/Formal_language)

------
goto11
I find it a bit of a strawman. "Untyped" is not a very common term for
programming languages. "Dynamically typed" is much more common in my
experience. I'd call something like CSV untyped.

~~~
tom_mellior
Certain subsets of the statically typed functional programming research crowd
(i.e., Haskell/OCaml/ML aficionados) do like to refer to dynamically typed
languages as "untyped". For them it is a common term of derision.
Unfortunately they do not know or care that the rest of the world uses the
term differently.

~~~
Tade0
Derision is generally worryingly common among these people.

This is of course anecdotal on my part but as someone working with JS on a
daily basis I've received more than my share of insults based on my choice of
career.

~~~
olau
I've seen the same.

Yet good user interfaces are typically warty behind the scenes and full of
exceptions, exactly the kind of thing where spending a lot of time on
specifications/typing offers little reward. Au contraire, it can put you into
a mindset where you'd rather fight the users than put in another difficult-to-
handle-typewise exception.

Some time ago I had a look at old Visual Basic which I've never personally
used, and lo and behold, if you squint it's not actually that far from
Javascript.

~~~
Tade0
Ha! I remember when _VBScript_ was still a thing. What a different world we
would be living in had Microsoft had its way.

------
myWindoonn
And most type systems don't actually reflect the subtyping relation inherent
in the topos correspondent to their logic. As a result, there is a massive
disconnect between type systems, which let us catch errors in programs without
running them, and type theory, an essential and unavoidable part of doing
maths.

I think that we should take "typed", "untyped", etc. away from the dynamic-
language haters and force them to justify their positions without this
needlessly-perjorative attitude. "Types lend themselves to strong automated
reasoning, automatically eliminate large classes of errors, and simplify the
job of whoever is reasoning about the programs," writes the author, ignoring
that there are piles of similar features in the language design space, like
highlighting, typesetting, brace-matching, whitespace, capitalization
conventions, docstrings, named variables, named procedures, 2D/spatial layout,
ASCII/JIS/Unicode art, and the list goes on and on and on.

~~~
AnaniasAnanas
> most type systems don't actually reflect the subtyping relation inherent in
> the topos correspondent to their logic

Would you mind elaborating? With examples if possible.

> ignoring that there are piles of similar features in the language design
> space, like highlighting, typesetting, brace-matching, whitespace,
> capitalization conventions, docstrings, named variables, named procedures,
> 2D/spatial layout, ASCII/JIS/Unicode art, and the list goes on and on and
> on.

I am interested to see how any of these would protect against bugs that say a
type system with dependent types would protect against.

~~~
myWindoonn
The most infamous example is that of Haskell. Reasoning in Haskell is not the
same as reasoning in the category Hask [1], whose objects and arrows purport
to be Haskell types and functions; many folks are actually reasoning in an
idealized, simplified Hask. Papers like "Fast and Loose Reasoning is Morally
Correct" [0] justify this stance. (There is, to be fair, work on determining
Haskell's actual categorical properties. [2]) This attitude is popular across
the ML family, but spikes in the Haskell community for some reason.

I don't have a good citation for categorical subtypes, but it is well-known
category-theory folklore that often objects have subobjects and that a given
arrow `f : X -> Y` has fixed source object X and target object Y, and
therefore that f likely embodies not just categorical structure but also
subobject structure inside X. There may be many distinct arrows of type `X ->
Y`, each with different subobject behavior. This is where concepts like
dependent typing shine, right?

I'm going to take the other branch of your implicit dilemma. Let's protect
against the same bugs that dependent types protect against, but without
dependent types. This is valid E [3] or Monte [4], and languages in the E
family are what Haskellers would call "unityped" or "untyped". The
specification is that `y >= x` and also that the return value `rv >= y`. The
compiler is not obligated to infer corollaries like `x >= 0`.

    
    
        def f(x :Int, y :(Int >= x)) :(Int >= y) { return x + y }
    

It is not accidental that Monte has a categorical semantics [5] at the level
of values and not types. It seems to be the case that, even if the category of
types is trivial, the category of values is rich. At the same time, studying
only the category of types and never examining the values is insufficient!
Ponder this Haskell REPL session:

    
    
        $ nix run nixpkgs.ghc -c ghci
        GHCi, version 8.2.2: http://www.haskell.org/ghc/  :? for help
        Prelude> :t reverse
        reverse :: [a] -> [a]
        Prelude> :t const [] :: [a] -> [a]
        const [] :: [a] -> [a] :: [a] -> [a]
        Prelude> :t cycle
        cycle :: [a] -> [a]
        Prelude> :t map id
        map id :: [b] -> [b]
        Prelude> :t tail
        tail :: [a] -> [a]
    

Studying the types alone, we are helpless. We need to know something of the
values as well.

[0] [http://www.cse.chalmers.se/~nad/publications/danielsson-
et-a...](http://www.cse.chalmers.se/~nad/publications/danielsson-et-al-
popl2006.pdf)

[1] [https://wiki.haskell.org/Hask](https://wiki.haskell.org/Hask)

[2]
[http://www.cs.gunma-u.ac.jp/~hamana/Papers/cpo.pdf](http://www.cs.gunma-u.ac.jp/~hamana/Papers/cpo.pdf)

[3] [http://erights.org/](http://erights.org/)

[4] [https://monte.readthedocs.io/](https://monte.readthedocs.io/)

[5]
[https://monte.readthedocs.io/en/latest/category.html](https://monte.readthedocs.io/en/latest/category.html)

------
tabtab
On C2-dot-com there used to be a long running debate about "tag based" dynamic
typing versus tag-free dynamic typing. Tag-free is similar to "untyped", but
that may depend heavily on which definitions are used. This debate included
whether tag-free is "good", and how to define it, since "tag" implies an
implementation and not really behavior. One may also use tags for compression
and speed "under the hood" even under "tag-free", but the programmer may not
otherwise have to care as a language user (outside of performance issues).
Thus, how does one know as a language user that the language is "tag free"?

I lean toward tag-free; it makes life simpler in my opinion. Interesting but
thready debate. A battle of philosophy and definitions.

------
mlthoughts2018
This is totally backwards. _Typed_ programming languages don’t exist, as in at
runtime it’s just instructions applied to sequences of bits. A type is just a
convention about an agreed interpretation of those bits. It’s a temporary
fiction that doesn’t exist at runtime.

~~~
klodolph
There's some inconsistency in that argument. If a type is just a convention
about an agreed interpretation of bits, then isn't it equally true that bits
are just a convention about an agreed interpretation of voltages, charges, and
magnetic domains? But worse yet, voltages, charges, and magnetic domains are
just interpretations of the physical world, and they don't really "exist"
either...

~~~
Tade0
The difference with bits is that it's a convention realised in the hardware.

0 and 1 are _physically_ different from each other. A 64-bit float could
easily be seen as an array of bytes without any change to the information it
contains.

Bottom line: types are external to the information something contains.

~~~
klodolph
Whether a particular voltage represents an 0 or a 1 is an interpretation
external to the device. Internally, there are just voltages. Whether a
particular voltage represents an 0 or 1 will not in general be consistent,
even within a device, and whether a particular piece of the device represents
a bit or not is a matter of interpretation.

What tells you that a particular location in a chip represents a bit and what
voltages correspond to 0 and 1? You have to bring external knowledge of the
particular device.

~~~
Tade0
Entropy. I can measure the number of steady states of the device and discover
the actual, real, physical information contained within.

Also through this I can discover which voltage corresponds to the high and low
state.

Meanwhile a series of bytes can be a float, a string, a table of numbers or
even a function pointer - you can't tell which without external information.

~~~
klodolph
It's a bit frustrating to see a comment that starts with a single,
incomprehensible, verbless statement like "entropy." I know what entropy is,
and I could give you three different senses for the word, but it's not a
particularly illuminating thing to think about here and I'm clueless about how
this connects to the rest of the comment.

I think what you're doing is taking your existing understanding of how a
processor works, and saying that you could figure it out, given an actual
processor. But that's begging the question. You have a definition for what a
"bit" is, and what a "byte" is, and since you know what they are you can go on
a hunt within a physical system and label things as "bit" and "byte". But I
could come in and start labeling certain things "function pointer" or "float",
and I don't see a particularly compelling reason why that's not allowed, but
labeling things as "bit" or "byte" is allowed.

Or in other words, why is, "I know how synchronous digital circuits work and I
can figure out a processor" acceptable, but "I know how compilers work and I
can reverse engineer a program" prohibited?

And I'll still say that the statement that, "Typed programming languages don’t
exist," is not even wrong. It's just a cute thing that you can say, to
demonstrate that you can choose a definition for "exist" or "typed programming
languages" that makes it needlessly difficult to have a conversation.

