
Should Your Specification Language Be Typed? (1999) [pdf] - pron
http://research.microsoft.com/en-us/um/people/lamport/pubs/lamport-types.pdf
======
ktRolster
The opinion piece (that's what it is) starts with an assertion that types make
it easier to find bugs. I think the reality is more nuanced.

When types were pushed into the C language (function prototypes, etc), it
fixed a lot of bugs. The compiler would prevent you from pushing a short into
an int, etc. Because that was a common and hard to debug mistake, this
experience increased the popularity of types.

Typing of higher level objects is less useful. When you pass an object into
the wrong function, the code usually doesn't even pretend to work (unlike an
integer passed in as a float parameter). Also, because objects are so
different, it's rare to accidentally pass an object as something else. The
only time I've seen that happen is in very convoluted, multi-layered code
bases.

We can see something similar in languages like Ruby or Perl, where it's
actually kind of nice that you don't have to worry about whether a variable
holds a string or integer or whatever. (For some reason, Javascript's
implementation of the same idea seems to be a bit more annoying, but that
could just be me).

So strong type systems are not necessarily a boon. (Of course, on the other
hand, we have Haskell, where the type system can help reduce bugs).

~~~
dllthomas
Types help you enforce invariants. When you need to care about representation,
representational types are quite useful. But it turns out, when we don't care
about information interchange or a whole lot about performance, we usually
don't care about representation as long as it's handled correctly, and a lot
of automatic systems are okay (or better) at handling representation.

More of what we care about is domain relevant, and that's a place where
enforcing invariants can be important. As an incredibly simple example,
passing a "bid" price where an "ask" is expected could be an expensive error,
and is unlikely to trigger an exception.

A type system like Haskell can be a big win, but we can enforce surprising
things in something as simple as C. I had a project where I had several
threads devoted to different activities, operating on their own static data.
With a little creativity, I was able to track at compile time which threads a
function could be run from and detect when I accessed data from the wrong
thread. Without adding any run-time overhead, and while leaving me a free hand
in organizing my data and functions in both my source files and my executable.
It was tremendously convenient, when changes in specifications required me to
move functionality from one thread to another, to be told everything that
needed to change along with that. Over the course of the project, I can count
on one hand the number of concurrency issues found in testing and only one - a
complicated livelock - ever made it into production.

~~~
versteegen
That sounds really fascinating and educational. Could you possibly provide
some more details or examples of how you accomplished that in C?

~~~
dllthomas
I'm a bit sleep deprived (new baby) but I'd like to try to respond before the
reply link goes away.

There's a few layers to this.

I threaded knowledge of what thread a function could run on through as the
first argument. It was always called `th` and took the form of an _empty_
struct. It turns out, if you pass an empty struct by value and don't do
anything but pass it along, modern C compilers seem to generate identical byte
code as if you removed the argument entirely (obviously, this isn't something
you could strictly rely on, but it made no difference for correctness only
efficiency, and if you're at the level where you care about that scale of
efficiency you should be verifying all manner of assumptions when moving
between compilers!).

So now at any given point in any given thread-pinned function, you've got a
variable in scope called `th`. Calling another thread-pinned function on the
same thread looks like `foo(th, other, args)`. If that function isn't supposed
to run on this thread, you get an error about trying to pass something of type
`server_thread_t` as an argument of type `main_thread_t` (or whatever).

Controlling data access could be done with getters and setters. I took a
slightly different approach (which I'm not convinced is better, but it didn't
cause me any pain this project), which was to define a macro used to declare
global data that was bound to a thread, which would mangle the name and also
create a no-op checking function. I had another macro that would take the
simple name, check the thread, and resolve to the thing with the mangled name
(as an LVAL). Having called that macro `L`, I now can treat `L(foo)` just as
if it were a global variable named `foo`, and at runtime that's all it is, but
at compile time my access to it is checked.

I had a few other macros for putting a "lying" `th` in scope for those places
I really needed to cross threads (`CROSSTH` for "I am accessing thread X from
thread Y, `BEFORETH` for "I am touching this threads data before any threads
were spawned", &c).

Oh, the definition of the `th` argument in functions was baked into a macro so
it looked like `int foo(TH(main), int something, char else)`. `TH` also added
some attributes - I know there was `unused`, there might have been others...

Hopefully that gives some sense of it. Let me know if you have further
questions! :)

~~~
versteegen
Thank you for that write-up! Sorry that my response is so slow. (Seems I can
still reply several days later, wonder what the limit actually is)

That's definitely both a nice solution to the problem, and I found it pretty
enlightening (having sadly not used anything as strongly typed as Haskell).

In the case of global variables, if performance didn't matter then it would be
possible to just pass around a pointer to a struct of the thread-local/thread-
locked data for use directly. I spent a while trying to come up with some
alternatives or generalisation of your approach, and found it was really hard
to match the same guarantees without a run-time cost. One idea I got from your
description is putting access to certain data/function in scope in a type-
checked way. So you could put multiple such contexts (pointers or other access
methods like functions or even macros) in scope at the top of the
function/block, rather than having to type-check every access throughout the
function using an implicit control (e.g. the 'th' variable.) But really,
that's just equivalent to using the L macro like so:

    
    
      void inc_foo(video_thread_t th) {
         video_thread_data_t *vid = &L(video_data);  // checked global access
         vid->foo++;
         vid->bar++;
      }
    

I think something like this looks more natural though.

------
ScottBurson
(Somewhat OT, but...)

 _Constructive logic accepts the truth of_ every integer is either even or odd
_, but only because we have an effective means of determining which
alternative holds for any integer. It does not accept the statement_ every
real number is either rational or irrational _; given a real number, say as a
convergent series, we have no effective means of determining whether or not it
is rational._

I don't understand this. Consider the smallest number N so that any N points
in the coordinate plane in general position contain a subset of 7 points that
form the vertices of a convex polygon (taken from [0], recently discussed here
on HN): is this number, which is definitely an integer, even or odd? We don't
know, because we don't know what the number is. Isn't that the same situation
we are in with regard to the convergent series? What's the difference? If we
actually computed the decimal expansion of the value of the convergent series,
we would be able to look at it and see whether it was rational or irrational.
There is the minor difficulty that the expansion is in most cases going to be
infinite, but so what? We're talking about mathematics here, not computation.

[0] [https://liorpachter.wordpress.com/2015/09/20/unsolved-
proble...](https://liorpachter.wordpress.com/2015/09/20/unsolved-problems-
with-the-common-core/), "Fifth grade" section

~~~
spdionis
I don't think that the expansion being sometimes infinite is a minor
difficulty. In fact it sounds pretty major, similiar to the halting problem.
Unless you can prove if it's infinite or not a priori.

I'm not knowledgeable about this topic but the quote suggests you can't prove
that.

~~~
dllthomas
_" In fact it sounds pretty major, similiar to the halting problem."_

I think it's _identical_ to the halting problem. The issue is that, for
irrational numbers, there is _never_ a point at which you can stop and say
"okay, we know it is irrational now".

In fact, if we had an algorithm that could _always_ tell us whether a sequence
converged to a rational number, we could solve the Halting Problem:

For any Turing Machine you wish to check for halting-ness, produce a sequence
S by successively appending a full encoding (any will do) of the Turing
Machine state at every step to the decimal expansion of the previous value in
S. Apply your algorithm to S. If S is irrational, then we know it never halts.
If S is rational, then we know it halts or repeats after a finite number of
digits - so we can simply run it until it does one or the other (this may take
arbitrarily long, and an arbitrarily large amount of storage, but _finitely_
in both cases).

~~~
ScottBurson
Everyone has gotten distracted by the infinite expansion and has missed the
point of my question. Let me try again, with no infinite expansion.

The constructivist position is to refuse to accept the statement "every real
number is either rational or irrational" because they're not in possession of
proofs, for each possible real-valued expression, that its value is rational
or, alternatively, a proof that its value is irrational. And yet, according to
the quoted paragraph, they accept the statement "every integer is either even
or odd", quote, "because we have an effective means of determining which
alternative holds for any integer". My argument is that they should reject
this statement on the same grounds: I named an integer whose value is not
known, and therefore it is not known whether it is even or odd. There is not,
in fact, an effective procedure for determining the evenness or oddness of the
value of an arbitrary integer-valued expression. It's only after one actually
knows the value that answering the question becomes easy.

~~~
pavpanchekha
Constructivists don't accept that you've named an integer. They would say: an
integer is any member of the smallest _constructive_ set that contains 0 and
one more than any member of that set. Since the set is a constructive set, if
you have anything you claim to be an integer, you must be able to either prove
that it is 0 or one more than an integer. The number you named isn't zero, so
presumably it must be more than some integer ("one less than the number you
named"). But if I then ask you whether that number is one more than an
integer, and then the same about that one, and so on, at some point you will
be unable to answer me. Since you can't provide the evidence that qualifies
something as an integer, there's no reason for me to think it is an integer.

~~~
ScottBurson
I see. The difference, then, between that situation and the situation where I
provide a real number named as the sum of an infinite series is that in the
latter case I may be able to provide also the modulus of convergence, thus
proving that the number is a constructive real number [0]. This is different
from a situation where I have a number defined non-constructively and no
series is known that converges on the number.

[0]
[https://en.wikipedia.org/wiki/Constructivism_%28mathematics%...](https://en.wikipedia.org/wiki/Constructivism_%28mathematics%29)

------
Kutta
A bit quaint in the light of the past 15 years. I find it amusing that the
paper uses the phrase "type-based formalisms" instead of "type theory".

