
Let the Type System do the Work - Garbage
http://javadocmd.com/blog/let-the-type-system-do-the-work/
======
benaiah
The following is based on conjecture, as I'm not old or well-read enough to be
sure of this, but it seems to me that the original purpose of type systems got
muddled up by the tremendous popularity, mostly in Windows systems, of
"hungarian" variable naming style. When your "type system" consists of adding
three letters to the beginning of a variable name, you don't have a way to
make descriptive types. Java, in most of the ways I've seen it used, is
essentially hungarian notation enforced as a language feature.

I realize that variable naming and type systems are two different things, but
it seems many programmers never realized the point of type systems (expressing
the type for the _programmers_ sake) because they only ever saw it used to
distinguish abstract primitives. For a long time, I had trouble understanding
that hungarian notation wasn't a type system, because they seemed to do
precisely the same thing - that's how limited my understanding of types was.
The kind of style seen in this article was alien to me for a long time, but it
was enlightening to realize that the type system is there to _help me out_ ,
not just make me type a bunch of unnecessary crap.

tl;dr: I really need to learn Haskell.

~~~
jarrett
I think what you're getting at is the distinction between this:

    
    
      float x;
    

and this:

    
    
      kilogram x;
    

The former, as you say, only tells you about the underlying representation. It
says "this is a float, so the computer should store it in such and such a
way." That's fine, but it doesn't tell us enough.

The latter example is far more useful. In my imaginary language, the
declaration implicitly tells the computer to store the value as a float,
because the kilogram type has been defined as such elsewhere. But that's not
all it does! It tells us and the compiler that this float _represents a real-
world quantity measured in kilograms._ It prevents us from mistakenly passing
kilograms where a pounds were expected, or seconds where kilograms were
expected.

On the Haskell front, you might be interested in the Dimensional library,
which does just that. It also works elegantly with multiplying and dividing
units. E.g. if you have a miles value and an hours value, you can divide to
get a miles per hour value.

~~~
steveklabnik
You can even take this a step further: using `float` instead of `kilogram` is
leaking an implementation detail. It's (probably) too low-level of information
to actually be useful.

Luckily, more and more programming languages are making creating these simple
types easier. Haskell:

    
    
        newtype Kilogram = Integer
    

Rust:

    
    
        struct Kilogram(int);
    

In fact, this blog post, while a bit outdated, shows an application of this
idea, to solve string encoding issues for HTML templating:
[http://bluishcoder.co.nz/2013/08/15/phantom_types_in_rust.ht...](http://bluishcoder.co.nz/2013/08/15/phantom_types_in_rust.html)

~~~
sparkie
nitpick - the haskell version should be

    
    
        newtype Kilogram = Kilogram integer
    

The issue with doing this of course, is that it's mostly useless to
computation. You lose the ability to use mathematical operators because you're
no longer an instance of Num, and even if you create the Num instance, or use
-XGeneralizedNewtypeDeriving, you can't multiply a Kilogram by a Meter/Second
for example, since the arguments to (*) must be of the same type. One would
need to use a generic "Measure" type instead, where the unit is some metadata
attached to it, and the Num instance implements the typechecking on units.

~~~
jarrett
You're quite right about the limitations imposed when you make your own
dimensional types like that. It turns out the problem isn't trivial. Which is
why I prefer to use a library like Dimensional. With Dimensional, you get
plenty of units out-of-the-box, you can define your own if needed, and you can
perform arithmetic in a fairly natural way.

------
rossjudson
The type system is what the compiler (or interpreter) knows about your
program. Sophisticated type systems know a lot, and can tell you a lot.
Sophisticated does not mean typing a lot.

When building code for the long term, you really want to focus on techniques
that make it _impossible_ to misuse an API, whenever it's possible to do that.
This article nicely calls out one such technique.

Generic typing can be used to (try to) force the consumer of an API into
correct usage. Judicious use of _final_ and _abstract_ in the land of Java can
be used force/guide eventual overriders of an abstract class down the right
path -- they'll get compilation errors if they don't at least implement the
right methods.

Whenever you find yourself writing an assert method, take a beat and figure
out if the type system could have turned that into a compilation error
instead.

~~~
dllthomas
_" The type system is what the compiler (or interpreter) knows about your
program."_

Compilers know a lot more than just what is represented in "the type system"
(as it's typically thought of, anyway). Control flow (basic block analysis,
&c) for instance. Which isn't really taking away from your larger point -
certainly, things represented in the type system are things the compiler
_will_ know about.

 _" Whenever you find yourself writing an assert method, take a beat and
figure out if the type system could have turned that into a compilation error
instead."_

Possibly with a _Static_assert (or static_assert) as if C11 (/C++11)!
Obviously there are substantial limitations still, but it's great to be able
to pull more to compile time cleanly.

------
sgarlatm
This discussion reminds me of a 2005 article from Joel Spolsky called "Making
Wrong Code Look Wrong". It has a similar point to this article: there are
things we can do as programmers to make our code less error prone. It's
definitely worth a read if you're interested in this topic and haven't read
Joel's article before.

[http://www.joelonsoftware.com/articles/Wrong.html](http://www.joelonsoftware.com/articles/Wrong.html)

------
exabrial
Type systems in languages make me less productive.

Unless I have to provide support, add features, scale my application, write
bug free code, hire additional developers, or explain my thought process to
anyone else.

~~~
axman6
Right, it's well known that type systems are essentially useless, except when
you need to write high quality code, and do all those things you mentioned.
Basically they're unnecessary (but only if you use the term unnecessary very
literally).

------
hcarvalhoalves
> val player = new Player(new Vector2(100, 100), new Vector2(50, 50))

> No problems here. The first noticible crack in the system comes after a few
> weeks vacation away from this code. You come back, and you want to change
> the starting point of the player. Will you remember which Vector2 to change?
    
    
        player = Player(position=Vector2(100, 100), size=Vector2(50, 50))
    

The problem is not typing here, it's lack of named arguments.

~~~
TheLoneWolfling
I'd disagree with you: named arguments are good, but they don't save you in a
lot of cases.

For example when you have one function returning (pos, size) and another
function expecting (size, pos).

~~~
hcarvalhoalves
Named arguments is syntatic sugar for packing/unpacking data structures in
some languages, so you can leverage that. In Python:

    
    
        >>> from collections import namedtuple
        >>> Vector2 = namedtuple('Vector2', ('x', 'y'))
        >>> PlayerOptions = namedtuple('PlayerOptions', ('position', 'size'))
        >>> opt = PlayerOptions(position=Vector2(100, 100), size=Vector2(50, 50))
        >>> Player(**opt._asdict())
    

The order arguments are passed doesn't matter anymore. You can also enforce
this calling convention with a constructor signature like this (in 3):

    
    
        >>> class Player(object):
        >>>     def __init__(self, *, size, position):

~~~
q845712
i agree that in languages without compiler defined types, but with other tools
such as named arguments, the class of errors described here can be to some
extent avoided by making good use of named arguments. I'm gonna do this more
often in my python code!

------
sgt101
Yus.

Dynamically typed languages are expressive, quick and fun.

But strict static typing is like a seat belt, it's annoying, but it just might
save your life.

Also ceremony code is there for a reason; that reason is not to annoy you, it
is to make sure that those who come after you know what it is that you have
done and why.

Also: comments.

Also: documentation.

Also: UML.

~~~
masklinn
> But strict static typing is like a seat belt, it's annoying, but it just
> might save your life.

The problem with saying "static typing" without further precision is on one
axis it ranges from C where the steatbelt is made of paper to ATS or Idris
where the "steabelt" is a zero-zero ejection seat, and on an other axis it
ranges from Java where you have to braid the steatbelt from raw fibers any
time you sit down to MLs or Haskell where the seatbelt magically appears
around you.

~~~
seanmcdirmid
> MLs or Haskell where the seatbelt magically appears around you.

Well, only if you don't need nominal subtypes.

~~~
DanWaterworth
Honest question; Could you give an example where you need nominal subtypes?

~~~
seanmcdirmid
If you like doing real object-oriented programming, then nominal subtypes are
mighty useful.

~~~
spacemanaki
Well here's another honest question: what's real object-oriented programming
and where can I learn more about it?

~~~
seanmcdirmid
There is an entire lineage of real object-oriented languages with nominal
types starting from the dynamically typed smalltalk (research on which where
we learned H&M was horribly ill suited to OOP in the first place). Or if you
prefer, just look at how you use your natural language to name things and
reason about the world.

~~~
acjohnson55
On the other hand, so many people have abandoned the most common forms of OOP.
See for example the story from yesterday [1]. Many writers have written about
how brittle object hierarchies are in anything nontrivial, [2] for example.

I will say though that newer models of OOP, focusing on interfaces and
composability of behavior, seem to have quite a bit of steam left in them.

[1]
[https://news.ycombinator.com/item?id=7618933](https://news.ycombinator.com/item?id=7618933)

[2] [http://raganwald.com/2014/03/31/class-hierarchies-dont-do-
th...](http://raganwald.com/2014/03/31/class-hierarchies-dont-do-that.html)

~~~
seanmcdirmid
Don't count OOP as being dead yet, the FP stuff is useful (I use it myself),
but we have our own better abstractions coming out, like having more fun with
mixin-style inheritance. The FP + OOP crowd is pretty vocal, I'll give you
that.

------
205guy
One language that enforces strong typing is Ada. It has compiler-time and run-
time type checking. It also had many other features to make the code less
error-prone and maintainable. It was used by the DoD and also used in critical
applications, such as nuclear power plants. The early versions of Java
reminded me very much of Ada.

~~~
axman6
As far as I know, there's no runtime type checking of Ada, but there is
runtime value checking to ensure values are within the specified ranges/adhere
to the specified predicates (a very cool feature; you can specify that a type
is always even, and encountering an odd value will cause an exception).

~~~
darkestkhan
Oh, there is runtime type checking, though usually it is optimized out at
compile time: it is for subtypes (which is mostly range checks) and checks of
tags for tagged types (usually in class wide subprograms). Though both of them
are often optimized out.

------
bglazer
From a performance point of view, is there a significant penalty associated
with these "small types"?

For example, kilogram as wrapper around float?

~~~
kasey_junk
The way these are implemented, the short answer is yes. There will be more
memory pressure on the garbage collector in Scala. The long answer is, "mmmm
maybe, you should test it" because due to escape analysis, eden generation
collecting speed, JIT, etc. It can get very complicated very fast. It can also
be very dependent on what you mean by "significant penalty" and "performance".

As an aside, Scala does allow for compile time only small types in the form of
AnyVal's. They won't help in the examples in the article, but in the simpler
kilogram wrapper around float it will and there will be no performance penalty
as it will be elided by the compiler.

~~~
Terr_
Or to reuse an adage: "It's easier to optimize correct code than to correct
optimized code."

You can always find the slow part and change it to use raw floats/ints/etc.

------
Serow225
See also this blog post about using using single-member structs in C to
achieve similar goals: [http://spin.atomicobject.com/2014/03/25/c-single-
member-stru...](http://spin.atomicobject.com/2014/03/25/c-single-member-
structs/)

~~~
NAFV_P
I came across this a few weeks ago, I think the first example is to do with
arrays being second class.

On the other hand, if you give data to a C-programmer, they can locate its
position in RAM. Give an assembly-programmer a simple data-type, they can
point to a specific register(s) in the processor.

------
charlieflowers
The general principal makes a ton of sense: let each line of your code express
as much of your intent as possible. That way, when you have a _huge_ codebase,
there are all these useful little "hooks" throughout it that will be useful
when you need to refactor.

Your codebase will "say" more, and therefore your tools will "know" more, and
therefore your tools can _do_ more for you.

~~~
charlieflowers
Oh, but one more thing. It's very nice if the abstractions you use to do this
are resolved down to nothing at runtime. So there's no penalty at runtime for
the precision you added to the code. Let the compiler do the hard work so that
you can say "Position", but at runtime, the code is as if you had written
"Float".

------
ZenoArrow
F# allows you to do these 'small types' in a few different ways, including
unit annotations... [http://fsharpforfunandprofit.com/posts/units-of-
measure/](http://fsharpforfunandprofit.com/posts/units-of-measure/)

------
jackcarter
I've seen this paradigm called "Tiny Types" before:
[http://darrenhobbs.com/2007/04/11/tiny-
types/](http://darrenhobbs.com/2007/04/11/tiny-types/)

~~~
darkestkhan
I'm actually surprised that people learn about this only now... It is quite
standard (and very common) thing to do in Ada. But then we can "subtype
Strength is Natural range Natural'First .. 32;" since 80s (numeric type with
all operators defined with permitted range of values from 0 to 32; range
checks are performed at runtime by compiler - but most of them are optimized
out anyway so penalty for them is in single digit percents of performance). Or
if you prefer example from that site: "type First_Name is new String;"

------
iaskwhy
While I agree with the idea of the article, what's the logic for not using
"small types" for all parameters? Or, to put it another way, how do you decide
a given method should have "small types" used for parameters?

Also, although it makes the code much more verbose, named parameters (as used
in, for example, Objective-C) fix any eventual bug with refactoring methods'
signatures, right?

~~~
Terr_
> Or, to put it another way, how do you decide a given method should have
> "small types" used for parameters?

I think a good rule of thumb is to ask yourself what the "dimension" or "unit"
the parameter has. (See also: Dimensional Analysis.) You never want to pass
4.33f radians into a function expecting 4.33f newtons!

A few more examples:

Distance in meters, Distance in feet, Speed in m/s, speed in yards/minute,
Acceleration in m/s^2, Acceleration in cm/s^2, Mass in grams, Weight in
pounds, Force in newtons, Force in dynes, Temperature in Celsius, Temperature
in Fahrenheit, Angles in radians, Angles in degrees....

"But I don't do physics programming!", you say? Well, there's plenty more:

Time in seconds, Time in milliseconds, Distance in pixels, Distance in inches,
Numeric ID of a Foo, Numeric ID of a Bar, Bytes in UTF8, Bytes in ASCII, US
currency in dollars, US currency in cents, Euros, interest rate (yearly),
interest rate (monthly)...

And that's not even getting into industry-specific dimensions that might
exist.

~~~
dragonwriter
> Distance in meters, Distance in feet

That brings up an interesting issue -- is the "right way" to do that to have
different types for these, or one type for distance with different factory
function/methods for different units. E.g.: should "meters" and "feet" be
different types, or should the "measurement" module have both a "meters" and
"feet" method that return the same ("distance") type.

It seems to me the latter is more conceptually clean.

OTOH, math might be easier to express with the former (particularly in a
language that allows you to define implicit type conversions, so that if you
pass an "inch" value to a function expecting a "cm" parameter, it gets
automatically converted to the appropriate "cm" value.)

~~~
Terr_
Well, there are really three different kinds of information involved. For
example, the variable "target_resist" might have: Computational type (float),
Dimension (resistance), and Unit (milliohms).

Most built-in type-systems (sensibly!) only try to tackle the broadly-
applicable computational-type, since the rest is context-specific.

However, in a contractual sense, all of those are important, if _any_ are not
as expected, you'll get bugs.

------
be5invis
According to Curry-Howard correspondence, types are conclusions and programs
are proofs of the conclusion. Any program is corresponded to a deduction
procedure, such as a Hilbert-style proof or natural deduction proof.

------
th3iedkid
some of the best pieces in type systems (and model theory too!) were from
benjamin pierce and his book on type-systems.

