
Make the Type System Do the Work - NathanWong
http://nathan.ca/2014/02/type-rich-programming/
======
chrisaycock
This article uses _inheritance_ in C++ to ensure the types line-up, but I have
really taken a shine to _algebraic data types_ for my firm's data feed
handlers.

Here is an example in C++11 that implements a market-data spec line-for-line.

    
    
      #pragma pack(push,1)
      
      struct Quote {
        char     symbol[16];
        uint16_t bid;
        uint32_t bidsize;
        uint16_t ask
        uint32_t asksize;
      };
    
      static_assert(sizeof(Quote) == 28, "Quote size wrong");
      
      struct Trade {
        char     symbol[16];
        uint16_t price;
        uint32_t size;
      };
    
      static_assert(sizeof(Trade) == 22, "Trade size wrong");
      
      struct Data {
        enum class DataType : char {
          kTrade = 't',
          kQuote = 'q'
        };
    
        uint64_t sequence_number;
        DataType data_type;
        union {
          Quote quote;
          Trade trade;
        };
      };
      
      #pragma pack(pop)
      

Now because I've disabled padding in the structs, the types will map directly
to the proper byte locations as specified by the data vendor. There is no need
for me to manipulate bytes!

And I can get away with a single cast:

    
    
      Data* data = reinterpret_cast<Data*>(raw_buffer);
      switch (data->data_type) {
        case Data::DataType::kTrade: {
          Trade& trade = data->trade;
          std::cout << trade.symbol << ' ' << trade.price << ' '
                    << trade.size   << std::endl;
          break;
        }
      
        case Data::DataType::kQuote: {
          Quote& quote = data->quote;
          std::cout << quote.symbol << ' ' << quote.bid << ' '
                    << quote.ask    << std::endl;
        }
      
        default:
          std::cerr << "Unknown message type: "
                    << static_cast<char>(data->datatype) << std::endl;
      }
    

So now the code resembles a match statement in OCaml, but will actually
compile directly onto the underlying bytes. No data marshaling or anything
like that.

~~~
com2kid
This is incredibly common in embedded programming. Lots of times you'll have
some binary packet come over the wire, the first field is a packet type
specifier, you can then just cast the rest of your byte buffer to the proper
type.

It is _very_ unfortunate that C (and until recently C++) did not have a way to
specify the size of enums. Basically this has resulted in crappy #defines
being used for far too many years.

I always called what you did above "tagged types". A number of type systems
actually work like that under the covers, objects get a type ID, and then a
lookup function figures out what that type ID maps to.

~~~
qznc
The more common term is "tagged union".
[http://en.wikipedia.org/wiki/Tagged_union](http://en.wikipedia.org/wiki/Tagged_union)

------
dlubarov
I prefer to have a single class which stores an SI unit, as in

    
    
      final class Temperature {
        private final double kelvin;
    
        public static Temperature fromKelvin() { ... }
        public static Temperature fromCelcius() { ... }
        public static Temperature fromFahrenheit() { ... }
    
        public double getKelvin() { ... }
        public double getCelcius() { ... }
        public double getFahrenheit() { ... }
      }
    

This avoids the quadratic explosion of conversion methods.

A more complete example -
[https://github.com/dlubarov/daniel/blob/master/data/src/dani...](https://github.com/dlubarov/daniel/blob/master/data/src/daniel/data/unit/Duration.java)

~~~
hrjet
But this solution doesn't solve the original problem described in the article.
With your API, it is easy for a programmer to use a temperature in Celcius as
a temperature in Kelvins and the type system can't catch it.

~~~
dlubarov
Well, most code can just pass around Temperature objects without dealing with
any particular units. There's no opportunity to mix up units there.

When some code needs to actually get a number out, say for the purpose of
logging, it's possible to screw up with either approach.

With the author's classes, you could do

    
    
      void printTemperature(DegCelsius temperature) {
        log("Temperature in Kelvin: %f", temperature.getDegrees());
      }
    

With this Temperate class, you could do

    
    
      void printTemperature(Temperature temperature) {
        log("Temperature in Kelvin: %f", temperature.getCelcius());
      }
    

As the author says, "wrong code should look wrong." I think both of these look
pretty wrong.

~~~
hrjet
With the author's code you can define a print() function inside each
temperature class, so that there is no mixup of units during printing. Or let
the class supply its suffix instead of hardcoding it in an external print()
function. It might not be always possible to follow that technique, but in
general, it is better (according to the article, and I agree) to retain as
much information at compile time as possible.

------
rcfox
The Boost.Units[1] library takes this even further, allowing you to do things
like divide a length by a time quantity and get a velocity, again with
compile-time type checking.

[1]:
[http://www.boost.org/doc/libs/1_55_0/doc/html/boost_units.ht...](http://www.boost.org/doc/libs/1_55_0/doc/html/boost_units.html)

~~~
EdwardDiego
F# has similar also:
[http://en.wikibooks.org/wiki/F_Sharp_Programming/Units_of_Me...](http://en.wikibooks.org/wiki/F_Sharp_Programming/Units_of_Measure)

------
eloff
Did anyone else recoil in horror at how a simple function is turned into a
type hierarchy? This is everything that's wrong with Java to me. Classes are
all fine and well, but don't take it to extremes. Maybe the examples were just
too trivial, but it really is a far greater evil than just using well named
variables and a simple conversion function.

~~~
tomlu
It's nice how in Go you can alias a type to int, yet it is its own distinct
type with compile time checks and explicit conversions. Definitely takes away
all the boilerplate.

~~~
pjmlp
> Definitely takes away all the boilerplate.

Until you need to write generic code. Then it is casts everywhere and
boilerplate to satisfy interfaces.

~~~
drbawb
>... and boilerplate to satisfy the interfaces.

Go actually provides facilities for embedding types which helps remove some of
the unnecessary boilerplate from satisfying an interface.[1]

Through embedding you can leverage the implementation of an existing type to
satisfy an interface w/o writing methods to dispatch to that underlying type.

[1]:
[http://golang.org/doc/effective_go.html#embedding](http://golang.org/doc/effective_go.html#embedding)

~~~
pjmlp
I used Go quite a lot until version 1.0 got released.

You are not explaining nothing new to me.

------
coolsunglasses
To take this to the current state of the art, look at Haskell, Agda, and
Idris. Latter two being dependently typed.

Steps for learning Haskell:

[https://gist.github.com/bitemyapp/8739525](https://gist.github.com/bitemyapp/8739525)

~~~
MaxGabriel
Any recommendations for web programming in Haskell?

~~~
tel
I highly recommend taking a look at the Happstack Tutorial. There's both a
Lite version [1] and a full Crash-Course [2].

[1] [http://happstack.com/page/view-page-slug/9/happstack-
lite](http://happstack.com/page/view-page-slug/9/happstack-lite) [2]
[http://happstack.com/docs/crashcourse/index.html](http://happstack.com/docs/crashcourse/index.html)

------
skybrian
This is a good trick but it needs to be used judiciously or you'll end up with
lots of boilerplate.

I particularly like Go's type system, though, because it makes it very simple:

    
    
      type celsius float64
    

There's no overhead and it doesn't attempt to prevent you from doing any
calculations as a float64; the type checking just prevents direct assignments
from a celsius type to another type.

~~~
hdevalence
On the other hand, Go doesn't have a way to write code that abstracts over
types, and my (albeit limited) experience with Go is that most of the
boilerplate is due to the language's poor type system. If you use a language
with a better type system (Haskell is the first that comes to mind, though
there are others), you can actually do this sort of thing without the
boilerplate code.

~~~
sparkie
Haskell still has some boilerplate - actually quite a lot if you don't use
-XGeneralizedNewtypeDeriving (which has known problems when used with other
extensions). With newtype deriving, this is about as simple as it gets:

    
    
        {-# LANGUAGE GeneralizedNewtypeDeriving #-}
    
        class Degrees deg where
          toK :: deg -> K
          fromK :: K -> deg
    
        newtype K = K { unK :: Double } deriving (Show)
        instance Degrees K where
          toK = id
          fromK = id
    
        newtype C = C { unC :: Double } deriving (Show)
        instance Degrees C where
          toK = K . (+ 273.16) . unC
          fromK = C . (flip (-) 273.16) . unK
    
        newtype F = F { unF :: Double } deriving (Show)
        instance Degrees F where
          toK = K . (+ 273.16) . (* (5/9)) . (flip (-) 32) . unF
          fromK = F . (+ 32) . (* (9/5)) . (flip (-) 273.16) . unK
    

Without newtype deriving (or other generic deriving extensions), one would
need to implement instances of {Num, Real, Fractional, RealFrac, Floating,
RealFloat} for all three types, which do nothing but map to the Double
equivalents.

~~~
tel
GeneralizedNewtypeDeriving's problems are fixed in the newest GHC which just
had it's first RC a few days ago. It introduces a "roles" system which tracks
which types are allowed to use GND without breaking abstraction boundaries.
Roles are mostly invisible to end users as well.

------
jeffdavis
"'Dog is-a Mammal' ... actually fairly sound"

I disagree in the context of OOP inheritance.

(This example uses shapes, because the idea of mutating mammals gets a little
strange.)

Let's say you have a Circle class and an Ellipse class. A Circle inherits from
an Ellipse, of course, because a Circle is-a Ellipse. By specializing as a
Circle, we get extra reader methods such as getRadius(). Great.

But what about mutators? For an Ellipse, it may have a mutator called squash()
that holds the major axis constant and halves the minor axis. But that leaves
us with a problem, because circles can't be squashed while remaining circles.

Let's say we have a program like:

    
    
        void f1() {
          Circle myCircle(10.0);
          f2(myCircle);
          cout << myCircle.getRadius() << "\n";
        }
        void f2(Ellipse &e) {
          if (...)
            e.squash();
        }
    

Inheritance says that f2 must also accept a Circle. But what happens when we
try to squash it? We can either throw a runtime error there, or we could let
the squash succeed and the getRadius() in f1() will fail.

Although a Circle _value_ is an Ellipse _value_ , a Circle _variable_ is not
an Ellipse _variable_. In fact, inheritance for variables should go in the
opposite direction as inheritance for values, because an Ellipse variable can
certainly hold a Circle value.

~~~
zvrba
The circle vs ellipse problem is rather artificial and, if anything, it shows
that modeling with types and everyday intuition are two different things.

If the rest of your program can handle general ellipses, it should also be
able to handle ellipses having the same minor and major radius (i.e.,
circles). The obvious solution is to not have the Circle class and _maybe_
equip Ellipse with IsCircle() method. (Though, why would you care?!)

See the C++ faq lite items 21.6 -- 21.8. (A quote from 21.8: "Here's how to
make good inheritance decisions in OO design/programming: recognize that the
derived class objects must be substitutable for the base class objects. ")

~~~
judk
The circle/ellipse problem is real and shows how mutability ruins people
intuitions about models and relationships. Note that mutability ruins programs
in the same way, introducing subtle inconsistencies and invariant violations.

------
EdwardDiego
> The only problem is that you end up with a variable that fails to describe
> itself better than “I’m a number”.

Our application servers need to know about updated entities, so we broadcast
notifications when an entity has been updated - but we prefer to leave it up
to the individual applications to retrieve the updated entity as they see fit.

So we're often dealing with numerical ids of entities, and we've had a few
bugs arise from code like so

    
    
       public void something(long accountId, long websiteId)
    

When people have accidentally used the wrong long in the wrong place. So we've
now moved to using simple wrapper types around aforementioned longs -

    
    
        public void something(AccountId accountId, WebsiteId websiteId)
    

It does come at a little bit of cost as we're passing these things around a
messaging system, so now we have to think about how they are serialized (Java
serialization is problematic, and there are occasional religious wars about
GPB vs. JSON), but it leverages the type system to make it every explicit what
this number represents.

~~~
eropple
This method also create a lot of object churn. For reasonably small systems
that can be okay, but you have to be mindful of it.

~~~
gizmo686
Would it be possible to do this as a type-alias type thing that disapears at
compile time? As far as I am aware, Java has no support for such type
aliasing, but it should be possible to add a pre-compiler phase to your build
process that replaces these types with what they are aliases of. The only part
of this that seems complicated is the type checker, which would need to be
aware of the difference between AccountId and long.

~~~
yareally
I could see it being done with annotations + processing them perhaps, as some
frameworks do for database properties such as index/uniqueness. An object for
every scalar datatype is just overkill, even if it solves a legitimate
problem.

~~~
eropple
You could make it a pre-compile step in Maven, maybe. But it'd be gross.

------
clord
The big problem (other than verbose code) with rolling up stuff into classes
in C++ is that performance suffers. Many platforms will not pass small structs
efficiently.

Inspired by Haskell's newtype, I drafted a proposal for C++14 which would have
introduced native newtype to C++, but it got rolled into an omnibus paper
(n3635) and appears to have been forgotten.

The proposal is modelled on strong enums, but generalizes them to all types.
They are essentially strong typedefs with none of the weaknesses of wrapper
types, restrict pointers, typedefs, and pragma disjoint.

    
    
        newtype NotInt : int; 
        NotInt k(5); // call ‘default’
        NotInt bar(NotInt);
        bar(k); // passes exactly as an ABI int would
        int baz(int&);
        baz(k); // Error! no conversion
    	

While drafting this proposal, I realized it was a much more C++-like way of
bifurcating aliases than 'restrict' keyword:

    
    
        newtype D1 : double; 
        newtype D2 : double; 
    	
        void biz(D1 *x, D2 *y) {
            // Being separate, D1/D2 can be
            // assumed not to alias by optimizers
    	...
        }
    	

While playing with it, we came up with this proposed syntax (although I don't
think this would get through the committee):

    
    
        [template <typename T>]
        newtype [NotT] : [(public|private)] T [= (default|explicit|deleted)] [{
          // only non-virtual member functions, no data or reference members (something like POD)
          // this->value has all of the operations of T, but only aliases with the new type, and implicitly converts to/from T prvalues.
        }] [optional_variable_name];

~~~
gordaco
I don't think there's going to be any performance difference at all compared
to the basic case, which actually is another reason why this is so good.
Ultimately, all these struct just hold a double, which means that in memory
they're exactly a double and nothing more, there's no type tag or anything
like that. And all the code is static so there is no vtable, just the double
value. The compiler does the rest, not the runtime.

(Still, I'd love to see newtypes in C++)

~~~
comex
It can make a difference due to weird ABIs. For example, in the ARM procedure
call standard, 64-bit values can be returned in r0 and r1, but 64-bit
structures can't (no matter whether they contain two 32-bit values or just one
64-bit value). However, I'd call that fairly niche - it's hard to imagine an
application that would notice a significant performance difference.

~~~
com2kid
This is very nice to know actually, and quite relevant to code I am working on
right now!

Disappointing though.

> it's hard to imagine an application that would notice a significant
> performance difference.

One hopes any function call made often enough to be a perf hit from this would
also be designed to be easily inlined!

~~~
clord
Scientific computing will hit this sort of problem just because the kernel is
so large it actually needs to be broken into sub-procedures for icache
reasons. But also, the penalty is large, because there is a full store/load
cycle involved, which will clear caches and increase memory bandwidth
requirements.

------
SuddsMcDuff
The best write up I ever heard for this approach (often called "primitive
obsession" \- [http://sourcemaking.com/refactoring/primitive-
obsession](http://sourcemaking.com/refactoring/primitive-obsession)) was in
Object Calisthenics, an essay by Jeff Bay in The Thoughtworks Anthology
([http://www.amazon.co.uk/The-ThoughtWorks-Anthology-
Technolog...](http://www.amazon.co.uk/The-ThoughtWorks-Anthology-Technology-
Programmers/dp/193435614X)):

 _" an int on its own is just a scalar, so it has no meaning. When a method
takes an int as a parameter, the method name needs to do all of the work of
expressing the intent. If the same method takes an Hour as a parameter, it’s
much easier to see what’s going on. Small objects like this can make programs
more maintainable, since it isn’t possible to pass a Year to a method that
takes an Hour parameter. With a primitive variable the compiler can’t help you
write semantically correct programs. With an object, even a small one, you are
giving both the compiler and the programmer additional info about what the
value is and why it is being used.

Small objects like Hour or Money also give us an obvious place to put behavior
that would otherwise have been littered around other classes."_

------
carlob
It might be because I only use C family languages for very low level highly
optimized stuff, but I found it very weird that a blog post about types would
force all those useless casts from int to double.

9 / 5 in lieu of 9. / 5\. or 1.8 will come back and bite you!

------
taeric
I think the major problem this runs into is that it doesn't necessarily help
with the actually hard parts of programming.

That is, sure, mistakes have been made with units. More mistakes are made in
other areas, though. Usually amusingly more basic areas.

The question seems to be whether encoding at the type level will help with
these areas. It seems the conflict is the idea that fully proving a solution
is superior to partially proving it.

Laudable, to be sure, but anyone that has attempted to prove that quick sort
works knows, it isn't easy. Just understanding the shorter versions of some
programs takes a fair bit of brain power as it is. Understanding the fully
specified version can be orders of magnitude more complicated.

So, I am left wondering if it isn't enough to stop at "passes all the tests."
Especially if you have the correct tests.

And eventually one wonders whether the "correct tests" includes more stringent
types. Maybe it does. Maybe it doesn't. Likely depends on the situation, is my
bet.

~~~
moron4hire
Type checking always happens at some point in every language, dynamic or
static. At some point, a routine gets executed on some form of data, and the
success of that routine is predicated on the data being of the right type.

In dynamic languages, you end up writing a whole raft of tests that are not
much more than type assertions that are otherwise done for you by the compiler
in a statically typed language.

For statically typed languages, the type check _is_ a type of test that just
so happens to be very succinct and easier to write than it is to skip. In
dynamic languages, it's harder to write a test for a type assertion than it is
to trust the duck is a duck, give it a punch, and hope it quacks.

It makes sense, especially for a reusable library, to try to offload as much
to the static type checker as possible. It always runs and can't be skipped.
As you say, "especially if you have the right tests," a static type checker
forces you to have the "right tests" for at least a small part of your code,
leaving you to write fewer tests for the rest of it.

~~~
taeric
I don't think you really wrote anything I did not acknowledge. Yes, types are
a form of complete test over the shape/type of data that they encode. If you
can encode your data with the correct type, it will go a long way to making
sure you don't have errors in the types.

However, I take issue with this statement: "For statically typed languages,
the type check is a type of test that just so happens to be very succinct and
easier to write than it is to skip. " For the simple cases, I agree with this.
But when you see the type gymnastics that some programs go through to
guarantee some traits, it is hard to agree with any claim of "easier to
write." Consider a "true" quicksort in haskel[1].

My assertion was that the "hard part" of programming is often in the logic,
not necessarily the shape/type of the data. When you see attempts at getting
the logic of a program into the type system, things start to take a turn for
the worse. Really worse when you find you have many types that all encode the
same type of data. (Consider the metric system debate.)

My point about "having the right tests" is that sometimes you only care/have
the knowledge to think about/whatever certain scenarios that are either likely
or guaranteed to happen. Pushing everything in the type system implies that
you had the energy to think of and consider everything. My assertion is often
you don't have that time/energy.

So, sure, you will have the "right tests" in that scenario to an extent, since
you will have all tests. However, I wonder if most programs couldn't work out
with a much smaller subset of those tests.

A direct analogy for the point I am making is physics. Classical/Newtonian
physics break down and don't work at a level. Worse, they rely on a lot of
simplifications and actually describe what is happening moreso than how/why.
Yet, for a large portion of calculations that people will do, they work. I
grant that encoding all of the assumptions into the calculations will make
them more accurate, but often you don't need that level of accuracy. And they
definitely make them harder to understand.

[1]
[http://www.haskell.org/haskellwiki/Introduction/Direct_Trans...](http://www.haskell.org/haskellwiki/Introduction/Direct_Translation)

~~~
tel
The true quicksort example you linked has very little to do with Haskell's
type system and everything to do with purity. The qsort version at the bottom
of that page is also pretty close to the version used in any mutable language,
just with a bit more noise in the syntax. A direct translation to Python might
make the syntax easier to understand, though it should be stated that this is
not supposed to be pretty Python code---it's literally just a syntactic swap.

    
    
        def lscan(ary, p, h, i):
          if (i < h):
            if (p < ary[i]):
              i
            else:
              lscan(ary, p, h, i+1)
          else:
            return i
    
        def rscan(ary, p, l, i):
          if (l < i):
            if (ary[i] < p):
              i
            else:
              rscan(ary, p, h, i-1)
          else:
            return i
    
        def swap(ary, i, j):
          ary[i], ary[j] = ary[j], ary[i]
    
        def sloop(ary, p, l, h):
          if (l < h):
            l1 = lscan(ary, p, h,  l)
            h1 = rscan(ary, p, l1, h)
            if (l1 < h1):
              swap(ary, l1, h1)
              sloop(ary, p, l1, h1)
            else:
              return l1
    
        def myqsort(ary, lo, hi):
          if (lo < hi):
            piv = a[hi]
            i = sloop piv lo hi
            swap i hi
            myqsort(ary, lo, i-1)
            myqsort(ary, i+1, hi)

~~~
taeric
My point is pretty much about the "a bit more noise in the syntax." And qsort
will only work on array types, something that is typically shunned by people
trying to "do all of the work with the type system." Consider the types for
"NonEmptyList" and such.

Is it neat when these work? Certainly. Do I think they pay obvious dividends
in the effort increase? Not so much. I am open to the argument, but I have yet
to see anything compelling. And with so much traction in other areas, I am
doubtful that it is anything close to the silver bullet that advocates make it
out to be.

~~~
tel
Again, that noise is entirely an artifact of purity and syntactic choices, not
the types.

I'm not sure exactly what you're arguing for here, but I can explain a use
case of NEL—when you consume it you don't have to check the for emptiness.
That means that you can eliminate an entire branch of code, either locally or
in exception catching, or an entire failure mode.

Frequently people just throw those checks in and then assume, trusting their
own eye or documentation to minimize the emptiness checking, but without
expressing that into the type the compiler won't be able to help you not
repeat that check. It's optimal to check it exactly once and people are liable
to check it either 0 or many times.

~~~
taeric
My argument is that as you get more types that cover states or behavior of
your program, one typically finds more code.

Is there an argument that these are beneficial? Of course there is. I just
have not been privy to an example where it carried its weight.

Take the not-empty list. Typically the "cost of checking for empty" is so
negligible that to push that thinking into the compiler just doesn't lead to
any gains. Not even just marginal gains. It is nigh unmeasurable.

Compound this with similar types over the state of any other object, and you
have an explosion of types that becomes unwieldy. Account, NotEmptyAccount,
OverdrawnAccount, DrawingInterestAccount... It gets to be just crazy.
Especially in areas where you don't care about any of those details, but now
have to account for an inheritance tree.

And then you get into the fun of having to logic not just about how the actual
program works, but how the interaction of various types works.

So, you keep saying the syntax I was referring to was due to purity. Which is
kind of the point. These languages use syntax and types to help ensure purity.
Which makes them more to handle for some algorithms. Less for others. And then
just hard to really know the true computational cost for a lot.

Which is all to say that it is a design tradeoff. And should be undertaken as
an informed decision. Not a "let the type system do the work" kind of one.
More of one that "this is so important we need to make sure we can afford the
cost of pushing it into the type system." Or, if you would prefer "this
particular instance is so cheap to push to the type system, we should do so."

~~~
tel
I agree completely that it's a tradeoff. I also only bring up purity to
suggest that other strongly typed, impure languages could solve this with less
trouble.

NEL is a particularly interesting point of compromise. I don't think it
carries its weight very frequently at all, though I have used it very
effectively in a few cases. I also think the burden of type hierarchies as
you're describing isn't much felt in Haskell. You typically don't build large
taxonomies, or taxonomies at all, in a language without subtyping.

The only place where I see taxonomies forming at all is in `lens` and there
they are used to great effect.

~~~
taeric
Apologies for not seeing this. I really need to get better at following HN
threads. (Or not, depending on your take.)

I think we are mostly in agreement. I just have not had the fortune to work in
an area where these have panned out that well. Hopefully I get a chance to
change that someday. (Though, I am currently under the song of the lisp
sirens.)

------
minikomi
Some talk on the racket mailing list was had for "numbers with dimensions[1]"
in which contracts were being bent to do calculations which respected units
[2]. The Frink language was also mentioned [3].

[1] [https://groups.google.com/forum/#!topic/racket-
users/mnId0ux...](https://groups.google.com/forum/#!topic/racket-
users/mnId0uxRaTU)

[2] [https://github.com/Metaxal/measures#4-dimensions-and-
contrac...](https://github.com/Metaxal/measures#4-dimensions-and-contracts)

[3]
[http://futureboy.us/frinkdocs/#SampleCalculations](http://futureboy.us/frinkdocs/#SampleCalculations)

------
moron4hire
I did just this very thing in C# a few months ago[1]. The problem I had was
that it gets out of hand very quickly. I wanted to be able to divide distance
by time and get speed out of it, but it is extremely difficult to satisfy all
of the M-to-N relationships between types and still end up being usable.

I ended up abandoning it for the project I'm building, as I wasn't too clear
on what the type of a secant of an angle in degrees should be and nobody on
the interwebs seemed to care enough to take notice or answer my questions.

[1]
[https://github.com/capnmidnight/UnitsOfMeasure](https://github.com/capnmidnight/UnitsOfMeasure)

~~~
platz
F# has this, I wonder how they did it?

~~~
magic_haze
They couldn't do it. The compiler does static verifications that, say, a value
of type Meter when divided by a value of type second results in a value of
type Meter/Second, where '/' is a custom concept built into the F# compiler.
The .NET runtime knows nothing about these custom types.

~~~
moron4hire
Yeah, that's pretty much exactly the issue I ran into. The problem is that the
standard .NET type semantics can only infer "up" the inheritance hierarchy.

I had gotten really close. I had types like

    
    
        Base
        Length: Base
        Meters: Length
        Feet: Length
        
        Compound<Base, Base>
        Area<T>: Compound<T, T> where T: Length
        SquareMeters: Area<Meters>
        SquareFeet: Area<Feet>
    

and had defined all of my math as generic extension methods of the Base class
so that the types would compose:

    
    
        Compound<T, U> Multiply<T, U>(T a, U b) where T : Base where U : Base
        T Divide<T, U>(Compound<T, U>, U b) where T : Base where U : Base
        

That went a long way towards getting fairly concrete types out of only a few
lines of code. But I couldn't get it the rest of the way. C# explicitly
forbids user-defined typecasts between types in an inheritance chain with each
other, so while multiplying two Meters got me a Compound<Meters, Meters>, it
was not then possible to take a Compound<Meter, Meter> and convert it to
anything like Area<Meters> or SquareMeters automatically.

~~~
platz
Interesting problem! I was wondering why even define things like SquareMeters,
but as you say the issues is reducing the terms that have different forms but
mean the same thing.

~~~
moron4hire
Having a SquareMeters type was meant to reduce complexity. Instead of always
dealing with Area<Meters, Meters> all the time, it was meant to be a shortcut.

Hrm, what about namespace aliasing? "using SquareMeters = Composite<Meters,
Meters>"? Since the problem is not the composition of the type but the
verbosity of the deeply composited types, then perhaps it's just about making
a shortcut for the name.

~~~
platz
things can be composed different ways. m/s^2 should equal (m/s)/s

~~~
moron4hire
Okay, I've definitely convinced myself there is no good way to do this without
having to write way too much code over and over again. The problem is
basically equivalent to inserting items into a sorted list, but you only have
one operation in which to do it. So, you are either limited to simple types,
or your code for operations on types explodes trying to define all of the ways
to compare types (in a generic sense) and different scenarios for combinations
of types.

This requires a Turing-complete type system.

------
mwsherman
Might be interesting to compare to F#‘s units of measure:
[http://msdn.microsoft.com/en-
us/library/dd233243.aspx](http://msdn.microsoft.com/en-
us/library/dd233243.aspx)

------
imsofuture
To take this even further, look at OCaml's strong, inferred typing.

------
jeffdavis
Does static typing have inherent costs?

For instance, there are some desirable properties often present in dynamic
languages like hot code loading, or the ability to make code updates
independently. Not many statically-typed systems are good at these things, or
at least the static typing doesn't work well across boundaries (like a
network, etc.).

Can these be reconciled? Can something have all of the benefits of, say,
haskell and erlang at once?

~~~
tel
There's Cloud Haskell which is an attempt to do Erlang style distributed
concurrency in Haskell. For non-distributed concurrency, you can already get
away with using Haskell's green threads and channels. Furthermore, there's the
beginning of a hot code swapping system in GHC now, but it's not terrifically
popular at the moment---it was built for a particular use at Facebook and
hasn't gotten too far from there.

It's worth talking about why Cloud Haskell is tough, though. The primary issue
is that Haskell doesn't have any kind of transparency into functions---if you
are given a type of `(a -> b)` there's no way to examine what it is on the
inside, no concept of source that could be sent to another node.

This limits the kinds of messages and spawning that can happen in a
distributed Haskell system. The Cloud Haskell project is working to remove as
much of this limitation as possible, though.

------
edderly
In some environments beware of creating new types and objects:
[http://developer.android.com/training/articles/perf-
tips.htm...](http://developer.android.com/training/articles/perf-
tips.html#ObjectCreation)

~~~
jimmaswell
This seems to paint the JIT as something that hurts performance instead of
helps it, is that true on Android?

~~~
edderly
It relates to the behavior of the garbage collection scheme used by dalvik
which is a mark and sweep approach. When the GC runs, that’s all the VM is
doing. It’s pointed out that a generational garbage collector would be more
efficient allocating memory presumably there is a trade off in terms of the
overall memory footprint.

------
midas007
Static type checking is a double-ledger bookkeeping of "what I am/have is what
you are/want." Erlang and Haskell have the most interesting approaches to
typing: optional and extensible respectively.

------
pointernil
I often wonder how the "tool" named MONADS could help to deal with this quite
old problem of what I call

meta-types or semantic types

(i'm sure there is a proper name for this; I'm not a cs/type-theory
professional ;)

Time handling f.e. and often physical Units would quite often benefit from
much stronger guaranties at compile and runtime. And no, OOP and it's tools
are not sufficient to handle this... in theory maybe, not in day to day
development.

We need to translate the core ideas of a monad into more a "mainstream" syntax
and runtime env. to prevent the next incident of:

I'll just add the integer 365 to my certificate valid-date within our core
cloud infrastructure code...

I'm not necessarily a MS fan, but would love to see how Anders Hejlsberg f.e.
would tackle this kind of language-design challenge ;)

~~~
tel
I don't think this has much at all to do with monads.

~~~
pointernil
Care to explain in a little more detail? Is a calender resp. date-time monad
really such a far fetched idea?

~~~
tel
That's just a state monad, the thread and what I assumed and may have assumed
incorrectly you were talking about, was around overloading math operations to
work on dimensionalized quantities like feet and seconds and the like. You can
dimensionalize time as well—take a look at the time or thyme Haskell packages
for some fascinating examples.

Doing computation in the context of the current time is absolutely monadic. If
you're handling time abstractly, it's the state monad. If you're handling it
for real then you need some kind of monad protecting communication with
NTP/your local CPU clock. That's IO in Haskell.

~~~
pointernil
Yup, you assumed incorrectly. I can see the __qualitative __difference between
operator overloading and monads. That 's why I suggested to transfer the core
idea of monads into more mainstream languages to provide more "semantic"
safety, so to say.

Thanks for the clarification.

------
ternaryoperator
What he's talking about is avoiding primitive obsession. Always good advice,
although I find its greatest benefit comes in typing collections, rather than
just using primitives like List or ArrayList.

------
wwwtyro
Interesting. What about more complex units, e.g. acceleration - m/s^2?

------
NigelTufnel
Maybe it's just something I don't understand but I would go with Celsius
everywhere with converting to Fahrenheit/Kelvin when showing/grabbing
temperature to/from user.

~~~
ygra
The Celsius/Fahrenheit example isn't the best. Generally all examples based on
units of measurement are probably flawed because sane systems can just
standardise on SI and be done with it.

However, different coordinate systems that should never be mixed _are_ useful
to keep separate. At work we deal with maps and geographic information, so
there are always at least three coordinate systems: Screen, projection and
geographic coordinates. Having a type system that catches mistakes there would
be so much nicer than just always passing P2D around and trying to never mix
different ones.

------
mapcar
Sorry, what language are these example given in?

~~~
MBlume
C++

------
dschiptsov
Looks rather like a type-system abuse.

btw, there is Haskell for that.)

~~~
alipang
He claims it's to make wrong code look wrong. He uses the type system in order
to prevent bugs, how is that abuse? Is the type system only there for
performance in your opinion, or what's your angle?

------
greatsuccess
This is fairly the premise of OOP on display. And fairly basic at that. Im not
sure whats to be learned here except that in the real world, your
representations will always need to be extended and subclassing is usually a
bad hammer.

Interfaces are better but extend poorly through subclassing as well.

Composition is infinite but difficult to define over time when things change
(usually resorts to more subclassing or more injection strategies).

I wish this article were valuable to solving the actual problem but its just a
primer on OOP vs using primitives which is not particularly instructive or
useful to solving real world problems.

------
saltylicorice
No.

