
Can Your Static Type System Handle Linear Algebra? (2014) - kushti
http://yosefk.com/blog/can-your-static-type-system-handle-linear-algebra.html
======
moron4hire
People keep talking about "you don't want to add meters to feet", but that's
exactly the opposite of what you want to do. You _want_ to add inches to yards
or furlongs or flea wings, and you want to get it _right_.

What you don't want to do is add length to weight. The true type of your value
is not inches, it's length, and expressing it in inches is a formatting issue.

~~~
ADRIANFR
Also, ideally, you don't want to add length to width, or Hertz to Becquerel
(both have [1/sec] unit). This was my humble attempt to address this with a
Scala Units library:
[https://github.com/adrianfr/scalau](https://github.com/adrianfr/scalau)

~~~
coldtea
If you want to find the periphery of something you DO want to add length to
width...

~~~
Someone
No, you add the lengths of the sides. That's type-safe and generalizes from
rectangles with edges parallel to the x and y axes to general polygons:

    
    
      perimeter = poly.edges.map(length).sum()
    

Height and width have (implicit) direction; length doesn't.

But yes, this will get hairy. For convenience, you want the ability to add
heights, but that only makes sense in some cases. Two 100m high skyscrapers
only make a 200m high skyscraper if you place one on top of the other.

~~~
marvy
Or maybe you want to find the height of an average skyscraper so you want to
add and then divide. This is going too far maybe. At some point, one has to
accept that not all logic bugs are type errors, and not all type errors are
common enough in practice to worry about. I have no idea where that point is,
but I'm sure it exists.

------
taeric
Don't skip the comments. I think there is more there than in the article,
honestly.

~~~
kragen
Of course there is. What there isn’t, as far as I can see, is a working
solution.

~~~
marvy
There is, more or less. Mine, in fact. But it's ugly. It's repetitive. If
someone gave me that solution, I would think "this is way too unreadable to
ever verify its correctness". My mistake was using C++98. I think C++11 would
handle this better, but I never got around to trying.

------
TTPrograms
There's no reason type(X' _X) should be fixed. If you allow yourself to
consider the case where each entry in the matrix has different type everything
is fine. You just end up with matrix product rules like:

[type(a) type(b)][type(c); type(d)] = type(a)type(c) + type(b)type(d)

Which puts constraints on the types of the various entries in the product, but
doesn't restrict them to being the same in each matrix or vector. This is
totally reasonable - in Ax + b = y I'm restricted to type(A)_type(x) = type(b)
= type(y) for scalars. Why wouldn't we have a similar generalization of
constraints in the matrix case? The model that arrays of objects must all be
the the same type simply doesn't fit with the analysis in this case. The
author suggests this:

> But I just can't imagine the monstrous types of products of the various
> elements ever eventually cancelling out as nicely as they apparently should
> – not in a real flesh-and-blood programming language.

Well that's how it should work. You just have to imagine it :)

------
dmlorenzetti
_Perhaps the sanest approach is we type every column of X and Y separately..._

Typing every column of a matrix is not sufficiently general. Each element of
the matrix can have its own type.

Suppose you're solving a linear algebra problem in two unknowns. Say it's an
electromechanical system, and we want to find a voltage and a pressure that
gives us some desired flow rates. So let x [=] (V, Pa), for example, and let y
[=] (mA, kg/s).

Then the matrix-vector product A _x = y requires the units:

    
    
        [ mA/V   , mA/Pa   ] * ( V  ) = ( mA   )
        [ kg/V/s , kg/Pa/s ]   ( Pa )   ( kg/s )
    

Of course those units in A have nicer reductions, but it doesn't teach us
anything to figure them out. The point is that the matrix A represents the
system design, so the units don't have to have familiar names, let alone be
consistent within a column. If the system involves Johnny reading a value off
a meter, and then phoning somebody up to tell them to pedal a stationary bike
a little faster, that process has some kind of units associated with it, and
the matrix A must include those units in order to represent the system.

~~~
porges
Each element of your matrix doesn't really have its own type, each column and
row has a type:

    
    
                                1
                              +---
                           V  | V
                          Pa  | Pa
    
              V^-1    Pa^-1   
             +--------------- +-----
        mA   | mA/V   mA/Pa   | mA
        kg/s | kg/V/s kg/Pa/s | kg/s
    

I can imagine a system that types this quite well.

------
jjoonathan
I think this is the strongest argument for dynamic typing I have ever read.

~~~
alexknvl
Wow, if that is so, then I am completely sold on static typing. I do not get
how dynamic typing gets into the picture.

Having a type system that can catch some errors at compile time but not all is
better than having a type system that doesn't catch any.

Having a static type system doesn't mean that you can't use dynamic typing as
much as you want (e.g. in Scala you can make all variables of type `Any` and
use type refinements whenever you want to call a method).

~~~
jjoonathan
You seem to be under the impression that I'm a proponent of using dynamic
typing to the exclusion of static typing. That's not the case (I'm an ObjC
fan). I just hadn't seen a situation before where the argument for one vs the
other in a particular situation hinged on something so fundamental rather than
on a design artifact.

EDIT: Removed comment about scala because I was confusing it with clojure. My
bad.

~~~
alexknvl
That's the thing. There is no _argument for one vs the other_.

There is a statement of the following fact: "It is hard to express
heterogeneous array type in most statically typed programming languages and
verify operations at compile time." Well, duh. AFAIK this would require
dependent types* (not something that you get for granted in mainstream
languages).

But can we check types in runtime? Yes. Does it mean that it can only be done
in a dynamic language? No. It can be easily done in any language that has
integers and arrays. Even x86 assembly is enough.

* Scala's type system should be enough, if we use shapeless.HList's for column and row types and then a ton of implicits to resolve the resulting type and check that everything is sound.

------
dllthomas
What's striking about this is that this constraint _is essentially a type
constraint_. To the degree that existing type systems (and the checkers
therefore) make this ugly, difficult, or impossible to express that surely
points to an opportunity for improvement.

Note that, contrary to some on this page, I read this as speaking entirely
about computation whose structure is known at compile time. Obviously, if the
structure is based on run-time information then you probably need dependent
types to guarantee statically that nothing can go wrong.

------
evanpw
Are there dynamically-typed systems that actually do this? That check the
units of matrix and vector elements at runtime without checking every single
time you want to multiply two numbers together?

~~~
im3w1l
I imagine most languages with a matrix library uses opaque types implemented
in C or similarly fast language. So just one typecheck would need to be done.

~~~
evanpw
I'm sure they check that the matrix doesn't contain a string, or that you're
not mixing 32-bit and 64-bit floats, but checking user-defined units as in the
article, where each row or column may have a different one seems almost as
hard in a dynamically-typed language, unless you do the check at the very
bottom level. And that would be unimaginably slow.

~~~
im3w1l
Oh ok, yeah I refered to 32/64bit floats / ints. I didn't like the notation in
the article, so I didn't bother understanding what he was talking about.

------
chimeracoder
I'm a statistician by training, though these days I do a variety of work as an
engineer.

I'm a huge proponent of static typing, but I have to say that there's a reason
that the predominant languages used by statisticians and scientists are
generally dynamically-typed (Python, R, MATLAB, etc.) - and it's _not_ that
these people simply "don't know how to deal with static types"[0].

I haven't used Haskell in a few years, so it's a bit rusty, but I'll use it as
an example, since it's the most widely-known strict static type system.
Hindley-Milner type systems like Haskell often don't allow dependent types[1],
which means that a 2D vector and 3D vector are the same type _unless_ their
dimensions are known at compile-time (which is oftentimes not the case). This
is a _big problem_ , because I may want to run a regression on an unknown
number of variables, but also constrain the number of variables in alternative
regressions to use the same number of variables.

In Haskell, you might solve this problem with a tuple, except then (1) your
vector elements may be of the wrong type (bad!), and (2) You can only have
vectors of length 63 or less.[2]

By contrast, in MATLAB, it's trivial to declare a number that may also be a
vector, but act like a scalar for certain operations, but also be correctly
"incompatible" with vectors of incorrect dimensions[3].

Yeah, you can do this in Haskell/Scala/Go/<insert-language-here>. But it's
really, really annoying.

There is one assignment that I had in school which was done in MATLAB, and
just for fun, I tried reimplementing it in a few other languages (Haskell,
Python, Go, and a few others). I ended up giving up in frustration with _all_
three of these, whereas the original MATLAB assignment only took a few hours
(most of which was figuring out the math behind it, not writing the code). My
Python attempt was the one which came the closest, but even then it was just
so hideous that I couldn't bear to work on it any longer[4].

[0] I haven't used Julia, so I can't say whether the optional typing there
solves this problem or not.

[1] At least, Haskell didn't as of a couple of years ago, which was the last
time I checked. Though it seems this is still an open problem:
[https://wiki.haskell.org/Dependent_type#Dependent_types_in_H...](https://wiki.haskell.org/Dependent_type#Dependent_types_in_Haskell_programming)

[2]
[https://downloads.haskell.org/~ghc/7.0.4/docs/html/libraries...](https://downloads.haskell.org/~ghc/7.0.4/docs/html/libraries/ghc-
prim-0.2.0.0/src/GHC-Tuple.html#%28%2C%29)

[3] If you think that this sounds easy in your language, try generalizing this
where `v` can be an element in _any_ vector space, not just the sorts of
vector spaces that we're used to dealing with as computer scientists.

[4] Yes, I am aware of the irony in saying saying that this Python code was
more hideous than the equivalent MATLAB code. MATLAB is awful in so many ways,
but at the end of the day, it's a domain-specific language, and it tackles its
use case in a way that no general-purpose programming language I've seen has.

~~~
dragonwriter
> I'll use Haskell as an example, since it's the most widely-known strict
> static type system. Hindley-Milner type systems like Haskell often don't
> allow dependent types, which means that a 2D vector and 3D vector are the
> same type unless their dimensions are known at compile-time (which is
> oftentimes not the case).

Implementation of dependent types in Haskell, with the specific example of
dimension-parameterized vectors, is addressed in [0].

[0] [https://www.fpcomplete.com/user/konn/prove-your-haskell-
for-...](https://www.fpcomplete.com/user/konn/prove-your-haskell-for-great-
safety/dependent-types-in-haskell)

------
kxyvr
This isn't really a problem with the type system. Basically, the author
figured out that the normal equations with Vandermonde matrices
([https://en.wikipedia.org/wiki/Vandermonde_matrix](https://en.wikipedia.org/wiki/Vandermonde_matrix))
are horribly ill-conditioned and a terrible tool for interpolation. Though, I
actually like the authors example because it's a nice way to give physical
units to what sometimes causes ill-conditioning. But, in short, I don't think
there's anything wrong or complicated about a type system tracking units in
this case. Vandermonde matrices will absolutely have radically different units
in their elements, which is a huge disincentive to use them.

Now, what's a better way to interpolate a series of points? Basically, use a
better polynomial basis. In the authors example, we use the polynomial basis
[x 1]. Or, in the higher order case, [x^4 x^3 x^2 x 1], etc. A better way to
do this is to use a polynomial basis that's more stable, such as Legendre,
Chebyshev, or Bernstein polynomials. They all have different properties, but
if we look at something like the second order Bernstein polynomials, we have
[(1-x)^2 2x(1-x) x^2], we notice that each polynomial has the same units.
Then, if we do form the normal equations, we don't have such a huge disparity
in the units of the elements.

In addition, forming the normal equations are a terrible way to solve these
problems. Basically, if we want to find a least squares solution, we're much
better off using something like a QR factorization because things like Givens
rotations are much more numerically stable. Alternatively, we can use
iterative methods like LSQR or LSMR, which also compute the least squares
solution without explicitly forming the normal equations and thus avoiding the
condition number problems.

Finally, even if someone wants to be neurotic about units, which is generally
good practice, but a pain to do every time, we're almost always better off
doing dimension analysis at the operator level and not the element level. For
example, if I was solving the heat equation, I might form a matrix with the
finite different operator or a stiffness matrix. However, I would know that
the stiffness matrix, in this context, would have units of L(m^2/s,K/s) where
L means that we have a linear operator from something with units of m^2/s to
K/s. Technically, we can work backwards from these two to see that our
elements have elements of K/m^2, but that's not as useful as the operator
formulation. And, I claim this because if we write things out in the operator
form, it makes it easier for us to figure out the units of the adjoint. In the
simple case with the L^2 norm, we know that the adjoint has units of
L(K/s,m^2/s). However, if the inner product we use has units, this answer will
absolutely be different. Personally, I find that easier to track when I write
things out like this.

Anyway, I do think dimension/unit analysis is important, but static type
systems are sufficient to track it and it's really not too bad to do so.

------
_random_
F# units of measure to the rescue:

[https://msdn.microsoft.com/en-
us/library/dd233243.aspx](https://msdn.microsoft.com/en-
us/library/dd233243.aspx)

let convertCtoF ( temp : float<degC> ) = 9.0<degF> / 5.0<degC> * temp +
32.0<degF>

~~~
kvb
F# units can certainly handle the problem given a fixed matrix size, but for a
general method handling arbitrary dimensions you'd need variadic measure
arguments, which aren't supported (and would probably require quite a bit of
bookkeeping for this particular problem, in any language).

------
sjtrny
Look at the "Eigen" [1] library for C++. It's totally possible to do linear
algebra in a staticly typed environment. You just need to use generic types.
However it isn't as pleasant as doing it in a dynamicly typed environment like
MATLAB.

[1]
[http://eigen.tuxfamily.org/dox/group__TutorialLinearAlgebra....](http://eigen.tuxfamily.org/dox/group__TutorialLinearAlgebra.html)

~~~
gugagore
as far as I know, Eigen is type-safe only with regards to dimensions of vector
spaces. It can check at compile time that a dot product is taken over two
compatibly-sized vectors, or that a matrix-matrix multiply makes sense
dimensionally.

It doesn't have a notion of vector spaces with units. It can't keep track (not
Eigen, at least) that a vector [m, kg] cannot be inner-product'd with itself
unless its through a norm that produces the correct dimension, otherwise the
units are m^2 + kg^2.

~~~
sjtrny
Isn't that the default behaviour for most linear algebra systems? (e.g.
MATLAB). Maybe I'm missing something but I don't really care about compile
time checking and it seems an impossible task anyway with variable/dynamic
sized inputs.

~~~
gugagore
Yes, it is impossible for a compiler to know the size of a matrix whose size
is determined at runtime. But a lot of times you do know the size of matrices
and not only is it more type-safe to have the type of a matrix be parametrized
by its size, it also allows for more efficient code generation. If you don't
know the size of a matrix and vector, you necessarily need a loop. If you know
your matrix is 3 by 2, you may just, at compile time, unroll the loop
completely.

I don't know which default behavior you mean.

~~~
dllthomas
It's impossible for a compiler to know the size of a matrix whose size is
determined at runtime, but it's not impossible for a compiler to prove that
whatever the size your logic is still correct (or at least doesn't break any
of the invariants you've established). Depending on the nature of those
invariants and the complexity of your logic, you may need dependent types.

------
nbouscal
I can't answer the question directly, because as a web engineer I haven't once
in my career actually had to do linear algebra in my code. That said, the
entire mindset behind this article strikes me as fundamentally wrong-headed.
The mindset is “either you have static types, or you don't.” From that
perspective, if you can think of any situation in which you can't encode the
problem into static types, then you might as well throw the whole idea out the
window and just hope things work at run-time. In the real world, this isn't at
all how it works.

The last couple of decades have been a story of type systems slowly but surely
expanding the class of problems for which type-safety is practical and
worthwhile. There are still problems for which this _isn 't_ the case, and
there will likely still be problems like this for a long time. Proponents of
static typing aren't arguing against this fact; they're arguing that for all
the myriad cases where static types _can_ help, we should use them. For the
cases we can't figure out, fine, just use ints. But for those functions, write
a lot of tests, and be sure to rewrap those ints in semantic types as soon as
possible.

In short, whether or not my static type system can handle linear algebra is
completely irrelevant to me. To be worth using, my type system doesn't have to
solve _every_ problem. It only has to solve enough problems to provide more
benefit than cost. In my experience, it does that by multiple orders of
magnitude.

~~~
angersock
The problem is that linear algebra is probably one of the most basic things
you can do with a computer. If static typing isn't capable of offering useful
help or even just has a few weird wrinkles, then something is _very odd_.

For what it's worth, I've seen LA done in C++ with various templates and
libraries, and so it's certainly possible in various ways.

~~~
nbouscal
It wouldn't surprise me at all if static typing was capable of offering useful
help. If it can't today, I wouldn't be surprised to see us make progress over
the next few years. I think interesting conversations could be had about why
this problem is harder to model using type systems than other problems are.
Where I get confused is when this is used as an argument against using static
typing _in general_. By analogy:

“Hammering nails is probably one of the most basic things you can do as a
carpenter. If lathes aren't capable of offering useful help, then something is
_very odd_.”

I don't mean to mock, but it doesn't seem odd to me at all that not all tools
are good for all purposes. What seems odd to me is trying to argue against the
use of a tool that has proven itself in many different situations by pointing
out that there exist situations it isn't useful in. Sure, that's true, but so
what?

~~~
metaphorm
but a programming language isn't a hammer or a lathe. its a workshop that is
used to produce hammers and lathes and every other tool you will need.

the criteria for judging the workshop is rather different than the criteria
for judging the lathe.

~~~
Solarsail
PL's as tools is a half-accurate analogy. If you're going to stretch it that
far, shouldn't you use something more accurate? Or better, not use an analogy
for something that detailed?

Say, PL's are materials. They are something that useful things "tools" are
fashioned out of. The properties of a material affect the entire thing built
out of it, in modest but pervasive ways. Some programs are built in more than
one language, this could be seen as an object built with different materials
for different parts. Now, whether mainstream languages are analogous to
different iron alloys, wood, concrete or plastic is a different discussion.
(And maybe another exercise in taking an analogy too far...)

Not sure where I saw this analogy first, since I can't find it on LtU...

