

Don’t invert that matrix - KonaB
http://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/

======
ewjordan
Here's another piece of advice that covers a lot more than matrix inversion:
don't solve math problems yourself in code if you can in any way avoid it.

Use a library instead, and use the most specific methods that are available:
if there are specialized routines to solve Ax=b, use them, don't use the
matrix inversion methods (which are probably also there), or even the LU
decomposition ones, even if Real Math tells you that the result is equivalent
- in numerical computation, mathematical equivalence is an entirely different
entity than what "real" mathematicians concern themselves with.

Seriously. I don't care if you're a math PhD, I don't even care if you know a
decent amount about numerical methods. Whatever code you're going to put
together to do the job will be inferior to what a mature library can do,
simply because the library has been put through many iterations and has had a
lot more time to get patched up to handle all those special cases (usually
related to finite precision or discretization) that Real Math doesn't have to
worry about. And it will probably be vastly more optimized than anything
you'll put together, because it's spent years of heavy use in some critical
research environments.

The only time you should be writing your own math code (beyond simple
arithmetic) is if no mature libraries are available to you. In which case you
should allocate at the very least twice the amount of time it will take to
research and implement the methods, and probably a lot more, because you're
going to have a lot of fiddling to do.

~~~
DrJokepu
Another reason would be (self-)education. There's no better way to gain a
deeper understanding of a subject or learn the ins-and-outs of optimization
than writing some code and then possibly trying to improve it. If you always
blindly depend on third party libraries, you will remain ignorant and dumb.

Then, you can go on and use third party libraries in production while feeling
satisfied that you know what's going on inside the library.

~~~
gaius
_If you always blindly depend on third party libraries, you will remain
ignorant and dumb._

Depends if your goal is to become an aeronautical engineer or an assembly
language hacker, really. Neither one's "better" than the other, but the the
former won't benefit much from understanding exactly how to optimize on SPARC
vs x64... Nor will the latter need to worry about his compiler's stalling
speed :-)

------
brettnak
Are people interested enough in this for a blog post? I studied numerical
analysis and would be happy to post some of the basics if people are
interested.

~~~
nimrody
What is usually missing from these articles is even a simplified error
analysis.

We can all count the floating-point operations, and today caches sometimes
influence matrix algorithms more than raw flop counts. The problem gets more
interesting when the matrix becomes ill conditioned and different algorithms
will produce very different results.

------
imurray
A Matlab/Octave example of inv being slower and less accurate:

    
    
      >> A = randn(100); x = randn(100,1); b = A*x;
      >> tic; x1 = inv(A)*b; toc
      Elapsed time is 0.142381 seconds.
      >> tic; x2 = A\b; toc     
      Elapsed time is 0.000736 seconds.
      >> max(abs(x-x1))
      ans =
         7.1054e-14
      >> max(abs(x-x2))
      ans =
         4.8406e-14
    

Edit: Actually this is a warning about timing stuff. The inv function hadn't
"warmed up". On a second run both times drop, and the difference is much
smaller:

    
    
      >> tic; x1 = inv(A)*b; toc
      Elapsed time is 0.000737 seconds.
      >> tic; x2 = A\b; toc
      Elapsed time is 0.000471 seconds.

~~~
ibsulon
What happens when you run it 1000 times, skipping the first?

~~~
imurray
If I use this:

[http://www.mathworks.com/matlabcentral/fileexchange/18798-ti...](http://www.mathworks.com/matlabcentral/fileexchange/18798-timeit-
benchmarking-function)

which warms up the function, averages over repetitions and tries to account
for various overheads, the ratio states about the same (1:1.5 ish). I would
normally use timeit(), but for a quick post, I was being sloppy.

------
jacobolus
Does anyone trying to solve problems involving matrices with millions of
entries not have some idea of the computational complexity of various
operations they can do on them, given the nature of their matrices, etc.? I’d
expect that either (a) someone has a small enough problem that the
computational time is irrelevant and they should just arrive at the solution
in whatever manner is conceptually clearest, given their current mathematical
knowledge, or (b) the problem is going to take a lot of computation time, in
which case it’s worth exploring to learn exactly what the fastest method will
be.

~~~
lliiffee
You can solve amazingly large problems with a sparse matrix with out
understanding what you are doing, as long as you use direct (factorization)
methods. If you type A\b with some gigantic sparse A, matlab will apply
reordering heuristics that often just work, and are actually incredibly hard
to improve upon by manual fiddling. When you get to really, _really_ big
problems, you need to use iterative methods (like conjugate gradient), and for
those you have to have a _very_ firm understanding of everything. The gap
between direct and indirect methods is huge.

~~~
jedbrown
Matlab's backslash is the GPL'd Umfpack (relicensed from the author). The most
important thing to know about iterative solvers is that the Krylov method is a
minor component, preconditioning is the hard part.

~~~
whyenot
You need to have a basic understanding of the different Krylov subspace
methods to be able to use them. If you are using CG on a matrix that isn't
SPD, then you are doing something very wrong. Same goes for the other methods
-- they all have their strengths, weaknesses and gotchas, and it's pretty
important to know what they are if you are going to use them! Preconditioning
can greatly reduce the number of iterations you have to do once you have
chosen a method, but it's not particularly helpful if you are using the wrong
method.

~~~
jedbrown
With CG, the preconditioner also needs to be SPD (or have suitable factors).
You can get by pretty well using only GMRES and spend all your time thinking
about preconditioners. Once you have some good candidate preconditioners, you
can try other Krylov methods, but the gains are rarely very large. I have
frequently found SPD systems that are more efficiently solved with GMRES and a
non-symmetric preconditioner, and even more so for symmetric indefinite
systems where popular wisdom says to use MINRES, CR, or SQMR.

For amusement, it's really worth knowing about

[http://www.comlab.ox.ac.uk/people/nick.trefethen/publication...](http://www.comlab.ox.ac.uk/people/nick.trefethen/publication/PDF/1992_52.pdf)

which shows that each of the popular nonsymmetric iterations can beat all
others by a huge factor.

But this is all justification that you should use a library like PETSc for
your linear solvers. Then all aspects of the solve become run-time (typically
command-line) options including an algebra for composing preconditioners and
other components, with a plugin architecture for every component.

------
memetichazard
I'm a bit rusty on my linear algebra. Without inverting the matrix, what's the
algorithm for solving Ax=b? Factorization is mentioned but I'm not sure what
it means in this case.

~~~
JabavuAdams
Gaussian elimination. Remember where you subtract multiples of one row from
another, in order to get zeros below the diagonal? That way you can back-solve
a system of equations by reading off solutions starting with the bottom-right
corner.

The result of carrying out the elimination steps is a triangular matrix with
all zeros below the main diagonal. Call it U, for "upper".

The elimination steps themselves can be encoded as a matrix, rather than
describing them in words. For instance, "subtract 2 times row 1 from row 2" is
the matrix:

[ 1 0 0]

[-2 1 0]

[ 0 0 1]

So, to solve the system, you can perform elimination on A x = b, which gives
you two matrices (U, and the elimination steps L). In some sense, you've
factored A in to L and U.

Example:

Using Strang's terminology, let's call the matrix that "eliminates (i.e. sets
to 0) the (i,j) entry" Eij

So, to solve a 3x3 system A x = b, you want to do elimination as: (E32 E31
E21) A = U

I.e.

1) Take A and subtract some multiple of row 2 from row 1, to put 0 in (2,1)

2) Take the result and subtract some multiple of row 3 from row 1 to put 0 in
(3,1)

3) Take the result and subtract some multiple of row 3 from row 2 to put 0 in
(3,2)

Now the goal of elimination is to get an uppper-triangular matrix U. I.e. if
you were solving some system of equations in x, y, z, then because U is
triangular, you could just read off the answer to z. Then plug that into the
equation for row 2 to solve for y, then plug that into the equation for row 1
to solve for x. That's back-substitution.

BUT, let's return to our operations. We have:

E32 E31 E21 A = U

(E32 E31 E21) A = U

inverse(E32 E31 E21) (E32 E31 E21) A = inverse(E32 E31 E21) U

I A = inverse(E32 E31 E21) U

Now, it just so happens that the inverse of (E32 E31 E21) is lower-triangular.
I.e. all entries above the diagonal are zero, so let's call it L.

I A = L U

A = L U

LU-decomposition. Or "factorization", since we've factored A into two
matrices: L, and U.

~~~
whyenot
Yes, but it's a little more complicated than that. In most cases you cannot
trust the solution of straight Gaussian elimination. You need to take
additional measures to avoid excessive error growth, such as using partial
pivoting.

see <http://en.wikipedia.org/wiki/Pivoting> and
<http://en.wikipedia.org/wiki/Gaussian_elimination>

