
Clojure Linear Algebra Refresher: Eigenvalues and Eigenvectors - dragandj
http://dragan.rocks/articles/17/Clojure-Linear-Algebra-Refresher-Eigenvalues-and-Eigenvectors
======
peatmoss
Anyone know what the status of Incanter is these days? It seems like Incanter
could benefit mightily from the Neanderthal library referenced in the article.

I have to confess that, after the initial excitement around Incanter, I
haven't kept tabs on it.

~~~
hencq
As far as I understand it, there's core.matrix which is intended as sort of a
universal Clojure API for linear algebra. There are different implementations
of the core.matrix API, like the pure JVM vectorz-clj, BLAS based clatrix and
even a clojurescript one, aljabr. Incanter uses core.matrix and can therefore
use these different implementations. However, the Neanderthal library does not
support core.matrix, so Incanter can't use it.

As for why Neanderthal doesn't support core.matrix, I don't know. I see the
author is active in this thread, so he can probably give a much better answer,
but from reading the mailing lists I gather he mainly created Neanderthal to
scratch his own itch, so core.matrix support wasn't a priority.

~~~
rux
Disclaimer: my brother is mikera, maintainer of core.matrix

I asked why neanderthal wasn't part of core.matrix and Mike replied that he'd
like if it were 'for the sake of the ecosystem', and he'd offered some code
and tests last year to get this started but it didn't get taken in. It's still
there from last October, probably a bit out of date by now...

[https://github.com/mikera/neanderthal/network](https://github.com/mikera/neanderthal/network)

------
btilly
The first mistake that I found is, _All eigenvectors that correspond to one
eigenvalue lie on the same line, but have different magnitudes._

The eigenvectors for [[1 0] [0 1]] are 2 dimensional.

~~~
dragandj
Can you give an example of two eigenvectors that do not lie on the same line?
(I am curious, I am not a mathematician)

~~~
btilly
[1 0] and [0 1] are both eigenvectors of the identity matrix, yet not on the
same line.

I highly recommend
[http://www.axler.net/DwD.html](http://www.axler.net/DwD.html) for developing
a good intuition about what eigenvalues and eigenvectors actually are.

~~~
dragandj
I obviously formulated this wrong. Will rephrase, but doesn't the intuiton
generally hold that when you find an eigenvector for one eigenvalue, then you
can construct infinite number of eigenvectors by scaling that one?

~~~
kgwgk
The problem arises when there is degeneration (some eigenvalues are repeated).
See for example
[https://www.ch.ntu.edu.tw/~byjin/course/ChemMath/handouts/de...](https://www.ch.ntu.edu.tw/~byjin/course/ChemMath/handouts/degeneracy.pdf)

[edit: the rest of my original comment was not correct]

If an eigenvalue is unique, the corresponding eigenvectors are on a line. If
it appears twice, the corresponding eigenvectors are [edit: could be] on a
plane. In general, they span [edit: could span] a subspace of dimension equal
to the multiplicity.

~~~
neutronicus
The dimension of the space spanned by the eigenvectors associated with a
degenerate eigenvalue is bounded above by the algebraic multiplicity of the
eigenvalue and bounded below by one.

------
mannko
The essence of linear algebra :
[https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2x...](https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)
This was already posted on HN, but people in this thread might be interested.
Lots of insights about linear algebra thanks to a geometric approach

~~~
orthoganol
I really appreciate that series, but my one criticism is that it does not
connect concepts to how they are used. Maybe it's just me, but I get a much
stronger intuition if I imagine what's happening when I apply the concepts,
and I think it's just low hanging fruit for him to enshrine that series as
possibly one of the best resources out there if he took a few minutes here and
there to walk through details of application.

~~~
mannko
Yes, you're right, if we take this course as standalone. When used alongside a
more "traditional" course with applications, it's how it becomes very
interesting. You can easily feel guilty as a student when watching this,
thinking you should focus on what's been specifically taught to you (during
you 'official' lectures), and not on the conceptual ideas behind it. But it
does make learning easier, and maybe encourages you to come back to the
subject later, because you don't only see the subject as a tool anymore, and
get a feeling a familiarity.

------
vonnik
We wrote an eigen* intro here that some may appreciate:
[https://deeplearning4j.org/eigenvector](https://deeplearning4j.org/eigenvector)

------
hota_mazi
> how do I find eigenvalues and eigenvectors? The function is called ev!

Why not call that function eigen-vectors instead of ev?

Some library authors need to be smacked on the head with a programming book
and learn how to design sensible API's.

Seriously, Clojure is typically easy on the eyes but this is just garbage:

    
    
        (require '[uncomplicate.neanderthal
           [core :refer [col entry nrm2 mv scal axpy copy mm dia]]
    

axpy? mm? dia? nrm2?

~~~
btilly
You are walking into a classic argument about concise notation for experts
versus self-explanatory names for generalist programmers. See
[https://news.ycombinator.com/item?id=14239228](https://news.ycombinator.com/item?id=14239228)
for a previous iteration of this argument on this site.

Suffice it to say that anyone with sufficient mathematical expertise to
actually make sense of this stuff, who is going to do anything interesting
will want to use concise names. And yes, this will be the first thing that a
programmer who doesn't know it will complain about, but it is far from the
largest problem that you have to face in using this stuff. And if you work
through the actually important barriers to doing useful stuff with this
knowledge, you'll probably appreciate concise notation.

~~~
marmaduke
It's still an interesting idea to try to expose an unknown domain in terms of
a well thought out API which takes care of some of those dirty details. An ORM
would be an example. Should linear algebra API be the raw numerical algorithms
(Fortran like) or something dressed up which throw exceptions on failed
convergence?

(This is probably orthogonal to your comment btw )

~~~
btilly
Any abstraction layer built on top of linear algebra intended for non-
mathematicians should be given mnemonic names that are domain specific for
what that abstraction layer is aimed at.

Any abstraction layer built with the purpose of making it easy to do linear
algebra in software should use names that are domain specific to linear
algebra. Which will be the short concise names that make sense to those with
mathematical expertise, no matter what generalist programmers think of said
names.

------
dragandj
There is also part 1 of this tutorial:
[http://dragan.rocks/articles/17/Clojure-Linear-Algebra-
Refre...](http://dragan.rocks/articles/17/Clojure-Linear-Algebra-Refresher-
Vector-Spaces)

------
nerdponx
Previous discussion:
[https://news.ycombinator.com/item?id=14480559](https://news.ycombinator.com/item?id=14480559)

~~~
dmix
Just to clarify that's part 1 of the series and this is part 2.

~~~
nerdponx
Thanks for clarifying. That should probably be reflected in the title.

------
deft
This is nice, but I wish he showed it in mathematical syntax too. I don't
understand some of the calculations because I don't know much about Clojure.
Or Linear Algebra.

~~~
drostie
So the transform is what we'd write as a matrix as:

    
    
        [-4  -6]
        [ 3   5]
    

This matrix corresponds to the following linear transform: if you hand in the
pair (x, y) it gives you the pair T(x, y) = (-4x - 6y, 3x + 5y). The word
"linear" means that T(x1 + x2, y1 + y2) = T(x1, y1) + T(x2, y2) for all x1,
x2, y1, y2 and that T(k x, k y) = k T(x, y) for all k. This has a nice
interpretation in terms of the "vector space" underneath; a linear transform
"distributes over vector addition" and "commutes with scalar multiplication."

In fact vector spaces are super-general. The set of infinite sequences of real
numbers `[a0, a1, a2, ...]` is a very nice vector space, and the Fibonacci
recurrence F[n] = F[n - 1] + F[n - 2] is linear in such infinite sequences. It
can be solved by F[n] = p^n for two bases p=b1, p=b2, and then you can see
that any other sequence obeying that recurrence relation must have the form
F[n] = A b1^n + B b2^n for some constants A, B. It's the same principles of
linear algebra which get you there. (Exercise: solve the Fibonacci recurrence
for b1, b2 and find A and B such that F[0] = 0, F[1] = 1, the normal Fibonacci
numbers.)

Anyway, back to the matrix. As you can see the trace of this matrix (sum of
diagonal entries) is 1, and the determinant (for a 2x2 matrix, the product of
the diagonal entries minus the product of the other two entries) is -20 + 18 =
-2. It turns out that the trace is always the sum of the eigenvalues and the
determinant is always the product of them, giving a nice way to understand
this as the eigenvalues +2 and -1, the only two numbers that you can multiply
together to get -2 and sum together to get +1.

If we know that these are the eigenvalues then we would calculate the
eigenvectors by looking for a non-trivial nullspace, T(x) = k x implying that
T2(x) = T(x) - k x maps this nontrivial vector x to 0. Since the - k x is the
diagonal matrix diag(-k, -k), we are looking at for k = +2 the matrix

    
    
        [-4  -6]  +  [-2   0] = [-6  -6]
        [ 3   5]     [ 0  -2]   [ 3   3]
    

Now that looks like a very degenerate transform! T(x, y) = (-6 x - 6 y, 3 x +
3 y), what's going to map this to (0, 0)? Just x = -y will do perfectly
nicely. So (1, -1) is one representative eigenvector for an entire
eigendirection (t, -t) for all t. Plugging that into the original we also see
T(1, -1) = (-4 + 6, 3 - 5) = (2, -2), proving that this is all 100% self-
consistent.

The other eigenvector comes from,

    
    
        [-4  -6]  +  [+1   0] = [-3  -6]
        [ 3   5]     [ 0  +1]   [ 3   6]
    

This takes a little more thinking, the eigenvector is (2, -1) representing the
eigendirection (2t, -t) for all t.

Now what's the basic reason that you're doing all of this? Here's what: the
_eigenbasis_. These vectors are not orthogonal, but they do span the space
with a skewed coordinate system. Any vector

    
    
        (x, y) = a (2, -1) + b (1, -1)
    

Call these new components {a, b} with curly braces.

To solve for it explicitly, notice that {1, -1} = (1, 0) and {1, -2} = (0, 1).
So x (1, 0) + y (0, 1) = x {1, -1} + y {1, -2} = {x + y, -x - 2y}, and we can
find that a = x + y, b = -x -2y.

Now if you pay the pain of using these skewed coordinates to begin with, then
the matrix has a very nice representation in these coordinates: T{a, b} = {-a,
2b}. It just treats both of these coordinates independently, scaling them
without mixing them. If you want to repeat it n times, it will just be T^n {a,
b} = {(-1)^n a, 2^n b}. (Exercise: one of those roots b1, b2 for the
Fibonaccis, let's say b2, has absolute magnitude less than 1, hence its
exponentiations drift towards 0. Program an algorithm to calculate the Nth
Fibonacci as `round(A*pow(b1, n))` and see how it does.)

What complicates things a little is that usually your vector space is also an
inner product space, and often that inner product has a nice structure (a, b)
. (x, y) = a x + b y, in the orthogonal coordinates. It loses this structure
in the skewed coordinates. Fortunately the physicists have a very nice
notation for these cases coming from their work in relativity: it is to write
vectors with both "upper indices" and "lower indices". The idea is that if
we're using the basis (2, -1), (1, -1) then for each vector of the basis, we
find a vector which has two properties: first, it's orthogonal to all of the
other vectors of the basis; second, its dot product with the vector that it
corresponds to is 1. These vectors form the "dual basis".

So the vectors perpendicular to (1, -1) have the shape (t, t) for all t, we
dot (2, -1) . (t, t) = 2t - t = t, so we choose t=1 and find that the dual
vector to (2, -1) is (1, 1). Similarly the vectors perpendicular to (2, -1)
have shape (t, 2t), dotting this with (1, -1) gives t - 2 t = -t, setting that
to 1 says that t = -1, so the dual vector to (1, -1) is (-1, -2). [These
mirror the expression {a, b} = {x + y, x - 2 y} above, and for good reason.]

So the idea is that we represent every vector in two ways, with its
"contravariant components" v^1, v^2 such that v^1 (2, -1) + v^2 (1, -1) is our
vector, and its "covariant components" v_1, v_2 such that v_1 (1, 1) + v_2
(-1, -2) is also our vector. When we do this we discover that actually we get
dot products back to a nice diagonal form even in a skewed coordinate system,
that form is u_1 v^1 + u_2 v^2 + ... = u^1 v_1 + u^2 v_2 + ... . If you've got
a nice crystal structure in physics, you might have, say, that the electrical
conductivity really "wants" to be expressed in these skewed coordinates, where
if you go down any of the natural directions of the crystal the material
responds via Ohm's law. If it has slightly different resistances in slightly
different crystal directions, say, then because the crystal is a skewed
coordinate system, if you apply an electric field in an arbitrary physical
direction you will in general create a current in a slightly different
physical direction. But in the skewed coordinates it just looks like `J = {s1
E1, s2 E2, s3 E3}` corresponding to a diagonal conductivity tensor `diag(s1,
s2, s3)` ... it's just that when you're not in the crystal structure you can't
quite see that this material wants to flow in a couple of skewed directions
preferentially.

~~~
deft
Wow, thank you for this! I appreciate it and am going through it now :)

------
vurobin123
vurobin123

------
Piccollo
oh god no... the memories

