Hacker News new | comments | show | ask | jobs | submit login
Eigenvectors and eigenvalues explained visually (setosa.io)
523 points by vicapow on Jan 20, 2015 | hide | past | web | favorite | 52 comments

The pretty animations are nice, and the ability to manipulate the vectors is very nice; however, I am sorry to say (and I do not mean this negatively) that there's not much "explanation".

The first sentence just describes the utility of the Eigens (so no explanation there). The next lays out the setting for the diagram. And the third says, "if we can do X, then v is an eigenvector and \lambda an eigenvalue". But... what if you can't do "X" ? What if v, (0,0) and Av are not colinear?

The skeleton of a great explanation is there, but the meat isn't there yet. A few more sentences would go a long way in making this better.

I appreciate the OP's effort, and I hope this will come across as constructive criticism.

WRITE THOSE SENTENCES! Don't leave us hanging!

My take:

The eigenvectors are the “axes” of the transformation represented by the matrix.

Consider spinning a globe (the universe of vectors): every location faces a new direction, except the poles.

An “eigenvector” is an input that doesn’t change direction when it’s run through the matrix (it points “along the axis”). And although the direction doesn’t change, the size might. The eigenvalue is the amount the eigenvector is scaled up or down when going through the matrix.

(Shameless plug, more here: http://betterexplained.com/articles/linear-algebra-guide/)

Be careful using this to interpret complex eigenvalues, though! (One could argue that a complex eigenvector points along the same complex axis, but that probably doesn't help much to understand the eigenvalues of a real matrix like ( ( 0 1 ) ( 1 0 ) ).)

Who says ((0 1) (1 0)) is a real matrix? It's a complex matrix whose entries all happen to have 0 as the value of their imaginary part :)

I didn't say that it isn't complex, only that it is real—and as such can be visualised as a transformation of the real plane, in which context one might still strive to understand its (complex) eigendecomposition.

What's a kernel?


If you have a linear transformation from one vector space to another, the kernel is the part of the domain that maps to the zero vector in the range. Intuitively, if you have a function that satisfies certain properties, the kernel is the part that it collapses to "zero".

This from wikipedia is where it started to click for me:

"In the 18th century Euler studied the rotational motion of a rigid body and discovered the importance of the principal axes. Lagrange realized that the principal axes are the eigenvectors of the inertia matrix".

So the eigenvectors are like the directions that describe position of an airplane: roll pitch and yaw.

This point of view was used in a recent HN post: https://news.ycombinator.com/item?id=8904089 .

Matrices represent linear transformations of spaces as described in the article. Not all matrices are well-behaved however. We'll talk first about diagonalizable matrices.

A diagonalizable matrix is a matrix which functions much like a diagonal one—one which is non-zero only on its diagonal. To consider a small example, multiplying an arbitrary 3-dimensional vector by a 3x3 diagonal matrix

    [ x1          [ b1 0  0            [ x1 * b1
      x2     *      0  b2 0       =      x2 * b2
      x3 ]          0  0  b3 ]           x3 * b3 ]
What we observe is that a diagonal matrix affects each element of the vector independently. Since a vector represented as a list of numbers merely reads out how "far along" the vector is with each basis vector of the space, this means that a diagonal matrix merely "stretches" vectors along the space's bases.

It's important to remember that vectors do not have a canonical representation. The vector [1 2 3] might be equivalent to [3 2 1]---we could simply be changing the bases, in this case by rearrangement. With this intuition in hand, (most) diagonalizable matrices are merely diagonal matrices in some other basis.

A diagonal matrix M is thus one such that there exists two matrices A and P such that M = P A inv(P) where A is diagonal and P corresponds to a "rotation". In other words, to act upon a vector x by M, to compute Mx, we first rotate with inv(P), apply a basis scaling, and then rotate back with P---we've effectively "stretched" the vector along a different basis.

If you think carefully about it, vectors which already lie along that "other" basis will be purely scaled, affected by only one component of the diagonal matrix. This satisfies the definition of eigenvector and the corresponding diagonal entry of A is the eigenvalue.


So far this has been a description of "nice" matrices---nice linear transforms which can be fully described by scaling (or reflecting, consider a negative eigenvalue) along some choice of basis which may or may not be the one we're currently writing our vectors in. Matrices which are not "nice" are called "defective". We might call non-square matrices defective (since a diagonal along a rectangle doesn't "go all the way").

Defective matrices still have eigenvalues and eigenvectors, but they are non-unique or incomplete in some way that makes the transformation "confusable". In particular, there may be multiple decompositions M = P A inv(P) which all work. Or none at all. As a very simple example, consider


technically (0, 0) isn't an eigenvector---for this matrix there is only one eigenvector and so it can't "scale" in two dimensions as it would need to in order to fit the intuition laid out above.


Defective matrices are rare. If you pick a random matrix it is highly unlikely (almost impossible) to be defective. Instead, you can think of these are ones where things line up "just right" to eliminate some information.

Geometrically, what occurs is that a defective matrix "shears" the space. Since we cannot describe a shearing motion in terms of a rotation+scaling+reflection then we no longer get the simple eigenvalue picture above.


It's worth noting that you can go quite far without leaving the world of matrices which have nice or mostly nice diagonalizations. As stated, random matrices are nearly always diagonalizable. You're more likely to see them in graph theory where structures of certain graphs induce "shearing".

That said, defective matrices can be nice still. For instance, diagonalizability is not necessary for invertibility---shearing transformations can be invertible. That's still a very "nice" matrix.

To consider this further, think about what happens to a diagonalizable matrix if you take one of the eigenvalues to 0. Suddenly, one "axis" of the stretch merely compresses all information away. We're now non-invertible.

I really like your explanation of diagonalizable matrices! I've always seen explanations of how to determine if a matrix is diagonalizable, but hadn't run across the idea that they are just scaled diagonal matrices for a different basis.

I'm glad it's useful!

One piece of information that helped me understand eigenvectors and values was to learn that eigen is a German word meaning own as in characteristic or distinct.

"But... what if you can't do 'X' ? What if v, (0,0) and Av are not colinear?"

That's the point of the simulation -- to wiggle things and see what happens. You drag stuff in and out of colinearity on the third one down and see how repeated steps get pulled along the eigenspace, getting your steady state (I'd love to see an example using these sims to explain underdamped/overdamped systems).

I think the meat isn't in the words, but in the play.

> That's the point of the simulation -- to wiggle things and see what happens.

Unfortunately, playing with the inputs to the simulation does not reveal to me any simple mental model of what's happening. About all I can figure out from the simulation is that the X- and Y- coordinates of each input and output are correlated (if I move a2 to the right, Av will move to the right, and so on). The relationship between A1, A2 and S1, S2 are not clear to me.

I have to agree with the parent that it doesn't explain much. It only states things and allows you to visualize the mathematical consequences. The visualization however does not assist me in understanding the model. I'm limited to understanding it from the pure mathematics.

A core part of explanation is allowing people to tie new knowledge to existing knowledge. Visualizations can be helpful because they allow people to take advantage of the inherent human visual system. For example, you might demonstrate how matrices can be used to transform object coordinates in a 3D system by displaying a 3D cube and allowing people to fiddle with the parameters. You could provide individual controls for translation and pitch/yaw/roll and show how those feed into matrix cells. By dragging one parameter, a person would see the cube begin to rotate, for example. That's an example of the kind of intuitive explanation that this page is missing; I can't connect the visualizations to any preexisting mental model of what should be happening.

True, but it doesn't supply the analytical tools so that you can find the correct A to produce an eigenvector without iteration to reduce error. Having said that you can reduce the linear system by hand if you are reasonably au fait with matricies etc and derive it yourself; but I suspect this "tutorial" is probably most beneficial for those who are struggling, i.e. the very people who wouldn't be able to leap to a solution on their own.

Very beautiful graphs, but i don't think it's going to make people understand anything. I would start with the problem : easily compute sums of dependant values, then show a naive computation, then use matrices, vectors and eigenvalues to come to a solution, and only then, show a graphical representation of the steps performed.

I'm surprised that this post isn't following this method, because i've come to think it's the standard way of explaining scientific things in the US.

Eigenvalues and eigenvectors are one of those things that pop up in a million places because they're so useful, but to recognize where they may be useful you need intuition as to what they're doing.

One of my biggest hurdles learning linear algebra was getting that intuition. This "standard way of explaining scientific things" never built that intuition for me, only forced the mechanics of the computation into pencilized muscle memory.

Don't get me wrong, you need that muscle memory in practice. But without the intuition, your muscle memory is always going to be inferior to a couple commands in matlab. Stuff like this visualization builds the intuitive knowledge -- you see, you explore, you wiggle a few things and see what happens when you go in and out of the sweet spot.

Play (and simulation) is an extremely effective way to build intuition, and one I'd love to see more. These guys are doing an awesome job at these kinds of simulations -- their markov chain one was fantastic too http://setosa.io/ev/markov-chains/

>Stuff like this visualization builds the intuitive knowledge -- you see, you explore, you wiggle a few things and see what happens when you go in and out of the sweet spot.

Disagree. The linked article does a very poor job of explaining anything, much less conveying intuition. The lambda values remain even if the 3 points are not collinear, thus contradicting the first part of the article.

"Very beautiful graphs, but i don't think it's going to make people understand anything."

Incorrect. This is the first time I've ever really understood eigen*s

From what you've said i think you mean that it isn't the first time you've tried to understand them, or be exposed to their definition. In which case yes, absolutely the graph representation of base change exposed here is very helpful. I was more talking for people that would start with this page. What i meant was that the fact that the page starts right away with base change, and introduces practical uses of it later seems akward.

Insanity. 90% of linear algebra can be understood from graphically examining affine transforms. Doing visuals last is what leads people to conceive of matrix-vector multiplication as a set of meaningless dot products as opposed to just multiplying the basis vectors by the matching coordinate and adding them all up.

I think you're missing the first step, which is, formalizing a problem using matrices and vectors in the first place.

Correct me if i'm wrong, but i can easily imagine that people have been solving those types of issues manualy for centuries before finding the "shortcut" or representing transforms using matrices and figuring the rules of matrix & vector multiplication.

Still, i think you're making a good point. So maybe the correct process would be : problem stating, naive "manual" solution, graphical representation and then matrice & vector formalizing ?

Well, I guess every teacher ever lose the real first step, which is getting a problem to solve.

Once you need to make a basis change, or you need to discover a "natural basis", all of linear algebra becomes easy and very intuitive.

Very cool, but as a layman I was a very confused by the description of eigenspaces and the S1/S2 lines. I'm just guessing here (reasoning below) but I'd like to suggest phrasing like:

"Eigenspaces are special lines, where any starting-point along them yields an eigenvalue that lands back on the same line. In these examples two exist, labeled, S1 and S2."

"Eigenspaces show where there is 'stability' from repeated applications of the eigenvector. Some act like 'troughs' which attract nearby series of points (S1) while others are like hills (S2) where any point even slightly outside the stable peak yields eigenvalues further away.


Original post / detailed-reaction:

> First, every point on the same line as an eigenvector is another eigenvector. That line is an eigenspace.

At first I though this statement-of-fact meant that the whole tweakable quadrant of the X/Y plot (at a minimum) is an unbroken 2D Eigenspace, because every point within it can be "covered" by a dashed line (a 2D "vector") if I pick the appropriate start-point.

However, the last sentence also says eigenspaces are (despite the "space" in their name) lines, which throws the earlier interpretation into doubt.

> As you can see below, eigenspaces attract this sequence

S1 and S2 were displayed earlier, but not explained, now this section implies that those lines are the Eigenspaces? If so, what is the difference between S1 and S2? Playing with the chart, I assume they are the "forward" and "reverse" for repeat-applications of the transformation.

hey yeah good point. I'll add a reference to the labels, which were added last minute.

I have no idea what eigenvectors or eigenvalues are, so this just confused me more. To be fair, I think the author does assume some basic math understanding before hand though.

I like the visualization. But there seems to be an error: the non-diagonal elements of the Markov matrix need to be interchanged. You can see this by setting p=1 and q=0. Their formula would result in a total population of 2*California after one step, which is clearly larger than California+New York.

good catch my friend

The interactive graph in the section "Complex eigenvalues" has a repeatable crash bug in Chrome 39 on Win 7. There are a number of was to trigger it, the easiest of which is to adjust a1 and a2 such that both have positive x and y values and the resulting line from v to Av has a slope of approximately 1.

This wikipedia graphic gives a pretty good graphical explanation of what eigenvalues do and what eigenvectors are:


First time I've actually sort of understood eigenvectors. Linear algebra was actually the class that made me hate math, after years of loving it in secondary education. Not everyone has the benefit of a good teacher, and the tools that exist now don't help you to self-learn much.

As a counterpoint to most of the comments, let me just say: this is fantastic.

(Nothing wrong with constructive criticism, which most comments are, but it's also nice to just say thanks as well.)

I have to admit I hated the term Eigenvector for two semesters of college and it nearly caused me to drop mathematics altogether. This explanation is very good and helps visualize some of the things I was missing. Apologies to the fantastic professors I had who were talking over my head for 16 weeks.

>> "It turns out that a matrix like A, whose rows add up to zero (try it!), is called a Markov matrix, ..."

Oops, you mean the rows add to one.

I hate to nitpick, but, additionally, numbers in the matrix can't be negative.

Also, it's not just that 1 is an eigenvalue, it is that 1 is the largest eigenvalue. This is significant, because it implies that all other components will die out in time.

The cited source says that in a Markov matrix the columns sum to 1 which Wikipedia reports is a "left stochastic matrix" whilst the rows summing to 1 makes it a "right stochastic matrix" - I'm not sure on the definition of Markov Matrix specifically but then it doesn't seem that http://mathworld.wolfram.com/StochasticMatrix.html knows either. The term returns no search results at http://www.encyclopediaofmath.com/.

http://blog.stata.com/2011/03/09/understanding-matrices-intu... came up here before and helped me visualise eigen-stuff.

Whether it's rows vs. columns is just a matter of definition, depending on whether you want to propagate the state by left multiplication or right multiplication. The typical definition (in the context of Markov chains) has the rows summing to one. But how to do it is up to the author of the page. Just be consistent.

In my own coursework, that matrix is called a "stochastic matrix", btw, not a "Markov matrix", but again, it's just definitional and not of interest in a simple article like this.

My main point is that summing to one is what you want, and summing to zero is crazy.

Indeed, I've heard of stochastic matrices - just wasn't sure they were coterminous in definition with "Markov matrix". Thanks for clarifying though.

Wow, almost no positive feedback here? I think the article assumes certain audience, and for me, this brought a great insight I did not ever get in our college courses.

The graphics is nice but the explanation is just terrible. There is a huge gap between the first part explaining vectors and then the part explaining eigenvectors.

"If you can draw a line through (0,0), v and Av, then Av is just v multiplied by a number λ; that is, Av=λv."

That makes no sense. How do you draw a line through a point to "v and Av"? What does "v and Av" even mean in that context?

If you can draw a line through the three points: (0,0), v, and Av

That assumption should have been stated in the explanation and the graphics need to reflect that constraint, right? Or am I still missing something here?

When I see similar faces that look a like from different people I wonder if they have similar eigenfaces?

I would not be surprised if FB had a class called Eigenperson, somewhere in their code or database.

I'm pretty sure they do: http://en.wikipedia.org/wiki/Eigenface

Did you send this to Malcolm Gladwell? :)

Igon send it to him if you can't :)

It takes me an enormous amount of effort to read this font. I need to squint my eyes and had to zoom my browser window to about 200% and then scroll horizontally to make my way through the paragraphs.

Try changing the font-width instead.

I absolutely hate it when websites use fonts this thin, too.

are you talking about through the css? What method do you use? I really want to read this content but I find some difficulty in it.

Yeah, I changed the CSS with the inspector. I don't know if it's because I have a low resolution screen, but I can't actually read it at all without doing this.

I just tried to figure out the simplest rigorous explanation of linear transformations. Here's one in terms of straight lines. Let's say we have a transformation of the 2D plane, i.e. a mapping from points to points. We will call that a "linear transformation" if these conditions are satisfied:

1) The point (0, 0) gets mapped to itself.

2) Straight lines get mapped to straight lines, though maybe pointing in a different direction.

3) Pairs of parallel straight lines get mapped to pairs of parallel straight lines.

Hence the name "linear transformation" :-) We can see that all straight lines going through (0, 0) get mapped to straight lines going through (0, 0). Let's consider just those straight lines going through (0, 0) that get mapped to themselves. There are four possibilities:

1) There are no such lines, e.g. if the transformation is a rotation.

2) There is one such line, e.g. if the transformation is a skew.

3) There are two such lines, e.g. if the transformation is a stretch along some axis.

4) There are more than two such lines. In this case, you can prove that in fact all straight lines going through (0, 0) are mapped to themselves, and the transformation is a scaling.

Now let's consider what happens within a single such line that gets mapped to itself. You can prove that within a single such line, the transformation becomes a scaling by some constant factor. (That factor could also be negative, which corresponds to flipping the direction of the line.) Let's call these factors the "eigenvalues", or "own values" of the transformation.

Now let's define the "eigenspaces", or "own spaces" of the transformation, corresponding to each eigenvalue. An eigenspace is the set of all points in the 2D plane for which the transformation becomes scaling by an eigenvalue. Let's see what happens in each of the cases:

1) In case 1, there are no eigenspaces and no eigenvalues.

2) In case 2, there is only one eigenspace, which is the straight line corresponding to the single eigenvalue.

3) In case 3, it pays off to be careful! First we need to check what happens if the two eigenvalues are equal. If that happens, it's easy to prove that we end up in case 4 instead. Otherwise there are two different eigenvalues, and their eigenspaces are two different straight lines.

4) In case 4, the eigenspace is the whole 2D plane.

In this way, eigenvalues and eigenspaces are unambiguously geometrically defined, and don't require coordinates or matrices.

Now, what are "eigenvectors", or "own vectors" of the transformation? Let's say that an "eigenvector" is any vector for which our transformation is a scaling. In other words, an "eigenvector" is a vector from (0, 0) to any point in an eigenspace. The disadvantage is that it involves an arbitrary choice. The advantage is that eigenvectors can be specified by coordinates, so you can find them by computational methods.

Does that make sense?

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact