The first sentence just describes the utility of the Eigens (so no explanation there). The next lays out the setting for the diagram. And the third says, "if we can do X, then v is an eigenvector and \lambda an eigenvalue". But... what if you can't do "X" ? What if v, (0,0) and Av are not colinear?
The skeleton of a great explanation is there, but the meat isn't there yet. A few more sentences would go a long way in making this better.
I appreciate the OP's effort, and I hope this will come across as constructive criticism.
The eigenvectors are the “axes” of the transformation represented by the matrix.
Consider spinning a globe (the universe of vectors): every location faces a new direction, except the poles.
An “eigenvector” is an input that doesn’t change direction when it’s run through the matrix (it points “along the axis”). And although the direction doesn’t change, the size might. The eigenvalue is the amount the eigenvector is scaled up or down when going through the matrix.
(Shameless plug, more here: http://betterexplained.com/articles/linear-algebra-guide/)
If you have a linear transformation from one vector space to another, the kernel is the part of the domain that maps to the zero vector in the range. Intuitively, if you have a function that satisfies certain properties, the kernel is the part that it collapses to "zero".
"In the 18th century Euler studied the rotational motion of a rigid body and discovered the importance of the principal axes. Lagrange realized that the principal axes are the eigenvectors of the inertia matrix".
So the eigenvectors are like the directions that describe position of an airplane: roll pitch and yaw.
A diagonalizable matrix is a matrix which functions much like a diagonal one—one which is non-zero only on its diagonal. To consider a small example, multiplying an arbitrary 3-dimensional vector by a 3x3 diagonal matrix
[ x1 [ b1 0 0 [ x1 * b1
x2 * 0 b2 0 = x2 * b2
x3 ] 0 0 b3 ] x3 * b3 ]
It's important to remember that vectors do not have a canonical representation. The vector [1 2 3] might be equivalent to [3 2 1]---we could simply be changing the bases, in this case by rearrangement. With this intuition in hand, (most) diagonalizable matrices are merely diagonal matrices in some other basis.
A diagonal matrix M is thus one such that there exists two matrices A and P such that M = P A inv(P) where A is diagonal and P corresponds to a "rotation". In other words, to act upon a vector x by M, to compute Mx, we first rotate with inv(P), apply a basis scaling, and then rotate back with P---we've effectively "stretched" the vector along a different basis.
If you think carefully about it, vectors which already lie along that "other" basis will be purely scaled, affected by only one component of the diagonal matrix. This satisfies the definition of eigenvector and the corresponding diagonal entry of A is the eigenvalue.
So far this has been a description of "nice" matrices---nice linear transforms which can be fully described by scaling (or reflecting, consider a negative eigenvalue) along some choice of basis which may or may not be the one we're currently writing our vectors in. Matrices which are not "nice" are called "defective". We might call non-square matrices defective (since a diagonal along a rectangle doesn't "go all the way").
Defective matrices still have eigenvalues and eigenvectors, but they are non-unique or incomplete in some way that makes the transformation "confusable". In particular, there may be multiple decompositions M = P A inv(P) which all work. Or none at all. As a very simple example, consider
technically (0, 0) isn't an eigenvector---for this matrix there is only one eigenvector and so it can't "scale" in two dimensions as it would need to in order to fit the intuition laid out above.
Defective matrices are rare. If you pick a random matrix it is highly unlikely (almost impossible) to be defective. Instead, you can think of these are ones where things line up "just right" to eliminate some information.
Geometrically, what occurs is that a defective matrix "shears" the space. Since we cannot describe a shearing motion in terms of a rotation+scaling+reflection then we no longer get the simple eigenvalue picture above.
It's worth noting that you can go quite far without leaving the world of matrices which have nice or mostly nice diagonalizations. As stated, random matrices are nearly always diagonalizable. You're more likely to see them in graph theory where structures of certain graphs induce "shearing".
That said, defective matrices can be nice still. For instance, diagonalizability is not necessary for invertibility---shearing transformations can be invertible. That's still a very "nice" matrix.
To consider this further, think about what happens to a diagonalizable matrix if you take one of the eigenvalues to 0. Suddenly, one "axis" of the stretch merely compresses all information away. We're now non-invertible.
That's the point of the simulation -- to wiggle things and see what happens. You drag stuff in and out of colinearity on the third one down and see how repeated steps get pulled along the eigenspace, getting your steady state (I'd love to see an example using these sims to explain underdamped/overdamped systems).
I think the meat isn't in the words, but in the play.
Unfortunately, playing with the inputs to the simulation does not reveal to me any simple mental model of what's happening. About all I can figure out from the simulation is that the X- and Y- coordinates of each input and output are correlated (if I move a2 to the right, Av will move to the right, and so on). The relationship between A1, A2 and S1, S2 are not clear to me.
I have to agree with the parent that it doesn't explain much. It only states things and allows you to visualize the mathematical consequences. The visualization however does not assist me in understanding the model. I'm limited to understanding it from the pure mathematics.
A core part of explanation is allowing people to tie new knowledge to existing knowledge. Visualizations can be helpful because they allow people to take advantage of the inherent human visual system. For example, you might demonstrate how matrices can be used to transform object coordinates in a 3D system by displaying a 3D cube and allowing people to fiddle with the parameters. You could provide individual controls for translation and pitch/yaw/roll and show how those feed into matrix cells. By dragging one parameter, a person would see the cube begin to rotate, for example. That's an example of the kind of intuitive explanation that this page is missing; I can't connect the visualizations to any preexisting mental model of what should be happening.
I'm surprised that this post isn't following this method, because i've come to think it's the standard way of explaining scientific things in the US.
One of my biggest hurdles learning linear algebra was getting that intuition. This "standard way of explaining scientific things" never built that intuition for me, only forced the mechanics of the computation into pencilized muscle memory.
Don't get me wrong, you need that muscle memory in practice. But without the intuition, your muscle memory is always going to be inferior to a couple commands in matlab. Stuff like this visualization builds the intuitive knowledge -- you see, you explore, you wiggle a few things and see what happens when you go in and out of the sweet spot.
Play (and simulation) is an extremely effective way to build intuition, and one I'd love to see more. These guys are doing an awesome job at these kinds of simulations -- their markov chain one was fantastic too http://setosa.io/ev/markov-chains/
Disagree. The linked article does a very poor job of explaining anything, much less conveying intuition. The lambda values remain even if the 3 points are not collinear, thus contradicting the first part of the article.
Incorrect. This is the first time I've ever really understood eigen*s
Correct me if i'm wrong, but i can easily imagine that people have been solving those types of issues manualy for centuries before finding the "shortcut" or representing transforms using matrices and figuring the rules of matrix & vector multiplication.
Still, i think you're making a good point. So maybe the correct process would be : problem stating, naive "manual" solution, graphical representation and then matrice & vector formalizing ?
Once you need to make a basis change, or you need to discover a "natural basis", all of linear algebra becomes easy and very intuitive.
"Eigenspaces are special lines, where any starting-point along them yields an eigenvalue that lands back on the same line. In these examples two exist, labeled, S1 and S2."
"Eigenspaces show where there is 'stability' from repeated applications of the eigenvector. Some act like 'troughs' which attract nearby series of points (S1) while others are like hills (S2) where any point even slightly outside the stable peak yields eigenvalues further away.
Original post / detailed-reaction:
> First, every point on the same line as an eigenvector is another eigenvector. That line is an eigenspace.
At first I though this statement-of-fact meant that the whole tweakable quadrant of the X/Y plot (at a minimum) is an unbroken 2D Eigenspace, because every point within it can be "covered" by a dashed line (a 2D "vector") if I pick the appropriate start-point.
However, the last sentence also says eigenspaces are (despite the "space" in their name) lines, which throws the earlier interpretation into doubt.
> As you can see below, eigenspaces attract this sequence
S1 and S2 were displayed earlier, but not explained, now this section implies that those lines are the Eigenspaces? If so, what is the difference between S1 and S2? Playing with the chart, I assume they are the "forward" and "reverse" for repeat-applications of the transformation.
(Nothing wrong with constructive criticism, which most comments are, but it's also nice to just say thanks as well.)
Oops, you mean the rows add to one.
I hate to nitpick, but, additionally, numbers in the matrix can't be negative.
Also, it's not just that 1 is an eigenvalue, it is that 1 is the largest eigenvalue. This is significant, because it implies that all other components will die out in time.
http://blog.stata.com/2011/03/09/understanding-matrices-intu... came up here before and helped me visualise eigen-stuff.
In my own coursework, that matrix is called a "stochastic matrix", btw, not a "Markov matrix", but again, it's just definitional and not of interest in a simple article like this.
My main point is that summing to one is what you want, and summing to zero is crazy.
"If you can draw a line through (0,0), v and Av, then Av is just v multiplied by a number λ; that is, Av=λv."
That makes no sense. How do you draw a line through a point to "v and Av"? What does "v and Av" even mean in that context?
Igon send it to him if you can't :)
I absolutely hate it when websites use fonts this thin, too.
1) The point (0, 0) gets mapped to itself.
2) Straight lines get mapped to straight lines, though maybe pointing in a different direction.
3) Pairs of parallel straight lines get mapped to pairs of parallel straight lines.
Hence the name "linear transformation" :-) We can see that all straight lines going through (0, 0) get mapped to straight lines going through (0, 0). Let's consider just those straight lines going through (0, 0) that get mapped to themselves. There are four possibilities:
1) There are no such lines, e.g. if the transformation is a rotation.
2) There is one such line, e.g. if the transformation is a skew.
3) There are two such lines, e.g. if the transformation is a stretch along some axis.
4) There are more than two such lines. In this case, you can prove that in fact all straight lines going through (0, 0) are mapped to themselves, and the transformation is a scaling.
Now let's consider what happens within a single such line that gets mapped to itself. You can prove that within a single such line, the transformation becomes a scaling by some constant factor. (That factor could also be negative, which corresponds to flipping the direction of the line.) Let's call these factors the "eigenvalues", or "own values" of the transformation.
Now let's define the "eigenspaces", or "own spaces" of the transformation, corresponding to each eigenvalue. An eigenspace is the set of all points in the 2D plane for which the transformation becomes scaling by an eigenvalue. Let's see what happens in each of the cases:
1) In case 1, there are no eigenspaces and no eigenvalues.
2) In case 2, there is only one eigenspace, which is the straight line corresponding to the single eigenvalue.
3) In case 3, it pays off to be careful! First we need to check what happens if the two eigenvalues are equal. If that happens, it's easy to prove that we end up in case 4 instead. Otherwise there are two different eigenvalues, and their eigenspaces are two different straight lines.
4) In case 4, the eigenspace is the whole 2D plane.
In this way, eigenvalues and eigenspaces are unambiguously geometrically defined, and don't require coordinates or matrices.
Now, what are "eigenvectors", or "own vectors" of the transformation? Let's say that an "eigenvector" is any vector for which our transformation is a scaling. In other words, an "eigenvector" is a vector from (0, 0) to any point in an eigenspace. The disadvantage is that it involves an arbitrary choice. The advantage is that eigenvectors can be specified by coordinates, so you can find them by computational methods.
Does that make sense?