After reading the article, I have but one question: What are eigen values?
A quick lookup on Wikipedia reveals that they are used in linear algebra and, for example, matrix transformations. I think this article failed by trying to relate eigenvalues to everyday scenarios, similar to when beginning calculus books give examples involving total pressure on an underwater dam door or mass of a rod as examples for the usefulness of definite integrals. In those cases, you end up calculating the area under the curve, but the curve happens to be the graph of a constant function or a first-order polynomial. Calculating such areas can easily be done by using the elementary formula for the area of a trapezoid.
In linear algebra, for a given linear transformation (a certain kind of matrix which generally represents some operation) M, an eigenvalue of M is a scalar c such that given a non-zero vector v (called the eigenvector), Mv = cv, where multiplication on the left is matrix multiplication and multiplication on the right is multiplication of v by the scalar c.
The definition is significant because it says that for a certain vector v, the transformation M, no matter how complicated it may be, just scales v by a factor of c. This is useful, for instance, if you want to determine an axis by which to evaluate the range of the transformation, because by choosing an eigenvector, you are choosing a "simple" or "natural" perspective from which to evaluate the range.
This is the problem I had with Linear Algebra. The first half of the class felt like I was just being drilled definitions. But once all the definitions click, it's rather simple.
The best thing about my linear algebra class was that the professor warned us about this up front. Yet, I still was not prepared for the onslaught of new words, and the bad part of the class is they were all defined in terms of more mathematical words, not concepts like this rubber band example.
That was the point of the "What are eigen values?" article. It was to give the intuition behind the scary mathematical definition.
I find this helpful, as it answers the all important questions 1) Why should I care? and 2) What is the basic problem that the scary math is trying to solve?
With answers to those questions in hand, you can return to the scary math and work out how it maps to those answers.
Ok, others have tried a little bit of the math, but this is a difficult form for that. So here is a very intuitive (i.e. hand-wavy) way to think about the type of relationships we are talking about.
Suppose you have a (finite) cloud of points in three dimensions. For example, a bunch of GPS measurements. Lets just imagine that they are roughly elliptical, stretched out like a football. For simplicity we'll subtract the mean from this, so our co-ordinates come from the middle of this cloud.
I can take this set of (x,y,z) points and and find the eigen (which means "characteristic") vectors and values for this set of points. The first eigenvector will point along the long axis of the "football" (right through the pointy bits) because this is the strongest, in some sense, single direction that is represented. It won't be perfectly aligned with a finite number of points...
The second and third eigenvectors will between them form a plane perpendicular to this. So essentially what you've done is found a co-ordinate system that is aligned with the data (modulo some details, this is essentially what a procedure called Principal Component Analysis (PCA) does.
Now that's the vectors, what about the values? The eigenvalues tell you the relative importance of each direction. If you stretch the "football" out, the first eigenvalue, which was associated with the first eigenvector pointing along the long axis, will be larger relative to the others. If you use a basketball, they'll all be the same magnitude.
Does that help?
The idea generalizes a lot from what I've told you, but there are a ton of places you can look up the details if you'd like.
So there's one part of the explanation on the page that I don't quite get: she uses the example of a coin being turned 360 degrees along some axis as leaving all possible vectors as eigenvectors:
"If you rotate a coin by 360 degrees you preserve all directions and so each direction is an eigenvector. Because no stretching has occurred, all of these eigenvectors have eigenvalue 1. Rotating the coin by 60 degrees destroys all directions and this transformation has no eigenvectors or eigenvalues at all."
But in the 60-degree case, why isn't the axis of rotation an eigenvector? Its direction isn't "destroyed," is it?...
Because the original vector is not pointing in the same
direction as the final vector (it is at an angle of 60 degrees to it). If you do a 360 degree rotation you get back to where you started. So any number of successive
full rotations will have eigenvectors, each with eigenvalue
1. If you do a 180 degree rotation, the resulting vector
will be pointing in the opposite direction, and will have
eigenvalue -1.
You are doing a rotation in 3 dimensions. Surely the original example meant two dimensions. In 3 dimensions, the axis of rotation is indeed an eigenvector.
Eigenvectors are "important" directions with respect to some linear transform (operator). The eigenvalues are constant scaling factors that relate input vectors (inputs to the operator) to output vectors.
Knowing that an operator merely scales (but doesn't rotate) certain input vectors is an important property to characterize it.
Imagine some matrix describing the stresses (strains?) in a continuous medium of material, about a certain point. If you can find special directions from that point where the forces are just scaling, then you know you're just dealing with tension/compression in those directions, not bending.
Same thing applies for fluid flows, it can be useful to find out where the flow is just speeding up / slowing down, but not twisting.
I found this to be an incredibly clear and simple description of what eigenvalues are. I completely missed the point in University. I agree that it was missing a discussion of what they're used for.
The thing is, at least for me, we learned these concepts in college, but it was only in terms of getting the right answer on the test. They only provided the abstraction, not its basis in reality.
This would have been very helpful to have a grasp of for my research in university where I had to explain eigenvalues and eigenvectors to fellow computer science people. It probably also would have helped me understand what the hell my own code did.
Wow, that's the best explanation I've read. Very intuitive. Not to say I didn't know the original definition at some point but having such a concept tied to a concrete example makes it much easier to remember.
It makes me wish that most mathematical concepts were explained with some kind of real world analogy.
This is ridiculous. I was just thinking last week that even though I took linear and did fairly well I never really grasped eigenvalues and eigenvectors since I see them coming up fairly frequently as far as math concepts go and don't instantly have an intuitive grasp. Literally the next week I see this posted on hn.
I first encountered eigenmagic in machine learning --- we were interested in the eigenvectors of the adjacency matrix of a graph. When the space that the matrix lives in is so abstract, 'real world' examples don't make it any easier to visualize what's happening.
If you really want to understand them intuitively, I reccomend plotting a few hundred matrices and their respective eigen- values and vectors. When you feel like you can predict what the results will look like at each frequency, then the stories about rubber bands and shiny coins will make sense.
Or maybe you're faster, and the stories helped you make the leap --- in which case, ignore me!
That was a very nice explanation. I love the rubber band analogy. But one minor qualm: A transformation, when applied, typically has several eigenvalues and eigenvectors associated with it. The "first" (usually selected as the largest, or the "principle component") eigenvector and it's associated eigenvalue is really being discussed. There are (or at least, can be) more of them! We could think about stretching the elastic along its width as well.
For a linear system (of any kind of equivalence) within an N-dimensional vector space, the Eigenvalues represent the scaling factors across those dimensions when the system's state is represented by an NxN sparse-diagonalized matrix (i.e. all values are 0 except for the main diagonal).
Those non-zero values along the main diagonal are its Eigenvalues and its rows are Eigenvectors.
For the common 3D isometric (e.g. xyz) coordinate system, the Eigenvalues can be thought of as a kind of multiplier across the unit vectors (Eigenvectors) [[1,0,0][0,1,0][0,0,1]]. This is the "stretching" analog mentioned in the article.
FWIW, Eigenvalues are not just a salient property of linear systems (i.e. matrices), but also of higher-order tensors.
Finding (or more-often approximating) these "characteristic scaling states" is a critical step in numerical analysis in everything from quantum mechanics, financial hedging strategies, and even consumer product marketing plans.
If you've ever represented a system as a series of Markov probability chains, every row of the "convergent/dominant" (if any) state contains an Eigenvalue.
1. A linear mapping is not a "kind of equivalence" by any reasonable definition. For instance, the function that maps every vector to 0 is a linear mapping, and it has plenty of eigenvectors. (All with eigenvalue 0.)
2. The eigenvectors are not the rows of the diagonalized matrix. They are the rows (or columns, depending on just how you define things) of the matrix that does the coordinate transformation to diagonalize your matrix.
3. I think the paragraph beginning "For the common 3D isometric ..." is rather confused. I certainly am when reading it. Perhaps the problem is in my brain; what exactly do you mean? (Here's the nearest true thing I can think of to what that paragraph says: For many, but not all, linear transformations from a space to itself, there is an orthogonal coordinate system with respect to which the transformation's matrix is diagonal; then the eigenvectors are the axes of that coordinate system, and the eigenvalues are the amounts by which vectors along those axes get stretched.)
4. Tensors are just as linear as matrices. (You can do nonlinear things with a tensor, but then you can with a matrix too.)
5. The probabilities in a Markov chain's stationary state are not eigenvalues. (Well, I'm sure they're eigenvalues of something; any set of numbers can be made the eigenvalues of something; but they aren't, e.g., eigenvalues of the transition matrix.) What you may be thinking of is: a stationary state of a Markov chain is an eigenvector of the transition matrix, with eigenvalue 1; all eigenvalues have absolute value at most 1; if the Markov chain is ergodic (i.e., can get from any state to any other), then there is exactly one stationary state, exactly one eigenvector of eigenvalue 1, and all the other eigenvalues are strictly smaller. This is enough to guarantee convergence to the stationary state.
Also: If eigenvalues are as important in dealing with higher-order tensors as they are for the second-order case (i.e., matrices) then that's news to me. Tell me more?
Incidentally: that Markov chain thing? Suppose you make a Markov chain out of all the world's web pages, where each step moves you from a page to a randomly selected page it links to, or (occasionally for a page that has links, and unconditionally for a page with no links) to a completely random page.
Then the entries in that unique eigenvector -- equivalently, the long-term probabilities of landing on each page -- are basically the Google pagerank. (I'm sure Google's pagerank computation has lots of tweaks in it, and they certainly consider things other than pagerank to determine their search results. But pagerank was their original key idea.)
Days late, but worth the response for accuracy's sake since I just popped in on HN now:
> 1. A linear mapping is not a "kind of equivalence" by any > reasonable definition.
By definition, ANY mapping is an equivalence relation--even if that relation results in a 0 or NAN value.
> 2. The eigenvectors are not the rows of the diagonalized > matrix...
Correct. It depends on one's orientation (columns vs. rows), but the matrix which does the coordinate transforms contains the eigenvectors. Bad wording on my part. The key behind diagonalization is it removes any orientation issues from the relationship by establishing eigenweights across a given vector space.
> 3. I think the paragraph beginning "For the common 3D isometric ..."
Your interpretation is correct and, indeed, my choice of verbiage was poor. I really should have used the words "ortho-normal to a given vector space" which collapses to the common XYZ unit vectors in a linearly partitioned vector space of 6 degrees of freedom (3 translations & 3 rotations)--e.g. classic Cartesian space.
> Tensors are just as linear as matrices.
Most tensor fields are modeled using linear approximations (i.e. matricies of n-dimensions), but the very fact that a tensor itself is being used in the characteristic equations is typically indicative of non-linear behavior in the overall system. For example, in fluid dynamics used to model airflow across a wing or boundary values issues when the Cauchy stress tensor is used for structures undergoing plastic deformation.
I believe that you are conflating Manifolds with Tensors. The latter is a refinement and/or characteristic relation defined upon the former. A non-linear Tensor is defined upon a manifold with one or more non-linear relations. Perhaps you are used to dealing exclusively with metric tensors?
> The probabilities in a Markov chain's stationary state are not eigenvalues.
Here you are spot-on. Indeed it's the eigenvalues of the transition matrix (or convergence for ergodic ones) to which I was referring. Thanks (again) for the correction and further clarification.
> 1. A linear mapping is not a "kind of equivalence" by any reasonable definition. For instance, the function that maps every vector to 0 is a linear mapping, and it has plenty of eigenvectors. (All with eigenvalue 0.)
Linear mappings are the homomorphisms between vector spaces. So they are a "kind of equivalence".
I know what an equivalence class is, thank you. How does that make the map from (say) R^3 to itself that sends everything to 0 a "kind of equivalence"?
It gives rise to an equivalence relation on R^3 in a fairly natural way, as indeed any function gives rise to an equivalence relation on its domain: x~y iff f(x)=f(y). But that doesn't mean that it is a "kind of equivalence".
Now, obviously, "kind of" is vague enough that saying "an endomorphism of a vector space simply Is Not a 'kind of equivalence'" would be too strong. But I would like to know what, exactly, you mean by calling something a "kind of equivalence", because I'm unable to think of any meaning for that phrase that (1) seems sensible to me and (2) implies that endomorphisms of vector spaces are "kinds of equivalence".
(Looking back at what bonsaitree wrote, I see s/he said "of any kind of equivalence", which I unconsciously typo-corrected to "or any kind of equivalence". But perhaps I misunderstood and bonsaitree meant something else, though I can't think what it might be. bt, if you're reading this: my apologies if I misunderstood, and would you care to clarify if so?)
Eigenvectors are also used extensively in pattern recognition. Notably, in facial recognition: http://en.wikipedia.org/wiki/Eigenface