

Making sense of principal component analysis, eigenvectors and eigenvalues - ColinWright
http://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues

======
arh68
I felt most the analogies lacking, so here goes. I'd tell my grandma, "the PCA
for a loaf of bread is actually listed right on the back, cleverly labeled
'Ingredients'."

If I took some sort of microscope to a loaf of bread and saw exactly how many
atoms of each element were in that loaf, the ingredients would be the
eigenvectors and the ratios of ingredients would be the eigenvalues. (Each
ingredient has a very specific and unique combination of elements, leaving a
certain signature)

It all depends on the data, though, because that's how you frame reality.
Maybe I look through the microscope a hundred times for a hundred different
loaves, and maybe each ingredient-list might start to look like a datapoint.
So now the patterns change. Here's where it gets interesting, because the very
notion of what the eigenvectors are gets blurry.

What used to be distinct (flour and sugar) now basically show up everywhere
together. They blend into the walls. Every loaf is going to be on average the
same. Now there's 1 main pattern, the average-loaf-of-bread, and several
trailing patterns, like extra-raisins or extra-sunflower-seeds. But remember
the one thing special about eigenvectors: in the model, they're mathematically
independent. So if two things are dependent, they will be globbed by the same
eigenvector. The pattern extra-raisins might come up in a loaf full of
cranberries (as long they are chemically similar).

If grandma is still listening, I'd wrap up: PCA reduces complex but uniform
data to mathematically independent patterns or ingredients. The eigenvectors
are the patterns and the eigenvalues are the ratios.

