
Principal Component Analysis - keyboardman
https://leimao.github.io/article/Principal-Component-Analysis/
======
roenxi
One of the topics that could really be improved with a 3Blue1Brown treatment;
but a diagram would do at a pinch - such as what Wikipedia has [0]. Looks like
a good writeup nonetheless.

The Wikipedia picture doesn't quite convey the message though - PCA is a
variable reduction technique. The Wiki picture could reduce the data from 2
dimensions (as shown) to 1 dimension (distance along the axis that you'd
intuitively pick). PCA will let you pick the axis with minimum loss of
information.

PCA is most useful when the data is really generated in a low number of
changing dimensions (say, plant species & soil nutrients in a controlled
experiment) but the data collected is high dimension (length, weight, colour,
smell, no. pests, flavour rating by a team of chefs). PCA will tell you that
there are really 2 important variables and how to construct them from the
observed data - but it won't tell you what they are.

> Positive Semi-Definite Matrix

Aka, it has a real square root. Except not related to the concept of square
roots or reals in any obviously meaningful way. The concept annoys me,
somehow. It is a pity we have to learn algebra before matrices are meaningful.

Caveat emptor, I'm bad at stats, corrections welcome.

[0]
[https://en.wikipedia.org/wiki/File:GaussianScatterPCA.svg](https://en.wikipedia.org/wiki/File:GaussianScatterPCA.svg)

~~~
MaxBarraclough
A perverse thought pertaining to matrix square-root: [0] I see the notation of
M^(1/2). Could the idea be meaningfully extended to define, say, M^(0.7),
similar to the way this is meaningful with real numbers?

[0]
[https://en.wikipedia.org/wiki/Definite_symmetric_matrix#Squa...](https://en.wikipedia.org/wiki/Definite_symmetric_matrix#Square_root)

~~~
salty_biscuits
Sure you can muck around with matrix exponentials and matrix logarithms to
think about such things

[https://en.wikipedia.org/wiki/Matrix_exponential](https://en.wikipedia.org/wiki/Matrix_exponential)

------
throwaway4747l
Together with factor analysis the most abused concept of the biological and
social sciences.

