
How are PCA and SVD related? - celerity
https://intoli.com/blog/pca-and-svd/
======
twelfthnight
For those looking for a more succinct answer:
[https://stats.stackexchange.com/questions/134282/relationshi...](https://stats.stackexchange.com/questions/134282/relationship-
between-svd-and-pca-how-to-use-svd-to-perform-pca)

And here is another interesting connection between PCA and ridge regression:
[https://stats.stackexchange.com/questions/81395/relationship...](https://stats.stackexchange.com/questions/81395/relationship-
between-ridge-regression-and-pca-regression)

------
vcdimension
I don't understand why people create these webpages just re-explaining stuff
that can be read in a book, lecture notes (usually available freely online),
or wikipedia. It just adds more noise to the internet. Is it a kind of
marketing thing to show their customers that they know what they are doing?

~~~
howscrewedami
There's value in explaining things in a different/more understandable way.
Wikipedia articles and book chapters on statistics can be hard to understand.

------
gabrielgoh
6 word answer

PCA is the SVD of A'A

~~~
lottin
Actually it's eigendecomposition of A'A and the SVD of A, is it not?

~~~
stephencanon
The SVD of A'A _is_ its eigendecomposition (since it is symmetric semi-
definite, the two factorizations are the same).

It is closely related to the SVD of A: (USV')'USV' = VSU'USV' = VS^2V'.

------
thanatropism
PCA is a statistical model -- the simplest factor model there is. It deals
with variances and covariances in datasets. It returns a transformed dataset
that's linearly related to the original one but has the first variable with
the highest variance and so on.

SVD is a matrix decomposition. It generalizes the idea of representing a
linear transformation (with same dimensions in domain and codomain) in the
basis of its eigenvalues, which gives a diagonal matrix representation and a
formula like A = V'DV.

SVD is like this, but for rectangular matrices. So you have two matrices to
diagonalize: A = U'DV.

That SVD even performs PCA as noted in the algorithms is a theorem, albeit
simple one usually given as an exercise. But hey, even OLS regression can be
programmed with SVD if you want to.

------
kiernanmcgowan
I've always understood PCA as SVD on a whitened matrix. Is this too simplistic
of a view to take wrt implementation?

[https://en.m.wikipedia.org/wiki/Whitening_transformation](https://en.m.wikipedia.org/wiki/Whitening_transformation)

~~~
celerity
I actually touch on the relation to whitening toward the bottom of the
article. You can whiten your dataset from the left singular matrix U which is
directly related to PCs. Thanks for reading!

------
popcorncolonel
The connection between these two has always been hazy to me. I often mixed up
the two when talking about each of them independently.

This article was well-written, exactly precise enough, and cleared up the
confusion. Thanks for sharing!

------
eggie5
SVD is the decomposition of a matrix into its components.

PCA is the analysis of a set of eigenvectors. Eigenvectors can come from SVD
components or a covariance matrix.

source: [http://www.eggie5.com/107-svd-and-pca](http://www.eggie5.com/107-svd-
and-pca)

------
foxh0und
Great article, the lecture comparing the two from John Hopkins, part of the
Data Science Specialization on Coursera also offers a great explanation.

------
finknotal
"Because vectors are typically written horizontally, we transpose the vectors
to write them vertically". Is there a typo in this sentence or is to just too
early in the morning for me to read this?

~~~
zeapo
No typo there. When we talk about vectors we mean "column vector". As it's
easier to read horizontally (and takes less place in a paper), most of the
time we write x^T = {a, b, c} rather than writing them in a column shape.

