
Implementing a Principal Component Analysis In Python (2014) - nafizh
http://sebastianraschka.com/Articles/2014_pca_step_by_step.html
======
mjfl
If anyone is interested, I also wrote a post on PCA this past weekend, but it
uh... didn't get any upvotes :(

[http://michaeljflynn.net/2017/02/06/a-tutorial-on-
principal-...](http://michaeljflynn.net/2017/02/06/a-tutorial-on-principal-
component-analysis/)

~~~
TheAlchemist
One more proof that python > R :)

I guess this kind of tutorials in python are more popular since the R / Matlab
people come with statistics background and don't really need the tutorials on
PCA.

------
Xcelerate
PCA is basically a decomposition of data into a low rank Euclidean space
(where the principal axis accounts for the bulk of the variation in the data
and each successive axis is orthogonal and accounts for the remaining
variation). Notably though, most problems do not lie in a Euclidean vector
space — they often exist on a lower dimensional manifold embedded into such a
space. More recent work considers the problem of performing low rank
decompositions if you assume _constraints_ on the factors, i.e., general
priors. The work on this is quite recent and may be of interest to someone.
You can essentially perform PCA with constraints using a message passing
algorithm (originally derived from physics applications):

[https://arxiv.org/abs/1701.00858](https://arxiv.org/abs/1701.00858)

------
ajtulloch
If anyone is interested, a much faster technique is using the randomized
methods from
[https://arxiv.org/abs/0909.4061](https://arxiv.org/abs/0909.4061) \- see e.g.
[http://scikit-
learn.org/stable/modules/generated/sklearn.dec...](http://scikit-
learn.org/stable/modules/generated/sklearn.decomposition.PCA.html)

------
DonaldFisk
I recently implemented PCA by computing the covariance matrix and using power
iteration to obtain the eigenvectors. I found these pages useful for
understanding how PCA works:

[http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch18.pdf](http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch18.pdf)

[https://en.wikipedia.org/wiki/Power_iteration](https://en.wikipedia.org/wiki/Power_iteration)

I've used it to analyse the Voynich Manuscript:
[http://web.onetel.com/~hibou/voynich/VoynichPagesPCA.html](http://web.onetel.com/~hibou/voynich/VoynichPagesPCA.html)

------
bigger_cheese
Mildly curious question I've read a few papers recently which talk about
'Projection to Latent Structures' (PLS) I know PLS and PCA are related but I'm
struggling to understand the differences and when you should use one or the
other. Are there any good references people could recommend?

~~~
leecarraher
I believe PLS is related to PLSR, where the difference is that you are trying
to maximize the covariance between the predictor variables and the observed
variables.

------
lottin
It's good for educational purposes, but I would say there isn't much to
"implement" since PCA is simply the eigen-decomposition of a matrix.

~~~
_pius
Big difference between knowing the math in theory and actually using it to do
something useful with your data.

~~~
yowlingcat
Agreed, but that Sword of Damocles cuts both ways. The thing you do with your
data which starts out useful at first can and will easily erode into
infrastructural debt if not built on sound foundations, which cannot be found
off the shelf in standard scientific computing libraries that have been around
for decades.

------
leecarraher
you can skip sklear and other packages and just use numpy (which is supported
in pypy now), and just do an svd of the input matrix.

U,d,Vt=svd(X)

D=diag(d)

Xhat= U[:,:2].dot(D[:2, :2])

~~~
leecarraher
in addition if you want a way to project new vectors into a reduced svd space:

l=2 # desired vector space dimension

U,d,Vt = linalg.svd(X,full_matrices=False)

P=(U*(1/d))[:,:l]

xnew = dot(x,P)

------
folli
Does anyone know of a similar understandable introduction to Independent
component analyis (ICA)?

~~~
conjectures
One issue is that ICA covers a few different implementations.

My favourite would be Infomax '95 paper by Bell and Sejnowski. Under this
setup:

ICA is a single layer neural network that maps N inputs to N outputs. The loss
function being maximised is the mutual info between input and output.

------
platz
ICA is also pretty neat

~~~
Xcelerate
I agree. I'm surprised with the number of situations where people use PCA when
ICA really would be more appropriate. I suppose that's the difference between
knowing the tools and knowing the math behind the tools though.

~~~
nextos
The late David McKay said this about PCA:

 _" Principal Component Analysis is a dimensionally invalid method that gives
people a delusion that they are doing something useful with their data. If you
change the units that one of the variables is measured in, it will change all
the "principal components"! It's for that reason that I made no mention of PCA
in my book. I am not a slavish conformist, regurgitating whatever other people
think should be taught. I think before I teach."_

—[https://www.amazon.com/review/R16RJ2PT63DZ3Q](https://www.amazon.com/review/R16RJ2PT63DZ3Q)

~~~
Scea91
I wouldn't read a book from an author who reasons and writes like that.

The issue he is having with PCA can be solved by normalizing the data.

~~~
conjectures
McKay did foundational research in machine learning; was a chief scientific
advisor to the UK government's department for climate change; created eye
tracking user interface software for paralysed people; and wrote an excellent
text book on information theory. I.e. a proper Olympian. So when you read a
statement like that, you should take it seriously and possibly realign your
world-view.

------
DennisP
Let's say I'm building something with naive bayes but some of my inputs are
correlated. Could PCA help me find an independent set of inputs?

~~~
eggie5
Your describing collinearity. Yes, The PCA reconstruction components will not
be correlated.

