Hacker News new | past | comments | ask | show | jobs | submit login
Principal Component Analysis explained visually (2015) (setosa.io)
176 points by spking on Oct 29, 2022 | hide | past | favorite | 13 comments



Here is a much better explanation of PCA: https://stats.stackexchange.com/questions/2691/making-sense-...

The key insight that many are missing is that PCA solves a series of optimization problems, namely that reconstructing the data from the first k PCs gives the best k-dimensional approximation in terms of the squared error. Even more, this is equivalent to assuming that the data lives in a k-dimensional subspace and becomes truly high-dimensional because of normally distributed noise that spills into every direction (dimension).


Principal Components is a wonderful concept, together with sister concepts eigenvalues/vectors, and orthogonality. i wish i could force everyone i talk to to internalize these ideas so that I could have more useful discussions with them.

that said, yeah not everything is linearly separable


I really like the way Harrell uses PCA to build regression analysis in Regression Modeling Strategies

https://link.springer.com/book/10.1007/978-3-319-19425-7


Indeed cool stuff!


Best thing I’ve ever read on PCA is Madeleine Udell’s PhD-thesis [1]. It extends PCA in many directions and shows that well-known techniques fit into the developed framework. (Was also impressed with a 138 page thesis in math that is readable as well. Quite the achievement.)

[1] https://people.orie.cornell.edu/mru8/doc/udell15_thesis.pdf


Indeed, this seems worth a deep read as this especially address main PCA shortcomings ( heterogeneous data, non numerical data,.etc...). Thanks mate I've definitely find a way to keep myself busy this weekend.


It’s kind of crazy that so many people have read this thesis, but it’s really good. I came across it independently a few years ago when I was trying to understand some stuff, but ended up saving it as a reference because I liked it so much.


This is some hot stuff! Thanks for sharing. Very lucid writing, clearly she has some deep understanding of the subject matter to be able to write that down so eloquently


Related:

Principal Component Analysis Explained Visually - https://news.ycombinator.com/item?id=27017675 - May 2021 (44 comments)

Principal Component Analysis Explained Visually (2015) - https://news.ycombinator.com/item?id=14405665 - May 2017 (25 comments)

Principal component analysis explained visually - https://news.ycombinator.com/item?id=9040266 - Feb 2015 (22 comments)


Also see

- Markov Chains (https://setosa.io/ev/markov-chains/)

- Image Kernels (https://setosa.io/ev/image-kernels/)

- Bus Bunching (https://setosa.io/bus/)

Wish these guys kept producing more visualizations!


In the UK eating example, it would be better to examine the feature-space singular vector associated with the first singular value instead of instructing the reader to "go back and look at the data in the table". PCA has already done that work, no additional (error-prone, subjective) interpretation needed.


Agreed, it kind of defeats the purpose of the PCA to have to go back and analyze everything "by hand"...


I'm not sure this is an explanation as much as an introductory demo. Nice visualizations though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: