In the case of radio direction finding, PCA allows you to distinguish multiple overlapping signals on the same frequency if they're hitting the antenna array from different directions.
If you have any other interesting applications in your own field, I'd like to hear them, too.
The interesting part was he used PCA and then assigned sliders to each component, and sorted them by importance. So the first slider would affect the image the most.
Your objective is to classify points on a map that have similar weather conditions.
You can do PCA on these geographical points at a given time. Your matrix will have as columns weather parameters and as rows the different geographical points.
PCA will determine how to best represent the variation in data into smaller dimensions. If you are lucky, 80% of the variation can be plotted in a 2D plot as in OP's post.
Points in the x/y cartesian that have similar weather will plot closer to each other.
This process reduces dimension. For example, temperature and solar radiation are strongly correlated and will tend to "push" into the same direction so that you can replace those parameters by a virtual parameter. (For weather data, that would tend to be the x-axis in the PCA biplot as temperature and similar parameters are often sufficient to classify a climate for the purposes that I have encountered in agriculture.)
In a nutshell, this is a real word example of when one can use PCA.
PCA works under the assumption that you're dealing with a low-dimensional signal that was projected into a high-dimensional space and then corrupted by low-magnitude high-dimensional noise. By taking only the top k components, you filter out the noise and get a better signal. But if you have a low-magnitude high-dimensional signal corrupted by highly-correlated noise, then naively applying PCA will filter out the signal and leave you with only noise.
So whether PCA makes your data more meaningful or less really depends on whether that assumption is satisfied or not. Dimensionality reduction is no silver bullet.
I don't understand what "highly-correlated noise" means. I thought noise was uncorrelated to the signal by definition.
Also, I understand that PCA is not scale invariant, but I'm having trouble relating that to "low-magnitude, high-dimensional signal".
I used that example because something similar actually happened (although there's no indication PCA was used): https://semiengineering.com/training-a-neural-network-to-fal...
(Of course, a better approach if possible would be baseline normalisation.)
Take a list of 100 questions all answered between 1-5 (disagree -agree). Sometimes, PCA can help you group these statements into like concepts. Which means instead of having to look at variation across all 100 statements you only need to look at a handful.
However, in doing so, you might miss out on an interesting pattern in a subset of the data where a given subset of respondents are consistently answering a single choice differently. Maybe in aggregate that signal doesn’t show up, and so you would wash it out in PCA. Obviously, in this case, each of the 100 questions is a dimension.
However, as explained by another comment, it does not always become "meaningless". In fact, it becomes variable. For example, if you are interested in secondary structure in your data, you can plot factors 2, 3, 4 or higher against another factor and often the change in pairwise distance at different factors is meaningful.
When one examined the data, some parameters were in 40 square kilometer boxes and others in more like 0.5 square kilometer boxes. Factor 1 would tell you which parameters are in the 40 square kilometer boxes.
Another interesting example is that one can use this to for example counteract the effects of inflation on sales prices.
Principal is kinda mathematic's way of saying something is important or critical.
Component is basically referring to parts
Analysis is basically referring to ability of this method to help you understand what's going on with your data.
So the main idea here is like actually pretty simple.
Let's say you want to understand what makes a difference to a person's SAT score. You track number of hours of study, maybe the average school SAT scores, age of the person, average of mock SAT scores.
Now you know that some of the columns of data are more important than other and you want to know which one?
So in regression what you would do is fit a line that best lies in the middle of these points by playing around weights or 'importance' of the columns till you get least distance from points.
What PCA does is it tells you which variables are important and gives you a sense of ranking of these variables.
So it has the ability to select variables, rank it's importance and tell you what columns of data matter more than others.
Ergo it tells you which components are principal and helps you analyse them through their importance ranking.
Because this method does so many things you can a do a ton of cool stuff.
If you have a ton of data, this method can tell you to focus on these 3 or 4 columns which have the biggest impact. So it can help you prioritize
Second if you are looking at optimizing your system,let's say SAT scores, this system can tell you a better school can make a bigger difference than just brutal hours of practice.
In networks like social networks, it can tell you who is the most important/ prestigious/ coolest person by looking at friendship or social messaging links between people.
So to sum up, it gives you an idea of what is important in your data, gives a sense of the quantum of its importance and hence gives a deeper feel for what's going on.
One big point with PCA is that it's a linear method. Which means variables which have exponential impact on your study will not get signalled well. So transformation and processing your data is critical for this method to work.
Hope this helped.
You can reduce dimensions by saying some of the eigenvalues are "noise" and discarding them.
It is a means of defining a new coordinate system that is ordered by dimensions of decreasing variance.
So component is one of those shapes. In case of PCA, shapes are ellipses. For SOM, shapes could be more complex like zigzag. For k-means, shapes would be like Voronoi cells.
In most case where PCA is used for ML algorithm, it is a very important thing to take into account since you can lose a feature which is quite discriminating but that you'll squeeze in a discarded axis if its coordinate is too small.
There are obviously methods to avoid this like whitening the input data but it doesn't cut it completely.
Another trick for datasets with high dimensionality is that you can randomly project the data to a lower dimensional space using random Gaussian vectors. By the Jordan-Lindenstrauss lemma, the projected dataset will have statistically similar properties under PCA.