The website is in Maintenance mode. We are in the process of adding more features.
Any new bookmarks, comments, or user profiles made during this time will not be saved.

Machine Learning Resources

What is Principal Component Analysis (PCA), and how does it differ from clustering?

Bookmark this question

Principal Component Analysis (PCA) is a dimension reduction technique that explains the variability across multiple dimensions of data through linear combinations of the original features. Each new linear combination that is created is referred to as a principal component, and the components have the property of being mutually orthogonal, or uncorrelated, to one another.

The first principal component always explains the highest percentage of variability among the features, and each subsequent component explains less. If there are k original features, there can be up to k principal components created, but as it is a reduction technique, the number of components chosen is usually much smaller and can be determined using a similar heuristic technique as the elbow plot in k-means clustering, in the case of PCA based on the cumulative proportion of variance explained. The main difference between clustering and PCA is that clustering attempts to find groupings among the observations, or rows, where PCA performs reduction among the features, or columns. However, there are several similarities between the two, namely the fact that both are unsupervised learning methods that require user interpretation to derive practical meaning from the results. 

Leave your Comments and Suggestions below:

Please Login or Sign Up to leave a comment

Partner Ad  

Find out all the ways
that you can

Explore Questions by Topics

Partner Ad

Learn Data Science with Travis - your AI-powered tutor |