What is Principal Component Analysis (PCA), and how does it differ from clustering?

Principal Component Analysis (PCA) is a dimension reduction technique that explains the variability across multiple dimensions of data through linear combinations of the original features. Each new linear combination that is created is referred to as a principal component, and the components have the property of being mutually orthogonal, or uncorrelated, to one another.

The first principal component always explains the highest percentage of variability among the features, and each subsequent component explains less. If there are k original features, there can be up to k principal components created, but as it is a reduction technique, the number of components chosen is usually much smaller and can be determined using a similar heuristic technique as the elbow plot in k-means clustering, in the case of PCA based on the cumulative proportion of variance explained. The main difference between clustering and PCA is that clustering attempts to find groupings among the observations, or rows, where PCA performs reduction among the features, or columns. However, there are several similarities between the two, namely the fact that both are unsupervised learning methods that require user interpretation to derive practical meaning from the results.