### What is Principal Component Analysis (PCA), and how does it differ from clustering?

Principal Component Analysis (PCA) is a dimension reduction technique.

- Machine Learning 101 (30)
- Statistics 101 (38)
- Supervised Learning (114)
- Regression (42)
- Classification (46)
- Logistic Regression (10)
- Support Vector Machine (10)
- Naive Bayes (4)
- Discriminant Analysis (5)
- Classification Evaluations (9)

- Classification & Regression Trees (CART) (23)

- Unsupervised Learning (55)
- Clustering (28)
- Distance Measures (9)
- Dimensionality Reduction (9)

- Deep Learning (23)
- Data Preparation (34)
- General (5)
- Standardization (6)
- Missing data (7)
- Textual Data (16)

Principal Component Analysis (PCA) is a dimension reduction technique.

Pros: Has the ability to find more local clusters that K-Means would not be able to differentiate

K-Means aims to minimize the Within Cluster Sum of Squares, while EM aims to maximize the likelihood of an underlying probability distribution.

A Gaussian Mixture Model describes an underlying distribution that is composed of multiple individual Gaussian distributions

Spectral Co-Clustering is an implementation of Co-Clustering that models the input data as a bipartite graph

Bi-Clustering, or Co-Clustering seeks to simultaneously perform clustering both within the observations and columns of a dataset.

Spectral clustering is an alternative clustering technique that is rooted in graph theory.

Density-based clustering approaches, such as DBSCAN, tend to perform better than partitioning methods like K-Means when clusters are non-globular

One problem of performing clustering in high-dimensional data is that common distance metrics, such as Euclidean distance, do not perform as well.

K-Modes is a modification of K-Means suitable for datasets with all categorical features that clusters based on matches/mismatches across the features

Find out all the ways

that you can

- Machine Learning 101 (30)
- Statistics 101 (38)
- Supervised Learning (114)
- Regression (42)
- Classification (46)
- Logistic Regression (10)
- Support Vector Machine (10)
- Naive Bayes (4)
- Discriminant Analysis (5)
- Classification Evaluations (9)

- Classification & Regression Trees (CART) (23)

- Unsupervised Learning (55)
- Clustering (28)
- Distance Measures (9)
- Dimensionality Reduction (9)

- Deep Learning (23)
- Data Preparation (34)
- General (5)
- Standardization (6)
- Missing data (7)
- Textual Data (16)