What is Principal Component Analysis (PCA), and how does it differ from clustering?
Principal Component Analysis (PCA) is a dimension reduction technique.
Principal Component Analysis (PCA) is a dimension reduction technique.
Pros: Has the ability to find more local clusters that K-Means would not be able to differentiate
K-Means aims to minimize the Within Cluster Sum of Squares, while EM aims to maximize the likelihood of an underlying probability distribution.
A Gaussian Mixture Model describes an underlying distribution that is composed of multiple individual Gaussian distributions
Spectral Co-Clustering is an implementation of Co-Clustering that models the input data as a bipartite graph
Bi-Clustering, or Co-Clustering seeks to simultaneously perform clustering both within the observations and columns of a dataset.
Spectral clustering is an alternative clustering technique that is rooted in graph theory.
Density-based clustering approaches, such as DBSCAN, tend to perform better than partitioning methods like K-Means when clusters are non-globular
One problem of performing clustering in high-dimensional data is that common distance metrics, such as Euclidean distance, do not perform as well.
K-Modes is a modification of K-Means suitable for datasets with all categorical features that clusters based on matches/mismatches across the features
Find out all the ways
that you can