What are some of the pros and cons of GMMs?
Pros: Has the ability to find more local clusters that K-Means would not be able to differentiate
Pros: Has the ability to find more local clusters that K-Means would not be able to differentiate
K-Means aims to minimize the Within Cluster Sum of Squares, while EM aims to maximize the likelihood of an underlying probability distribution.
When used for clustering, any of the evaluation metrics (Silhouette Score, Dunn Index, Rand Index, etc.) are appropriate
A Gaussian Mixture Model describes an underlying distribution that is composed of multiple individual Gaussian distributions
Expectation-Maximization refers to a two-step, iterative process that is often used when latent or unobserved variables are present underlying a data generation process.
Spectral Co-Clustering is an implementation of Co-Clustering that models the input data as a bipartite graph
Bi-Clustering, or Co-Clustering seeks to simultaneously perform clustering both within the observations and columns of a dataset.
Spectral clustering is an alternative clustering technique that is rooted in graph theory.
Density-based clustering approaches, such as DBSCAN, tend to perform better than partitioning methods like K-Means when clusters are non-globular
One problem of performing clustering in high-dimensional data is that common distance metrics, such as Euclidean distance, do not perform as well.
Find out all the ways
that you can