Pros: Has the ability to find more local clusters that K-Means would not be able to differentiate
K-Means aims to minimize the Within Cluster Sum of Squares, while EM aims to maximize the likelihood of an underlying probability distribution.
When used for clustering, any of the evaluation metrics (Silhouette Score, Dunn Index, Rand Index, etc.) are appropriate
A Gaussian Mixture Model describes an underlying distribution that is composed of multiple individual Gaussian distributions
Expectation-Maximization refers to a two-step, iterative process that is often used when latent or unobserved variables are present underlying a data generation process.
Pros: Do not have to specify the number of clusters before running the algorithm
A dendrogram is a tree-like diagram that visualizes the successive process for how clusters are formed in a hierarchical clustering algorithm.
Two ways of hierarchical clustering are: Agglomerative and Divisive
Pros: Easy to implement
Cons: Must specify number of clusters in advance
Being that clustering is a distance-based algorithm, outliers can have multiple undesired effects on the quality of the clusters produced.