What is Expectation-Maximization (EM)?

Expectation-Maximization refers to a two-step, iterative process that is often used when latent or unobserved variables are present underlying a data generation process. It provides the framework used to fit a Gaussian Mixture Model, which has wide application in unsupervised learning contexts. The EM algorithm alternates between the E-step, in which observations are assigned to an underlying distribution with a certain probability, and the M-step, which then maximizes the likelihood of the distributions based on the latest assignments. The algorithm continues iterating between these steps until a state of convergence is achieved, meaning observations are no longer moved around to different distributions, and the parameter estimates for each distribution are optimized for the final assignments.