Pros:
- Has the ability to find more local clusters that K-Means would not be able to differentiate when there is a lot of overlap from a global view
- Provides probability estimates of belonging to each cluster (soft clustering)
- Increased flexibility provided by having ability to specify covariance structure
Cons:
- Not guaranteed to converge to global optimum, meaning that if algorithm is run multiple times on the same data set with the same number of components, the cluster assignments might be different (as with K-Means)