The website is in Maintenance mode. We are in the process of adding more features.
Any new bookmarks, comments, or user profiles made during this time will not be saved.

Machine Learning Resources

How does K-Means Work?

Bookmark this question
  • K-Means starts by selecting initial centroids for the k-clusters by randomly choosing k observations as the centroids. Different initialization approaches can result in a different clustering assignment, so the algorithm is often run several times, where the iteration that produces the most compact clusters is ultimately chosen. The researcher must decide on what value to use for k ahead of time, but similar to supervised machine learning algorithms, it can be thought of as a hyperparameter to be tuned. 
  • It then assigns each observation to the cluster of its nearest centroid based on a multivariate distance measure, such as Mahalanobis Distance. Because the algorithm is distance-based, all of the features should be scaled to a similar range as part of the data preprocessing. 
  • Next, it recalculates the centroids of each cluster by moving it to the centroid of all observations currently belonging to that cluster. 
  • After all of the assignments are made, it re-assigns any observations to a different cluster that would result in more similar clusters than on the previous step. 
  • It continues in this iterative manner until no more reassignments result in further improvement of the clustering. 

Leave your Comments and Suggestions below:

Please Login or Sign Up to leave a comment

Partner Ad  

Find out all the ways
that you can

Explore Questions by Topics

Partner Ad

Learn Data Science with Travis - your AI-powered tutor |