Supervised learning and Unsupervised learning are the two main categories of machine learning algorithms. The primary differences between supervised and unsupervised learning are:
- the presence of labeled data for supervised methods vs the absence of labeled data for unsupervised methods
- the ease of objective evaluation and comparison of models in supervised learning as opposed to (somewhat) subjective evaluations in unsupervised learning
Explaining further, in supervised learning the algorithms are trained on labeled datasets with a goal to learn a mapping function that can predict the output on a new, unseen data. Whereas in unsupervised learning the algorithms are trained on unlabeled datasets with a goal to find patterns and relationships in the data.
Since the supervised methods use labeled datasets it is easy to do objective model evaluations and comparisons by splitting the data into training and test sets. However, for unsupervised methods the lack of a definitive output variable makes it difficult to do a truly objective model evaluation.
The following table summarizes the differences between Supervised and Unsupervised learning:
|Supervised Learning||Unsupervised Learning|
|Input Data||Labeled training data |
|Unlabeled training data |
(no output variable)
|Objective||Learn a function to predict the output variable||Find hidden patterns, relationships, and associations in the data|
– a continuous variable (Regression)
– a categorical variable (Classification)
|– Group similar data points together (Clustering)|
– Find outliers in the data (Anomaly Detection)
– Reduce the number of features (Dimensionality Reduction)
|Model Evaluations||Objective evaluation and model comparison methods||Lack of a definitive output variable makes it difficult to do truly definitive evaluations and model comparisons|
|Example Applications||– Predict the selling price for a house|
– Classify an image into a dog image or a cat image
|– Group similar news articles together |
– Identify fraudulent financial transactions
– Visualize high dimensional datasets
This image sourced from Recro.io succinctly describes the difference between supervised and unsupervised learning:
The following video from bigml.com provides a good overview of Supervised and Unsupervised learning methods; and compares the two by using various examples.