In unsupervised learning the algorithms are not given any labeled data but instead the goal is to find patterns and relationships that are hidden in the input data. The algorithms learn such patterns by modeling the underlying structure and the distribution of data.
In comparison to supervised learning where a response variable y is present, in unsupervised learning, there is no response variable y to predict. Thus, there is no function that maps the feature space X to a target variable y. Instead the goal is to automatically learn hidden patterns and intrinsic structures in the data.
Unsupervised learning can be broadly classified into three categories:
Clustering: groups similar data points together.
Example: Customer segmentation based on users’ purchase history, wage and demographic information. The generated clusters will only inform whether a particular group of people belong together or not. It will not tell if a particular group is a preferred group or not.
Anomaly Detection: also known as outlier detection, is the process of identifying items or events in a dataset that deviate significantly from the majority of the data.
Example: Fraud detection – detecting if there is an unusual pattern in financial transactions
Dimensionality Reduction: reduces the number of features in a dataset while retaining as much information as possible
Example: Visualizing high dimensional data – If a dataset have more than 3 dimensions it is difficult to visualize it. Dimensionality reduction can bring down the number of dimensions to 2 while ensuring that the underlying structure is retained as much as possible, thereby allowing visualization of data.
The following infographics from Booz Allen Hamilton provides a pictorial explanation of what is unsupervised learning:
In this video, Andrew Ng describes what is Unsupervised learning, and gives several real life applications of unsupervised learning (Runtime: 9 mins)
Other recommended videos
- As a follow-up of the video recommend above, Andrew Ng details the different types of unsupervised learning methods including clustering, anomaly detection, and dimensionality reduction: https://www.youtube.com/watch?v=u7Y_b04upmQ (Runtime: 3:40 mins)
- This video from AssemblyAI provides a short and self contained introduction to unsupervised learning. The explanation includes examples of clustering, anomaly detection, and auto-encoders (which is one of the techniques for dimensionality reduction): https://www.youtube.com/watch?v=yteYU_QpUxs (Runtime: 5:30 mins)