The website is in Maintenance mode. We are in the process of adding more features.
Any new bookmarks, comments, or user profiles made during this time will not be saved.

AIML.com

Machine Learning Resources

What is Underfitting?

Bookmark this question

Related Questions:
What is Overfitting?
How to mitigate Underfitting?
What is the Bias-Variance Tradeoff?

Underfitting occurs when a machine learning model is too simple to capture the complexity of the underlying data. Essentially, the model is not able to learn the underlying patterns and relationships in the data and as a result, it does not perform well on either the training data or the test data.

For an underfitted model, both training and test errors tend to be high as shown in the image below. On a positive note, this means it is easier to detect and mitigate underfitting, because training error is one of the first indicators model developers use to measure model performance.

Underfitting: High Training and Test Error (Source: Al-Behadili et al. Rule pruning techniques in the ant-miner classification algorithm and its variants: A review)

What causes Underfitting?

Underfitting can occur because of several reasons, including:

  1. Insufficient Data: If the amount of data available for training the model is too small, the model may not be able to capture the complexity of the underlying patterns in the data.
  2. Over-regularization: Regularization is a technique used to prevent overfitting, but if the regularization parameter is set too high, it can lead to underfitting.
  3. Poor Feature Selection: If the model is not given access to enough relevant features, it may not be able to capture the underlying patterns in the data.
  4. Model Complexity: If the model is too simple, it may not be able to capture the underlying complexity in the data.

How to mitigate underfitting?

The best way to avoid underfitting is to ensure that the model is complex enough to capture the underlying patterns in the data while also being regularized enough to avoid overfitting. This can be achieved by:

  1. Increasing the complexity of the model: By adding more layers or increasing the number of neurons in the model, the model will become more complex and better able to capture the underlying patterns in the data.
  2. Reducing the regularization parameter: By reducing the amount of regularization, the model will be less constrained and better able to capture the underlying patterns in the data.
  3. Adding more relevant features: By adding more relevant features to the model, it will have access to more information and be better able to capture the underlying patterns in the data.
  4. Collecting more data: By collecting more data, the model will have access to more examples and be better able to capture the underlying patterns in the data.

Visual Explanation

The following infographic showcase the decision boundaries in case of underfitted regression and classification models:

underfitting overfitting
Decision boundaries for underfitted Regression and Classification models (Source: Wonseok Shin)

Video Explanation

In this brilliant and amusing video, Cassie Kozyrkov explains what is underfitting, and more importantly why people don’t worry about underfitting as much as they do about overfitting (Runtime: 2:15 mins)

Why model developers don’t worry about underfitting as much as they do about overfitting?
  • For a comprehensive comparison of underfitting with overfitting, and how the underfitted regression and classification decision boundaries look like, please see this video from Andrew Ng (Runtime: 12 mins)

Partner Ad  

Find out all the ways
that you can

Explore Questions by Topics

Partner Ad

Learn Data Science with Travis - your AI-powered tutor | LearnEngine.com