The website is in Maintenance mode. We are in the process of adding more features.
Any new bookmarks, comments, or user profiles made during this time will not be saved.

Machine Learning Resources

What does L2 regularization (Ridge) mean?

Bookmark this question

L2, or Ridge regularization, is a form of regularization in which the penalty is based on the squared magnitude of the coefficients. The L2 cost function is as follows, where just as in the case of LASSO, lambda is the parameter that controls the amount of regularization applied. 

The major difference between Ridge and LASSO is that in Ridge, no coefficients are shrunk all the way to 0. Thus, it does not have a built-in variable selection capability. However, predictors that are the least important can have coefficients that are very close to 0. In both LASSO and Ridge regression, the magnitude of the coefficients directly translates to their effect in the model. 

An important data pre-processing step in regularized regression is to scale the features before fitting the model using a scaling technique such as standardization or minmax scaling. The need for feature scaling arises due to the presence of the second term in the loss function, which is where the shrinkage occurs. If a model contains one feature that is measured on a much larger scale than the others, regularization will not be able to sufficiently shrink its influence even if it is not an important predictor in the model. In order to apply an equal magnitude of regularization to all of the features, they should be converted to a similar scale range first. 

Leave your Comments and Suggestions below:

Please Login or Sign Up to leave a comment

Partner Ad  

Find out all the ways
that you can

Explore Questions by Topics

Partner Ad

Learn Data Science with Travis - your AI-powered tutor |