What is Regularization?

Regularization is a technique commonly used in machine learning that involves adding a penalty for complexity to the model objective function by introducing a small degree of bias with the goal of improving a model’s generalization performance. It is usually performed for the purpose of reducing overfitting in complex models. When regularization is applied to a regression problem, it is referred to as penalized regression, where the penalty has the effect of shrinking the magnitude of regression coefficients and thus producing a simpler model. In regularized regression, the objective is no longer only to minimize the sum of squared residuals but also with the constraint that the coefficients are as small as possible. If there are two models that perform equivalently well on a training dataset, regularization ensures the simpler model is chosen. While regularization often improves a model’s performance on unseen data, too much regularization can result in underfitting. Thus, it is usually necessary to tune the regularization parameter during the cross-validation process.