What is Feature Standardization (or Z-Score Normalization), and why is it needed?

Feature Standardization, also known as ‘Z-Score Normalization’, is a technique for pre-processing numerical raw data during the creation of your training data. 

It consists of centering and scaling your data so that it has the properties of a Standard Normal Distribution (Gaussian Distribution) i.e. a mean centered at zero and a standard deviation (for all Features) equal to 1. 

Feature Standardization is: –

Essential for some algorithms (esp. those utilizing a euclidean distance function): –

  • Support Vector Machine utilizing a Radial Basis Function or RBF Neural Networks
  • Clustering Algorithms having Features with ≉ comparable scales

Recommended for: –

  • Any Algorithim that assumes Gaussian Distribution
  • Arguably benefits Multi Layer Perceptron Neural Networks
  • Clustering Algorithms having Features with ≈ comparable scales

Less important for: –

  • Tree Based algorithms (high robustness to Un-standardized Data
  • Logistic Regression

You can expect that, within a Real World Dataset, the features will be of different units, with a different range of values and having different types of statistical distributions. There might also exist data as outliers. All of which can introduce modeling errors if not dealt with correctly. 

Without feature standardization, predictive models may perform badly as features with higher orders of magnitude and range can incorrectly overshadow the importance of those that are relatively lower, even if these offer better prediction capability.