When preparing our ‘Training Data’, two basic pre-processing techniques, applicable to Numerical Features, are ‘Centering’ and ‘Scaling’. These are usually applied together and maybe necessary to transform raw numerical data into a format that is suitable for the algorithms of choice.
Centering our data means that we alter the position of its mean, by applying a constant to each data point, shifting the response curve up/down. The objective, in Standardization, is to achieve a mean that is equal to zero. By only ‘Centering’ the data variance / relative magnitudes of the data remains the same, as does the unit, only the mean is altered.
Scaling our data means that it is transformed so as to fit within a single specific range, it is a technique that is useful to ensure that different Features can be compared without the risk of overshadowing others that have a different range. It is common to scale Features, as in Standardization, so that they have a Standard Deviation of 1. However ‘Scaling’ a Features min & max values between 0 & 1 (or -1 & 1 if negative values are present) is performed during ‘‘Min-Max Scaling’