What is the difference between Feature Engineering and Feature Selection?

Feature Engineering is the process of using domain knowledge to extract numerical representations from raw data. Broadly speaking, there are two steps to Feature Engineering: 1) Feature Identification, and b) Feature Transformation. 

  1. In Feature Identification we use domain expertise to identify characteristics that we think might have a predictive effect on the outcome. For ex: If the goal is to predict the selling price of a house, then lot size, number of bedrooms, year of construction, installed appliances etc. can have an effect on the final housing price, and thus should be used when building a Machine Learning model. 
  2. Feature Transformation on the other hand takes these identified features, and transforms them into numerical representation that can be used by a Machine Learning model. For ex: installed appliances is a categorical feature that can be represented using one hot encoding. On the other hand, lot size is a numerical feature, and can be used as it is. Additionally, many times these numerical representations also include using mathematical formulations such as logarithm and square roots. Once all these numerical features are extracted, they are concatenated to form a vector, which is a numerical representation of the raw data, and can be used with Machine Learning models. 

Feature Selection is the process of identifying a subset of features that are most predictive of the outcome. It is possible that not every feature that a domain expert thinks is predictive is truly predictive. Also, many times, the effect of a feature is masked by the presence of other features. For example, a bigger lot size can be positively correlated with the number of bedrooms, and thus might mask the effect of the ‘number of bedrooms’ feature. There are several principled ways of identifying important features including Mutual Information measures, ML models like Random Forest, and regularization methods like LASSO.