Classification is a form of supervised learning in which the objective is to classify observations into a discrete number of categories based on their input features. In general, classification is used when the outcome is categorical.
The most common classification scenario is when the target has only two classes, where one level represents a “success”, and the other a “failure”. Real world examples of such situations include predicting customer turnover, spam detection, or the presence of a disease. It is convention to label the occurrence of the event as the success, even if it would not be considered a success in regular life, such as in the case of a disease.
Classification can also be extended to an outcome with more than two levels, both if there is an inherent ordering present in the categories or not. Classification largely proceeds in the same manner as regression when it comes to feature engineering and data preprocessing. However, many of the evaluation metrics used in regression are no longer relevant, and the evaluation process and inference is reflective of the different setup