How Does Naive Bayes Work?

Naive Bayes uses the framework of Bayes Theorem and the assumption of conditional independence between all pairs of predictors to assign class labels to observations based on the prior probabilities of belonging to each class and the conditional probability or likelihood of the observed feature values falling within the distribution of each class label. The prior probabilities are usually chosen based on the observed ratio of observations within the training data. For example, if 80 observations fall within class A and 20 within class B, the prior probabilities of class A and B would be .8 and .2, respectively. The likelihood score for belonging to each class is calculated by multiplying the prior probability times the conditional probability or likelihood of each feature belonging to that class based on its values relative to the class distribution. The decision rule then assigns the observation to the class with the highest likelihood.