Pros:
- Computational efficiency
- Less prone to overfitting
- If assumption of independence between features holds, the algorithm often has superior performance to other classification techniques
- Highly suitable when all features are categorical, such as in text classification
Cons:
- Independence assumption is not realistic for many data sets, and if that is the case, the algorithm suffers from high bias