Logistic regression is a statistical technique used to model the relationship between a dependent variable (often binary or dichotomous) and one or more independent variables. The major assumptions of logistic regression are depicted in the following table:
|1||Binary Outcome||The dependent variable must be binary, which means it can take only two possible values, such as 0 or 1, Yes or No, etc.|
|2||Linearity||There should be a linear relationship between the independent variable(s) and the log-odds of the dependent variable|
|3||Independence of Observations||Each observation should be independent of all other observations|
|4||No Multicollinearity||The independent variables should not be highly correlated with each other. If there is high multicollinearity, it can lead to unstable and unreliable coefficient estimates.|
|5||Large Sample Size||Logistic regression assumes a large sample size to obtain reliable estimates. A rule of thumb is that there should be at least 10-15 observations for each independent variable in the model.|
|6||No Outliers||Outliers can have a significant impact on the coefficient estimates in logistic regression. Therefore, it is important to check for outliers and address them if necessary.|
Overall, logistic regression is a powerful tool for analyzing binary outcomes, but it is important to carefully consider the assumptions and limitations of the model when interpreting the results.
There are following two recommended videos for understanding the assumptions of logistic regression:
1. [Recommended] In this video from Learn2Stats, Prof. Ryan explains the assumptions of logistic regression in detail (Runtime: 3:47).
2. [Recommended] Now that you understand the assumptions for logistic regression, you can follow this video from Hannah at University of Liverpool, to learn how to test for these assumptions (Runtime: 6:30 mins): https://www.youtube.com/watch?v=jILEwqg2p3k