What is Logistic Regression?

Logistic Regression is a case of Generalized Linear Model used in situations where the response variable is a binary outcome. Despite having “regression” in its name, logistic regression is usually used for classification problems, where the goal is to classify observations into their appropriate classes (usually stated in the context of successes or failures) based on one or more independent variables. Logistic regression relates a binary response to a set of predictors through a logit, or log-odds, transformation in order to model the probability of the outcome occurring for each observation. The model formulation is as follows, where pi is the probability of success for each observation:

In the logistic regression equation above, the random component comes from the fact that each yi is thought to be sampled from a Bernoulli distribution parameterized by pi, or the probability of success for that observation. The systematic component is everything on the right side of the equation, which is the linear combination of ?’s and X’s. The link function is the logit link, or