Machine Learning Resources

What problems would arise from using a regular linear regression to model a binary outcome?

  • Predicted values would not be constrained to the range of [0,1], resulting in predictions that are not valid probabilities. 
  • The outcome only takes on values of 0 and 1, and thus it is not a continuous, normal distribution. Thus, a regression line would seek to fit something that connects the observations with a label of 0 to those having a label of 1, which is a poor representation for such a setup.
  • Since the theoretical variance of a Bernoulli distribution is p(1-p), it is a function of the mean, and thus it does not exhibit constant variance (one of the assumptions for linear regression model to hold true)

Find out all the ways
that you can