# linear regression

### What problems would arise from using a regular linear regression to model a binary outcome?

Predicted values would not be constrained to the range of [0,1], resulting in predictions that are not valid probabilities.

### What is non-negative least squares, and when is it used?

Non-Negative Least Squares (NNLS) adds a constraint to the least squares equation that all coefficient estimates must be greater than or equal to zero.

### What are potential problems encountered in Linear Regression?

If any of the assumptions of linear regression are violated, the model may not be reliable to use for either inference or prediction.

### What is a high influence point?

High influence points are observations that most influence, hence the name, the shape of the regression equation.

### What is a high leverage point?

A high leverage point specifically refers to an observation in which the value of a predictor is considered to be extreme in the feature space.

### What is an outlier?

Outlier is a general term for an observation that is far away from most other data points.

### What is the difference between outliers, high leverage points, and high influence points?

Outlier is a general term for an observation that is far away from most other data points.

### What is the difference between Regression and ANOVA?

ANOVA is a special case of regression when all of the independent variables are categorical.

### Why does multicollinearity result in poor estimates of coefficients in linear regression?

In matrix form, the vector of coefficient estimates is derived using the formula: (X’X)-1X’Y, where X is the design matrix where the rows correspond to the observations and columns to the features, and Y is the vector of target values.