Briefly discuss other models that fall within the scope of GLM.
Gamma Regression: The gamma distribution is used to model non-negative data that has an inherent right skew, such as income. It is also commonly used to model the time between events of a Poisson distribution. A special case of the Gamma distribution is the Exponential, which models the time until the first occurrence of an event.
Beta Regression: The beta distribution is used to model proportion data, as its support is limited to the range between 0 and 1. Unlike in classification, where in the binary case, the actual labels only take on the values of 0 or 1, beta regression can be used on continuous data that falls within the interval [0, 1]. It is often useful in modeling rates, such as a win or hit ratio.
Tweedie Regression: The tweedie distribution has a density that follows an exponential curve but has a large concentration of data points around 0. Analogous to the discrete case of Zero-Inflated Poisson regression, the Tweedie can be used in continuous data that has a lot of 0 data points. A common use case of the Tweedie distribution is in modeling the pure premium of insurance claims, or total claim amount per exposure, which consists of both the frequency of claims (count data with many 0’s) and amount per claim (continuous, right-skewed data). One approach would be to separately model the frequency of claims using a Poisson-like approach and the amount portion using a Gamma-like approach and then multiplying the predictions together to model the pure premium. However, the Tweedie distribution can also be used for such cases and removes the need for separately modeling the individual components using a different distribution. When performing a Tweedie regression, the user must specify a power parameter that represents the underlying target distribution, which can be tuned using cross validation.