While a linear model gets its name from the fact that the coefficients of the terms that relate a target to a set of predictors must follow a linear form, it does not make the same restrictions on the data itself. Many non-linear relationships can be transformed into linear relationships through logarithmic and power transformations.
When power transformations are applied to predictor variables, such as taking the square, it is referred to as polynomial regression. When creating polynomial, or higher order terms, it is convention to leave the original, or first-order term, of that predictor in the model. In the case of a nonlinear relationship, it is possible that the first-order term might not show up as significant, but if a higher order term is, that means the variable still has significance as a predictor. However, it is important to refrain from adding unnecessary complexity in the form of higher order terms, as it makes interpretation more difficult and also introduces the possibility of overfitting. Finally, alternative algorithms such as decision tree ensembles and deep learning are designed and optimized to capture complex, nonlinear relationships in the data.