Suppose Null hypothesis (H_{0}) is:

?_{1}= ?_{2} = ……….?_{p}= 0

And Alternate hypothesis (H_{a}) is:

At least one of the ?_{i} not equal to 0

Best approach: In order to find if any of the ‘p’ predictors are helpful in predicting ‘y’, use F-Statistic. (This approach works well when p<n. For p>n, other high dimensional methods will work)

*Side Note: T-statistic might not be good in this scenario*

If p is large, let’s say p = 200, and none of the variables (p_{1}, ….p_{n}) are predictive for response variable y (i.e. null hypothesis above is true), yet about 5% of the p-values associated with each of the variables comes below 0.05 by chance. Now, in reality, these variables with low p values do not have any predictive power. The lower p-value is just by chance. Therefore, if we are using individual t-statistic and p values to conclude that the variables have predictive power, we may be drawing the wrong conclusion.

As F-statistic adjusts for the large number of variables, it doesn’t suffer from the above problem