**
**

**
Multiple Linear Regression - Model Testing
**

The following tests should be carried out:

- Tests of parameter significance of all parameters
_{0},_{1},_{2}, ...,_{k}. These tests can be accomplished also by finding the corresponding confidence intervals. - Test of significance of the overall regression.
- Test of the
*Ordinary Least Squares Assumptions (OLS)*.

**Parameter Tests**

All parameter tests are *t-tests* of the same form as in simple
linear regression. With *k+1* parameters there are *k+1 t-tests*
to be conducted, one test for each variable *j* from *0* to *k*:

*H*_{0}:_{j}=0*H*_{0}:_{j}0- Assume the level of significance
- Evaluate the test statistic
*T*:_{calc}

- Determine the critical region (rejection region)
(Degrees of freedom for the
*t*-distribution,*=n-k-1*, and critical region*T*or_{calc}< -t_{/2, }*T*._{calc}> +t_{/2, } - Conclusion:

If*T*falls into the critical region, then reject_{calc}*H*, and conclude that_{0}_{j}is significantly different from zero (i.e. significant). Otherwise accept*H*, and conclude that_{0}_{j}is not significantly different from zero, (i.e. not significant).

**Parameter Confidence Intervals**

As an alternative to the above parameter tests you may want to find
a confidence interval for each parameter. All parameter confidence interval
formulae are of the same form. They can be derived from the *T-test statistic*.

**Note:** Here you first simply look at the signs. If the confidence
interval left-hand side sign is *negative* and the right-hand side
sign is *positive*, then zero is included in the interval. Hence,
the parameter is *not significant*, because it may be zero. If, on
the other hand, the left-hand side- and right-hand side signs are the same,
either *both negative* or *both positive*, then zero is not
included in the interval, and the parameter is significant (significantly
different from zero).

**Test of the Significance of the Overall Regression**

This is an *F-test* like in simple linear regression. The test uses
the total variability *SST* in the data. This variability is divided
into variability due to regression *SSR* and variability due to
randomness *SSE*. If all variability in the data is only randomness
then *SST=SSE*. If all variability in the data can be accounted for
by the relationship between the dependent variable and the independent
variables then *SST=SSR*. Because some randomness will exist in the
data, therefore *SST=SSR+SSE*. Now two independent estimators of
error variance * ^{2}*
are obtained, one using the

The error variance estimator from *SSR* becomes

The error variance estimator from *SSE* becomes

It can be shown that both estimators are unbiased if the data
contains only randomness and there is no relationship between the
dependent variable *y* and the independent variables
*x _{j}*, i.e.

and the random variable

follows the *F-distribution*. However, if there is a relationship
between the dependent variable and the independent variables then the
estimator *s _{1}^{2}* is a biased estimator

and the *F-* random variable does not follow the *F-distribution*.
For significant regression one hopes that *s _{1}^{2}*
would be a biased estimator, and hence, that significant amount of variability
in the data would be contained in the

**Note:** You may recall that an estimator is said to be unbiased if the
expected value of the estimator is the population parameter. That's all what
the notation above stands for (*E(...)= ...*). You don't need to worry about
this here. It is just to remind you why and how we came up with the formulae.

**Steps for the Overall Significance Test using the ANOVA Approach**

*H*The regression is not significant_{0}:*H*The regression is significant_{0}:- Assume the level of significance
- Evaluate the test statistic
*F*:_{calc} - Determine the critical region (rejection region)
- ANOVA Summary Table
**Note:**This type of an ANOVA summary table is part of regression output of many software capable of statistical analysis including*MS Excel*. - Conclusion
If

*F*then conclude that there is a significant amount of regression explained by the model._{calc}>f_{, 1, 2}=f_{,(k,n-k-1)}

**Test of the Ordinary Least Squares Assumptions (OLS)**

This test is carried out as in simple linear regression by plotting the
residuals *e _{i}=y_{i}-
_{i}* against the estimated
(or fitted) values

**Note:** For practise, please repeat manual computations
for a multiple linear regression case with two independent variables,
and create the corresponding *MS Excel* table (with formulae). This
will help you eliminate any and all 'magic' from regression.