01.03.2013 Views

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

282 7 Data Regression<br />

Example 7.5<br />

Q: Consider the ART(PRT) linear regression in Example 7.1. Is it valid to reject<br />

the null hypothesis of a linear fit through the origin at a 5% level of significance?<br />

A: The results of the respective t test are shown in the last two columns of Figure<br />

7.3. Taking into account the value of p (p ≈ 0 for t * = −9.1), the null hypothesis is<br />

rejected. This is a somewhat strange result, since one expects a null area<br />

corresponding to a null perimeter. As a matter of fact an ART(PRT) linear<br />

regression without intercept is also a valid data model (see Exercise 7.3).<br />

7.1.3.3 Inferences About Predicted Values<br />

Let us assume that one wants to derive interval estimators of E[ Y ˆ<br />

k ] , i.e., one wants<br />

to determine which value would be obtained, on average, for a predictor variable<br />

level xk, <strong>and</strong> if repeated samples (or trials) were used.<br />

The point estimate of E[ Y ˆ<br />

k ] , corresponding to a certain value xk, is the<br />

computed predicted value:<br />

ˆ = + x .<br />

yk b0<br />

b1<br />

k<br />

The yˆ k value is a possible value of the r<strong>and</strong>om variableYk ˆ which represents all<br />

possible predicted values. The sampling distribution for the normal regression<br />

model is also normal (since it is a linear combination of observations), with:<br />

– Mean: E[ Y ˆ<br />

k ] = E[ b0 + b1x<br />

k ] = E[<br />

b0<br />

] + xk<br />

E[<br />

b1<br />

] = β 0 + β1x<br />

k ;<br />

⎛<br />

2 ⎞<br />

– Variance: ˆ 2 ⎜ 1 ( xk<br />

− x)<br />

V[ Y = +<br />

⎟<br />

k ] σ .<br />

⎜<br />

2 ⎟<br />

⎝<br />

n ∑ ( xi<br />

− x)<br />

⎠<br />

Note that the variance is affected by how far xk is from the sample mean x . This<br />

is a consequence of the fact that all regression estimates must pass through ( x , y)<br />

.<br />

Therefore, values xk far away from the mean lead to higher variability in the<br />

estimates.<br />

Since σ is usually unknown we use the estimated variance:<br />

⎛<br />

2 ⎞<br />

ˆ ⎜ 1 ( xk<br />

− x)<br />

s[<br />

Y = +<br />

⎟<br />

k ] MSE<br />

. 7.17<br />

⎜<br />

2 ⎟<br />

⎝<br />

n ∑ ( xi<br />

− x)<br />

⎠<br />

Thus, in order to make inferences about k Yˆ , we use the studentised statistic:<br />

t<br />

*<br />

yˆ<br />

ˆ<br />

k − Ε[<br />

Yk<br />

]<br />

= ~ tn−2. 7.18<br />

s[<br />

Yˆ<br />

]<br />

k

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!