01.03.2013 Views

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

7.1 Simple Linear Regression 283<br />

This sampling distribution allows us to compute confidence intervals for the<br />

predicted values. Figure 7.4 shows with dotted lines the 95% confidence interval<br />

for the cork-stopper Example 7.1. Notice how the confidence interval widens as we<br />

move away from ( x , y)<br />

.<br />

Example 7.6<br />

Q: The observed value of ART for PRT = 1612 is 882. Determine the 95%<br />

confidence interval of the predicted ART value using the ART(PRT) linear<br />

regression model derived in Example 7.1.<br />

A: <strong>Using</strong> the MSE <strong>and</strong> sPRT values as described in Example 7.2, <strong>and</strong> taking into<br />

account that PRT = 710.4, we compute:<br />

2<br />

xk − = (1612–710.4) 2 = 812882.6; ∑ xi −<br />

( x)<br />

⎛<br />

2 ⎞<br />

ˆ ⎜ 1 ( xk<br />

− x)<br />

s[<br />

Y = +<br />

⎟<br />

k ] MSE<br />

= 73.94.<br />

⎜<br />

2 ⎟<br />

⎝<br />

n ∑ ( xi<br />

− x)<br />

⎠<br />

( x<br />

2<br />

)<br />

= 19439351;<br />

Since t148,0.975 = 1.976 we obtain yˆ k ∈ [882 – 17, 882 + 17] with 95%<br />

confidence level. This corresponds to the 95% confidence interval depicted in<br />

Figure 7.4.<br />

7.1.3.4 Prediction of New Observations<br />

Imagine that we want to predict a new observation, that is an observation for new<br />

predictor values independent of the original n cases. The new observation on y is<br />

viewed as the result of a new trial. To stress this point we call it:<br />

Y .<br />

k(<br />

new)<br />

If the regression parameters were perfectly known, one would easily find the<br />

confidence interval for the prediction of a new value. Since the parameters are<br />

usually unknown, we have to take into account two sources of variation:<br />

– The location of E[ Y k(<br />

new)<br />

] , i.e., where one would locate, on average, the<br />

new observation. This was discussed in the previous section.<br />

– The distribution of Y k(<br />

new)<br />

, i.e., how to assess the expected deviation of the<br />

new observation from its average value. For the normal regression model,<br />

the variance of the prediction error for the new prediction can be obtained<br />

as follows, assuming that the new observation is independent of the original<br />

n cases:<br />

V<br />

ˆ 2<br />

= V[ Y −Y<br />

] = σ + V[ Yˆ<br />

] .<br />

pred k(<br />

new)<br />

k<br />

k

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!