01.03.2013 Views

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

a<br />

4<br />

3<br />

2<br />

1<br />

0<br />

-1<br />

-2<br />

-3<br />

Expected Normal Value<br />

-4<br />

-800 -600 -400 -200 0 200 400 600 800 1000 1200<br />

7.3 Building <strong>and</strong> Evaluating the Regression Model 309<br />

Residuals<br />

130<br />

120 No of obs<br />

110<br />

100<br />

90<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

-600<br />

-800<br />

b -1000<br />

Figure 7.14. Distribution of the residuals for the foetal weight example: a) Normal<br />

probability plot; b) Histogram.<br />

7.3.3.2 Evaluating the Linear Model<br />

Distribution of the Residuals<br />

In order to assess whether the errors can be assumed normally distributed, one can<br />

use graphical inspection, as in Figure 7.14, <strong>and</strong> also perform the distribution fitting<br />

tests described in chapter 5. In the present case, the assumption of normal<br />

distribution for the errors seems a reasonable one.<br />

The constancy of the residual variance can be assessed using the following<br />

modified Levene test:<br />

1. Divide the data set into two groups: one with the predictor values<br />

comparatively low <strong>and</strong> the other with the predictor values comparatively<br />

high. The objective is to compare the residual variance in the two groups. In<br />

the present case, we divide the cases into the two groups corresponding to<br />

observed weights below <strong>and</strong> above 3000 g. The sample sizes are n1 = 118<br />

<strong>and</strong> n2 = 296, respectively.<br />

2. Compute the medians of the residuals ei in the two groups: med1 <strong>and</strong> med2.<br />

In the present case med1 = −182.32 <strong>and</strong> med2 = 59.87.<br />

3. Let d i1<br />

= ei1<br />

− med1<br />

<strong>and</strong> d i2<br />

= ei2<br />

− med 2 represent the absolute<br />

deviations of the residuals around the medians in each group. We now<br />

compute the respective sample means, d1 <strong>and</strong> d 2 , of these absolute<br />

deviations, which in our study case are: d 1 = 187.<br />

37 , d 2 = 221.<br />

42 .<br />

4. Compute:<br />

* d1<br />

− d 2<br />

t = ~ t n−2<br />

, 7.52<br />

1 1<br />

s +<br />

n n<br />

1<br />

2<br />

-400<br />

-200<br />

0<br />

200<br />

400<br />

600<br />

800<br />

1000<br />

1200

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!