01.03.2013 Views

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

304 7 Data Regression<br />

the same way as we performed feature selection in Chapter 6. The search<br />

procedure has also to use an appropriate criterion for predictor selection. There are<br />

many such criteria published in the literature. We indicate here just a few:<br />

– SSE (minimisation)<br />

– R square (maximisation)<br />

– t statistic (maximisation)<br />

– F statistic (maximisation)<br />

When building the model, these criteria can be used in a stepwise manner the<br />

same way as we performed sequential feature selection in Chapter 6. That is, by<br />

either adding consecutive variables to the model − the so-called forward search<br />

method −, or by removing variables from an initial set − the so-called backward<br />

search method.<br />

For instance, a very popular method is to use forward stepwise building up the<br />

model using the F statistic, as follows:<br />

1. Initially enters the variable, say X1, that has maximum Fk<br />

MSR(Xk)/MSE(Xk), which must be above a certain specified level.<br />

=<br />

2. Next is added the variable with maximum Fk = MSR(Xk | X1) / MSE(Xk, X1)<br />

<strong>and</strong> above a certain specified level.<br />

3. The Step 2 procedure goes on until no variable has a partial F above the<br />

specified level.<br />

Example 7.17<br />

Q: Apply the forward stepwise procedure to the foetal weight data (see Example<br />

7.13), using as initial predictor sets {BPD, CP, AP} <strong>and</strong> {MW, MH, BPD, CP, AP,<br />

FL}.<br />

A: Figure 7.11 shows the evolution of the model using the forward stepwise<br />

method to {BPD, CP, AP}. The first variable to be included, with higher F, is the<br />

variable AP. The next variables that are included have a decreasing F contribution<br />

but still higher than the specified level of “F to Enter”, equal to 1. These results<br />

confirm the findings on partial correlation coefficients discussed in section 7.2.5<br />

(Table 7.4).<br />

Figure 7.11. Forward stepwise regression (obtained with <strong>STATISTICA</strong>) for the<br />

foetal weight example, using {BPD, CP, AP} as initial predictor set.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!