01.03.2013 Views

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

254 6 Statistical Classification<br />

In a backward search, the process starts with the whole feature set <strong>and</strong>, at each<br />

step, the feature that contributes the least to class discrimination is removed. The<br />

process goes on until the merit criterion for any c<strong>and</strong>idate feature is above a<br />

specified threshold.<br />

2. Sequential search (dynamic)<br />

The problem with the previous search methods is the possible existence of “nested”<br />

feature subsets that are not detected by direct sequential search. This problem is<br />

tackled in a dynamic search by performing a combination of forward <strong>and</strong> backward<br />

searches at each level, known as “plus l-take away r” selection.<br />

Direct sequential search methods can be applied using <strong>STATISTICA</strong> <strong>and</strong> <strong>SPSS</strong>,<br />

features at a given step) with default value of one. <strong>SPSS</strong> allows the use of other<br />

merit criteria such as the squared Bhattacharyya distance (i.e., the squared<br />

Mahalanobis distance of the means).<br />

It is also common to set a lower limit to the so-called tolerance level, T = 1 – r 2 the latter affording a dynamic search procedure that is in fact a “plus 1-take away<br />

1” selection. As merit criterion, <strong>STATISTICA</strong> uses the ANOVA F (for all selected<br />

,<br />

which must be satisfied by all features, where r is the multiple correlation factor of<br />

one c<strong>and</strong>idate feature with all the others. Highly correlated features are therefore<br />

removed. One must be quite conservative, however, in the specification of the<br />

tolerance. A value at least as low as 1% is common practice.<br />

Example 6.12<br />

Q: Consider the first two classes of the Cork Stoppers’ dataset. Perform<br />

forward <strong>and</strong> backward searches on the available 10-feature set, using default values<br />

for the tolerance (0.01) <strong>and</strong> the ANOVA F (1.0). Evaluate the training set errors of<br />

both solutions.<br />

A: Figure 6.21 shows the summary listing of a forward search for the first two<br />

classes of the cork-stopper data obtained with <strong>STATISTICA</strong>. Equal priors are<br />

assumed. Note that variable ART, with the highest F, entered in the model in “ Step 1 ”<br />

.<br />

The Wilk’s lambda, initially 1, decreased to 0.42 due to the contribution of<br />

ART. Next, in “ Step 2”, the variable with highest F contribution for the model<br />

containing ART, enters in the model, decreasing the Wilks’ lambda to 0.4. The<br />

process continues until there are no variables with F contribution higher than 1. In<br />

the listing an approximate F for the model, based on the Wilk’s lambda, is also<br />

indicated. Figure 6.21 shows that the selection process stopped with a highly<br />

significant ( p ≈ 0) Wilks’ lambda. The four-feature solution {ART, PRM, NG,<br />

RAAR} corresponds to the classification matrix shown before in Figure 6.14b.<br />

<strong>Using</strong> a backward search, a solution with only two features (N <strong>and</strong> PRT) is<br />

obtained. It has the performance presented in Example 6.2. Notice that the<br />

backward search usually needs to start with a very low tolerance value (in the<br />

present case T = 0.002 is sufficient). The dimensionality ratio of this solution is

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!