16.11.2012 Views

Data Mining Methods and Models

Data Mining Methods and Models

Data Mining Methods and Models

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

258 CHAPTER 6 GENETIC ALGORITHMS<br />

4. Press the Choose button, next to classifier.<br />

5. Select Classifiers → Bayes → Naive Bayes from the navigation hierarchy.<br />

6. Click OK to close the WrapperSubsetEval dialog. The evaluation method for<br />

AttributeSelection is now specified.<br />

7. On the AttributeSelection dialog, press the Choose button next to search.<br />

8. Select AttributeSelection → GeneticSearch from the navigation hierarchy.<br />

9. Press OK to close the AttributeSelection dialog.<br />

The evaluator <strong>and</strong> search methods for attribute selection have been specified;<br />

however, our changes haven’t yet been applied to our data set.<br />

10. Press the Apply button on the right side of the Explorer panel.<br />

After processing the comm<strong>and</strong>, WEKA displays updated results in the Explorer<br />

panel. In particular, under Attributes, notice the list now shows seven predictor attributes.<br />

That is, the two attributes single-cell-size <strong>and</strong> mitoses have been removed<br />

from the attribute list. Let’s reclassify the records using naive Bayes with 10-fold<br />

cross-validation; however, this time only seven attributes are input to the classifier.<br />

1. Select the Classify Tab.<br />

2. Under Classifier, press the Choose button.<br />

3. Select Classifiers → Bayes → Naive Bayes from the navigation hierarchy.<br />

4. Cross-validation is specified.<br />

5. Click Start.<br />

Now, naive Bayes reports 96.78% (661/683) classification accuracy, which indicates<br />

that the second model outperforms the first model by almost 0.05% (96.78%<br />

versus 96.34%). That is, classification accuracy has increased where only seven of<br />

the nine attributes are specified as input. Although these results do not show a dramatic<br />

improvement in accuracy, this simple example has demonstrated how WEKA’s<br />

Genetic Search algorithm can be included as part of an attribute selection approach.<br />

Let’s further examine the results reported by WEKA’s Genetic Search method,<br />

where the characteristics of the c<strong>and</strong>idate population are described. The following<br />

procedures should look similar to those performed above. This time, however, we’re<br />

invoking the attribute selection filter from WEKA’s Select attributes Tab, which provides<br />

detailed output.<br />

1. Return to the Preprocess Tab on the Explorer panel.<br />

2. Press the Undo button (top right). This removes the filter we applied to the data<br />

set earlier.<br />

3. Select the Select attributes Tab from the Explorer panel.<br />

4. Under Attribute Evaluator, press the Choose button.<br />

5. Select AttributeSelection → WrapperSubsetEval from the navigation hierarchy.<br />

6. Click on the text “WrapperSubsetEval . . . ” to open the WrapperSubsetEval dialog.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!