13.07.2015 Views

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

7.5 COMBINING MULTIPLE MODELS 329outlook= overcast≠ overcastyesoption nodehumiditywindy= high= normal= true= falsenoyesnoyesFigure 7.10 Simple option tree for the weather data.overall classification. This can be done simply by voting, taking the majorityvote at an option node to be the prediction of the node. In that case it makeslittle sense to have option nodes with only two options (as in Figure 7.10)because there will only be a majority if both branches agree. Another possibilityis to average the probability estimates obtained from the differentpaths, using either an unweighted average or a more sophisticated Bayesianapproach.Option trees can be generated by modifying an existing decision tree learnerto create an option node if there are several splits that look similarly usefulaccording to their information gain. All choices within a certain user-specifiedtolerance of the best one can be made into options. During pruning, the errorof an option node is the average error of its options.Another possibility is to grow an option tree by incrementally adding nodesto it. This is commonly done using a boosting algorithm, <strong>and</strong> the resulting treesare usually called alternating decision trees instead of option trees. In this contextthe decision nodes are called splitter nodes <strong>and</strong> the option nodes are called predictionnodes. Prediction nodes are leaves if no splitter nodes have been addedto them yet. The st<strong>and</strong>ard alternating decision tree applies to two-class problems,<strong>and</strong> with each prediction node is associated a positive or negative numericvalue. To obtain a prediction for an instance, filter it down all applicablebranches <strong>and</strong> sum up the values from any prediction nodes that are encountered;predict one class or the other depending on whether the sum is positiveor negative.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!