13.07.2015 Views

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6.5 NUMERIC PREDICTION 243eralization is. Salzberg (1991) suggested that generalization with nested exemplarscan achieve a high degree of classification of accuracy on a variety of differentproblems, a conclusion disputed by Wettschereck <strong>and</strong> Dietterich (1995),who argued that these results were fortuitous <strong>and</strong> did not hold in otherdomains. Martin (1995) explored the idea that it is not the generalization butthe overgeneralization that occurs when hyperrectangles nest or overlap that isresponsible for poor performance <strong>and</strong> demonstrated that if nesting <strong>and</strong> overlappingare avoided excellent results are achieved in a large number of domains.The generalized distance function based on transformations is described byCleary <strong>and</strong> Trigg (1995).Exemplar generalization is a rare example of a learning strategy in which thesearch proceeds from specific to general rather than from general to specific asin the case of tree or rule induction. There is no particular reason why specificto-generalsearching should necessarily be h<strong>and</strong>icapped by forcing the examplesto be considered in a strictly incremental fashion, <strong>and</strong> batch-orientedapproaches exist that generate rules using a basic instance-based approach.Moreover, it seems that the idea of producing conservative generalizations <strong>and</strong>coping with instances that are not covered by choosing the “closest” generalizationis an excellent one that will eventually be extended to ordinary tree <strong>and</strong>rule inducers.6.5 Numeric predictionTrees that are used for numeric prediction are just like ordinary decisiontrees except that at each leaf they store either a class value that represents theaverage value of instances that reach the leaf, in which case the tree is called aregression tree, or a linear regression model that predicts the class value ofinstances that reach the leaf, in which case it is called a model tree. In whatfollows we will describe model trees because regression trees are really a specialcase.Regression <strong>and</strong> model trees are constructed by first using a decision treeinduction algorithm to build an initial tree. However, whereas most decisiontree algorithms choose the splitting attribute to maximize the information gain,it is appropriate for numeric prediction to instead minimize the intrasubsetvariation in the class values down each branch. Once the basic tree has beenformed, consideration is given to pruning the tree back from each leaf, just aswith ordinary decision trees. The only difference between regression tree <strong>and</strong>model tree induction is that for the latter, each node is replaced by a regressionplane instead of a constant value. The attributes that serve to define that regressionare precisely those that participate in decisions in the subtree that will bepruned, that is, in nodes underneath the current one.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!