13.07.2015 Views

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6.5 NUMERIC PREDICTION 251in Section 3.3, <strong>and</strong> sometimes the structure can be expressed much more conciselyusing a set of rules instead of a tree. Can we generate rules for numericprediction? Recall the rule learner described in Section 6.2 that uses separate<strong>and</strong>-conquerin conjunction with partial decision trees to extract decision rulesfrom trees. The same strategy can be applied to model trees to generate decisionlists for numeric prediction.First build a partial model tree from all the data. Pick one of the leaves<strong>and</strong> make it into a rule. Remove the data covered by that leaf; then repeatthe process with the remaining data. The question is, how to build the partialmodel tree, that is, a tree with unexp<strong>and</strong>ed nodes? This boils down to thequestion of how to pick which node to exp<strong>and</strong> next. The algorithm ofFigure 6.5 (Section 6.2) picks the node whose entropy for the class attribute issmallest. For model trees, whose predictions are numeric, simply use the varianceinstead. This is based on the same rationale: the lower the variance, theshallower the subtree <strong>and</strong> the shorter the rule. The rest of the algorithm staysthe same, with the model tree learner’s split selection method <strong>and</strong> pruningstrategy replacing the decision tree learner’s. Because the model tree’s leaves arelinear models, the corresponding rules will have linear models on the right-h<strong>and</strong>side.There is one caveat when using model trees in this fashion to generate rulesets: the smoothing process that the model tree learner employs. It turns outthat using smoothed model trees does not reduce the error in the final rule set’spredictions. This may be because smoothing works best for contiguous data, butthe separate-<strong>and</strong>-conquer scheme removes data covered by previous rules,leaving holes in the distribution. Smoothing, if it is done at all, must be performedafter the rule set has been generated.Locally weighted linear regressionAn alternative approach to numeric prediction is the method of locally weightedlinear regression. With model trees, the tree structure divides the instance spaceinto regions, <strong>and</strong> a linear model is found for each of them. In effect, the trainingdata determines how the instance space is partitioned. Locally weightedregression, on the other h<strong>and</strong>, generates local models at prediction time bygiving higher weight to instances in the neighborhood of the particular testinstance. More specifically, it weights the training instances according totheir distance to the test instance <strong>and</strong> performs a linear regression on theweighted data. Training instances close to the test instance receive a highweight; those far away receive a low one. In other words, a linear model is tailormade for the particular test instance at h<strong>and</strong> <strong>and</strong> used to predict the instance’sclass value.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!