13.07.2015 Views

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

7.5 COMBINING MULTIPLE MODELS 331every possible location in the tree is considered for addition, <strong>and</strong> a node is addedaccording to a performance measure that depends on the particular boostingalgorithm employed. However, heuristics can be used instead of an exhaustivesearch to speed up the learning process.Logistic model treesOption trees <strong>and</strong> alternating trees yield very good classification performancebased on a single structure, but they may still be difficult to interpret when thereare many options nodes because it becomes difficult to see how a particular predictionis derived. However, it turns out that boosting can also be used to buildvery effective decision trees that do not include any options at all. For example,the LogitBoost algorithm has been used to induce trees with linear logisticregression models at the leaves. These are called logistic model trees <strong>and</strong> areinterpreted in the same way as the model trees for regression described inSection 6.5.LogitBoost performs additive logistic regression. Suppose that each iterationof the boosting algorithm fits a simple regression function by going through allthe attributes, finding the simple regression function with the smallest error,<strong>and</strong> adding it into the additive model. If the LogitBoost algorithm is run untilconvergence, the result is a maximum likelihood multiple-logistic regressionmodel. However, for optimum performance on future data it is usually unnecessaryto wait for convergence—<strong>and</strong> to do so is often detrimental. An appropriatenumber of boosting iterations can be determined by estimating theexpected performance for a given number of iterations using cross-validation<strong>and</strong> stopping the process when performance ceases to increase.A simple extension of this algorithm leads to logistic model trees. The boostingprocess terminates when there is no further structure in the data that canbe modeled using a linear logistic regression function. However, there may stillbe a structure that linear models can fit if attention is restricted to subsets ofthe data, obtained, for example, by a st<strong>and</strong>ard decision tree criterion such asinformation gain. Then, once no further improvement can be obtained byadding more simple linear models, the data is split <strong>and</strong> boosting is resumed separatelyin each subset. This process takes the logistic model generated so far <strong>and</strong>refines it separately for the data in each subset. Again, cross-validation is run ineach subset to determine an appropriate number of iterations to perform in thatsubset.The process is applied recursively until the subsets become too small. Theresulting tree will surely overfit the training data, <strong>and</strong> one of the st<strong>and</strong>ardmethods of decision tree learning can be used to prune it. Experiments indicatethat the pruning operation is very important. Using a strategy that chooses theright tree size using cross-validation, the algorithm produces small but veryaccurate trees with linear logistic models at the leaves.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!