28.08.2014 Views

An Introduction to Recursive Partitioning Using the ... - Mayo Clinic

An Introduction to Recursive Partitioning Using the ... - Mayo Clinic

An Introduction to Recursive Partitioning Using the ... - Mayo Clinic

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1=Improved, 2=No change, 3=Worse<br />

X 3 = initial response <strong>to</strong> drugs<br />

1=Improved, 2=No change, 3=Worse<br />

The o<strong>the</strong>r 11 variables did not appear in <strong>the</strong> nal model. This procedure seems<br />

<strong>to</strong> work especially well for variables such asX 1 , where <strong>the</strong>re is a denite ordering,<br />

but spacings are not necessarily equal.<br />

The tree is built by <strong>the</strong> following process: rst <strong>the</strong> single variable is found which<br />

best splits <strong>the</strong> data in<strong>to</strong> two groups (`best' will be dened later). The data is<br />

separated, and <strong>the</strong>n this process is applied separately <strong>to</strong> each sub-group, and so on<br />

recursively until <strong>the</strong> subgroups ei<strong>the</strong>r reach a minimum size (5 for this data) or until<br />

no improvement can be made.<br />

The resultant model is, with certainty, <strong>to</strong>o complex, and <strong>the</strong> question arises as it<br />

does with all stepwise procedures of when <strong>to</strong> s<strong>to</strong>p. The second stage of <strong>the</strong> procedure<br />

consists of using cross-validation <strong>to</strong> trim back <strong>the</strong> full tree. In <strong>the</strong> medical example<br />

above <strong>the</strong> full tree had ten terminal regions. A cross validated estimate of risk was<br />

computed for a nested set of subtrees; this nal model, presented in gure 1, is <strong>the</strong><br />

subtree with <strong>the</strong> lowest estimate of risk.<br />

2 Notation<br />

The partitioning method can be applied <strong>to</strong> many dierent kinds of data. We will<br />

start by looking at <strong>the</strong> classication problem, which isoneof <strong>the</strong> more instructive<br />

cases (but also has <strong>the</strong> most complex equations). The sample population consists<br />

of n observations from C classes. A given model will break <strong>the</strong>se observations in<strong>to</strong><br />

k terminal groups; <strong>to</strong> each of <strong>the</strong>se groups is assigned a predicted class (this will be<br />

<strong>the</strong> response variable). In an actual application, most parameters will be estimated<br />

from <strong>the</strong> data, such estimates are given by formulae.<br />

i i =1; 2; :::; C Prior probabilities of each class.<br />

L(i; j) i =1; 2; :::; C Loss matrix for incorrectly classifying<br />

an i as a j. L(i; i) 0:<br />

A<br />

Some node of <strong>the</strong> tree.<br />

Note that A represents both a set of individuals in<br />

<strong>the</strong> sample data, and, via <strong>the</strong> tree that produced it,<br />

a classication rule for future data.<br />

4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!