14.03.2014 Views

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

336 Recursively Partitioning Data Chapter 13<br />

Graphs for Goodness of Fit<br />

Table 13.2 Validation <strong>Methods</strong> (Continued)<br />

Validation Column<br />

Uses a column’s values to divide the data into parts. The column is<br />

assigned using the Validation role on the Partition launch window. See<br />

Table 13.1.<br />

The column’s values determine how the data is split, <strong>and</strong> what method<br />

is used for validation:<br />

• If the column’s values are 0, 1<strong>and</strong> 2, then:<br />

– Rows with 0 are assigned to the Training set<br />

– Rows with 1 are assigned to the Validation set<br />

– Rows with 2 are assigned to the Test set<br />

• If the column’s values are 0 <strong>and</strong> 1, then only Training <strong>and</strong><br />

Validation sets are used.<br />

Graphs for Goodness of Fit<br />

The graph for goodness of fit depends on which type of response you use. The Actual by Predicted plot is<br />

for continuous responses, <strong>and</strong> the ROC Curve <strong>and</strong> Lift Curve are for categorical responses.<br />

Actual by Predicted Plot<br />

For continuous responses, the Actual by Predicted plot shows how well the model fits the data. Each leaf is<br />

predicted with its mean, so the x-coordinates are these means. The actual values form a scatter of points<br />

around each leaf mean. A diagonal line represents the locus of where predicted <strong>and</strong> actual values are the<br />

same. For a perfect fit, all the points would be on this diagonal.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!