13.07.2015 Views

Boosted Regression (Boosting): An introductory tutorial and a Stata ...

Boosted Regression (Boosting): An introductory tutorial and a Stata ...

Boosted Regression (Boosting): An introductory tutorial and a Stata ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

has a lower value. Both values are much lower than the value obtained by boostedlogistic regression (test R 2 =0.27).Because there are only two response values (0 <strong>and</strong> 1), I use a different plot forcalibration than the scatter plot shown in Figure 5. If the predicted values are accurateone would expect that the predicted values are roughly the same as the fraction ofresponse values classified as “1” that give rise to a given predicted value. The fraction ofresponse values classified as “1” can be estimated by averaging or smoothing overresponse values with similar predictions. In <strong>Stata</strong> I use a lowess smoother to compare thepredictions from the boosted logistic regression <strong>and</strong> the linear logistic regression:twoway (lowess y logit_pred, bwidth(0.2)) (lowess y boost_pred,bwidth(0.2)) (lfit straight y), xtitle("Actual Values")legend(label(1 "Logistic <strong>Regression</strong>") label(2 "<strong>Boosting</strong>") label(3"Fitted Values=Actual Values") ) xsize(4) ysize(4)Calibration plots for the test data are shown in Figure 8. The near horizontal line forlogistic regression in the test calibration plot implies that logistic regression classifies50% of the observations correctly regardless of the actual predicted value. The logisticregression model does not generalize well.25

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!