13.07.2015 Views

Boosted Regression (Boosting): An introductory tutorial and a Stata ...

Boosted Regression (Boosting): An introductory tutorial and a Stata ...

Boosted Regression (Boosting): An introductory tutorial and a Stata ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

R squared (on a test data set).935 .936 .937 .9381 2 3 4 5Number of interactionsFigure 4: Scatter plot of the R 2 (computed on a test data set) versus the number ofinteractions. Note the scale on the vertical axis.I often want to confirm that this model indeed works better than linear regression.Directly comparing the R 2 value for the boosted regression <strong>and</strong> the linear regression isnot a fair comparison. The boosted regression R 2 refers is computed on a test data set,whereas the linear regression R 2 is computed on the training data set. Using equation (1)it is possible to compute an R 2 on a test data set for the linear regression. For thefollowing set of <strong>Stata</strong> comm<strong>and</strong>s I assume that the first trainn observations in the data setconstitute the training data <strong>and</strong> the remainder the test data. The predictions from thelinear regression (or any other predictions) are denoted regress_pred.18

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!