13.07.2015 Views

Boosted Regression (Boosting): An introductory tutorial and a Stata ...

Boosted Regression (Boosting): An introductory tutorial and a Stata ...

Boosted Regression (Boosting): An introductory tutorial and a Stata ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

compared to linear regression. Linear regression is a good reference model because manyscientists initially fit a linear model. I simulated data from the following modely = 30( x − 0.5) + 2x + x + ε2 −0.51 2 3where ε~uniform(0,1) <strong>and</strong> 0 ≤ x i≤ 1 for i ∈ 1, 2, 3 . To keep things relatively simple, themodel has been chosen to be additive without interactions. It is quadratic in x 1 , nonlinearin x 2 , linear with a small slope in x 3 . The nonlinear contribution of x 2 is stronger than thelinear contribution of x 3 even though their slopes are similar. A fourth variable, x 4 , isunrelated to the response but is used in the analysis in an attempt to confuse the boostingalgorithm. Scatter plots of y vs x1 through x4 is shown in Figure 3.y0 20 40 60 80y0 20 40 60 800 .2 .4 .6 .8 1x10 .2 .4 .6 .8 1x2y0 20 40 60 80y0 20 40 60 800 .2 .4 .6 .8 1x30 .2 .4 .6 .8 1x4Figure 3: Scatter plots of y versus x1 through x4I chose the following parameter values shrink=0.01, bag=0.5 <strong>and</strong> maxiter=4000.My rule of thumb in choosing the maximal number of iterations is that the shrinkagefactor times the maximal number of iterations should be roughly between 10 <strong>and</strong> 100. In16

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!