11.07.2015 Views

statisticalrethinkin..

statisticalrethinkin..

statisticalrethinkin..

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

5.3. WHEN ADDING VARIABLES HURTS 15710 30 5010 30 50kcal.per.gperc.fatperc.lactose0.5 0.7 0.930 50 70FIGURE 5.10. A pairs plot of the total energy,percent fat, and percent lactose variablesfrom the primate milk data. Percentfat and percent lactose are strongly negativelycorrelated with one another, providingmostly the same information.0.5 0.7 0.9 30 50 70bl ~ dnorm( 0 , 1 ) ,sigma ~ dunif( 0 , 10 )) ,data=d ,start=list(a=mean(d$kcal.per.g),bf=0,bl=0,sigma=sd(d$kcal.per.g)) )precis( m5.12 , digits=3 )Mean StdDev 2.5% 97.5%a 1.007 0.200 0.615 1.399bf 0.002 0.002 -0.003 0.007bl -0.009 0.002 -0.013 -0.004sigma 0.061 0.008 0.045 0.077Now the posterior means of both bf and bl are closer to zero. And the standard deviationsfor both parameters are twice as large as in the bivariate models (m5.10 and m5.11). In thecase of percent fat, the posterior mean is essentially zero.What has happened here? is is the same phenomenon as in the leg length example.What has happened is that the variables perc.fat and perc.lactose contain much of thesame information. ey are substitutes for one another. As a result, when you include bothin a regression, the posterior distribution ends up describing a long ridge of combinationsof bf and bl that are equally plausible.In the case of the fat and lactose, these two variables form essentially a single axis ofvariation. e easiest way to see this is to use a pairs plot:pairs( ~ kcal.per.g + perc.fat + perc.lactose ,data=d , col=rangi2 )R code5.37

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!