11.07.2015 Views

statisticalrethinkin..

statisticalrethinkin..

statisticalrethinkin..

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.4. DEVIANCE INFORMATION CRITERION 199(a)(b)deviance1.5 2.5 3.51deviance1.5 2.5 3.50.5-1 0 1 2 3mu-1 0 1 2 3muFIGURE 6.7. How DIC estimates the overfitting penalty. In each plot, samplesfrom the posterior density of µ, mu, are shown against deviance at eachvalue of µ. e vertical black dashed line marks the posterior mean, E µ.e horizontal black dashed line marks its deviance, ˆD. e blue dashedline marks the average deviance, ¯D. e difference ¯D − ˆD estimates theeffective number of parameters. is difference is shown by the gray linesegment in the right margin of each plot, labeled by its value. (a) A flatprior on µ, resulting in ¯D − ˆD ≈ 1. (b) An informative µ ∼ Normal(0, 1)prior, resulting in ¯D − ˆD ≈ 0.5.e informative prior has this effect because priors act like previously observed data. Inm6.10, the prior is equivalent to exactly one previous observation, located at y i = 0. Goahead and fit a model to one observation at y i = 0, using a flat prior, and you’ll see that theposterior density of µ will have mean 0 and standard error 1. So m6.10 updates that posteriorin light of another observation at y i = 1. Since the prior and the sample are equally strong,the posterior mean ends up right between them, at µ = 0.5. And likewise, since the priorand sample influence the posterior equally, the effective number of parameters is exactlyhalf the number of parameters. In terms of the distance ¯D − ˆD, the prior effectively doublesthe sample size, concentrating the posterior more around its mean. If the posterior is moreconcentrated, then the average deviance (blue dashed line) must be closer to the deviance ofthe mean (black dashed line).Typically, you’ll work with much larger samples, so any informative prior won’t be nearlyas influential as in the pedagogical example in FIGURE 6.7. But the phenomenon is quitegeneral. Informative priors reduce the flexibility of the model—how freely it can encode thesample. Reduced flexibility tends to mean less risk of overfitting.To see how well DIC does at estimating D test , let’s repeat the simulation strategy we developedfor AIC (FIGURE 6.6, panels c and d). Now however, we’ll use Normal(0, 1) priorson every regression slope (but not the intercept) in the multivariate models. So the models

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!