11.07.2015 Views

statisticalrethinkin..

statisticalrethinkin..

statisticalrethinkin..

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

102 4. LINEAR MODELSpreviously observed 100 heights with mean value 165. at’s a pretty strong prior. In contrast, theformer Normal(165, 10) prior implies n = 1/10 2 = 0.01, a hundredth of an observation. is is anextremely weak prior. But of course exactly how strong or weak either prior is will depend upon howmuch data is used to update it.4.3.6. Sampling from a map fit. e above explains how to get a MAP quadratic approximationof the posterior, using map. But how do you then get samples from the quadraticapproximate posterior distribution? e answer is rather simple, but non-obvious, and itrequires recognizing that a quadratic approximation to a posterior distribution with morethan one parameter dimension—µ and σ each contribute one dimension—is just a multidimensionalGaussian distribution.As a consequence, when R constructs a quadratic approximation, it calculates not onlystandard deviations for all parameters, but also the covariances among all pairs of parameters.Just like a mean and standard deviation (or its square, a variance) are sufficient todescribe a one-dimensional Gaussian distribution, a list of means and a matrix of variancesand covariances is sufficient to describe a multi-dimensional Gaussian distribution. To seethis matrix of variances and covariances, for model m4.1, use:R code4.29vcov( m4.1 )mu sigmamu 0.1695256052 0.0003866944sigma 0.0003866944 0.0849067256e above is a VARIANCE-COVARIANCE matrix. It is the multi-dimensional glue of a quadraticapproximation, because it tells us how each parameter relates to every other parameterin the posterior distribution. A variance-covariance matrix can be factored into twoelements: (1) a vector of variances for the parameters and (2) a correlation matrix that tellsus how changes in any parameter lead to correlated changes in the others. is decompositionis usually easier to understand. So let’s do that now:R code4.30diag( vcov( m4.1 ) )cov2cor( vcov( m4.1 ) )mu sigma0.16952561 0.08490673mu sigmamu 1.00000000 0.00322314sigma 0.00322314 1.00000000e two-element vector in the output is the list of variances. If you take the square root of thisvector, you get the standard deviations that are shown in precis output. e two-by-twomatrix in the output is the correlation matrix. Each entry shows the correlation, boundedbetween −1 and +1, for each pair of parameters. e 1’s indicate a parameter’s correlationwith itself. If these values where anything except 1, we would be worried. e other entriesare typically closer to zero, and they are very close to zero in this example. is indicatesthat learning µ tells us nothing about σ and likewise that learning σ tells us nothing about

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!