11.07.2015 Views

statisticalrethinkin..

statisticalrethinkin..

statisticalrethinkin..

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

136 5. MULTIVARIATE LINEAR MODELSYou can interpret these estimates as saying:Once we know median age at marriage for a State, there is little or no additionalpredictive power in also knowing the rate of marriage in that State.Note that this does not mean that there is no value in knowing marriage rate. If you didn’thave access to age-at-marriage data, then you’d definitely find value in knowing the marriagerate.But how did the model achieve this result? To answer that question, we’ll draw somepictures.5.1.3. Plotting multivariate posteriors. Visualizing posterior distribution in simple bivariateregressions (the previous chapter) is easy. ere’s only one predictor variable, so a singlescatterplot can convey a lot of information. And so in the previous chapter we used scattersof the data. en we overlaid regression lines and intervals to both (1) visualize the size ofthe association between the predictor and outcome and (2) to get a crude sense of the abilityof the model to predict the individual observations.With multivariate regression, you’ll need more plots. ere is a huge literature detailinga variety of plotting techniques that all attempt to help one understand multiple linear regression.None of these techniques is suitable for all jobs, and most do not generalize beyondlinear regression. So the approach I take here is to instead help you compute whatever youneed from the model. I offer three types of interpretive plots:(1) Predictor residual plots. ese plots show the outcome against residual predictorvalues.(2) Counterfactual plots. ese show the implied predictions for imaginary experimentsin which the different predictor variables can be changed independently ofone another.(3) Posterior prediction plots. ese show model-based predictions against raw data,or otherwise display the error in prediction.Each of these plot types has its advantages and deficiencies, depending upon the context andthe question of interest. In the rest of this section, I show you how to manufacture each ofthese in the context of the divorce data.5.1.3.1. Predictor residual plots. A predictor variable residual is the average predictionerror when we use all of the other predictor variables to model a predictor of interest. at’sa complicated concept, so we’ll go straight to the example, where it will make sense. ebenefit of computing these things is that, once plotted against the outcome, we have a bivariateregression of sorts that has already “controlled” for all of the other predictor variables. Itjust leaves in the variation that is not expected by the model of the mean, µ, as a function ofthe other predictors.In our multivariate model of divorce rate, we have two predictors: (1) marriage rate(Marriage.s) and (2) median age at marriage (MedianAgeMarriage.s). To compute predictorresiduals for either, we just use the other predictor to model it. So for marriage rate,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!