11.07.2015 Views

statisticalrethinkin..

statisticalrethinkin..

statisticalrethinkin..

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

192 6. MODEL SELECTION, COMPARISON, AND AVERAGINGere are generalizations of AIC that address this problem, as well as the informativeprior problem more generally. e most common and already in wide use is DIC, the DE-VIANCE INFORMATION CRITERION. 81 More general yet is WAIC, known as the WIDELY AP-PLICABLE INFORMATION CRITERION 82 or rather the WATANABE-AKAIKE INFORMATIONCRITERION. 83 We’ll consider DIC later in this very chapter, to handle informative priors.WAIC will make an appearance much later, in Chapter 13.6.3.1.4. Other prediction frameworks. e AIC gambit entails predicting a test sampleof the same size and nature as the training sample. is most certainly does not mean AICcan only be used when we plan to predict a sample of the same size as training. For example,AIC approximates some forms of cross-validation. 84But AIC’s implied prediction task is hardly representative of everything we might wishto do with models. For example, some statisticians prefer to evaluate predictions using aPREQUENTIAL framework, in which models are judged on their accumulated learning errorover the training sample. 85 And once you start using multilevel models, “prediction” is nolonger uniquely defined, because the test sample can differ from the training sample in waysthat forbid use of some the parameter estimates. We’ll worry about that issue in Chapter 13.Rethinking: Uniformitarianism and AIC. e AIC gambit described on page 190 pulls the test samplefrom the same process as the training sample. is is a kind of uniformitarian assumption, inwhich future data are expected to come from the same process as past data and have the same roughrange of values. is can cause problems. For example, suppose we fit a regression that predictsheight using body weight. e training sample comes from a poor town, in which most people arepretty thin. e relationship between height and weight turns out to be positive and strong. Nowalso suppose our prediction goal is to guess the heights in another, much wealthier, town. Pluggingthe weights from the wealthy individuals into the model fit to the poor individuals will predictoutrageously tall people. e reason is that, once weight becomes large enough, it has essentiallyno relationship with height. AIC will not automatically recognize nor solve this problem. Nor willany other isolated procedure. But over repeated rounds of model fitting, attempts at prediction, andmodel criticism, it is possible to overcome this kind of limitation. As always, statistics is no substitutefor science.6.3.2. Simulating AIC. e gambit that leads to AIC is complicated. So let’s walk through asimulation of this procedure. Hirotugu Akaike (1927–2009) himself original did this entirelywith formal mathematics. But we’ll use an expedient simulation, so it’ll be easier to understandand to modify. is section is not necessary for using AIC. But it may be necessaryfor understanding it, both with respect to its derivation and its scope. It will also provide anexample structure for how to study forecasting for any unique modeling problem you mightencounter in the future.Let’s begin by writing a function that implements the Akaike gambit on page 190. Here’sthe code, followed by explanation.R code6.11sim.train.test

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!