11.07.2015 Views

statisticalrethinkin..

statisticalrethinkin..

statisticalrethinkin..

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.3. SAMPLING TO SIMULATE PREDICTION 77Frequency0 1000 2000 3000Frequency0 1000 2000 30002 4 6 8longest run length0 2 4 6 8number of switchesFIGURE 3.7. Alternative views of the same posterior predictive distributon(see FIGURE 3.6). Instead of considering the data as the model saw it, asa sum of water samples, now we view the data as both the length of themaximum run of water or land (le) and the number of switches betweenwater and land samples (right). Observed values highlighted in each. Whilethe simulated predictions are consistent with the run length (3 water in arow), they are much less consistent with the frequent switches (6 switchesin 9 tosses).tosses, and indeed skilled individuals can influence the outcome of a coin toss, by exploitingthe physics of it. 51So with the goal of seeking out aspects of prediction in which the model fails, let’s lookat the data in two different ways. Recall that the sequence of nine tosses was W L W W W LW L W. First, consider the length of the longest run of either water or land. is will providea crude measure of correlation between tosses. So in the observed data, the longest run is 3W’s. Second, consider the number of times in the data that the sample switches from waterto land or from land to water. is is another measure of correlation between samples. Inthe observed data, the number of switches is 6. ere is nothing special about these two newways of describing the data. ey just serve to inspect the data in new ways. In your ownmodeling, you’ll have to imagine aspects of the data that are relevant in your context, foryour purposes.FIGURE 3.7 shows the simulated predictions, viewed in these two new ways. On thele, the length of the longest run of water or land is plotted, with the observed value of 3highlighted by the bold line. Again, the true observation is the most common simulated observation,but with a lot of spread around it. On the right, the number of switches from waterto land and land to water is shown, with the observed value of 6 highlighted in bold. Nowthe simulated predictions appear less consistent with the data, as the majority of simulatedobservations have fewer switches than were observed in the actual sample. is is consistentwith lack of independence between tosses of the globe, in which each toss is negativelycorrelated with the last.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!