11.07.2015 Views

statisticalrethinkin..

statisticalrethinkin..

statisticalrethinkin..

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

184 6. MODEL SELECTION, COMPARISON, AND AVERAGINGreflected in any reasonable measure of distance from the target, because by adding anothertype of event, the target has gotten harder to hit.It’s like taking a two-dimensional archery bullseye and forcing the archer to hit the targetat the right time—a third dimension—as well. Now the possible distance between thebest archer and the worst archer has grown, because there’s another way to miss. And withanother way to miss, one might also say that there is another way for an archer to impress.As the potential distance between the target and the shot increases, so too does the potentialimprovement and ability of a talented archer to impress us.Rethinking: What is a true model? It’s hard to define “true” probabilities, because all models arefalse. So what does “truth” mean in this context? It means the right probabilities, given our stateof ignorance. Our state of ignorance is described by the model. e probability is in the model,not in the world. If we had all of the information relevant to producing a forecast, then rain or sunwould be deterministic, and the “true” probabilities would be just 0’s and 1’s. Absent some relevantinformation, as in all modeling, outcomes in the small world are uncertain, even though they remainperfectly deterministic in the large world. Because of our ignorance, we can have “true” probabilitiesbetween zero and one.An example might help. Suppose you toss the globe, as in Chapter 2. Before you catch it, theoutcome is uncertain. ere is a “true” probability of observing water, conditional on our assumedmodel. But if we had enough information about the globe toss—initial conditions, angular momentumvector, and such—then the outcome would be knowable with certainty. No two tosses are everexactly alike, so the “true” probability of observing water must average over the unknown differencesto describe the relative plausibility of water compared to land. ere is a right answer to the questionthis model poses—70% water. It’s our ignorance of the physics of the globe toss that leads us to useit as a way to estimate the amount of water on the surface.6.2.2. Information and uncertainty. One solution to the problem of how to measure distanceof a model’s accuracy from a target was provided in the late 1940s. 76 Originally appliedto problems in communication of messages, such as telegraph, the field of INFORMATIONTHEORY is now important across the basic and applied sciences, and it has deep connectionsto Bayesian inference. And like many successful fields, information theory has spawned alarge number of bogus applications, as well. 77e basic insight is to ask: how much is our uncertainty reduced by learning an outcome?Consider the weather forecasts again. Forecasts are issued in advance and the weather isuncertain. When the actual day arrives, the weather is no longer uncertain. e reductionin uncertainty is then a natural measure of how much we have learned, how much “information”we derive from observing the outcome. So if we can develop a precise definition of“uncertainty,” we can provide a baseline measure of how hard it is to predict, as well as howmuch improvement is possible. e measured decrease in uncertainty is the definition ofinformation in this context.Information: e reduction in uncertainty derived from learning an outcome.To use this definition, what we need is a principled way to quantify the uncertainty inherentin a probability distribution. So suppose again that there are two possible weatherevents on any particular day: either it is sunny or it is rainy. Each of these events occurs withsome probability, and these probabilities add up to one. What we want is a function that uses

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!