11.07.2015 Views

statisticalrethinkin..

statisticalrethinkin..

statisticalrethinkin..

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

182 6. MODEL SELECTION, COMPARISON, AND AVERAGINGRethinking: Bias and variance. e underfitting/overfitting dichotomy is oen described as theBIAS-VARIANCE TRADEOFF. 74 “Bias” here refers to underfitting, while “variance” refers to overfitting.ese terms are confusing though, because they are used in many different ways in different contexts,even within statistics. e term “bias” also sounds like a bad thing, even though increasing bias oenleads to better predictions. For these reasons, this book prefers underfitting/overfitting, but you shouldexpect to see the same concepts discussed as bias/variance.6.2. Information theory and model performanceSo how do we navigate between the hydra of overfitting and the vortex of underfitting?Many common approaches begin by defining a precise criterion of model performance.What do you want the model to do well at? It can be hard to define “fit” in even a simple,uncontroversial way, however. is is because accuracy depends upon the definition of thetarget, and there is no unique best target.In defining a target, there are two major dimensions to worry about:(1) Cost-benefit analysis. How much does it cost when we’re wrong? How much do wewin when we’re right? Most scientists never ask these questions in any formal way,but applied scientists must routinely answer them.(2) Accuracy in context. Some prediction tasks are inherently easier than others. Soeven if we ignore costs and benefits, we still need a way to judge “accuracy” thataccounts for how much a model could possibly improve prediction.It will help to explore these two dimensions in an example.6.2.1. Firing the weatherperson. Suppose in a certain city, a certain weatherperson issuesuncertain predictions for rain or shine on each day of the year. 75 e predictions are in theform of probabilities of rain. e currently employed weatherperson predicted these chancesof rain over a ten-day sequence, with the actual outcomes shown below each prediction:Day 1 2 3 4 5 6 7 8 9 10Prediction 1 1 1 0.6 0.6 0.6 0.6 0.6 0.6 0.6ObservedA newcomer rolls into town, and this newcomer boasts that he can best the current weatherperson,by always predicting sunshine. Over the same ten day period, the newcomer’s recordwould be:Day 1 2 3 4 5 6 7 8 9 10Prediction 0 0 0 0 0 0 0 0 0 0Observed“So by rate of correct prediction alone,” the newcomer announces, “I’m the best person forthe job.”e newcomer is right. Define hit rate as the average chance of a correct prediction. Sofor the current weatherperson, she gets 3 × 1 + 7 × 0.4 = 5.8 hits in 10 days, for a rate of5.8/10 = 0.58 correct predictions per day. In contrast, the newcomer gets 3×0+7×1 = 7,for 7/10 = 0.7 hits per day. e newcomer wins.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!