11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

394 Good health: Statistical challengesflict tamoxifen’s side effects on women with little chance of gaining from theexperience.Example 2. Risk models are used to improve the cost-efficiency of preventiveinterventions. For instance, screening with breast magnetic resonance imaging(MRI) detects more breast cancers but costs more and produces more falsepositive scans, compared to mammography. Costs and false positives can bereduced by restricting MRI to women whose breast cancer risk exceeds somethreshold (Plevritis et al., 2006). For this type of application, a good riskmodel should give a classification rule that assigns mammography to thosetruly at low risk (i.e., has a low false positive rate), but also assigns MRI tothose truly at high risk (i.e., has a high true positive rate).Example 3. Risk models are used to facilitate personal health care decisions.Consider, for instance, a postmenopausal woman with osteoporosis who mustchoose between two drugs, raloxifene and alendronate, to prevent hip fracture.Because she has a family history of breast cancer, raloxifene would seem agood choice, since it also reduces breast cancer risk. However she also has afamily history of stroke, and raloxifene is associated with increased stroke risk.To make a rational decision, she needs a risk model that provides accurateinformation about her own risks of developing three adverse outcomes (breastcancer, stroke, hip fracture), and the effects of the two drugs on these risks.The first two examples involve classifying people into “high” and “low”categories; thus they require risk models with low false positive and/or falsenegative rates. In contrast, the third example involves balancing one person’srisks for several different outcomes, and thus it requires risk models whoseassigned risks are accurate enough at the individual level to facilitate rationalhealthcare decisions. It is common practice to summarize a model’s calibrationand discrimination with a single statistic, such as a chi-squared goodness-offittest. However, such summary measures do not reveal subgroups whoserisks are accurately or inaccurately pegged by a model. This limitation can beaddressed by focusing on subgroup-specific performance measures. Evaluatingperformance in subgroups also helps assess a model’s value for facilitating personalhealth decisions. For example, a woman who needs to know her breastcancer risk is not interested in how a model performs for others in the population;yet summary performance measures involve the distribution of covariatesin the entire population to which she belongs.35.4 How do we estimate model performance measures?Longitudinal cohort studies allow comparison of actual outcomes to modelassignedrisks. At entry to a cohort, subjects report their current and past co-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!