Decision curve analysis

Decision curve analysisAndrew VickersDepartments of Epidemiology andBiostatisticsMemorial Sloan-Kettering Cancer Center

What am I doing here?Prediction models arewonderful!

What am I doing here?Prediction models arewonderful!How do you know thatthey do more good thanharm?

What am I doing here?Prediction models arewonderful!

How do you know thatprediction models do moregood than harm?• Biostatistical measures:– Sensitivity, specificity– PPV, NPV– Brier score– Youden index, AUC– NRI

Model• Data generated with a logit model.• Betas of:– 1.8 for test 1– 2.0 for test 2– 1.9 for interaction– -3.5 for intercept

ResultsMarker Sensitivity Specificity PPV NPV Brier AUCYoudenNRITest 1 56 86 39 92 .1862 .706Test 2 72 76 34 94 .2441 .741 .070

ResultsMarker Sensitivity Specificity PPV NPV Brier AUCYoudenNRITest 1 56 86 39 92 .1862 .706Test 2 72 76 34 94 .2441 .741 .070What if Test 2 was a scan that wasexpensive and inconvenient?

ResultsMarker Sensitivity Specificity PPV NPV Brier AUCYoudenNRITest 1 56 86 39 92 .1862 .706Test 2 72 76 34 94 .2441 .741 .070Both 40 99 91 91 .0923 .695 -.023Either 88 63 28 97 .3380 .753 .093

Principle• Take statistical method m for decision d• m insufficient for d if extra information ineeded• For m to be sufficient for d:• i must be parameterized• Method available to incorporate i in m

Principle• Sensitivity / specificity insufficient todetermine which test should be used:– “Depends on whether sensitivity orspecificity is more important”• We need:– A parameter specifying relative importanceof sensitivity vs. specificity– A way to incorporate that parameter in astatistical methodology

What is the “depends”parameter?

Youden IndexSensitivity + Specificity – 1

Youden IndexSensitivity + Specificity – 1Value of sensitivity &specificity are equal

AUC / C indexProbability correct identification ofdiscordant pair

AUC / C indexProbability correct identification ofdiscordant pairClassifying a patient without an eventas high probability =classifying a patient with an event aslow probability– Pr: 50%, D=0; Pr: 10%, D=1– Pr: 50%, D=1; Pr: 90%, D=0

Brier Score(D - P) 2

Brier Score(D - P) 2• D=1 and D=0 scored equally

Net Reclassification Index

Net Reclassification IndexAs harmful to reclassify down a patientwho has the event asreclassify up a patient who doesn’thave the event

When is it ever the case inmedicine where…Harm of missed treatment=Harm of unnecessary treatment

Prostate cancer• What is worse?To fail to biopsy a patient with cancer(missed treatment = false negative)To biopsy a patient without cancer(unnecessary treatments = false positive)

Conclusion 1Prediction modelsare wonderful!How do you knowthat they do moregood than harm?• Traditional biostatistical techniquesfor evaluating models, markers andtests do not incorporate clinicalconsequences• Accordingly, they cannot informclinical practice

CanceraApplymodelBiopsyNo biopsyNo cancerCancerCancerNo cancer Cancerp 1p 2p 31- (p 1 + p 2 + p 3 )bcdCancerCancer(p 1 + p 3 )aBiopsyNo cancerCancer1 - (p 1 + p 3 )bNo biopsyCancerNo cancer Cancer(p 1 + p 3 )1 - (p 1 + p 3 )cd

CanceraApplymodelBiopsyNo biopsyNo cancerCancerCancerNo cancer Cancerp 1p 2p 31- (p 1 + p 2 + p 3 )bcdCancerCancer(p 1 + p 3 )aBiopsyNo cancerCancer1 - (p 1 + p 3 )bNo biopsyCancerNo cancer Cancer(p 1 + p 3 )1 - (p 1 + p 3 )cd

Problems with traditionaldecision analysis• p’s require a cut-point to be chosen• Extra data needed on health valuesoutcomes (a – d)– Harms of biopsy– Harms of delayed diagnosis– Harms may vary between patients

Threshold probabilityProbability of disease ispˆDefine a threshold probability of disease as p tPatient accepts treatment ifpˆ≥ptp t

If P(D=1) = P tad− c −=1− b pTreatmentNo treatmenttptDiseaseptNo disease1-pDiseasepttabcNo disease1-p td

Intuitively• The threshold probability at which apatient will opt for treatment isinformative of how a patient weighs therelative harms of false-positive andfalse-negative results.• Nothing new:– Decision analytic result since 1970’s

Decision curve analysis1. Select a p t2. Positive test defined as pˆ≥ p t3. Calculate “Clinical Net Benefit” as:TruePositiveCountn−FalsePositiveCountn4. Vary p t over an appropriate range⎛ pt⎜⎝1−pt⎞⎟⎠Vickers & Elkin Med Decis Making 2006;26:565–574

Illustrative example• Men with raised PSA are referred forprostate biopsy• Free PSA measured• Does free PSA help predict outcome ofbiopsy?

Worked example at p t = 20%N=753 Negative TruepositiveFalsepositiveBiopsy ifrisk ≥20% 384 149 220Biopsyall men 0 192 561Net benefit calculationNetbenefit149 - 220 × (0.2 ÷ 0.8)753 0.1248192 - 561 × (0.2 ÷ 0.8)753 0.0687

Free PSA

Net benefit has simpleclinical interpretation• Net benefit of 0.118 at p t of 20%• Using the model is the equivalent ofa strategy that identified theequivalent of 11.8 cancers per 100patients with no unnecessarybiopsies

Net benefit has simpleclinical interpretation• Difference between model and treat all atp t of 20%.– 0.0524• Divide by weighting 0.0524/ 0.25 = 0.21– 21 fewer false positives per 100 patients forequal number of true positives– E.g. 21 fewer unnecessary biopsies with nomissed cancers

Introductory exampleMarker Sensitivity Specificity PPV NPV Brier AUCYoudenNRITest 1 56 86 39 92 .1862 .706Test 2 72 76 34 94 .2441 .741 .070

Test 2Test 1

EitherBoth

Incorporating the harm of atestTruePositiveCountn−FalsePositiveCountn⎛ pt⎜⎝1−pt⎞⎟⎠−TestHarmTest harm = reciprocal of how many patients would yougive the test to in order to find one case, assuming that thetest was perfectly accurate?

What if test 2 a scan?• This is a difficult question to answer!If the test was perfect, we wouldprobably perform no more than 20scans to find a cancer• Test harm = 0.05

Traditionalstatistical analysisTraditionaldecision analysisMathematics Simple Can be complexAdditionaldataNot requiredPatient preferences,costs oreffectivenessEndpointsBinary orcontinuousContinuousendpointsproblematicAssess clinicalvalue?NoYes

Traditionalstatistical analysisTraditionaldecision analysisDecisioncurveanalysisMathematics Simple Can be complex SimpleAdditionaldataNot requiredPatient preferences,costs oreffectivenessInformal,generalestimatesEndpointsBinary orcontinuousContinuousendpointsproblematicBinary orcontinuousAssess clinicalvalue?No Yes Yes

Software for Decision CurveAnalysis

Very easy to use softwarefor Stata and Rdca event m1 m2 both either, xstop(.4)

Extensions to decisioncurve analysis• Cross validation• 95% confidence intervals• Time-to-event data, includingcompeting risk• Evaluation of published models• Joint and conditional models

In summary…Prediction modelsare wonderful!How do you knowthat they do moregood than harm?• Traditional performance measuresdo not incorporate clinicalconsequences and cannot informclinical practice• There are very simple ways ofincorporating clinical consequencesusing threshold probability as adecision-analytic weight

Decision curve analysis

Create successful ePaper yourself

Delete template?

Save as template?