11.07.2015 Views

Decision curve analysis

Decision curve analysis

Decision curve analysis

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Decision</strong> <strong>curve</strong> <strong>analysis</strong>Andrew VickersDepartments of Epidemiology andBiostatisticsMemorial Sloan-Kettering Cancer Center


What am I doing here?Prediction models arewonderful!


What am I doing here?Prediction models arewonderful!How do you know thatthey do more good thanharm?


What am I doing here?Prediction models arewonderful!


How do you know thatprediction models do moregood than harm?• Biostatistical measures:– Sensitivity, specificity– PPV, NPV– Brier score– Youden index, AUC– NRI


Model• Data generated with a logit model.• Betas of:– 1.8 for test 1– 2.0 for test 2– 1.9 for interaction– -3.5 for intercept


ResultsMarker Sensitivity Specificity PPV NPV Brier AUCYoudenNRITest 1 56 86 39 92 .1862 .706Test 2 72 76 34 94 .2441 .741 .070


ResultsMarker Sensitivity Specificity PPV NPV Brier AUCYoudenNRITest 1 56 86 39 92 .1862 .706Test 2 72 76 34 94 .2441 .741 .070What if Test 2 was a scan that wasexpensive and inconvenient?


ResultsMarker Sensitivity Specificity PPV NPV Brier AUCYoudenNRITest 1 56 86 39 92 .1862 .706Test 2 72 76 34 94 .2441 .741 .070Both 40 99 91 91 .0923 .695 -.023Either 88 63 28 97 .3380 .753 .093


Principle• Take statistical method m for decision d• m insufficient for d if extra information ineeded• For m to be sufficient for d:• i must be parameterized• Method available to incorporate i in m


Principle• Sensitivity / specificity insufficient todetermine which test should be used:– “Depends on whether sensitivity orspecificity is more important”• We need:– A parameter specifying relative importanceof sensitivity vs. specificity– A way to incorporate that parameter in astatistical methodology


What is the “depends”parameter?


Youden IndexSensitivity + Specificity – 1


Youden IndexSensitivity + Specificity – 1Value of sensitivity &specificity are equal


AUC / C indexProbability correct identification ofdiscordant pair


AUC / C indexProbability correct identification ofdiscordant pairClassifying a patient without an eventas high probability =classifying a patient with an event aslow probability– Pr: 50%, D=0; Pr: 10%, D=1– Pr: 50%, D=1; Pr: 90%, D=0


Brier Score(D - P) 2


Brier Score(D - P) 2• D=1 and D=0 scored equally


Net Reclassification Index


Net Reclassification IndexAs harmful to reclassify down a patientwho has the event asreclassify up a patient who doesn’thave the event


When is it ever the case inmedicine where…Harm of missed treatment=Harm of unnecessary treatment


Prostate cancer• What is worse?To fail to biopsy a patient with cancer(missed treatment = false negative)To biopsy a patient without cancer(unnecessary treatments = false positive)


Conclusion 1Prediction modelsare wonderful!How do you knowthat they do moregood than harm?• Traditional biostatistical techniquesfor evaluating models, markers andtests do not incorporate clinicalconsequences• Accordingly, they cannot informclinical practice


CanceraApplymodelBiopsyNo biopsyNo cancerCancerCancerNo cancer Cancerp 1p 2p 31- (p 1 + p 2 + p 3 )bcdCancerCancer(p 1 + p 3 )aBiopsyNo cancerCancer1 - (p 1 + p 3 )bNo biopsyCancerNo cancer Cancer(p 1 + p 3 )1 - (p 1 + p 3 )cd


CanceraApplymodelBiopsyNo biopsyNo cancerCancerCancerNo cancer Cancerp 1p 2p 31- (p 1 + p 2 + p 3 )bcdCancerCancer(p 1 + p 3 )aBiopsyNo cancerCancer1 - (p 1 + p 3 )bNo biopsyCancerNo cancer Cancer(p 1 + p 3 )1 - (p 1 + p 3 )cd


Problems with traditionaldecision <strong>analysis</strong>• p’s require a cut-point to be chosen• Extra data needed on health valuesoutcomes (a – d)– Harms of biopsy– Harms of delayed diagnosis– Harms may vary between patients


Threshold probabilityProbability of disease ispˆDefine a threshold probability of disease as p tPatient accepts treatment ifpˆ≥ptp t


If P(D=1) = P tad− c −=1− b pTreatmentNo treatmenttptDiseaseptNo disease1-pDiseasepttabcNo disease1-p td


Intuitively• The threshold probability at which apatient will opt for treatment isinformative of how a patient weighs therelative harms of false-positive andfalse-negative results.• Nothing new:– <strong>Decision</strong> analytic result since 1970’s


<strong>Decision</strong> <strong>curve</strong> <strong>analysis</strong>1. Select a p t2. Positive test defined as pˆ≥ p t3. Calculate “Clinical Net Benefit” as:TruePositiveCountn−FalsePositiveCountn4. Vary p t over an appropriate range⎛ pt⎜⎝1−pt⎞⎟⎠Vickers & Elkin Med Decis Making 2006;26:565–574


Illustrative example• Men with raised PSA are referred forprostate biopsy• Free PSA measured• Does free PSA help predict outcome ofbiopsy?


Worked example at p t = 20%N=753 Negative TruepositiveFalsepositiveBiopsy ifrisk ≥20% 384 149 220Biopsyall men 0 192 561Net benefit calculationNetbenefit149 - 220 × (0.2 ÷ 0.8)753 0.1248192 - 561 × (0.2 ÷ 0.8)753 0.0687


Free PSA


Net benefit has simpleclinical interpretation• Net benefit of 0.118 at p t of 20%• Using the model is the equivalent ofa strategy that identified theequivalent of 11.8 cancers per 100patients with no unnecessarybiopsies


Net benefit has simpleclinical interpretation• Difference between model and treat all atp t of 20%.– 0.0524• Divide by weighting 0.0524/ 0.25 = 0.21– 21 fewer false positives per 100 patients forequal number of true positives– E.g. 21 fewer unnecessary biopsies with nomissed cancers


Introductory exampleMarker Sensitivity Specificity PPV NPV Brier AUCYoudenNRITest 1 56 86 39 92 .1862 .706Test 2 72 76 34 94 .2441 .741 .070


Test 2Test 1


EitherBoth


Incorporating the harm of atestTruePositiveCountn−FalsePositiveCountn⎛ pt⎜⎝1−pt⎞⎟⎠−TestHarmTest harm = reciprocal of how many patients would yougive the test to in order to find one case, assuming that thetest was perfectly accurate?


What if test 2 a scan?• This is a difficult question to answer!If the test was perfect, we wouldprobably perform no more than 20scans to find a cancer• Test harm = 0.05


Traditionalstatistical <strong>analysis</strong>Traditionaldecision <strong>analysis</strong>Mathematics Simple Can be complexAdditionaldataNot requiredPatient preferences,costs oreffectivenessEndpointsBinary orcontinuousContinuousendpointsproblematicAssess clinicalvalue?NoYes


Traditionalstatistical <strong>analysis</strong>Traditionaldecision <strong>analysis</strong><strong>Decision</strong><strong>curve</strong><strong>analysis</strong>Mathematics Simple Can be complex SimpleAdditionaldataNot requiredPatient preferences,costs oreffectivenessInformal,generalestimatesEndpointsBinary orcontinuousContinuousendpointsproblematicBinary orcontinuousAssess clinicalvalue?No Yes Yes


Software for <strong>Decision</strong> CurveAnalysis


Very easy to use softwarefor Stata and Rdca event m1 m2 both either, xstop(.4)


Extensions to decision<strong>curve</strong> <strong>analysis</strong>• Cross validation• 95% confidence intervals• Time-to-event data, includingcompeting risk• Evaluation of published models• Joint and conditional models


In summary…Prediction modelsare wonderful!How do you knowthat they do moregood than harm?• Traditional performance measuresdo not incorporate clinicalconsequences and cannot informclinical practice• There are very simple ways ofincorporating clinical consequencesusing threshold probability as adecision-analytic weight

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!