Evaluating non-randomised intervention studies - NIHR Health ...

More documents

Recommendations

Info

Empirical evaluation of the ability of case-mix adjustment methodologies to control for selection bias64regression models described in (3) model howcovariates relate to outcome, propensity scoremethods model how the same covariates relateto treatment allocation. 145–147 The principle isthat the propensity score summarises themanner in which baseline characteristics areassociated with treatment allocation, so thatselection bias is removed when comparisonsare made between groups with similarpropensity scores. 146 The method involvescalculation for each individual in the datasetthe propensity probability that estimates theirchance of receiving the experimentalintervention from their baseline characteristics.In an RCT, where there should be norelationship between baseline characteristicsand treatment assignment, this probability willbe the same for each participant (e.g. 0.5 if thegroups are of equal size). In non-randomisedstudies, it is likely that treatment assignmentdoes depend on baseline covariates, and thatpropensity scores will vary betweenindividuals. For example, patients with moresevere disease may be more likely to receiveone treatment than the other. Propensityscores in such a situation would relate todisease severity, and the average propensityscore in the experimental group will differfrom the average in the control group.Estimation of propensity scores is typicallyundertaken using multiple logistic regressionmodels, the outcome variable being treatmentallocation.What evidence is there that these methods actuallyadjust for selection bias in non-randomisedstudies? Although there is plenty of literaturedemonstrating that case-mix adjustment canchange estimates of treatment effects, none of thetexts that we have consulted cite any empiricalevidence demonstrating that case-mix adjustment,on average, reduces bias. Broader searches of themethodological literature related to case-mixlikewise did not identify supporting empiricalevidence.To the contrary, there are hints in the literaturethat case-mix adjustment methods may notadequately perform the task to which they areapplied. Of the eight reviews in Chapter 3, twocompared adjusted and unadjusted estimates oftreatment effects from non-randomised studieswith the results of similar RCTs, and noted thatthere was little evidence that adjustmentconsistently moved the estimates of treatmenteffects from non-randomised studies towards thoseof RCTs. 25,27In an attempt to provide empirical evidence ofthe value of case-mix adjustment, we have usedthe same non-randomised studies generated byresampling participants from the IST 132 andECST 133 data sets (described in Chapter 6) toevaluate the performance of eight different casemixadjustment methods in controlling forselection bias. As explained previously,resampling of data from the trials was used togenerate randomised and non-randomisedstudies of different designs in such a way that thedifferences between their results with respect tolocation and spread could only be attributed toselection bias. Case-mix adjustment methodswere then applied to each resampled nonrandomisedstudy making use of the availablebaseline data (different in the two trials – seeTable 20). The distribution of results of theadjusted non-randomised studies was thencompared with both the distribution of unadjustedresults and the distribution of the results of theRCTs, to see whether there was any evidence thatadjustment had reduced or removed the selectionbias.In our investigations we evaluate the ability ofcase-mix adjustment to control for three differenttypes of bias:1. Naturally occurring systematic bias, whereresults are consistently either all overestimatesor all underestimates of the treatment owing tosome naturally occurring unknown allocationmechanism (as was observed in the ECSThistorical comparisons in Chapter 6).2. Naturally occurring unpredictable bias, whereresults are biased variably with differentmagnitudes in different directions leading toboth underestimates and overestimates oftreatment effects (as was observed in the ISTconcurrent comparisons in Chapter 6).3. Bias arising where allocation to treatment isprobabilistically linked to a prognostic variable,such that participants are more likely or lesslikely to receive treatment according to theirobserved characteristics. This in fact is a modelof practice in much of medicine, wheretreatment decisions are made according to theconditions and characteristics that each patientdisplays (allocation by indication). Although wehave no realistic way of mimicking such amechanism for the two trial data sets, wegenerate two artificial scenarios when allocationrelated (a) to the value of a single prognosticcovariate, and, more realistically, (b) to anunknown function of a set of prognosticcovariates.
Health Technology Assessment 2003; Vol. 7: No. 27TABLE 20 Baseline covariates used in case-mix adjustment models for the IST and ECSTISTBinary covariatesSex (male/female)Symptoms noted on wakingConsciousnessAtrial fibrillationContinuous covariatesAgeDelay to presentationSystolic blood pressureECSTSex (male/female)Residual neurological signsPrevious MIAnginaCurrent prophylactic aspirin useAgeDegree of stenosisUnordered categorical variablesInfarct visible on CT scanType of strokeOrdered categorical variablesNeurological deficit score (7 categories)Presenting stroke (4 categories)The investigative method provides an opportunityto make comparisons between different case-mixadjustment strategies: matching baseline groups,stratification, regression and propensity scoremethods.MethodsGeneration of samplesThe principles of our resampling methodologywere discussed in detail in Chapter 6. Studies offixed sample sizes with randomised and nonrandomiseddesigns (described below) weregenerated for each region in each of the IST andECST data sets by selectively samplingparticipants, the whole process being repeated1000 times. In each trial baseline data onimportant prognostic variables had been recordedfor each participant at the point of recruitment.These variables (we will refer to them ascovariates) were used in the analyses to adjust fordifferences in case-mix. Details of the covariatesavailable for each study are given in Table 20 andAppendix 8.Samples with ‘naturally’ occurring biasesHistorically controlled and concurrently (nonrandomised)controlled studies were generatedfrom the IST and ECST data sets as described inChapter 6. We thus obtained results for:1. 14,000 historically controlled studies basedon the 14 international regions from the ISTand 14,000 corresponding RCTs, all of samplesize 200 (100 per arm)2. 14,000 concurrently controlled studies basedon the 14 international regions from the ISTand 14,000 corresponding RCTs, all of samplesize 200 (100 per arm)3. 10,000 concurrently controlled studies basedon the 10 UK cities within the IST and 10,000corresponding RCTs, all of sample size 200(100 per arm)4. 8000 historically controlled studies based onthe eight international regions from the ECSTand 8000 corresponding RCTs, all of samplesize 80 (40 per arm)5. 8000 concurrently controlled studies based onthe eight international regions from the ECSTand 8000 corresponding RCTs, all of samplesize 80 (40 per arm).Different case-mix adjustment methods wereapplied individually to each of these 54,000 nonrandomisedstudies, and the results werecompared with the results from the corresponding54,000 RCTs.Samples with bias related to ‘known’ differencesin case-mix (allocation by indication)In addition to the standard historically andconcurrently controlled designs, for the purposesof evaluating the performance of case-mixadjustment we have included two further designsin which bias relates to known relationships withprognostic variables. This has been done for tworeasons. First, we wished to evaluate how well casemixmethods work in situations when we havedirect knowledge of the bias-inducing mechanismthat they are trying to correct. Second, we wishedto mimic crudely clinical database-type studies, in65© Queen’s Printer and Controller of HMSO 2003. All rights reserved.
Page 1 and 2:
Health Technology Assessment 2003;
Page 3 and 4:
Evaluating non-randomisedinterventi
Page 5 and 6:
Page 7:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17:
Page 21 and 22:
Page 23 and 24:
© Queen’s Printer and Controller
Page 25 and 26: Health Technology Assessment 2003;
Page 27 and 28: © Queen’s Printer and Controller
Page 33: Health Technology Assessment 2003;
Page 36 and 37: Evaluation of checklists and scales
Page 44 and 45: 32TABLE 8 Details of top 60 quality
Page 50 and 51: 38TABLE 10 Other domains: reporting
Page 56 and 57: Use of quality assessment in system
Page 62 and 63: Empirical estimates of bias associa
Page 78 and 79: Empirical evaluation of the ability
Page 82 and 83: 70TABLE 22 Comparison of concurrent
Page 86 and 87: 74TABLE 26 Comparison of methods of
Page 92 and 93: 80TABLE 33 Hypothetical example dem
Page 94 and 95: 82TABLE 34 Hypothetical example dem
Page 100 and 101: Discussion and conclusions88histori
Page 102 and 103: Discussion and conclusions90For exa
Page 104 and 105: Discussion and conclusionsNon-rando
Page 121: Health Technology Assessment 2003;
Page 124 and 125: Appendix 1data)) or (non-random$ or
Page 126 and 127:
Appendix 2AuthorYearENDARESourcePub
Page 128 and 129:
Appendix 2Author:Accession No:Endno
Page 130 and 131:
Appendix 20 0 00Additional outcomes
Page 132 and 133:
Appendix 2Endnote NoWas CMA conside
Page 134 and 135:
122AuthorOrigin aModified toolTool
Page 136 and 137:
Page 138 and 139:
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Page 146 and 147:
Page 148 and 149:
Appendix 4136DuRant, 1994 99The typ
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Page 163 and 164:
Page 165 and 166:
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185:
Page 188 and 189:
Health Technology Assessment Progra
Page 190:
Health Technology Assessment Progra
show all

Evaluating non-randomised intervention studies - NIHR Health ...

Create successful ePaper yourself

Delete template?

Save as template?