Evaluating non-randomised intervention studies - NIHR Health ...

More documents

Recommendations

Info

Empirical estimates of bias associated with non-random allocationTABLE 19 Impact of observed increased variability with sample sizeIST – 14 regions IST – 10 UK cities ECST – 8 regionsObserved ratio of SDs for concurrent controls 2.5 1.8 1.01Increase in variance in log OR attributable tonon-random allocation 0.607 0.570 0.014Total sample sizeMultipliers to confidence interval width to give correct coverage100 1.9 1.5 1.0200 2.5 1.8 1.0500 3.8 2.6 1.11000 5.2 3.5 1.22000 7.3 4.8 1.35000 12 7.5 1.710000 16 11 2.220000 23 15 2.950000 36 24 4.460studies may be an order of magnitude too narrowto describe correctly the true uncertainty in theirresults, but that there are differences in theadjustments that are needed in differentsituations. For example, the confidence intervalcalculated from a concurrently controlled study of1000 participants may be five times too narrow todescribe the true uncertainty for regional IST-typecomparisons, three times too narrow to describethe true uncertainty in UK city IST-typecomparisons, but only 20% too narrow forregional ECST-type comparisons. For sample sizesof 10,000 the confidence intervals are estimated tobe more than 10 times too narrow for the ISTsituations and half the width needed for theECST situation. Of course, in practice onewould not know to what extent the standardconfidence interval under-represented the trueuncertainty.Generalisability and limitations of thefindingsThe value of these findings and estimates dependson the generalisability of the results obtained fromthe IST and ECST and the degree to which theslightly artificial methodology and samples used inthese evaluations are representative of the realityof non-randomised studies.GeneralisabilityThe IST and ECST were chosen for thisinvestigation as (a) they were large trials, (b) theyhad an outcome which was not rare, (c) they weremulticentre trials and (d) the trialists were willingto provide reduced and anonymised data setssuitable for our analyses. Other than the fact thatboth trials relate to stroke medicine, the trialsdiffer considerably. One is a trial ofpharmacological agents (aspirin and heparin)whereas the other is a trial of a surgical procedure(carotid endarterectomy). The treatment in one isacute, being given immediately after the patientshave suffered a severe stroke, whereas in the otherit is preventive, being given to high-risk patients.It is difficult to argue that these trials can beregarded as representative and therefore that theresults are generalisable. However, their resultsshould be regarded as being indicative of thebiases associated with the use of non-randomcontrols. Ideally these resampling study methodsshould be repeated in more trials. In the caseof this project, the time required to generatethe resampling studies and the difficulty inobtaining data sets from multicentre clinicaltrials prevented additional evaluations beingundertaken.It is important also to consider whether the timetrend observed in the ECST is likely to be typicalof those that may be observed in other areas ofhealthcare – especially as it is in agreement withthe trends observed by Sacks and colleagues intheir review across six clinical contexts. 27 Thetrend is one of patient outcomes improving overtime. It is consistent with a general pattern ofaverage outcomes improving with progress inmedical care, which may apply across all medicalspecialities. However, this argument assumes thatthe case-mix of patients being treated is stable,which may not be the case. In some circumstanceschanges in case-mix over time, for good reason,may lead to apparent increases in adverseoutcomes. For example, if medical informationleads to knowledge that the treatment is not suitedto patients at low risk, then a change to excludinglower risk patients from receiving that treatment
Health Technology Assessment 2003; Vol. 7: No. 27may lead to increases in average event rates.Historically controlled studies undertaken in sucha situation may be prone to underestimating thebenefits of treatment and may even falselyconclude that treatment does more harm thangood.The lack of systematic bias with the use ofgeographical controls is based on the presumptionthat geographical differences act in a haphazardmanner, and are as likely to lead to overestimatesof treatment effects as to underestimates. Therandom manner in which concurrent controlgroups were selected in the resampling exerciseensured that across a large number of studiesthese differences would be seen to balance eachother out, albeit possibly increasingunpredictability. This result does not indicate thatgeographically controlled studies are unbiased. Inreality, a single comparison between two areas islikely to be biased, as are meta-analyses ofseveral studies, although the direction in whichthe bias acts may be unknown. In addition,if an investigator chose a geographical controlgroup with knowledge of the likely differencesin case-mix, it would be possible for theselection to be manipulated (consciously orsubconsciously) in such a way as to introducea particular bias, akin to the bias observed inRCTs when treatment allocation is notconcealed. 20Similarly, we should consider whether themechanisms leading to unpredictability in bias,especially in studies generated from the IST, arelikely to apply widely across different clinical areas.Tables 38 and 40 in Appendix 8 show that the casemixof patients recruited to the IST variedbetween locations, both internationally andbetween cities in the UK. These haphazarddifferences, together with differences in otherunknown risk factors and aspects of patientmanagement and outcome assessment, will havecaused the unpredictability in the bias that wasobserved. Evidence is available in all areas ofmedicine that such differences exist, and thereforeit seems reasonable to conclude that theunpredictable behaviour in biases will be observedelsewhere, although the degree of unpredictabilitymay vary.Limitations of the resampling methodologyThe resampling method used participantsrecruited to a randomised controlled trial togenerate non-randomised studies. This, of course,is not what happens in reality, but there arereasons to believe that our approach is more likelyto have led to underestimates than overestimatesof bias.The degree of bias in a non-randomised studydepends on the similarity of the two groups fromwhich treated participants and controls are drawn.Sampling these groups from the same randomisedtrial is likely to have reduced such differences forthe following reasons:1. All participants included in the RCT will havebeen judged to have been suitable for eithertreatment. In a non-randomised studyparticipants who are suitable for only one ofthe two treatments may have been recruited tothat arm: there is usually no formal assessmentthat they would have been considered suitablefor the alternative. This difference will nearlyalways act to increase differences in outcomebetween the groups.2. The RCT was conducted according to aprotocol, describing methods for recruiting,assessing, treating and evaluating the patients.This will have reduced the variability withinthe trial. Although some non-randomisedstudies are organised using a protocol, manyare not.3. All participants included in the trial wererecruited prospectively. In non-randomisedstudies, especially those using historicalcontrols, participants are likely to have been‘retrospectively’ included in the study,potentially introducing additional bias.On balance, it could be argued that usingrandomly chosen international comparisons forselection of concurrent controls may be regardedas rather artificial and likely to have increaseddifferences between groups. In reality, aconcurrent control group in a non-randomisedstudy would be selected to minimise likelydifferences between groups, and such longdistancegeographical comparisons wouldprobably be avoided. It is perhaps more realistic tofocus on the magnitude of the biases observed inthe concurrent comparisons generated from theUK cities in the IST as being more representativeof what might occur in reality. The unsystematicbias seen here was less than that observed ininternational comparisons, but still large enoughto lead many studies falsely to obtain significantfindings of both benefit and harm.Importantly, we have concentrated on only oneaspect of quality in non-randomised studies: thereare other biases to which they are susceptible inthe same way as are RCTs.61© Queen’s Printer and Controller of HMSO 2003. All rights reserved.
Page 1 and 2:
Health Technology Assessment 2003;
Page 3 and 4:
Evaluating non-randomisedinterventi
Page 5 and 6:
Page 7:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17:
Page 21 and 22: Health Technology Assessment 2003;
Page 23 and 24: © Queen’s Printer and Controller
Page 27 and 28: © Queen’s Printer and Controller
Page 33: Health Technology Assessment 2003;
Page 36 and 37: Evaluation of checklists and scales
Page 44 and 45: 32TABLE 8 Details of top 60 quality
Page 50 and 51: 38TABLE 10 Other domains: reporting
Page 56 and 57: Use of quality assessment in system
Page 62 and 63: Empirical estimates of bias associa
Page 76 and 77: Empirical evaluation of the ability
Page 82 and 83: 70TABLE 22 Comparison of concurrent
Page 86 and 87: 74TABLE 26 Comparison of methods of
Page 92 and 93: 80TABLE 33 Hypothetical example dem
Page 94 and 95: 82TABLE 34 Hypothetical example dem
Page 100 and 101: Discussion and conclusions88histori
Page 102 and 103: Discussion and conclusions90For exa
Page 104 and 105: Discussion and conclusionsNon-rando
Page 121: Health Technology Assessment 2003;
Page 124 and 125:
Appendix 1data)) or (non-random$ or
Page 126 and 127:
Appendix 2AuthorYearENDARESourcePub
Page 128 and 129:
Appendix 2Author:Accession No:Endno
Page 130 and 131:
Appendix 20 0 00Additional outcomes
Page 132 and 133:
Appendix 2Endnote NoWas CMA conside
Page 134 and 135:
122AuthorOrigin aModified toolTool
Page 136 and 137:
Page 138 and 139:
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Page 146 and 147:
Page 148 and 149:
Appendix 4136DuRant, 1994 99The typ
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
© Queen’s Printer and Controller
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Page 163 and 164:
Page 165 and 166:
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185:
Page 188 and 189:
Health Technology Assessment Progra
Page 190:
Health Technology Assessment Progra
show all

Evaluating non-randomised intervention studies - NIHR Health ...

Create successful ePaper yourself

Delete template?

Save as template?