Evaluating non-randomised intervention studies - NIHR Health ...

More documents

Recommendations

Info

Evaluation of checklists and scales for assessing quality of non-randomised studies24validity approach, different reviewers may chooseto list different methodological characteristics.However, one of the main advantages is that thecoding of study characteristics does not require thesame degree of judgement as when one is requiredto identify the presence of a threat to validity. Forexample, two reviewers might disagree on whetheror not a study is low in power (threat to validity),and yet have perfect agreement when coding theseparate components that make up the decision(e.g. sample size, study design, inherent power ofthe statistical test). 45Cooper 45 advocates that the optimal strategy forcategorising studies is a mix of these twoapproaches, that is, coding all potentially relevant,objective aspects of research design as well asspecific threats to validity which may not be fullycaptured by the first approach. Although thisstrategy does not (and could not) remove all of thesubjectivity from the assessment process, it may bethe best way of making assessments of quality asexplicit and objective as possible. It should benoted that inter-rater reliability is likely to befurther diminished where a judgement regardingthe ‘acceptability’ of a given feature is required asopposed to identifying its presence or absence. 52Quality assessment tools developed for thehealthcare literature have followed bothapproaches, but the majority are of the ‘methodsdescription’variety, whether they took the form ofa checklist or a scale. These tools provide a meansof judging the overall quality of a study usingitemised criteria, either qualitatively in the case ofchecklists or quantitatively for scales. 53Alternatively, a component approach can be taken,whereby one or more individual qualitycomponents, such as allocation concealment orblinding, are investigated. However, as Moher andcolleagues have pointed out, “assessing onecomponent of a trial report may provide onlyminimal information about its overall quality”. 53A common criticism of quality assessment tools isthe lack of rationale provided for the particularstudy features that reviewers choose to code 52 andthe inclusion of features unlikely to be related tostudy quality. 44,54 This may in part be due to thelack of empirical evidence for the biases associatedwith inadequately designed studies (although suchevidence does exist to some extent forRCTs 21,55,56 ). A further criticism of tools forassessing RCTs is lack of attention to standardscale development techniques, 44 to the extent thatone scale 46 which was developed usingpsychometric principles has been singled out fromother available tools. These principles involve thefollowing steps as laid out by Streiner andNorman: 57 preliminary conceptual decisions; itemgeneration and assessment of face validity; fieldtrials to assess frequency of endorsement,consistency and construct validity; and generationof a refined instrument. However, as Jüni andcolleagues have pointed out, following suchprinciples does not necessarily make a toolsuperior to other available instruments. 23Quality assessment scales in particular have alsobeen heavily criticised for the use of a singlesummary score to estimate study quality, by addingthe scores for each individual item. 58 Greenlandargues that the practice of quality scoring is themost insidious form of bias in meta-analysis as it“subjectively merges objective information witharbitrary judgements in a manner that can obscureimportant sources of heterogeneity among studyresults”. 58 It has since been empiricallydemonstrated that the use of different qualityscales for the assessment of the same trial(s) resultsin different estimates of quality. 21,59 Nevertheless,formal (or systematic) quality assessment,especially of RCTs, is increasingly common. Areview by Moher and colleagues published in 1995identified 25 scales for the quality assessment ofRCTs. 44 Subsequent work by Jüni and colleagueshas identified several more. 23,59In spite of these criticisms, it is largely agreed thatthe assessment of methodological quality shouldbe routine practice in systematic reviews and metaanalyses.Although the majority of methodologicalwork in this area has surrounded the assessment ofRCTs, it is reasonable to suggest that if formalquality assessment of randomised controlled trialsis important, then it is doubly so for nonrandomisedstudies owing to the greater degree ofjudgement that is required. The largelyobservational nature of non-randomised studiesleads to a much higher susceptibility to bias thanis found for experimental designs, as discussed inChapter 1.A review of existing quality assessment tools for nonrandomisedintervention studies was conducted inorder to provide a description of what is available,paying particular attention to whether and how wellthey cover generally accepted quality domains.MethodsInclusion criteriaTo be considered as a quality assessment tool, a list
Health Technology Assessment 2003; Vol. 7: No. 27of criteria that could be (or had been) used toassess the methodological quality or validity ofprimary studies was required. A specific statementthat the list of criteria was a scale or checklistintended to assess methodological quality was notrequired.These tools could exist either as individualpublications in their own right or within thecontext of a systematic review or other type ofreview, such as methodological reviews that hadused some form of tool.The tool must have been (or must have thepotential to be) applied to non-randomisedstudies of intended effect, that is, epidemiologicalstudies or studies primarily aimed at theinvestigation of the side-effects of an interventionwere excluded. Tools specifically designed to assesscase–control and uncontrolled studies wereexcluded (case–control studies were excluded onthe basis that the design is rarely used to examineintended effects). To provide as comprehensive apicture of current practice as possible, any toolthat had been used to assess non-randomisedstudies was included in the review, even if it hadbeen explicitly designed to assess only RCTs.Both ‘new’ and ‘modified’ tools were included.‘Modified’ tools were those based on a singleexisting tool. Tools that were stated to be based onmore than one existing tool were considered to be‘new’ tools. Tools which made no statementregarding the originality of the tool were assumedto be ‘new’ tools. This is likely to have led to anover-estimation of the number of unique tools inexistence; however, it was not practical to checkfurther on the origin of these tools.Literature searchesIn an attempt to identify the largest possiblenumber of quality assessment tools, an extensiveand comprehensive literature search from theearliest possible date up to December 1999 wascarried out. This included searching a wide rangeof electronic databases (see Table 5). The searchstrategies used were developed via an iterativeprocess, by which a series of strategies weresuggested, amended and piloted, and the searchresults scanned to identify the proportion ofrelevant papers retrieved (see Appendix 1 for thesample search strategy). Owing to the nature ofthe searches, and the poor indexing of the studies,it was necessary to strike a balance betweenstrategies that were less likely to miss any relevantpapers, yet retrieved a ‘manageable’ number ofcitations. Similar problems with searching formethodological literature have been cited byprevious HTA-funded projects. 60 The resultinglists of titles and abstracts were screened by twoTABLE 5 Literature search resultsSource Retrieved Selected from Met inclusionscreening a criteriaMEDLINE (1966–99) 1897 149 26EMBASE (1974–99) 639 11 0PsycLit (1967–99) 1835 113 4Science Citation Index (1981–99) 1078 45 5Social Science Citation Index (1981–99) 502 11 0Index to Scientific and Technical Proceedings (1990–9) 294 6 0Applied Social Sciences Index and Abstracts (1987–99) 262 10 1Educational Resource Information Centre database (ERIC) (1965–99) 699 29 3British Education Index (1986–99) 11 0 0Cochrane Review Groups 3Citation searches NA NA 85Database of Abstracts of Reviews of Effectiveness (DARE) (1994–9) 1109 131 75 bOther c NA NA 11Total 8326 213a Totals presented are those following de-duplication of search results, i.e. only the additional number of unique studiesobtained from each source is presented.b Number of included reviews that developed their own quality assessment tool or modified another.c Includes the CRD and Cochrane Collaboration methodology databases, handsearching of a number of key journals(Statistics in Medicine (1984–98), Controlled Clinical Trials (1984–98), Journal of Clinical Epidemiology (1991–8), PsychologicalBulletin (1994–9), Psychological Methods (1996–9) and the International Journal of Technology Assessment in Health Care(1985–99)) and contact with a number of methodological experts.25© Queen’s Printer and Controller of HMSO 2003. All rights reserved.
Page 1 and 2: Health Technology Assessment 2003;
Page 3 and 4: Evaluating non-randomisedinterventi
Page 7: Health Technology Assessment 2003;
Page 23 and 24: © Queen’s Printer and Controller
Page 27 and 28: © Queen’s Printer and Controller
Page 38 and 39: Evaluation of checklists and scales
Page 44 and 45: 32TABLE 8 Details of top 60 quality
Page 50 and 51: 38TABLE 10 Other domains: reporting
Page 56 and 57: Use of quality assessment in system
Page 62 and 63: Empirical estimates of bias associa
Page 76 and 77: Empirical evaluation of the ability
Page 82 and 83: 70TABLE 22 Comparison of concurrent
Page 86 and 87:
74TABLE 26 Comparison of methods of
Page 88 and 89:
Empirical evaluation of the ability
Page 90 and 91:
Page 92 and 93:
80TABLE 33 Hypothetical example dem
Page 94 and 95:
82TABLE 34 Hypothetical example dem
Page 96 and 97:
Page 98 and 99:
Page 100 and 101:
Discussion and conclusions88histori
Page 102 and 103:
Discussion and conclusions90For exa
Page 104 and 105:
Discussion and conclusionsNon-rando
Page 107 and 108:
Health Technology Assessment 2003;
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121:
Page 124 and 125:
Appendix 1data)) or (non-random$ or
Page 126 and 127:
Appendix 2AuthorYearENDARESourcePub
Page 128 and 129:
Appendix 2Author:Accession No:Endno
Page 130 and 131:
Appendix 20 0 00Additional outcomes
Page 132 and 133:
Appendix 2Endnote NoWas CMA conside
Page 134 and 135:
122AuthorOrigin aModified toolTool
Page 136 and 137:
Page 138 and 139:
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Page 146 and 147:
Page 148 and 149:
Appendix 4136DuRant, 1994 99The typ
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
© Queen’s Printer and Controller
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Page 163 and 164:
Page 165 and 166:
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185:
Page 188 and 189:
Health Technology Assessment Progra
Page 190:
Health Technology Assessment Progra
show all

Evaluating non-randomised intervention studies - NIHR Health ...

Create successful ePaper yourself

Delete template?

Save as template?