13.07.2015 Views

Evaluating non-randomised intervention studies - NIHR Health ...

Evaluating non-randomised intervention studies - NIHR Health ...

Evaluating non-randomised intervention studies - NIHR Health ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Evaluation of checklists and scales for assessing quality of <strong>non</strong>-<strong>randomised</strong> <strong>studies</strong>24validity approach, different reviewers may chooseto list different methodological characteristics.However, one of the main advantages is that thecoding of study characteristics does not require thesame degree of judgement as when one is requiredto identify the presence of a threat to validity. Forexample, two reviewers might disagree on whetheror not a study is low in power (threat to validity),and yet have perfect agreement when coding theseparate components that make up the decision(e.g. sample size, study design, inherent power ofthe statistical test). 45Cooper 45 advocates that the optimal strategy forcategorising <strong>studies</strong> is a mix of these twoapproaches, that is, coding all potentially relevant,objective aspects of research design as well asspecific threats to validity which may not be fullycaptured by the first approach. Although thisstrategy does not (and could not) remove all of thesubjectivity from the assessment process, it may bethe best way of making assessments of quality asexplicit and objective as possible. It should benoted that inter-rater reliability is likely to befurther diminished where a judgement regardingthe ‘acceptability’ of a given feature is required asopposed to identifying its presence or absence. 52Quality assessment tools developed for thehealthcare literature have followed bothapproaches, but the majority are of the ‘methodsdescription’variety, whether they took the form ofa checklist or a scale. These tools provide a meansof judging the overall quality of a study usingitemised criteria, either qualitatively in the case ofchecklists or quantitatively for scales. 53Alternatively, a component approach can be taken,whereby one or more individual qualitycomponents, such as allocation concealment orblinding, are investigated. However, as Moher andcolleagues have pointed out, “assessing onecomponent of a trial report may provide onlyminimal information about its overall quality”. 53A common criticism of quality assessment tools isthe lack of rationale provided for the particularstudy features that reviewers choose to code 52 andthe inclusion of features unlikely to be related tostudy quality. 44,54 This may in part be due to thelack of empirical evidence for the biases associatedwith inadequately designed <strong>studies</strong> (although suchevidence does exist to some extent forRCTs 21,55,56 ). A further criticism of tools forassessing RCTs is lack of attention to standardscale development techniques, 44 to the extent thatone scale 46 which was developed usingpsychometric principles has been singled out fromother available tools. These principles involve thefollowing steps as laid out by Streiner andNorman: 57 preliminary conceptual decisions; itemgeneration and assessment of face validity; fieldtrials to assess frequency of endorsement,consistency and construct validity; and generationof a refined instrument. However, as Jüni andcolleagues have pointed out, following suchprinciples does not necessarily make a toolsuperior to other available instruments. 23Quality assessment scales in particular have alsobeen heavily criticised for the use of a singlesummary score to estimate study quality, by addingthe scores for each individual item. 58 Greenlandargues that the practice of quality scoring is themost insidious form of bias in meta-analysis as it“subjectively merges objective information witharbitrary judgements in a manner that can obscureimportant sources of heterogeneity among studyresults”. 58 It has since been empiricallydemonstrated that the use of different qualityscales for the assessment of the same trial(s) resultsin different estimates of quality. 21,59 Nevertheless,formal (or systematic) quality assessment,especially of RCTs, is increasingly common. Areview by Moher and colleagues published in 1995identified 25 scales for the quality assessment ofRCTs. 44 Subsequent work by Jüni and colleagueshas identified several more. 23,59In spite of these criticisms, it is largely agreed thatthe assessment of methodological quality shouldbe routine practice in systematic reviews and metaanalyses.Although the majority of methodologicalwork in this area has surrounded the assessment ofRCTs, it is reasonable to suggest that if formalquality assessment of <strong>randomised</strong> controlled trialsis important, then it is doubly so for <strong>non</strong><strong>randomised</strong><strong>studies</strong> owing to the greater degree ofjudgement that is required. The largelyobservational nature of <strong>non</strong>-<strong>randomised</strong> <strong>studies</strong>leads to a much higher susceptibility to bias thanis found for experimental designs, as discussed inChapter 1.A review of existing quality assessment tools for <strong>non</strong><strong>randomised</strong><strong>intervention</strong> <strong>studies</strong> was conducted inorder to provide a description of what is available,paying particular attention to whether and how wellthey cover generally accepted quality domains.MethodsInclusion criteriaTo be considered as a quality assessment tool, a list

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!