13.07.2015 Views

Evaluating non-randomised intervention studies - NIHR Health ...

Evaluating non-randomised intervention studies - NIHR Health ...

Evaluating non-randomised intervention studies - NIHR Health ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Health</strong> Technology Assessment 2003; Vol. 7: No. 27Chapter 8Discussion and conclusionsChapters 3–7 have reported results from fiveseparate evaluations concerning <strong>non</strong>-<strong>randomised</strong><strong>studies</strong>. The results have been discussed in detailin each chapter. We summarise their main findingsbelow.Summary of key findingsOur review of previous empirical investigations ofthe importance of randomisation (Chapter 3)identified eight <strong>studies</strong> that fulfilled our inclusioncriteria. Each investigation reported multiplecomparisons of results of <strong>randomised</strong> and <strong>non</strong><strong>randomised</strong><strong>studies</strong>. Although there was overlap inthe comparisons included in these reviews, theyreached different conclusions concerning the likelyvalidity of <strong>non</strong>-<strong>randomised</strong> data, mainly reflectingweaknesses in the meta-epidemiologicalmethodology that they all used, most notably thatit was not able to account for confounding factorsin the comparisons between <strong>randomised</strong> and <strong>non</strong><strong>randomised</strong><strong>studies</strong>, nor to detect anything otherthan systematic bias.We identified 194 tools that could be used toassess the quality of <strong>non</strong>-<strong>randomised</strong> <strong>studies</strong>(Chapter 4). Overall the tools were poorlydeveloped: the majority did not provide a meansof assessing the internal validity of <strong>non</strong><strong>randomised</strong><strong>studies</strong> and almost no attention waspaid to the principles of scale development andevaluation. However, 14 tools were identified thatincluded items related to each of our pre-specifiedcore internal validity criteria, which related toassessment of allocation method, attempts toachieve comparability by design, identification ofimportant prognostic factors and adjustment ofdifferences in case-mix. Six of the 14 tools wereconsidered potentially suitable for use as qualityassessment tools in systematic reviews, but allrequire some modification to meet all of our prespecifiedcriteria.Of 511 systematic reviews we identified thatincluded <strong>non</strong>-<strong>randomised</strong> <strong>studies</strong>, only 169 (33%)assessed study quality, and only 46% of thesereported the results of the quality assessment foreach study (Chapter 5). This is lower than the rateof quality assessment in systematic reviews of<strong>randomised</strong> controlled trials. 131 Among those thatdid assess study quality, a wide variety of qualityassessment tools were used, some of which weredesigned only for use in evaluating RCTs, andmany were designed by the review authorsthemselves. Most reviews (88%) did not assess keyquality criteria of particular importance for theassessment of <strong>non</strong>-<strong>randomised</strong> <strong>studies</strong>. Sixty-ninereviews (41%) investigated the impact of quality onstudy results in a quantitative manner. The resultsof these analyses showed no consistent pattern inthe way that study quality relates to treatmenteffects, and were confounded by the inclusion of avariety of study designs and <strong>studies</strong> of variablequality.A unique ‘resampling’ method was used togenerate multiple unconfounded comparisonsbetween RCTs and historically controlled andconcurrently controlled <strong>studies</strong> (Chapter 6). Theseempirical investigations identified twocharacteristics of the bias introduced by using <strong>non</strong>randomallocation. First, the use of historicalcontrols can lead to systematic over- orunderestimations of treatment effects, thedirection of the bias depending on time trends inthe case-mix of participants recruited to the study.In the <strong>studies</strong> used for the analyses, these timetrends varied between study regions, and weretherefore difficult to predict. Second, the results ofboth study designs varied beyond what wasexpected from chance. In a very large sample of<strong>studies</strong> the biases causing the increasedunpredictability on average cancelled each otherout, but in individual <strong>studies</strong> the bias could befairly large, and could act in either direction.These biases again relate to differences in casemix,but the differences are neither systematic norpredictable.Four commonly used methods of dealing withvariations in case-mix were identified: (i) discardingcomparisons between groups which differ in theirbaseline characteristics, (ii) regression modelling,(iii) propensity score methods and (iv) stratifiedanalyses (Chapter 7). The methods were applied tothe historically and concurrently controlled <strong>studies</strong>generated in Chapter 6, and also to <strong>studies</strong>designed to mimic ‘allocation by indication’. Noneof the methods successfully removed bias in87© Queen’s Printer and Controller of HMSO 2003. All rights reserved.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!