29.06.2013 Views

Evaluating Patient-Based Outcome Measures - NIHR Health ...

Evaluating Patient-Based Outcome Measures - NIHR Health ...

Evaluating Patient-Based Outcome Measures - NIHR Health ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

30<br />

Criteria for selecting a patient-based outcome measure<br />

considered small, 0.5 as medium and 0.8 or greater<br />

as large (Cohen, 1977, Kazis et al., 1989).<br />

Standardised response mean<br />

An alternative measure is the SRM. It only differs<br />

from an effect size in that the denominator is the<br />

standard deviation of change scores in the group in<br />

order to take account of variability in change rather<br />

than baseline scores (Liang et al., 1990). Because<br />

the denominator in the SRM examines response<br />

variance in an instrument whereas the effect size<br />

does not, Katz and colleagues (1992) consider<br />

that the SRM approach is more informative.<br />

Modified standardised response mean<br />

A third method of providing a standardised<br />

expression of responsiveness is that of Guyatt and<br />

colleagues (1987b). As with effect sizes and SRMs,<br />

the numerator of this statistic is the mean change<br />

score for a group. In this case, the denominator is<br />

the standard deviation of change scores for individuals<br />

who are identified by other means as stable<br />

(Tuley et al., 1991). This denominator provides an<br />

expression of the inherent variability of changes in<br />

an instrument, ‘an intuitive estimate of background<br />

noise’ (Liang, 1995). Unlike the two other expressions,<br />

this method requires independent evidence<br />

that patients are indeed stable, in the form of a<br />

transition question asked at follow-up. MacKenzie<br />

and colleagues (1986b) used a transition index<br />

and the modified SRM to test the responsiveness<br />

of the SIP.<br />

Relative efficiency<br />

Another approach is to compare the responsiveness<br />

of health status instruments when used in studies<br />

of treatments widely considered to be effective, so<br />

that it is very likely that significant changes actually<br />

occur. As applied by Liang and colleagues (1985)<br />

who developed this approach, the performance<br />

of different health status instruments is compared<br />

to a standard instrument amongst patients who<br />

are considered to have experienced substantial<br />

change. Thus they asked patients to complete a<br />

number of health status questionnaires before<br />

and after total joint replacement surgery. <strong>Health</strong><br />

status questionnaires were considered most<br />

responsive that produced the largest paired t-test<br />

score for pre and post surgical assessments. Liang<br />

and colleagues (1985) produce a standardised<br />

version of the use of t-statistics (termed ‘relative<br />

efficiency’), the square of the ratio of t-statistic<br />

for two instruments being compared. As noted<br />

earlier, much of the data in this field is nonparametric,<br />

so that t statistics are not appropriate<br />

and non-parametric forms of relative efficiency<br />

need to be used.<br />

Sensitivity and specificity of<br />

change scores<br />

Another approach to assessing responsiveness<br />

is to consider the change scores of a health status<br />

instrument as if they were a screening test to<br />

detect true change (Deyo and Inui, 1984). In<br />

other words, data for a health status measure<br />

are examined in terms of the sensitivity of change<br />

scores produced by an instrument (proportion<br />

of true changes detected) and specificity of<br />

change scores (proportion of individuals who<br />

are truly not changing, detected as stable). This<br />

method requires that investigators identify<br />

somewhat arbitrarily a specific change score<br />

that will be taken as of interest to examine, say an<br />

improvement of five points between observations.<br />

The sample is then divided according to whether<br />

or not they have reported five points improvement<br />

or not. Some external standard is also needed to<br />

determine ‘true’ change; commonly it is a<br />

consensus of the patient’s and or clinician’s<br />

retrospective judgement of change (Deyo and<br />

Inui, 1984). Essentially, the extent of agreement<br />

is then examined between change as defined in<br />

this hypothetical case as more than five points<br />

change and independent evidence of whether<br />

individuals have changed. The same analysis is<br />

then provided for specificity. Scores of less than<br />

five points are counted as ‘unchanged’. Individuals<br />

scores are then examined to determine how much<br />

this classification of individuals as unchanged<br />

agrees with independent evidence<br />

that they have not changed.<br />

This approach permits an assessment of the<br />

sensitivity and specificity of different amounts of<br />

change registered by an instrument. For example,<br />

an instrument which, for the sake of simplicity,<br />

has five possible scores (1–5), when applied on<br />

two occasions over time, may either produce<br />

identical scores on both occasions or register<br />

up to four points of change (for example a<br />

change from ‘5’ to ‘1’). If one has an external<br />

standard of whether change truly occurred, such<br />

as a consensus of patient’s and doctor’s opinion,<br />

one can assess how sensitive and specific to the<br />

external evidence of change are changes of one<br />

point, two points etc. As sensitivity improves,<br />

specificity may deteriorate.<br />

Receiver-operating characteristics<br />

Deyo and Centor (1986) extend the principle<br />

of measuring the sensitivity and specificity of<br />

an instrument against an external criterion by<br />

suggesting that information be synthesised into<br />

receiver-operating characteristics. Like the simpler<br />

version just described in the previous section, this

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!