Evaluating Patient-Based Outcome Measures - NIHR Health ...
Evaluating Patient-Based Outcome Measures - NIHR Health ...
Evaluating Patient-Based Outcome Measures - NIHR Health ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
42<br />
Criteria for selecting a patient-based outcome measure<br />
instruments should be reduced in terms of<br />
length and number of items in order to increase<br />
acceptability (Burisch, 1984). Some comparative<br />
studies have shown no loss of responsiveness when<br />
such shorter instruments are used (Fitzpatrick<br />
et al., 1989; Katz et al., 1992). Thus, the SF-12 has<br />
emerged as a shorter version of the SF-36, considered<br />
to require only 2 minutes or less to complete<br />
whilst reproducing more than 90% of the<br />
variance in SF-36 scores in the general population<br />
(Ware et al., 1996). Similarly the SIP has undergone<br />
a number of attempts to reduce it from its full<br />
136-item version (McDowell and Newell, 1996).<br />
Whilst there are good reasons in particular<br />
circumstances to prefer shortened versions of<br />
instruments, attention is needed to how an instrument<br />
has been shortened. A recent structured<br />
review of 42 studies intended to shorten longer<br />
original measures found that the majority used<br />
statistical methods alone to achieve this objective,<br />
typically relying on correlations of shorter with<br />
longer versions in the same data, or methods to<br />
maximise internal consistency of the shorter<br />
version (Cronbach’s alpha) (Coste et al., 1997).<br />
The authors argue that, whilst there are obvious<br />
advantages in shortening a well developed and<br />
widely validated longer instrument, there are also<br />
methodological pitfalls. In particular, selection of<br />
items on the basis of internal consistency will<br />
further narrow the scale. They argue that the<br />
psychometric properties of the short version<br />
need to be examined as if it is a new instrument.<br />
Properties such as precision, as discussed in an<br />
earlier section of this review may also be jeopardised<br />
by an instrument with fewer items.<br />
Direct assessment of acceptability<br />
It is preferable directly to assess patients’ views<br />
about a new questionnaire. Sprangers and<br />
colleagues (1993) argue that patients’ views<br />
should be obtained at the pre-testing phase<br />
prior to formal tests for reliability etc., by means<br />
of a structured interview in which they are asked<br />
whether they found any questionnaire items<br />
difficult annoying or distressing or whether issues<br />
were omitted. When the EORTC QLQ-C30 was<br />
assessed in this way, 10% of patients reported that<br />
one or more items were confusing or difficult to<br />
answer and less than 3% that an item was upsetting,<br />
whilst more generally patients welcomed the opportunity<br />
to report their experiences (Aaronson et al.,<br />
1993). Another formal evaluation of acceptability<br />
of a questionnaire found 89% enjoyed the task<br />
of completing the COOP instrument and 97%<br />
reported understanding the questions (Nelson<br />
et al., 1990).<br />
Weinberger and colleagues (1996) directly asked<br />
patients for their preferences for different forms<br />
of administration of the SF-36. Far more positive<br />
preference was expressed for face-to-face interview<br />
compared with either self complete or telephone<br />
based administration.<br />
Not all studies of patient acceptability of instruments<br />
are positive. In a qualitative study of<br />
patients’ views of a QoL assessment that included<br />
an early form of EORTC questionnaire, patients<br />
complained about length, difficulties in understanding<br />
the format of the questionnaire and<br />
possible risks that their answers would influence<br />
subsequent treatment decisions (Bernhard<br />
et al., 1995).<br />
In general, users should expect to see evidence<br />
of acceptability being examined at the design<br />
stage. Subsequently, the most direct and easy to<br />
assess evidence is the length and response rates<br />
of questionnaires.<br />
Translation and cultural applicability<br />
One basic way in which a questionnaire may fail<br />
to be acceptable is if it is expressed in a language<br />
unfamiliar to respondents. This issue has received<br />
a large amount of attention in recent literature<br />
on patient-based outcomes, mainly because of the<br />
increasing need for clinical trials incorporating<br />
QoL measures to be conducted on a multi-national<br />
basis, especially in Europe (Kuyken et al., 1994;<br />
Orley and Kuyken, 1994; Shumaker and Berzon,<br />
1995). As a result, there are quite elaborate<br />
guidelines available intended to ensure high standards<br />
of translation of questionnaires (Bullinger<br />
et al., 1995; Leplege and Verdier, 1995). Amongst<br />
procedures to improve translation, according to<br />
such guidelines, are: use of several independent<br />
translations that are compared; back-translation;<br />
testing of the acceptability of translations to<br />
respondents. Less attention has been paid to<br />
cultural and linguistic variations within national<br />
boundaries, but it would seem that similar<br />
principles could be applied to increase cultural<br />
applicability. Presently, few patient-based outcome<br />
measures have been translated into the languages<br />
of ethnic minorities in the UK.<br />
An important issue is whether rigorous translation<br />
can by itself establish the appropriateness of an<br />
instrument to a new cultural context from the<br />
one in which it was developed. Such methods may<br />
not establish whether subjective experiences differ<br />
in terms of salience from one culture to another,<br />
or indeed may fail to identify concerns and experiences<br />
not anticipated in the culture in which an