12.01.2015 Views

RESEARCH METHOD COHEN ok

RESEARCH METHOD COHEN ok

RESEARCH METHOD COHEN ok

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

VALIDITY AND RELIABILITY IN TESTS 163<br />

<br />

in question. Content validity is achieved by<br />

making professional judgements about the relevance<br />

and sampling of the contents of the<br />

test to a particular domain. It is concerned<br />

with coverage and representativeness rather<br />

than with patterns of response or scores. It is<br />

a matter of judgement rather than measurement<br />

(Kerlinger 1986). Content validity will<br />

need to ensure several features of a test (Wolf<br />

1994): (a) test coverage (the extent to which<br />

the test covers the relevant field); (b) test relevance<br />

(the extent to which the test items<br />

are taught through, or are relevant to, a particular<br />

programme); (c) programme coverage<br />

(the extent to which the programme covers<br />

the overall field in question).<br />

Criterion-related validity is where a high correlation<br />

coefficient exists between the scores on the<br />

test and the scores on other accepted tests of<br />

the same performance: this is achieved by comparing<br />

the scores on the test with one or more<br />

variables (criteria) from other measures or tests<br />

that are considered to measure the same factor.<br />

Wolf (1994) argues that a major problem<br />

facing test devisers addressing criterion-related<br />

validity is the selection of the suitable criterion<br />

measure. He cites the example of the difficulty<br />

of selecting a suitable criterion of academic<br />

achievement in a test of academic aptitude.<br />

The criterion must be: relevant (and agreed to<br />

be relevant); free from bias (i.e. where external<br />

factors that might contaminate the criterion<br />

are removed); reliable – precise and accurate;<br />

capable of being measured or achieved.<br />

Construct validity (e.g. the clear relatedness<br />

of a test item to its proposed construct/unobservable<br />

quality or trait, demonstrated<br />

by both empirical data and logical<br />

analysis and debate, i.e. the extent to which<br />

particular constructs or concepts can give an<br />

account for performance on the test): this is<br />

achieved by ensuring that performance on the<br />

test is fairly explained by particular appropriate<br />

constructs or concepts. As with content validity,<br />

it is not based on test scores, but is more a<br />

matter of whether the test items are indicators<br />

of the underlying, latent construct in question.<br />

<br />

<br />

<br />

<br />

<br />

In this respect construct validity also subsumes<br />

content and criterion-related validity. It is argued<br />

(Loevinger 1957) that, in fact, construct<br />

validity is the queen of the types of validity because<br />

it is subsumptive and because it concerns<br />

constructs or explanations rather than methodological<br />

factors. Construct validity is threatened<br />

by under-representation of the construct,<br />

i.e. the test is too narrow and neglects significant<br />

facets of a construct, and by the inclusion<br />

of irrelevancies – excess reliable variance.<br />

Concurrent validity is where the results of the<br />

test concur with results on other tests or instruments<br />

that are testing/assessing the same<br />

construct/performance – similar to predictive<br />

validity but without the time dimension. Concurrent<br />

validity can occur simultaneously with<br />

another instrument rather than after some time<br />

has elapsed.<br />

Face validity is where, superficially, the test appears<br />

– at face value – to test what it is designed<br />

to test.<br />

Jury validity is an important element in construct<br />

validity, where it is important to agree<br />

on the conceptions and operationalization of<br />

an unobservable construct.<br />

Predictive validity is where results on a test accurately<br />

predict subsequent performance – akin<br />

to criterion-related validity.<br />

Consequential validity is where the inferences<br />

that can be made from a test are sound.<br />

Systemic validity (Frederiksen and Collins<br />

1989) is where programme activities both<br />

enhance test performance and enhance performance<br />

of the construct that is being addressed<br />

in the objective. Cunningham (1998) gives an<br />

example of systemic validity where, if the test<br />

and the objective of vocabulary performance<br />

leads to testees increasing their vocabulary,<br />

then systemic validity has been addressed.<br />

To ensure test validity, then, the test must<br />

demonstrate fitness for purpose as well as<br />

addressing the several types of validity outlined<br />

above. The most difficult for researchers to<br />

address, perhaps, is construct validity, for it<br />

argues for agreement on the definition and<br />

Chapter 6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!