25.11.2014 Views

Developmental psychology.pdf

Developmental psychology.pdf

Developmental psychology.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1<br />

.••<br />

Tests and Measurement 357<br />

150<br />

*<br />

| 125<br />

ra<br />

1100<br />

75<br />

50<br />

1<br />

1L?'<br />

•V*<br />

. •<br />

1 " -I-<br />

. • 1<br />

f—<br />

Figure 13.11<br />

Scattergram of Reliability. In this<br />

test-retest reliability, the results of<br />

two administrations of the same test<br />

to the same subjects show a high<br />

correlation.<br />

25<br />

0<br />

25 50 75 100 125 150<br />

First test administration<br />

Whenever a test is evaluated for reliability, the same procedures must be employed<br />

on all occasions, and then the sets of scores are compared. The relationship<br />

between each subject's scores, or each judge's scores, is expressed as a degree of correlation<br />

or agreement (Figure 13.11).<br />

Reliability Coefficients Perfect agreement is seldom found, but in general the more<br />

extensive the test, the higher is the consistency. As more test items are included, the<br />

less likely it is that the final score will be disrupted by one or two items that are<br />

unsuccessful or misunderstood. However, as the time interval between testings increases,<br />

reliability becomes lower, for the subjects' responses may be influenced by changes in<br />

mood, setting, age, and cultural factors.<br />

A numerical measure of a test's consistency, or reliability, is indicated by a<br />

reliability coefficient, which may range from .00 to ± 1.00. A coefficient of zero means<br />

that the test is completely unreliable, giving a totally unpredictable result on each occasion.<br />

A coefficient of one is equally improbable, for it means that the test gives exactly<br />

the same result every time it is administered to the same subject. Instead, a<br />

worthwhile test is highly consistent but not perfect.<br />

It is not uncommon to have reliability coefficients for mechanical aptitude in<br />

the vicinity of .90. Coefficients for personality tests are lower because the results are<br />

more likely to be influenced by temporary states, such as mood. Tests of perceptual<br />

and motor skills for Air Force personnel have shown a range of reliability coefficients<br />

from .64 to .95 (Hunter, 1975; Figure 13.12).<br />

Test<br />

Kinesthesis<br />

Perceptual speed<br />

Stress reactions<br />

Associative learning<br />

Memory<br />

Concept identification<br />

Divided attention<br />

Coefficient<br />

.93<br />

.84<br />

.92<br />

.64<br />

.95<br />

.81<br />

.95<br />

Figure 13.12<br />

Reliability Coefficients. These<br />

coefficients indicate the reliability of a<br />

series of tests of perceptual and<br />

motor skills (Hunter, 1975).<br />

Assessment of Validity<br />

Perhaps Captain Holmgren, approaching some turbulence, has asked the passengers<br />

to obey the no-smoking and seat-belt signs. In any case, we now encounter some special<br />

difficulty ourselves, and it lies with the issue of validity. It is one thing for a test to be<br />

reliable, giving the same result over and over again. It is another for the test to be<br />

valid. A test that has validity measures the quality that it is intended to measure. It is<br />

a potentially successful test. This issue of validity is the most crucial in the entire field<br />

of psychological testing and the most difficult to assess satisfactorily.<br />

A reliable test might or might not be valid, depending upon whether it measures<br />

the intended trait. A test of running speed may give a highly reliable result, but<br />

it is not a valid measure of flying ability. A test that has high validity, accurately measuring<br />

flying ability, must be reliable, giving the same result from one time to the next,<br />

unless the characteristic being measured itself changes. Certain flight-simulator tests,<br />

for example, have been shown to have high validity as pilot-selection devices (Fowler,<br />

1981).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!