09.12.2012 Views

I__. - International Military Testing Association

I__. - International Military Testing Association

I__. - International Military Testing Association

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The general knowledge test consisted of 98 items. It was not separated<br />

by task or functional area, and in administration and analysis was treated as<br />

a single test.<br />

Test Administration<br />

The field tests were administered to 61 Radiomen, all of whom were<br />

graduates of the Class A Radioman School, were in paygrades E-2, E-3, and E-4,<br />

and had graduated from the School between 1 and 59 months prior to testing.<br />

(Of the tested population, 79% were in paygrade E-3 and 60% were in the 12<br />

months to 35 months experience window.) Twenty-eight of the 61 sailors tested<br />

were assigned to shore installations at the time of testing and 33 were aboard<br />

ships.<br />

<strong>Testing</strong> was conducted at two locations and about a month apart. <strong>Testing</strong><br />

lasted for 8 hours for each 'examinee and the three components of the test were<br />

sequentially counterbalanced. Five hands-on scorers were used. All scorers<br />

were project staff and had received extensive task/test training and<br />

calibration. Each Radioman was scored independently by at least two scorers<br />

for each hands-on test.<br />

Field Test Results<br />

Although a wide variety of analyses were conducted (Ford, Doyle,<br />

Schultz, & Hoffman, 1987), this paper will focus on four main areas of<br />

interest. Specifically:<br />

� Interrater reliability of the hands-on tests.<br />

� Internal consistency within test methods.<br />

� Intercorrelations among test methods.<br />

� Assignment effect (ship vs. shore).<br />

Interrater Reliabilitvsf the Hands-On Tests<br />

Interrater reliability estimates were computed from a generalizability<br />

theory in which absolute generalizability coefficients were produced (SAS,<br />

1982; Brennan, Jarjoura & Deaton, 1980). Generalizability estimates were<br />

obtained as if only one rater score were produced and for an average of the<br />

two raters, as shown in Table 2.<br />

The reliabilities are exceptionally high. This is attributed to the<br />

influence of the firm control over the scorers that was possible because they<br />

were members of project staff and, secondly, due to the high incidence of<br />

product scoring among the tested tasks.<br />

Internal Consistencv Within Test Methods<br />

Intertask correlations were conducted for the hands-on, written, and<br />

general knowledge tests (the general knowledge test was analyzed for interitem<br />

correlations since it was treated as a single test) and are presented in<br />

Table 3. The obtained coefficients demonstrate acceptable levels of internal<br />

consistency.<br />

531

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!