09.12.2012 Views

I__. - International Military Testing Association

I__. - International Military Testing Association

I__. - International Military Testing Association

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

One-hundred and eighty of the most promising situations were then chosen based on their content<br />

(e.g., appropriately difficult, realistic, etc.) and the number of plausible response alternatives available.<br />

For each of these 180 situations retained, information concerning the effectiveness of the various response<br />

alternatives was collected from two groups, a group of expert NCOs and a group of the target<br />

population NC0 job incumbents. The expert NCOs were 90 students and instructors at the United States<br />

Army Sergeants Major Academy. These NCOs were among the highest ranking enlisted soldiers in the<br />

Any (rank of E-8 to E-9), and all had extensive experience as supervisors in the Army. The target<br />

NCOs were 344 second tour soldiers (rank of E-4 to E-5) who were participating in a field test of a group<br />

of job performance measures at several Army posts in the United States and Europe. For each SJT situation,<br />

these respondents were asked to rate the effectiveness of each response alternative on a seven point<br />

scale (1 = least and 7 = most effective). Because there were still 180 situations and time limitations, each<br />

soldier could only respond to a subset of the situations. This resulted in about 25 expert NC0 and 45<br />

incumbent NC0 responses per situation.<br />

Items (situations) for the field test version of the SIT and response alternatives for these items were<br />

then selected based on these data, The following criteria were used to select 35 of these situations and<br />

from 3-5 response alternatives for each situation: 1) the expert group had high agreement concerning the<br />

most effective response for the item; 2) the item was difficult for the incumbents (i.e., agreement was<br />

substantially lower than for the expert group); 3) the difference between the expert and the incumbent<br />

responses for each situation was judged to reflect an important aspect of supervisory knowledge; and 4)<br />

the content of the final group of situations was as representative as possible of the first-line supervisory<br />

job in the Army.<br />

Field Test of the SJT<br />

The field test of the SJT had three major objectives. The first objective was to explore different<br />

methods of scoring the SJT. The second objective was to examine and evaluate the psychometric properties<br />

of this instrument. The final objective was to obtain preliminary information concerning the consttuct<br />

validity of the SJT as a criterion measure of supervisory job knowledge.<br />

The SJT was administered as part of a larger data collection effort to a sample of 1049 NCOs (most<br />

were E-4s and E-5s) at a variety of posts in the United States and Europe. For each of the 35 SJT items,<br />

these soldiers were asked to place an “M” next to the response alternative they thought was the most<br />

effective and an “L” next to the response alternative they thought was the least effective.<br />

Scoring Procedures. Several different procedures for scoring the SJT were explored. The most<br />

straightforward was a simple number correct score. For each item, the response alternative that had been<br />

given the highest mean effectiveness rating by the experts (senior NCOs) was designated the “correct”<br />

answer. Respondents were then scored based on the number of items for which they indicated that this<br />

“correct” response alternative was the most effective. The second scoring procedure involved weighting<br />

each response alternative chosen by soldiers as the most effective by the mean effectiveness rating given<br />

to that response alternative by the expert group. This gives respondents more credit for choosing<br />

“wrong” answers that are relatively effective than for choosing wrong answers that are very ineffective.<br />

These item level effectiveness scores were then averaged to obtain an overall effectiveness score for each<br />

soldier, Averaging these item level scores instead of simply summing them placed respondents’ scores<br />

on the same 1 to 7 effectiveness scale as the experts’ ratings and ensured that respondents were not penalized<br />

for any missing data (up to 10% missing responses were allowed).<br />

Scoring procedures based on respondents’ choices for the least effective response to each situation<br />

were also explored. The ability to identify the least effective response alternatives might be seen as an<br />

indication of respondents’ ability to avoid these very ineffective responses or in effect to avoid “screwing<br />

up”, As with the choices for the most effective response, a simple number correct score was computed:<br />

the number of times each respondent correctly identified the response alternative that the experts rated<br />

the least effective. In order to differentiate this score from the number correct score based on choices for<br />

the most effective response, this score will be referred to as the L-Correct score, and the score based on<br />

choices for the most effective response (described previously) wit1 be referred to as the M-Correct score.<br />

Another score was computed by weighting respondents’ choices for the least effective response altema-<br />

269

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!