09.12.2012 Views

I__. - International Military Testing Association

I__. - International Military Testing Association

I__. - International Military Testing Association

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

tive by the mean effectiveness rating for that response, and then averaging these item level scores to<br />

obtain an overall effectiveness score based on choices for the least effective response alternative. This<br />

score will be referred to as L-Effectiveness, and the parallel score based on choices for the most effective<br />

responses (described previously) will be referred to as M-Effectiveness.<br />

Finally, a scoring procedure that involved combining the choices for the most and the least effective<br />

response alternative into one overall score was also explored. For each item, the mean effectiveness of<br />

the response alternative each soldier chose as the least effective was subtracted from the mean effectiveness<br />

of the response alternative they chose as the most effective. Because it is actually better if<br />

respondents indicate that less effective response alternatives are the least effective, this score can be seen<br />

as a sum or composite of the two effectiveness scores described previously (i.e., subtracting a negative<br />

number from a positive number is the same as adding the absolute values of the two numbers). These<br />

item level scores were then averaged together for each soldier to generate yet another score, and this<br />

score will be referred to as M-L Effectiveness.<br />

Descriotive Statistics. Descriptive statistics and internal consistency reliability estimates (RR-20)<br />

were computed for each of the five scoring procedures. Intercorrelations were also computed among the<br />

five scores generated by the five different scoring procedures.<br />

Preliminarv Information Concerning Construct Validity<br />

The data from this field test were also used to obtain preliminary information concerning the construct<br />

validity of the SJT as a criterion measure supervisory job knowledge. As mentioned previously, collecting<br />

the field test data for the SJT was a part of a larger data collection effort. Several other job performance<br />

measures were administered concurrently with the SJT, including job knowledge tests, a self-report<br />

administrative information survey, and supervisory simulation exercises (involving training a subordinate,<br />

disciplinary counseling, and personal counseling). Performance ratings were also collected from<br />

peers and supervisors using behavior-based rating scales. If the SJT is a valid measure of supervisory job<br />

knowledge, certain relationships would be expected with these other measures. For example, it should<br />

have at least moderate correlations with the scores on the supervisory simulations and performance ratings<br />

on supervisory dimensions. Correlations of SJT scores with several of these other job performance<br />

measures were examined.<br />

Another type of information that was used to assess the construct validity of the SJT was the extent<br />

to which the knowledges assessed by the SJT am learned on the job. If the SJT is a valid measure of job<br />

knowledge, soldiers who have more experience or training would be expected, on average, to obtain<br />

higher scores than soldiers with less experience or training. Self report information was collected from<br />

the soldiers in this field test sample concerning whether or not they had attended any supervisory training<br />

and how regularly they were required to supervise other soldiers, Mean SJT scores for soldiers with<br />

different levels of training and experience were also examined.<br />

Field Test Results<br />

Results<br />

Table 1 presents the mean score for each of the five scoring procedures. The maximum possible for<br />

the M-Correct scoring procedure is 35 (i.e., all 35 items answered correctly), but the mean score obtained<br />

by soldiers in this sample was only 16.25. The maximum score obtained was only 27. The mean number<br />

of least effective response alternatives correctly identified by this group was only 14.86. Clearly the SJT<br />

was difficult for this group of soldiers.<br />

Table 1 also presents the standard deviation for each of the five scoring procedures, and all of the<br />

scoring procedures resulted in a reasonable amount of variability in scores obtained by the soldiers in this<br />

sample. Table 1 also shows that the internal consistency reliabilities for all of these scoring procedures<br />

are quite high. The most reliable score is M-L Effectiveness, probably because this score contains more<br />

information than the other scores (i.e., choices for both the most and least effective response).<br />

.~ .,<br />

f---~..-wz.-.-.. _ .,._ _ 2 7 0

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!