09.12.2012 Views

I__. - International Military Testing Association

I__. - International Military Testing Association

I__. - International Military Testing Association

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Table IV cont.<br />

EM-46 13<br />

Paygrade 1 N 1 N 1 % 1 N I N I % 1 N 1 N 1 %<br />

E-5 43 16 37 42 9 21 25 6 24<br />

E-6 44 18 41 42 22 52 31 12 39<br />

E-7 11 5 45 8 7 88 4 2 50<br />

E-8 8 E-9 1 0 0 1 1 100 2 2 100<br />

TOTALS 99 39 37 93 39 42 62 22 35<br />

Part Test Performance. In addition to evaluating any effects on total test<br />

performance of randomizing the items it was also considered prudent to consider<br />

any effects on domain performance. As indicated in Table V below, the results are<br />

similar to those reported in Table Ill for total test performance. That is, the average<br />

domain scores are quite consistent across test administrations with the 2-89<br />

administration being somewhat easier for almost all domains across the three<br />

administrations.<br />

Table V<br />

Average Domain Scores<br />

BM-0110 EM-461 3<br />

Randomized complete block design ANOVAs were computed for the domain scores<br />

across the three administrations of each test and the results were not significant for<br />

either the BM-0110 or EM-4613, (F[2,17] = 2.36) and (F[2,17] = .015) respectively.<br />

Common Item Comparisons. Since it was not possible to use the same items in total<br />

for each of the three test administrations, it was also necessary to evaluate the<br />

effect, if any, on the subset of common items for each paired comparison. A twotailed<br />

t-test was used to analyze the items common to each pair of administrations<br />

and all results for both the BM-0110 and EM-4613 were nonsignificant at the .05<br />

level, In addition, ANOVAs were calculated for each of the three administrations of<br />

the BM-0110 and EM-4613 tests and the results failed to reveal any significant<br />

differences at the .05 level of significance, (F[2,74] = .044) and (F[2,1461 = -720)<br />

respectively.<br />

Individual Item Statistics. The issue of any effect on item statistics of varying the<br />

item’s position was investigated by comparing the item difficulty indexes (p-values)<br />

of common items in each pair of test administrations as well as the item<br />

277

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!