October 2007 Volume 10 Number 4 - Educational Technology ...
October 2007 Volume 10 Number 4 - Educational Technology ...
October 2007 Volume 10 Number 4 - Educational Technology ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
esponse is identified, then the score is 3; while if the subset of options identified includes the correct response and<br />
other options, then the item score is 3 – n (n = 1, 2, or 3), where n denotes the number of other options included. If<br />
subsets of options that do not include the correct option are identified, then the score is –n, where n is the number of<br />
options included. SST and ET are probabilistically equivalent.<br />
Comparison studies of scoring methods<br />
Coombs et al. (1956) observed that tests using ET are somewhat more reliable than NS tests and measure the same<br />
abilities as NS scoring. Dressel & Schmid (1953) compared SST with NS using college students in a physical science<br />
course. They observed that the reliability of the SST test, at 0.67, was slightly lower than that of the NS test at 0.70.<br />
They also noted that academically high-performing students scored better than average compared to low-performing<br />
students, with respect to full knowledge, regardless of the difficulty of the items. Jaradat and Tollefson (1988)<br />
compared ET with SST, using graduate students enrolled in an educational measurement course. No significant<br />
differences in terms of reliability were observed between the methods. Jaradat and Tollefson reported that the<br />
majority of students felt that ET and SST were better measures of their knowledge than conventional NS, but they<br />
still preferred NS. Bradbard et al. (2004) concluded that ET scoring is useful whenever there is concern about<br />
improving the accuracy of measuring a student’s partial knowledge. ET scoring may be particularly helpful in<br />
content areas where partial or full misinformation can have life-threatening consequences. This study adopted ET<br />
scoring as the measurement scheme for partial scoring.<br />
Design of multiple choice<br />
An MC item is composed of a correct answer and several distractors. The design of distractors is the largest<br />
challenge in constructing an MC item (Haladyna & Downing, 1989). Haladyna & Downing summarized the common<br />
rules of design found in many references. One such rule is that all the option choices should adopt parallel grammar<br />
to avoid giving clues to the correct answer. The option choices should address the same content, and the distractors<br />
should all be reasonable choices for a student with limited or incorrect information. Items should be as clear and<br />
concise as possible, both to ensure that students know what is being asked, and to minimize reading time and the<br />
influence of reading skills on performance. Haladyna and Downing recommended some guidelines for developing<br />
distractors:<br />
Employ plausible distractors; avoid illogical distractors.<br />
Incorporate common student errors into distractors.<br />
Adopt familiar yet incorrect phrases as distractors.<br />
Use true statements that do not correctly answer the items.<br />
Kehoe (1995) recommended improving tests by maintaining and developing a pool of “good” items from which<br />
future tests are drawn in part or in whole. This approach is particularly true for instructors who teach the same course<br />
more than once. The proportion of students answering an item correctly also affects its discrimination power. Items<br />
answered correctly (or incorrectly) by a large proportion of examinees (more than 85%) have a markedly low power<br />
to discriminate. In a good test, most items are answered correctly by 30% to 80% of the examinees. Kehoe described<br />
the following three methods to enhance the ability of items to discriminate among abilities:<br />
Items that correlate less than 0.15 with total test score should probably be restructured.<br />
Distractors that are not chosen by any examinees should be replaced or eliminated.<br />
Items that virtually all examinees answer correctly are unhelpful for discriminating among students and should<br />
be replaced by harder items.<br />
Pros and cons of CBTs<br />
CBTs have several benefits:<br />
The single-item presentation is not restricted to text, is easy to read, and allows combining with pictures, voice,<br />
image, and animation.<br />
98