24.07.2013 Views

October 2007 Volume 10 Number 4 - Educational Technology ...

October 2007 Volume 10 Number 4 - Educational Technology ...

October 2007 Volume 10 Number 4 - Educational Technology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

esponse is identified, then the score is 3; while if the subset of options identified includes the correct response and<br />

other options, then the item score is 3 – n (n = 1, 2, or 3), where n denotes the number of other options included. If<br />

subsets of options that do not include the correct option are identified, then the score is –n, where n is the number of<br />

options included. SST and ET are probabilistically equivalent.<br />

Comparison studies of scoring methods<br />

Coombs et al. (1956) observed that tests using ET are somewhat more reliable than NS tests and measure the same<br />

abilities as NS scoring. Dressel & Schmid (1953) compared SST with NS using college students in a physical science<br />

course. They observed that the reliability of the SST test, at 0.67, was slightly lower than that of the NS test at 0.70.<br />

They also noted that academically high-performing students scored better than average compared to low-performing<br />

students, with respect to full knowledge, regardless of the difficulty of the items. Jaradat and Tollefson (1988)<br />

compared ET with SST, using graduate students enrolled in an educational measurement course. No significant<br />

differences in terms of reliability were observed between the methods. Jaradat and Tollefson reported that the<br />

majority of students felt that ET and SST were better measures of their knowledge than conventional NS, but they<br />

still preferred NS. Bradbard et al. (2004) concluded that ET scoring is useful whenever there is concern about<br />

improving the accuracy of measuring a student’s partial knowledge. ET scoring may be particularly helpful in<br />

content areas where partial or full misinformation can have life-threatening consequences. This study adopted ET<br />

scoring as the measurement scheme for partial scoring.<br />

Design of multiple choice<br />

An MC item is composed of a correct answer and several distractors. The design of distractors is the largest<br />

challenge in constructing an MC item (Haladyna & Downing, 1989). Haladyna & Downing summarized the common<br />

rules of design found in many references. One such rule is that all the option choices should adopt parallel grammar<br />

to avoid giving clues to the correct answer. The option choices should address the same content, and the distractors<br />

should all be reasonable choices for a student with limited or incorrect information. Items should be as clear and<br />

concise as possible, both to ensure that students know what is being asked, and to minimize reading time and the<br />

influence of reading skills on performance. Haladyna and Downing recommended some guidelines for developing<br />

distractors:<br />

Employ plausible distractors; avoid illogical distractors.<br />

Incorporate common student errors into distractors.<br />

Adopt familiar yet incorrect phrases as distractors.<br />

Use true statements that do not correctly answer the items.<br />

Kehoe (1995) recommended improving tests by maintaining and developing a pool of “good” items from which<br />

future tests are drawn in part or in whole. This approach is particularly true for instructors who teach the same course<br />

more than once. The proportion of students answering an item correctly also affects its discrimination power. Items<br />

answered correctly (or incorrectly) by a large proportion of examinees (more than 85%) have a markedly low power<br />

to discriminate. In a good test, most items are answered correctly by 30% to 80% of the examinees. Kehoe described<br />

the following three methods to enhance the ability of items to discriminate among abilities:<br />

Items that correlate less than 0.15 with total test score should probably be restructured.<br />

Distractors that are not chosen by any examinees should be replaced or eliminated.<br />

Items that virtually all examinees answer correctly are unhelpful for discriminating among students and should<br />

be replaced by harder items.<br />

Pros and cons of CBTs<br />

CBTs have several benefits:<br />

The single-item presentation is not restricted to text, is easy to read, and allows combining with pictures, voice,<br />

image, and animation.<br />

98

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!