24.07.2013 Views

October 2007 Volume 10 Number 4 - Educational Technology ...

October 2007 Volume 10 Number 4 - Educational Technology ...

October 2007 Volume 10 Number 4 - Educational Technology ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

distinguish between partial knowledge (Coombs, Milholland, & Womer, 1956) and the absence of knowledge. In<br />

conventional MC tests, students choose only one response. The number of correctly answered questions is counted,<br />

and the scoring method is called number scoring (NS). Akeroyd (1982) stated that NS makes the simplifying<br />

assumption that all of the wrong answers of students are the results of random guesses, thus neglecting the existence<br />

of partial knowledge. Coombs et al. (1956) first proposed an alternative method for administering MC tests. In their<br />

procedure, students are instructed to mark as many incorrect options as they can identify. This procedure is referred<br />

to as elimination testing (ET). Bush (2001) presented a multiple-choice test format that permits an examinee who is<br />

uncertain of the correct answer to a question to select more than one answer. Incorrect selections are penalized by<br />

negative marking. The aim of both the Bush and Coombs schemes is to reward examinees with partial knowledge<br />

over those who are simply guessing.<br />

Education researchers have been continuously concerned not only about how to evaluate students’ partial knowledge<br />

accurately but also about how to reduce the number of unexpected responses. The number of correctly answered<br />

questions is composed of two numbers: the number of questions to which the students actually know the answer, and<br />

the number of questions to which the students correctly guess the answer (Bradbard et al., 2004). A higher frequency<br />

of the second case indicates a less reliable learning performance evaluation. Chan & Kennedy (2002) compared<br />

student scores on MC and equivalent constructed-response questions, and found that students do indeed score better<br />

on constructed-response questions for particular MC questions. Although constructed-response testing produces<br />

fewer unexpected responses than the conventional dichotomous scoring method, the change of item constructs raises<br />

the complexity of both creating the test and of the post-test item grading and analysis, whereas ET uses the same set<br />

of MC items and makes guessing a futile effort.<br />

Bradbard et al. (2004) suggested that the greatest obstacle in implementing ET is the complexity of grading and the<br />

analysis of test items following traditional paper assessment. Accordingly, examiners are not very willing to adopt<br />

ET. To overcome this problem, this study provides an integrated computer-based test and item-analysis system to<br />

reduce the difficulty of grading and item analysis following testing. Computer-based tests (CBTs) offer several<br />

advantages over traditional paper-and-pencil tests (PPTs). The benefits of CBTs include reduced costs of data entry,<br />

improved rate of disclosure, ease of data conversion into databases, and reduced likelihood of missing data (Hagler,<br />

Norman, Radick, Calfas, & Sallis, 2005). Once set up, CBTs are easier to administer than PPTs. CBTs offer the<br />

possibility of instant grading and automatic tracking and averaging of grades. In addition, they are easier to<br />

manipulate to reduce cheating (Inouye & Bunderson, 1986; Bodmann & Robinson, 2004).<br />

Most CBTs measure test item difficulty based on the percentage of correct responses. A higher percentage of correct<br />

responses implies an easier test item. The approach of test items analysis disregards the relationship between the<br />

examinee’s ability and item difficulty. For instance, if the percentage of correct responses for test item A is quite<br />

small, then the test item analysis system categorizes it as “difficult.” However, statistics also reveal that more failing<br />

examinees than passing examinees answer item A correctly. Therefore, the design of test item A may be<br />

inappropriate, misleading, or unclear, and should be further studied to aid future curriculum designers to compose<br />

high-quality items. To avoid the fallacy of percentage of correct responses, this study constructs a CBT system that<br />

applies the Rasch model based on item response theory for dichotomous scoring and the partial credit model based<br />

on graded item response for ET to estimate the examinee ability and item difficulty parameters (Baker, 1992;<br />

Hambleton & Swaminathan, 1985; Zhu & Cole, 1996; Wright & Stone, 1979; Zhu, 1996; Wright & Masters, 1982).<br />

Before ET implemented by computer-based system is broadly adopted, we still need to examine whether any<br />

discrepancy exists in the performance of examinees who take elimination tests on paper and the performance of those<br />

who take CBTs. This study compares the scores of examinees taking tests using the NS of dichotomous scoring<br />

method and the partial scoring of ET using the same set of MC items in CBT and PPT settings, where the content<br />

subject is operations management. This study has the following specific goals:<br />

1. Evaluate whether the partial scoring for the MC test produces fewer unexpected responses of examinees.<br />

2. Compare the examinee performance on conventional PPTs with their performance on CBTs.<br />

3. Analyze whether different question content, such as calculation and concept, influences the performance of<br />

examinees on PPTs and CBTs.<br />

4. Investigate the relationship between an examinee’s ability and the item difficulty, to help the curriculum<br />

designers compose high-quality items.<br />

96

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!