12.07.2015 Views

awej 5 no.4 full issue 2014

awej 5 no.4 full issue 2014

awej 5 no.4 full issue 2014

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

AWEJ Volume.5 Number.4, <strong>2014</strong>Fair, Reliable, Valid: Developing a Grammar Test UtilizingEfeotorMultipleMike Orrchoice item responses may also be categorised, where an examinee scores a point forchoosing a distracter which is considered a better response than the other distracters. This ishighlighted in the question in figure 4.Figure 4.A multiple choice item that could use polytomous scoring.Q. Who was the 41 st President of the United States of America?a) George Bush b) Abraham Lincoln c) Ronald Reagan d) Tony BlairThe multiple choice item in Figure 4 could be scored using two categories. If the examineechooses the correct answer, (a), they score one point, if they choose any of the distracters theyscore zero. Alternatively, the distracters could be graded in order, to show partial success. AsWilson (2005) states, this is done when the difference between the distracters is large enough,and when ―there is a way to interpret those differences with respect to the construct definition‖(p.70). Subsequently, the outcome space may award points for each distracter accordingly; (a=3,c=2, b=1, d=0).Another possible scoring scheme is to award negative marks for an incorrect answer. Adisadvantage of this is that it causes cautious students to refrain from answering items, even ifthey possibly know the correct answer. The analysis focuses on the items themselves, rather thanthe overall score. Therefore negative marking could have had the effect of leaving certain itemswith very little statistical data available. Furthermore, it was believed that the measurementmodel would highlight possible instances of examinee guessing.The traditional method of scoring multiple choice items was chosen. Negative marking wasrejected for the reasons outlined above. Polytomous scoring was rejected because there are nosignificant differences between the distracters which could be interpreted with respect to theconstruct. In many of the items the options are homogenous. Figure 5 is taken from the testpaper.Figure 5.An item from the test.Q. Ali is tall and ___________ hair is black.a) our b) your c) her d) hisThe answer to the item in Figure 5 is a possessive pronoun, as are the other three distracters.Therefore, due to this homogeny there is no ‗better‘ answer amongst the distracters. As a result,the most valid scoring scheme for the test was adopted, which was to give one point for a correctanswer and zero for an incorrect answer.The Measurement ModelThe final building block, the measurement model aims to ―relate the scored outcomes from theitems design and the outcome space back to the construct that was the original inspiration of theitems‖ (Wilson, 2005, p.85). Two main approaches to measurement have emerged, the first onehas been coined Classical Test Theory (CTT). The theory purports an examinee‘s observed scoreArab World English JournalISSN: 2229-9327www.<strong>awej</strong>.org210

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!