Theory and Practice in Language Studies Contents - Academy ...
Theory and Practice in Language Studies Contents - Academy ...
Theory and Practice in Language Studies Contents - Academy ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
674<br />
© 2012 ACADEMY PUBLISHER<br />
THEORY AND PRACTICE IN LANGUAGE STUDIES<br />
speech, he/she can give a coefficient of two to the criteria before the total score of the speak<strong>in</strong>g performance is<br />
calculated.<br />
Advantages of Analytic Scor<strong>in</strong>g Schemes<br />
Analytic scor<strong>in</strong>g schemes are preferred over holistic schemes by many speak<strong>in</strong>g specialists for a number of reasons.<br />
First, as mentioned above, it provides more useful diagnostic <strong>in</strong>formation about students’ speak<strong>in</strong>g abilities. That is, it<br />
tells learners where their weaknesses are <strong>and</strong> where their strengths are. Analytic scor<strong>in</strong>g has been considered as more<br />
<strong>in</strong>terpretable scor<strong>in</strong>g approach because it accesses the exam<strong>in</strong>ee’s specific strengths <strong>and</strong> weaknesses <strong>and</strong> identifies the<br />
particular components of speak<strong>in</strong>g discourse that an exam<strong>in</strong>ee needs to develop (Down<strong>in</strong>g <strong>and</strong> Haladyna, 2006, p. 314).<br />
Holistic scores provide valuable <strong>in</strong>formation for an overall categorization of speak<strong>in</strong>g ability, but analytic scores<br />
provide more diagnostic <strong>in</strong>formation. The <strong>in</strong>formation also allows <strong>in</strong>structors <strong>and</strong> curriculum developers to tailor<br />
<strong>in</strong>structions more closely to the needs of their students. To a certa<strong>in</strong> extent, the analytic scor<strong>in</strong>g scale is a useful tool to<br />
provide teachers’ feedback for students on the areas of students’ strengths <strong>and</strong> weaknesses. Park (2004, p. 2) p<strong>in</strong>po<strong>in</strong>ted<br />
that the explicitness of the analytic scor<strong>in</strong>g scheme guides offers teachers a potentially valuable tool for provid<strong>in</strong>g<br />
speakers with consistent <strong>and</strong> direct feedback. Second, analytic scor<strong>in</strong>g schemes are particularly useful for second<br />
language learners, who are more likely to show a marked or uneven profile across different aspects of speak<strong>in</strong>g<br />
discourse. Some second language learners may have excellent speak<strong>in</strong>g skill <strong>in</strong> terms of content <strong>and</strong> organization, but<br />
may have much lower grammatical control; others may have an excellent control of sentence structure, but may not<br />
know how to organize their speech <strong>in</strong> a logical way. On this aspect, the analytic scor<strong>in</strong>g scales can show students that<br />
they have made progress over time <strong>in</strong> some or all dimensions when the same rubric categories are used repeatedly<br />
(Moskal, 2000).<br />
Analytic schemes have also been found to be particularly useful for scorers who are relatively <strong>in</strong>experienced (Weir,<br />
2005, p. 190). Weir <strong>in</strong> his 1990 research reported that a multi-trait analytic mark scheme is seen as a useful tool for the<br />
tra<strong>in</strong><strong>in</strong>g <strong>and</strong> st<strong>and</strong>ardization of new exam<strong>in</strong>ers (Weir, 2005, p. 190). Other authors ma<strong>in</strong>ta<strong>in</strong>ed that, compared to holistic<br />
scor<strong>in</strong>g schemes, analytic scor<strong>in</strong>g schemes are easier to tra<strong>in</strong> scorers to use it, an <strong>in</strong>experienced scorer may f<strong>in</strong>d it easier<br />
to work with an analytic scor<strong>in</strong>g scheme than a holistic scor<strong>in</strong>g one because they can evaluate specific textual criteria<br />
(Park, 2004, p. 2; McNamara, 1996). Thus, <strong>in</strong>experienced scorers may f<strong>in</strong>d it easier to work with an analytic scale than<br />
a holistic one.<br />
Disadvantages of Analytic Scor<strong>in</strong>g Schemes<br />
The major disadvantage of scor<strong>in</strong>g analytically is that it takes a lot of time to rate speak<strong>in</strong>g performance s<strong>in</strong>ce<br />
exam<strong>in</strong>ers are required to make more than one decision for every speak<strong>in</strong>g performance. When scor<strong>in</strong>g analytically, an<br />
exam<strong>in</strong>er has to check, consider, <strong>and</strong> score each criterion of the speak<strong>in</strong>g ability <strong>and</strong> then gives a total score depend<strong>in</strong>g<br />
on the coefficient put forward.<br />
Critics of analytic scor<strong>in</strong>g schemes also po<strong>in</strong>t out that measur<strong>in</strong>g the quality of a text by tally<strong>in</strong>g accumulated subskill<br />
scores dim<strong>in</strong>ishes the <strong>in</strong>terconnectedness of spoken discourse. At this aspect, it is thought that “the whole should<br />
be greater than the sum of its part”. Measur<strong>in</strong>g the quality of a spoken discourse by tally<strong>in</strong>g accumulated sub-skill gives<br />
the false impression that speak<strong>in</strong>g can be understood <strong>and</strong> fairly assessed by analyz<strong>in</strong>g autonomous discourse features<br />
(Park, 2004, p. 3). Hughes (1989) p<strong>in</strong>po<strong>in</strong>ted that concentration on the different aspects may divert attention from the<br />
overall effect of the speech. Inasmuch as the whole is often greater than the sum of its parts, a composite score may be<br />
very reliable but not valid (Hughes 1989, pp. 93-94). In this aspect, the analytic scor<strong>in</strong>g often has the tendencies to<br />
reduce <strong>and</strong> oversimplify the components of speak<strong>in</strong>g, <strong>and</strong> to emphasize the flaws than the strengths of speak<strong>in</strong>g.<br />
Hughes (2003) warned that <strong>in</strong> scor<strong>in</strong>g analytically, the criterion scored first may affect on subsequent criteria which<br />
are scored later, mak<strong>in</strong>g the overall effect of a speech diverted to an <strong>in</strong>dividual criterion. Futcher (2009), based on<br />
Thorndike’s idea, named this phenomenon as the Halo Effect of the analytic scor<strong>in</strong>g. Thorndike <strong>in</strong> 1920 def<strong>in</strong>ed the<br />
phenomenon as a problem that arises <strong>in</strong> data collection when there is carry-over from one judgment to another. In<br />
other words, when scorers are asked to make multiple judgments they really make one, <strong>and</strong> this affects all other<br />
judgments. If scorers are give five scales each with n<strong>in</strong>e po<strong>in</strong>ts, <strong>and</strong> they award a score of five on the first scale for a<br />
speech, it is highly likely that they will score five on the second <strong>and</strong> subsequent scale, <strong>and</strong> be extremely reluctant to<br />
move too far away from this generally. As a result what we f<strong>in</strong>d is that profiles tend to be “flat”, defeat<strong>in</strong>g the aim of<br />
provid<strong>in</strong>g <strong>in</strong>formative, rich <strong>in</strong>formation, on learner performance (Futcher, 2009). Consequently, criteria scales may not<br />
be used effectively accord<strong>in</strong>g to their <strong>in</strong>ternal criteria, result<strong>in</strong>g <strong>in</strong> a halo effect <strong>in</strong> which one criteria score may<br />
<strong>in</strong>fluence another.<br />
An additional problem with some analytic scor<strong>in</strong>g schemes is that even experienced essay judges sometimes f<strong>in</strong>d it<br />
difficult to assign numerical scores based on certa<strong>in</strong> descriptors (Hamp-Lyons, 1989). In this aspect, there are<br />
possibilities for scorers to disagree with one another. It is more difficult to achieve <strong>in</strong>tra- <strong>and</strong> <strong>in</strong>ter-rater reliability on all<br />
of the dimensions <strong>in</strong> an analytic scor<strong>in</strong>g scheme than on a s<strong>in</strong>gle score yielded by a holistic scale. Also, on the scorers’<br />
part, McNamara (1996) exposed that there are some evidences prov<strong>in</strong>g that scorers tend to evaluate grammar-related<br />
categories more harshly than they do other categories (McNamara, 1996), thereby overemphasiz<strong>in</strong>g the role of accuracy<br />
<strong>in</strong> provid<strong>in</strong>g a profile of students' proficiency. This disadvantage is <strong>in</strong>evitable, especially with un-tra<strong>in</strong>ed or unexperienced<br />
scorers. Grammar-related categories are somewhat wrong – right categories whereas other categories are<br />
judgments. Focus<strong>in</strong>g on wrong – right categories will always be easier than judgments. White (1985) added other limits