Cut-off Scores of the New Chinese Proficiency Test Based on ... - ALTE
Cut-off Scores of the New Chinese Proficiency Test Based on ... - ALTE
Cut-off Scores of the New Chinese Proficiency Test Based on ... - ALTE
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<str<strong>on</strong>g>Cut</str<strong>on</strong>g>-<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> <str<strong>on</strong>g>Scores</str<strong>on</strong>g> <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> <str<strong>on</strong>g>New</str<strong>on</strong>g><br />
<str<strong>on</strong>g>Chinese</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>iciency <str<strong>on</strong>g>Test</str<strong>on</strong>g><br />
<str<strong>on</strong>g>Based</str<strong>on</strong>g> <strong>on</strong> <str<strong>on</strong>g>the</str<strong>on</strong>g> Ang<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> Standard<br />
Setting Method<br />
Steering Committee for <str<strong>on</strong>g>the</str<strong>on</strong>g> <str<strong>on</strong>g>Test</str<strong>on</strong>g> Of Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>iciency-Huayu<br />
Lin, Ling-Ying<br />
Lan, Pei-Jiun
<str<strong>on</strong>g>Cut</str<strong>on</strong>g>-<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> <str<strong>on</strong>g>Scores</str<strong>on</strong>g> <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> <str<strong>on</strong>g>New</str<strong>on</strong>g><br />
<str<strong>on</strong>g>Chinese</str<strong>on</strong>g> Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>iciency <str<strong>on</strong>g>Test</str<strong>on</strong>g><br />
<str<strong>on</strong>g>Based</str<strong>on</strong>g> <strong>on</strong> <str<strong>on</strong>g>the</str<strong>on</strong>g> Ang<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> Standard<br />
Setting Method<br />
Steering Committee for <str<strong>on</strong>g>the</str<strong>on</strong>g> <str<strong>on</strong>g>Test</str<strong>on</strong>g> Of Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>iciency-Huayu<br />
Lin, Ling-Ying<br />
Lan, Pei-Jiun
Background<br />
<br />
<br />
Renamed:<br />
old: <str<strong>on</strong>g>Test</str<strong>on</strong>g> Of Pr<str<strong>on</strong>g>of</str<strong>on</strong>g>iciency-Huayu (TOP)<br />
new: <str<strong>on</strong>g>Test</str<strong>on</strong>g> <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>Chinese</str<strong>on</strong>g> as a Foreign Language<br />
(TOCFL)<br />
Revised:<br />
1. based <strong>on</strong> CEFR<br />
2. 120 items 100 items<br />
3. new item types
Purpose<br />
To identify <str<strong>on</strong>g>the</str<strong>on</strong>g> cut-<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> scores <strong>on</strong> three levels<br />
<str<strong>on</strong>g>of</str<strong>on</strong>g> TOCFL based <strong>on</strong> standard setting method.
Literature Review<br />
Introducti<strong>on</strong> to TOCFL<br />
Short<br />
Dialogue<br />
(1 turn)<br />
Listening Comprehensi<strong>on</strong><br />
(50 items)<br />
Short<br />
Dialogue<br />
(twoturn)<br />
L<strong>on</strong>g<br />
Dialogue<br />
(multiple<br />
- turn)<br />
M<strong>on</strong>ologue<br />
Reading Comprehensi<strong>on</strong><br />
(50 items)<br />
Cloze<br />
Au<str<strong>on</strong>g>the</str<strong>on</strong>g>ntic<br />
Material<br />
Short<br />
Essay<br />
B1 20 15 -- 15 20 15 15<br />
B2 10 10 15 15 15 10 25<br />
C1 -- 10 20 20 15 -- 35
Literature Review<br />
definiti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> standard setting<br />
Cizek (1993)<br />
“standard setting is <str<strong>on</strong>g>the</str<strong>on</strong>g> proper following <str<strong>on</strong>g>of</str<strong>on</strong>g> a<br />
prescribed, rati<strong>on</strong>al system <str<strong>on</strong>g>of</str<strong>on</strong>g> rules or<br />
procedures resulting in <str<strong>on</strong>g>the</str<strong>on</strong>g> assignment <str<strong>on</strong>g>of</str<strong>on</strong>g> a<br />
number to differentiate between two or<br />
more states or degrees <str<strong>on</strong>g>of</str<strong>on</strong>g> performance.”
Literature Review<br />
definiti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> standard setting<br />
Kane (1994)<br />
“It is useful to draw a distincti<strong>on</strong> between <str<strong>on</strong>g>the</str<strong>on</strong>g><br />
passing score, defined as a point <strong>on</strong> <str<strong>on</strong>g>the</str<strong>on</strong>g> score<br />
scale, and <str<strong>on</strong>g>the</str<strong>on</strong>g> performance standard, defined<br />
as <str<strong>on</strong>g>the</str<strong>on</strong>g> minimally adequate level <str<strong>on</strong>g>of</str<strong>on</strong>g> performance<br />
for some purpose…The performance standard<br />
is <str<strong>on</strong>g>the</str<strong>on</strong>g> c<strong>on</strong>ceptual versi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> desired level <str<strong>on</strong>g>of</str<strong>on</strong>g><br />
competence, and <str<strong>on</strong>g>the</str<strong>on</strong>g> passing score is <str<strong>on</strong>g>the</str<strong>on</strong>g><br />
operati<strong>on</strong>al versi<strong>on</strong>.”
Literature Review<br />
definiti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> standard setting<br />
Tannenbaum and Wylie(2008)<br />
“Standard setting is a general label for a<br />
number <str<strong>on</strong>g>of</str<strong>on</strong>g> approaches comm<strong>on</strong>ly used to<br />
identify test scores that support decisi<strong>on</strong>s<br />
about test takers’ (candidates’) level <str<strong>on</strong>g>of</str<strong>on</strong>g><br />
knowledge, skill, pr<str<strong>on</strong>g>of</str<strong>on</strong>g>iciency, mastery, or<br />
readiness.”
Literature Review<br />
definiti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> standard setting<br />
Cizek and Bunch(2007)<br />
“To some degree, <str<strong>on</strong>g>the</str<strong>on</strong>g>n, because standard<br />
setting necessarily involves human opini<strong>on</strong>s<br />
and values, it can also be viewed as a nexus<br />
<str<strong>on</strong>g>of</str<strong>on</strong>g> technical, psychometric methods and<br />
policy making.”
Literature Review<br />
<str<strong>on</strong>g>the</str<strong>on</strong>g> Ang<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> method<br />
minimally<br />
acceptable<br />
pers<strong>on</strong><br />
probability<br />
statements<br />
aggregating<br />
individual<br />
standards
Literature Review<br />
<str<strong>on</strong>g>the</str<strong>on</strong>g> Ang<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> method<br />
minimally<br />
acceptable<br />
pers<strong>on</strong><br />
probability<br />
statements<br />
aggregating<br />
individual<br />
standards<br />
minimally<br />
acceptable pers<strong>on</strong>
Literature Review<br />
<str<strong>on</strong>g>the</str<strong>on</strong>g> Ang<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> method<br />
minimally<br />
acceptable<br />
pers<strong>on</strong><br />
probability<br />
statements<br />
aggregating<br />
individual<br />
standards
Literature Review<br />
<str<strong>on</strong>g>the</str<strong>on</strong>g> Ang<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> method<br />
minimally acceptable<br />
pers<strong>on</strong><br />
15% 25% 35%<br />
hard<br />
45% 55% 65%<br />
medium<br />
75% 85% 95%<br />
easy
Literature Review<br />
<str<strong>on</strong>g>the</str<strong>on</strong>g> Ang<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> method<br />
minimally<br />
acceptable<br />
pers<strong>on</strong><br />
probability<br />
statements<br />
aggregating<br />
individual<br />
standards
Literature Review<br />
<str<strong>on</strong>g>the</str<strong>on</strong>g> Ang<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> method<br />
item1 item2 item3 …. item50 SUM<br />
J1 0.75 0.65 0.65 0.55 31.7<br />
J2 0.85 0.75 0.65 0.45 32.1<br />
.<br />
.<br />
.<br />
J11 0.75 0.75 0.65 0.45 31.9<br />
average
Method<br />
Panelists<br />
Number<br />
Percentage<br />
Gender<br />
Background<br />
Female 11 100.0%<br />
Male 0 0.0%<br />
<str<strong>on</strong>g>Chinese</str<strong>on</strong>g><br />
Teaching<br />
5 45.5%<br />
Linguistics 4 36.4%<br />
Psycometrics 2 18.2%
Method: procedure<br />
Panelist familiarizati<strong>on</strong><br />
Definiti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> minimally acceptable pers<strong>on</strong><br />
Two rounds <str<strong>on</strong>g>of</str<strong>on</strong>g> probability judgments<br />
Analysis <str<strong>on</strong>g>of</str<strong>on</strong>g> internal validity
Method: procedure<br />
Panelist familiarizati<strong>on</strong><br />
Definiti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> minimally acceptable pers<strong>on</strong><br />
Two rounds <str<strong>on</strong>g>of</str<strong>on</strong>g> probability judgments<br />
Analysis <str<strong>on</strong>g>of</str<strong>on</strong>g> internal validity
Method: procedure<br />
Panelist familiarizati<strong>on</strong><br />
Definiti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> minimally acceptable pers<strong>on</strong><br />
Two rounds <str<strong>on</strong>g>of</str<strong>on</strong>g> probability judgments<br />
Analysis <str<strong>on</strong>g>of</str<strong>on</strong>g> internal validity
Two rounds <str<strong>on</strong>g>of</str<strong>on</strong>g> probability judgments<br />
round1 judgment<br />
Item P value<br />
feedback & discussi<strong>on</strong><br />
IRT difficulty<br />
parameter<br />
round2 judgment
Method: procedure<br />
Panelist familiarizati<strong>on</strong><br />
Definiti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> minimally acceptable pers<strong>on</strong><br />
Two rounds <str<strong>on</strong>g>of</str<strong>on</strong>g> probability judgments<br />
Analysis <str<strong>on</strong>g>of</str<strong>on</strong>g> internal validity
<str<strong>on</strong>g>Cut</str<strong>on</strong>g>-<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> scores<br />
Result<br />
<str<strong>on</strong>g>Test</str<strong>on</strong>g> Level<br />
mean <str<strong>on</strong>g>of</str<strong>on</strong>g> 1 st<br />
round (SD)<br />
mean <str<strong>on</strong>g>of</str<strong>on</strong>g> 2 nd<br />
round (SD)<br />
<str<strong>on</strong>g>Cut</str<strong>on</strong>g>-<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g><br />
scores<br />
B1<br />
34.51<br />
(2.862)<br />
33.89<br />
(1.778)<br />
34<br />
Reading<br />
Comprehensi<strong>on</strong><br />
B2<br />
32.17<br />
(1.616)<br />
32.10<br />
(1.409)<br />
32<br />
C1<br />
29.96<br />
(1.905)<br />
30.06<br />
(1.686)<br />
30<br />
B1<br />
32.18<br />
(1.632)<br />
32.31<br />
(1.129)<br />
32<br />
Listening<br />
Comprehensi<strong>on</strong><br />
B2<br />
31.56<br />
(1.697)<br />
31.84<br />
(1.498)<br />
32<br />
C1<br />
32.55<br />
(0.983)<br />
32.45<br />
(0.933)<br />
32
Result<br />
<br />
Correlati<strong>on</strong> analysis between <str<strong>on</strong>g>the</str<strong>on</strong>g> estimated results<br />
and IRT difficulty parameter <str<strong>on</strong>g>of</str<strong>on</strong>g> test items<br />
<str<strong>on</strong>g>Test</str<strong>on</strong>g> Level<br />
Number <str<strong>on</strong>g>of</str<strong>on</strong>g><br />
test items<br />
1 st round 2 nd round<br />
Reading<br />
Comprehensi<strong>on</strong><br />
Listening<br />
Comprehensi<strong>on</strong><br />
B1 47 -0.585 ** -0.760 **<br />
B2 34 -0.646 ** -0.796 **<br />
C1 47 -0.669 ** -0.764 **<br />
B1 50 -0.607 ** -0.772 **<br />
B2 47 -0.785 ** -0.894 **<br />
C1 46 -0.648 ** -0.802 **
Result<br />
Variati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> participants and test items<br />
<str<strong>on</strong>g>Test</str<strong>on</strong>g><br />
Level<br />
B1<br />
B2<br />
C1<br />
Source<br />
2<br />
σ i<br />
2<br />
σ p<br />
2<br />
σ ip<br />
2<br />
σ i<br />
2<br />
σ p<br />
2<br />
σ ip<br />
2<br />
σ i<br />
2<br />
σ p<br />
2<br />
σ ip<br />
Reading<br />
Listening<br />
1 st round 2 nd round 1 st round 2 nd round<br />
0.004 24.9% 0.007 50.1% 0.002 22.6% 0.003 41.5%<br />
0.004 23.1% 0.001 9.2% 0.001 10.1% 0.000 5.9%<br />
0.008 52.0% 0.006 40.7% 0.007 67.3% 0.004 52.6%<br />
0.007 43.6% 0.008 56.2% 0.002 22.9% 0.003 38.1%<br />
0.001 6.3% 0.001 5.7% 0.001 12.2% 0.001 10.3%<br />
0.008 50.1% 0.005 38.2% 0.006 64.9% 0.005 51.6%<br />
0.004 32.3% 0.005 43.1% 0.003 30.1% 0.004 41.6%<br />
0.001 11.7% 0.001 10.9% 0.000 3.0% 0.000 2.9%<br />
0.007 56.0% 0.005 46.0% 0.007 66.9% 0.005 55.5%
Result<br />
G-coefficient<br />
<str<strong>on</strong>g>Test</str<strong>on</strong>g> Level<br />
Reading<br />
Comprehensi<strong>on</strong><br />
Listening<br />
Comprehensi<strong>on</strong><br />
1 st round 2 nd round 1 st round 2 nd round<br />
B1 0.811 0.917 0.787 0.897<br />
B2 0.897 0.936 0.795 0.890<br />
C1 0.825 0.904 0.832 0.892
C<strong>on</strong>clusi<strong>on</strong><br />
<str<strong>on</strong>g>Test</str<strong>on</strong>g> Level<br />
<str<strong>on</strong>g>Cut</str<strong>on</strong>g>-<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> score<br />
B1 66<br />
B2 64<br />
C1 62<br />
restricti<strong>on</strong>s: no male panel members<br />
<br />
future study: examine <str<strong>on</strong>g>the</str<strong>on</strong>g> cut-<str<strong>on</strong>g><str<strong>on</strong>g>of</str<strong>on</strong>g>f</str<strong>on</strong>g> scores in<br />
real tests