04.08.2013 Views

User Guide for the TIMSS International Database.pdf - TIMSS and ...

User Guide for the TIMSS International Database.pdf - TIMSS and ...

User Guide for the TIMSS International Database.pdf - TIMSS and ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

C H A P T E R 6 A C H I E V E M E N T S C O R E S<br />

ASMSTDR St<strong>and</strong>ardized ma<strong>the</strong>matics raw score – Population 1<br />

ASSSTDR St<strong>and</strong>ardized science raw score – Population 1<br />

BSMSTDR St<strong>and</strong>ardized ma<strong>the</strong>matics raw score – Population 2<br />

BSSSTDR St<strong>and</strong>ardized science raw score – Population 2<br />

Because of <strong>the</strong> difficulty in making any comparisons across <strong>the</strong> test booklets using only<br />

<strong>the</strong> number of raw score points obtained on a set of items, raw scores were st<strong>and</strong>ardized<br />

by booklet to provide a simple score which could be used in comparisons across booklets<br />

in preliminary analyses. The st<strong>and</strong>ardized score was computed so that <strong>the</strong> weighted mean<br />

score within each booklet was equal to 50, <strong>and</strong> <strong>the</strong> weighted st<strong>and</strong>ard deviation was equal<br />

to 10.<br />

These st<strong>and</strong>ardized raw scores were used in <strong>the</strong> initial item analysis <strong>for</strong> computing <strong>the</strong><br />

discrimination coefficients <strong>for</strong> each of <strong>the</strong> items in <strong>the</strong> test. This initial item analysis was<br />

conducted prior to scaling <strong>the</strong> test items. The st<strong>and</strong>ardized raw scores can be found in <strong>the</strong><br />

Student Background data files <strong>and</strong> in <strong>the</strong> Written Assessment data files.<br />

ASMNRSC National Rasch Ma<strong>the</strong>matics Score (ML) – Population 1<br />

ASSNRSC National Rasch Science Score (ML) – Population 1<br />

BSMNRSC National Rasch Ma<strong>the</strong>matics Score (ML) – Population 2<br />

BSSNRSC National Rasch Science Score (ML) – Population 2<br />

The national Rasch scores were also designed <strong>for</strong> preliminary analyses. These provided a<br />

basic Rasch score <strong>for</strong> preliminary analyses within countries, but could not be used <strong>for</strong><br />

international comparisons since each country has been assigned <strong>the</strong> same mean score.<br />

The national Rasch scores were computed by st<strong>and</strong>ardizing ma<strong>the</strong>matics <strong>and</strong> science logit<br />

scores to have a weighted mean of 150 <strong>and</strong> a st<strong>and</strong>ard deviation of 10 within each<br />

country. The logit scores were computed using <strong>the</strong> Quest Rasch analysis software; Quest<br />

provides maximum likelihood (ML) estimates of a scaled score, based on <strong>the</strong> Rasch<br />

model, <strong>for</strong> <strong>the</strong> per<strong>for</strong>mance of <strong>the</strong> students on a set of items. The computation took into<br />

account <strong>the</strong> varying difficulty of <strong>the</strong> items across test booklets, <strong>and</strong> <strong>the</strong> per<strong>for</strong>mance <strong>and</strong><br />

ability of <strong>the</strong> students responding to each set of items. These logit scores were obtained<br />

using item difficulties that were computed <strong>for</strong> each country using all available item<br />

responses <strong>for</strong> <strong>the</strong> country <strong>and</strong> centering <strong>the</strong> item difficulty around zero. When computing<br />

<strong>the</strong> item difficulties, responses marked as "not reached" were treated as items that were not<br />

administered. This avoids giving inflated item difficulties to <strong>the</strong> items located at <strong>the</strong> end<br />

of <strong>the</strong> test in cases where students systematically do not reach <strong>the</strong> end of <strong>the</strong> test. These<br />

item difficulties were <strong>the</strong>n used to compute logit scores <strong>for</strong> each student.<br />

When computing <strong>the</strong> student logit scores <strong>the</strong> responses marked as “not reached” were<br />

treated as incorrect responses. This avoided unfairly favoring students who started<br />

answering <strong>the</strong> test <strong>and</strong> stopped as soon as <strong>the</strong>y did not know <strong>the</strong> answer to a question.<br />

Logit scores <strong>for</strong> <strong>the</strong> students generally ranged between -4 <strong>and</strong> +4. Since it is not possible<br />

to obtain finite logit scores <strong>for</strong> those students who correctly answered all or none of <strong>the</strong><br />

items, scores <strong>for</strong> <strong>the</strong>se students were set to +5 <strong>and</strong> -5 logits, respectively. These logit scores<br />

were <strong>the</strong>n st<strong>and</strong>ardized to have a weighted mean of 150 <strong>and</strong> a st<strong>and</strong>ard deviation of 10.<br />

6 - 2 T I M S S D A T A B A S E U S E R G U I D E

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!