11READING ACHIEVEMENT IN 1991 AND 2000 - Pisa

More documents

Recommendations

Info

11 READING ACHIEVEMENT IN 1991 AND 200011.3.1 The Rasch modelGeorg Rasch repeatedly told a story to his students about how he was onceconfronted with the task of equating two different spelling tests. He hadavailable only the students’ test scores, no actual data from the tests (items)themselves, and from these scores he drew a simple X-Y scatter plot. This isthe starting point for all kinds of analysis of co-variation between two testscores, and probably the means for even non-statisticians to ‘translate’ testscores from one test to the other. Instead of concentrating on the degree ofcorrelation between the two tests, which is characteristic of classical statisticalanalysis of test equating, Rasch asked for more information about the responsesto the single items on which the test scores were built. Because single itemresponse data were not available to Rasch, he proceeded theoretically by settingup a set of assumptions for these unobserved responses. He wanted the X-Yplot to reflect something test-specific, i.e. something that was independent ofthe population of students (who in fact, by definition, influence the correlation).These considerations led Rasch to a mathematical formalisation, in terms ofa comprehensive statistical model, for the probability of responding ‘correct’ toeach of the items (spelling the words correctly) in the test. From theoreticalrequirements concerning the interpretation of this X-Y plot, Rasch deduced astatistical model for responding to each of the items of the test. Problemsconcerning equating the two spelling tests were consequently transferred fromproblems displayed in the X-Y plot to problems concerning the structure ofsingle item responses.The simple Rasch model (Rasch 1960) for two response categories assigns aprobability for student No. v to answer ‘correct’ for item No. i. In this modelthe individual student’s ‘ability’ emerges through a parameter σ v , which isspecific for student No v, together with a measure specific for item No i, a‘difficulty’ parameter θ i . These measures are combined to determine theprobability for a correct answer. The Rasch model is a statistical model forsingle item responses. It has to be emphasized that based on this model for thedistinct item response, the statistical properties of the student test scores (i.e.the summed item responses) are derived as mathematical consequences fromthe model itself. Consequently, the statistical distribution of the test score is adistribution, depending on the individual parameter σ v and the item parametersθ I , which cannot be evaluated independently, irrespective of the model.(Readers interested in an elaborated statistical background for the Rasch modelin relation to test equating, are referred to the literature e.g. Allerup 1994,2002).11.3.2 Test equating in the Rasch modelUnder the Rasch model, test equating is defined as the process of transferring‘true scale’ information regarding the σ ‘abilities’ from one test (test 1) toanother (test 2) in such a way, that it takes care of the fact that item difficulties136
11 READING ACHIEVEMENT IN 1991 AND 2000θ i may vary between the two tests, considering both ‘content’ related mattersand item difficulties. The practical steps to ensure that this can be achieved arethe following: First, the very existence of a σ-scale specific for test 1 is tested. This isdone by exercising test statistics (Allerup 1994, 1995, 1997) for the fitof the Rasch model to the item level data for test 1. Notice that it isnecessary to have access to the data at single item level. The same test procedure is repeated for test 2, testing the existence of aσ-scale specific for test 2. The two σ-scales need not be identical at thisstage. On acceptance of σ-scales for each of the two scales, it is finally tested,if the two σ-scales are identical.Under the Rasch model, test equating reflects a property of the two tests: Itis hypothesized that items from test 1 can be merged with items from test 2 sothat σ-abilities measured by the combined set of items remains the same asmeasured by the two tests.One of the useful mathematical consequences of fit by the Rasch model toitem level data is that the σ-scale can be estimated using any subset of theoriginal items (Rasch 1960).11.3.3 Equating IEA and PISABoth the IEA and PISA studies conducted Rasch model analyses for the fieldtrial data before main study data was collected. It is therefore assumed that thesimple Rasch model can adequately describe the students' responses to themain study items.Some test equating procedures have, of course, already been undertakenprior to this attempt, when information from the nine different PISA bookletswere combined into one reading scale and, likewise, when the results from thetwo booklets of IEA were combined. Conducting test equating in practicemeans, usually, that items enjoying a certain overlap cover all booklets. Thiswas done most rigorously in the IEA mathematics and science study, TIMSS(Beaton et al. 1996). Here each booklet contained a common ‘core’ set of itemsin excess of the items specific for that booklet).For the Danish PISA study a 10 th booklet was constructed containing asubset of items from both the IEA and the PISA tests and the Danish samplesize was enlarged to accommodate for this extra booklet (keeping the 10 thbooklet students out of the national PISA sample).Although any selection of items from IEA and PISA can be used for the 10 thbooklet, it is recommended that items representing a broad range of difficultiesare selected. Figure 11.1 shows the item difficulty among the selected items inbooklet 10, where items are arranged from the hardest items to the easier ones.(Each score point for a 2-point item is here regarded as a separate 1-pointitem.) The figure shows that there are sufficient items from both tests at both137
Page 1 and 2: 11 READING ACHIEVEMENT IN 1991AND 2
Page 3: 11 READING ACHIEVEMENT IN 1991 AND
Page 7 and 8: 11 READING ACHIEVEMENT IN 1991 AND

11READING ACHIEVEMENT IN 1991 AND 2000 - Pisa

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?