28.06.2013 Views

Aggregate versus Disaggregate Data in Measuring School Quality

Aggregate versus Disaggregate Data in Measuring School Quality

Aggregate versus Disaggregate Data in Measuring School Quality

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Aggregate</strong> <strong>versus</strong> <strong>Disaggregate</strong> <strong>Data</strong> <strong>in</strong> Measur<strong>in</strong>g <strong>School</strong> <strong>Quality</strong><br />

Francisca G.-C. Richter Contact Author: B. Wade Brorsen<br />

Department of Economics Department of Agricultural Economics<br />

Cleveland State University Oklahoma State University<br />

Rhodes Tower Room #1704 414 Ag Hall<br />

Cleveland, OH 44115 Stillwater, OK 74078-6026<br />

Phone: (216) 687-4529 Phone: (405) 744-6836<br />

FAX: (216) 687-9206 FAX: (405) 744-8210<br />

e-mail: f.richter16@csuohio.edu e-mail: brorsen@okstate.edu<br />

Francisca G. C. Richer is a lecturer at Cleveland State University. B. Wade Brorsen is regents<br />

professor and Jean & Patsy Neustadt Chair <strong>in</strong> the Department of Agricultural Economics at<br />

Oklahoma State University.<br />

L:\Home\Brorsen\PAPERS\Oldpapers Feb 05\Measur<strong>in</strong>g <strong>School</strong> <strong>Quality</strong>\aggdis1revised.doc 2/9/2006


<strong>Aggregate</strong> <strong>versus</strong> <strong>Disaggregate</strong> <strong>Data</strong> <strong>in</strong> Measur<strong>in</strong>g <strong>School</strong> <strong>Quality</strong><br />

Abstract<br />

This article develops a measure of efficiency to use with aggregated data. Unlike the most<br />

commonly used efficiency measures, our estimator adjusts for the heteroskedasticity created by<br />

aggregation. Our estimator is compared to estimators currently used to measure school<br />

efficiency. Theoretical results are supported by a Monte Carlo experiment. Results show that for<br />

samples conta<strong>in</strong><strong>in</strong>g small schools (sample average may be about 100 students per school but<br />

sample <strong>in</strong>cludes several schools with about 30 or less students), the proposed aggregate data<br />

estimator performs better than the commonly used OLS and only slightly worse than the<br />

multilevel estimator. Thus, when school officials are unable to gather multilevel or disaggregate<br />

data, the aggregate data estimator proposed here should be used. When disaggregate data are<br />

available, standardiz<strong>in</strong>g the value-added estimator should be used when rank<strong>in</strong>g schools.<br />

Keywords: data aggregation, error components, school quality<br />

1


<strong>Aggregate</strong> <strong>versus</strong> <strong>Disaggregate</strong> <strong>Data</strong> <strong>in</strong> Measur<strong>in</strong>g <strong>School</strong> <strong>Quality</strong><br />

Over the last three decades, resources devoted to education have cont<strong>in</strong>uously <strong>in</strong>creased<br />

while student performance has barely changed (Odden and Clune 1995). In response, several<br />

states now reward public schools that perform better than others, based on their own measures of<br />

school quality (Ladd 1996). Test scores are used not only by policymakers <strong>in</strong> reward programs<br />

but are also presented <strong>in</strong> state report cards issued to each school. Already more than 35 states<br />

have comprehensive report cards report<strong>in</strong>g on a variety of issues <strong>in</strong>clud<strong>in</strong>g test scores and a<br />

comparison of school variables with district and state averages. But often the <strong>in</strong>formation<br />

presented is mislead<strong>in</strong>g or difficult to <strong>in</strong>terpret. Accurate <strong>in</strong>formation on school performance is<br />

needed if report cards and reform programs are to succeed <strong>in</strong> improv<strong>in</strong>g public schools.<br />

Hierarchical l<strong>in</strong>ear model<strong>in</strong>g (HLM), a type of multilevel model<strong>in</strong>g, has been recognized<br />

by most researchers as the appropriate technique to use when rank<strong>in</strong>g schools by effectiveness.<br />

As Webster argues, HLM recognizes the nested structure of students with<strong>in</strong> classrooms and<br />

classrooms with<strong>in</strong> schools, produc<strong>in</strong>g a different variance at each level for factors measured at<br />

that level. Multilevel data, also called disaggregate data is needed to implement HLM. For<br />

example, two-level data could consist of school-level and student-level variables. The value-<br />

added framework <strong>in</strong> comb<strong>in</strong>ation with HLM has become popular among researchers (Hanushek,<br />

Rivk<strong>in</strong>, and Taylor 1996; Goldste<strong>in</strong> 1997; Woodhouse and Goldste<strong>in</strong> 1998). Value-added<br />

regressions isolate a school’s effect on test scores dur<strong>in</strong>g a given time period, by us<strong>in</strong>g previous<br />

test scores as a regressor. As of 1996, among the 46 out of 50 states with accountability systems,<br />

only two used value-added models (Webster et al. 1996). Multilevel analysis has been criticized<br />

for be<strong>in</strong>g a complicated statistical analysis that school officials cannot understand (Ladd 1996).<br />

2


Most state school evaluation systems use aggregate data. Rather than hav<strong>in</strong>g data for each<br />

student with<strong>in</strong> each school, aggregate data provides only averages over all students, with<strong>in</strong> a<br />

school. <strong>School</strong> adm<strong>in</strong>istrators may be able to obta<strong>in</strong> records of each student’s <strong>in</strong>dividual test<br />

score but may not be able to match them with their parents’ <strong>in</strong>come, for example. Therefore,<br />

average test scores <strong>in</strong> a school are matched to the average <strong>in</strong>come <strong>in</strong> the respective school<br />

district.<br />

To measure school quality with aggregate data, it is common to regress school mean<br />

outcome measures on the means of several demographic and school variables The residuals from<br />

this regression are totally attributed to the school effect, and thus, are used to rank schools.<br />

Although the use of aggregate data has been widely criticized <strong>in</strong> the literature (Webster et al.<br />

1996; Woodhouse and Goldste<strong>in</strong> 1998), many states use aggregate data. This article proposes a<br />

new and more efficient estimator of quality based on aggregate data, and compares it with the<br />

commonly used ord<strong>in</strong>ary least squares (OLS) estimator as well as with the value-added-<br />

disaggregate estimator. Estimators based on disaggregate data will perform better than an<br />

estimator based on aggregate data. The questions that arise are: by how much will their<br />

performances differ? Should schools be us<strong>in</strong>g OLS, when they can use a more efficient<br />

aggregate estimate at no extra cost?<br />

One of Goldste<strong>in</strong>’s ma<strong>in</strong> oppositions to aggregate data models is that they say noth<strong>in</strong>g<br />

about the effects upon <strong>in</strong>dividual students. Also, aggregate data does not allow study<strong>in</strong>g<br />

differential effectiveness, which dist<strong>in</strong>guishes between schools that are effective for low<br />

achiev<strong>in</strong>g students and schools that are effective for high achiev<strong>in</strong>g students. The <strong>in</strong>ability to<br />

handle differential effectiveness is a clear disadvantage of aggregate as compared to disaggregate<br />

data.<br />

3


Another problem with us<strong>in</strong>g aggregate data is that the aggregated variables may not have<br />

been obta<strong>in</strong>ed from the same group of students or <strong>in</strong>dividuals. Family <strong>in</strong>come, for example, may<br />

be a county average rather than the average over the school. Average test scores and previous<br />

test scores may also come from different groups of students due to student mobility. In some<br />

school districts student mobility can be quite high (Fowler-F<strong>in</strong>n). Previous test scores are often<br />

not available for students who have changed schools. With disaggregate data, school effects are<br />

often estimated by reduc<strong>in</strong>g the sample to those students tested <strong>in</strong> both periods. <strong>Disaggregate</strong><br />

data at least permit a study of mobile students us<strong>in</strong>g regressions without previous test scores.<br />

With aggregate data, a percent of students not present <strong>in</strong> both periods can be <strong>in</strong>cluded as a<br />

regressor, but that does not fully capture the measurement error <strong>in</strong> explanatory variables or the<br />

possible differential effectiveness of schools <strong>in</strong> educat<strong>in</strong>g mobile students.<br />

However, when aggregate data are all that schools have, is it still possible to detect the<br />

extreme over and under perform<strong>in</strong>g schools? When us<strong>in</strong>g OLS on aggregate data, it has been<br />

observed that small schools are disproportionately rewarded (Clotfelter and Ladd 1996). The<br />

estimator proposed here elim<strong>in</strong>ates that bias by us<strong>in</strong>g standardized residuals to rank schools.<br />

Woodhouse and Goldste<strong>in</strong> (1998) argue that residuals from regressions with aggregate<br />

data are highly unstable and therefore, unreliable measures of school efficiency. Woodhouse and<br />

Goldste<strong>in</strong> analyze an aggregate model used <strong>in</strong> a previous study and show how small changes <strong>in</strong><br />

the <strong>in</strong>dependent variables as well as the <strong>in</strong>clusion of non-l<strong>in</strong>ear terms will change the rank<br />

order<strong>in</strong>g of regression residuals. However, their data set is small and they do not exam<strong>in</strong>e<br />

whether disaggregate data would have also led to fragile conclusions.<br />

The past research criticiz<strong>in</strong>g aggregate data did not consider maximum likelihood<br />

estimation of the aggregate model. Goldste<strong>in</strong> (1995), for example, illustrates the <strong>in</strong>stability of<br />

4


aggregate data models with an example <strong>in</strong> which he compares estimates com<strong>in</strong>g from an<br />

aggregate model <strong>versus</strong> estimates from several multilevel models and shows they are different.<br />

Goldste<strong>in</strong>’s (1995) aggregate model, however, does not provide an estimate of the between-<br />

student variance, which suggests that the author does not use MLE residuals to estimate school<br />

effects. Maximum likelihood estimation is possible s<strong>in</strong>ce the form of heteroskedasticity for the<br />

aggregate model is known (Dickens 1990).<br />

While it is expected that aggregation will attenuate the bias due to measurement error,<br />

few researchers have compared aggregate data models <strong>versus</strong> multilevel models while<br />

consider<strong>in</strong>g measurement error. Hanushek, Rivk<strong>in</strong>, and Taylor (1996) argue this aggregation<br />

produces an ambiguous bias on the estimated regression parameters. Thus they suggest an<br />

empirical exam<strong>in</strong>ation of the effects of aggregation <strong>in</strong> the presence of measurement error.<br />

Although the conventional wisdom is that aggregate data should not be used to measure<br />

school quality, the literature on which this belief is based, is <strong>in</strong>sufficient to support it. Research<br />

compar<strong>in</strong>g aggregate with disaggregate models have used ord<strong>in</strong>ary least squares rather than<br />

maximum likelihood estimators so the validity of their criticism is unclear. Efficient estimators<br />

of school quality based on aggregate data, as well as their confidence <strong>in</strong>tervals will be developed<br />

here and compared to multilevel estimators with and without measurement error. In the process,<br />

a standardized version of the value-added multilevel estimator is proposed. S<strong>in</strong>ce many states use<br />

aggregate data to rank and reward schools, the relevance of this issue cannot be denied.<br />

1. Theory<br />

Estimators of school effects on student achievement based on disaggregate data have<br />

been developed and reviewed extensively <strong>in</strong> the education literature, and are presented only<br />

5


iefly here. However, little effort has been devoted to develop appropriate estimators for<br />

aggregate data.<br />

This section consists of three parts. The first part will show how aggregation of a 2-level<br />

error components model, with heterogeneous number of first-level units with<strong>in</strong> second-level<br />

units, leads to a model with heteroskedastic error terms. Therefore, for estimators of the<br />

parameters of the model to be efficient, ML or GLS estimation is required. The aggregate data<br />

estimator is presented as well as its standardized version.<br />

The second part derives confidence <strong>in</strong>tervals for the aggregate data estimator and presents<br />

the confidence <strong>in</strong>tervals commonly used for disaggregate data. The third part <strong>in</strong>troduces<br />

measurement error <strong>in</strong> the model and derives the bias of parameter estimates.<br />

1.1. Aggregation of a Simple 2-Level Error Components Model<br />

Consider the follow<strong>in</strong>g model:<br />

(1) Y ij = ( X β ) ij + u j + eij<br />

, i = 1 , K , n j j = 1,<br />

K J ,<br />

whereY ij is the test score of the i th student <strong>in</strong> the j th school, ( Xβ ) ij is the fixed part of the model,<br />

likely to be a l<strong>in</strong>ear comb<strong>in</strong>ation of student and school characteristics, such as previous test score<br />

(for a value added measure), parents’ education, and average parents’ <strong>in</strong>come for each school,<br />

u j is the random effect due to school, that we are try<strong>in</strong>g to estimate, and e ij is the unexpla<strong>in</strong>ed<br />

portion of the test score, with distributions given by<br />

2<br />

2<br />

u ~ iid N(<br />

0,<br />

σ ), e ~ iid N(<br />

0,<br />

σ ), cov( u , e ) = 0 .<br />

j<br />

u<br />

ij<br />

In matrix notation the model is:<br />

(1.a) Y = Xβ<br />

+ Ζ u + e ,<br />

where<br />

e<br />

j<br />

ij<br />

6


2 ⎡σ<br />

e I n<br />

⎢<br />

V = ⎢<br />

⎢<br />

⎣<br />

⎡1<br />

⎤<br />

n 0<br />

1<br />

⎢<br />

⎥<br />

Ζ = ⎢ O ⎥ ,<br />

⎢<br />

⎥<br />

⎣<br />

0 1nJ<br />

⎦<br />

1<br />

( , V )<br />

Ζ u + e ~ N 0 ,<br />

2<br />

+ σ J<br />

0<br />

u<br />

n<br />

1<br />

0 ⎤<br />

⎥<br />

O ⎥ .<br />

2<br />

2<br />

σ I + ⎥<br />

e n σ J<br />

J u nJ<br />

⎦<br />

The random effect u j represents the departure from the overall mean effect of schools on<br />

students’ scores. While the <strong>in</strong>tercept conta<strong>in</strong>s the overall mean effect of schools, u j measures by<br />

how much school j deviates from this mean.<br />

The shr<strong>in</strong>kage estimator of u j is (Goldste<strong>in</strong> 1995):<br />

2 2 2<br />

j<br />

(2) uˆ<br />

j = ( σ u /( σ u + σ e / n j ))( yˆ<br />

ij ) / n j<br />

y ˆ = Y − ( X ˆ β ) ,<br />

ij<br />

ij<br />

7<br />

n<br />

∑ i=<br />

1<br />

where the yˆ ij ’s are called raw residuals and βˆ is the MLE of β . So the school effect for school j<br />

is estimated by the raw residuals, averaged over all students, and ‘shrunken’ by a factor that is a<br />

function of the variance components and the number of students <strong>in</strong> the school. The larger the<br />

number of students <strong>in</strong> a school, the closer this factor is to one. But if school size is small, there<br />

will be less <strong>in</strong>formation to estimate the school effect. Thus, the shr<strong>in</strong>kage factor becomes<br />

smaller, mak<strong>in</strong>g the estimate of the school effect deviate less from the overall mean.<br />

each school,<br />

ij<br />

Now let us see how the model changes with aggregation. Add<strong>in</strong>g over all students with<strong>in</strong><br />

n j<br />

n j<br />

n j<br />

∑ Y =<br />

= ∑ ( X<br />

β ) + +<br />

= ij n ju<br />

1<br />

1<br />

j ∑ =<br />

i ij e<br />

i<br />

i 1 ij


and divid<strong>in</strong>g by the number of students <strong>in</strong> each school, leads to the follow<strong>in</strong>g model:<br />

(3) j = ( X β ) j + u j + e j , j = 1, K,<br />

J<br />

Y .<br />

.<br />

.<br />

2<br />

2<br />

u iid N(<br />

0,<br />

σ ), e ~ N(<br />

0,<br />

σ / n ), cov( u , e ) = 0 ,<br />

j ~ u . j<br />

e j<br />

j . j<br />

where the dot is the common notation to denote that the variable has been averaged over the<br />

correspond<strong>in</strong>g <strong>in</strong>dex; students <strong>in</strong> this case. The error term for the aggregated model will be<br />

2 2<br />

v ~ ( 0,<br />

σ + σ / n ) .<br />

j<br />

u<br />

e<br />

j<br />

Aga<strong>in</strong>, <strong>in</strong> matrix notation the model is:<br />

(3.a) Y a = X a β + u + e a ,<br />

X<br />

a<br />

⎡ 1 1′<br />

n1<br />

⎢<br />

= ⎢<br />

⎢<br />

⎣<br />

0<br />

n1<br />

V a<br />

0 ⎤ ⎡ 1 1′<br />

⎤ ⎡ 1<br />

n<br />

1′<br />

⎤<br />

n1<br />

1<br />

n1<br />

n1<br />

⎥ ⎢ ⎥ ⎢ ⎥<br />

O ⎥ X , Ya<br />

= ⎢ M ⎥Y<br />

, ea<br />

= ⎢ M ⎥e<br />

1 1′<br />

⎥ ⎢ 1<br />

nJ<br />

nJ<br />

⎦<br />

1′<br />

⎥ ⎢ 1<br />

⎣ nJ<br />

nJ<br />

⎦<br />

1′<br />

⎥<br />

⎣ nJ<br />

nJ<br />

⎦<br />

a<br />

( )<br />

u + e ~ N 0,V ,<br />

2 2 ⎡σ<br />

u + σ e / n1<br />

⎢<br />

= ⎢<br />

⎢<br />

⎣ 0<br />

O<br />

8<br />

a<br />

2<br />

u<br />

0<br />

2<br />

σ + σ / n<br />

We are <strong>in</strong>terested <strong>in</strong> estimat<strong>in</strong>g the random effects u j ’s. For this, we estimate the MLE<br />

residuals of the error term vj. We def<strong>in</strong>e our estimator as the conditional mean of u j given vj, i.e.,<br />

u ~ = Eˆ<br />

( u / v ) , This value can be shown to be (see appendix):<br />

j<br />

j<br />

j<br />

2<br />

(4) ~ σ u<br />

u ( ( ˆ<br />

j = Y.<br />

) . )<br />

2 2<br />

j − Xβ<br />

j ,<br />

( σ + σ / n )<br />

u<br />

e<br />

j<br />

where β ˆ is the MLE of β for the aggregate model. Notice that this estimator has the same<br />

shr<strong>in</strong>kage factor as the disaggregate estimator.<br />

e<br />

J<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />


However, the school effects <strong>in</strong> (4) are heteroskedastic, while the true school effects are<br />

not. Thus, to correct for heteroskedasticity, we divide the estimator by its standard deviation<br />

obta<strong>in</strong><strong>in</strong>g the standardized estimator of school effect:<br />

( 1<br />

(5) u ( ( ˆ<br />

j =<br />

Y.<br />

j − Xβ<br />

) . j )<br />

2 2<br />

σ + σ / n<br />

u<br />

e<br />

j<br />

Thus, the set of u j<br />

( ’s may also be used to rank schools. Similarly, the multilevel estimator <strong>in</strong> (2)<br />

can also be standardized to obta<strong>in</strong>:<br />

)<br />

2 2<br />

n j<br />

(2.a) u j = ( 1/( σ u + σ e / n j ))( ∑ yˆ<br />

i ij ) / n<br />

= 1 j<br />

1.2. Confidence Intervals for the Estimates of <strong>School</strong> <strong>Quality</strong><br />

A confidence <strong>in</strong>terval for school effects is: uˆ j ± t1<br />

/ 2σ<br />

u|<br />

uˆ<br />

. Thus, it is necessary to obta<strong>in</strong><br />

the conditional variance of the random effect given its estimator; that is, Cov ( u | uˆ<br />

) .<br />

For both, disaggregate and aggregate estimators, the covariance matrix is derived<br />

similarly. First it is necessary to obta<strong>in</strong> the jo<strong>in</strong>t distribution of the vector of school effects u and<br />

its estimator. For this, notice that <strong>in</strong> both cases, the estimator is a l<strong>in</strong>ear comb<strong>in</strong>ation of the vector<br />

of dependent variables, test scores <strong>in</strong> our case. Thus the jo<strong>in</strong>t distribution can be derived from the<br />

jo<strong>in</strong>t of u and Y. Then, us<strong>in</strong>g a theorem from Moser (theorem. 2.2.1, page 29), the conditional<br />

covariance matrix of school effects is obta<strong>in</strong>ed. A derivation of this covariance matrix is given <strong>in</strong><br />

the appendix.<br />

The conditional covariance matrix based on the disaggregate estimator is:<br />

2 4 −1<br />

−1<br />

(6) Cov u | uˆ<br />

) = I −σ<br />

Ζ'<br />

V V − X ( X'V<br />

X )<br />

9<br />

−α<br />

−1<br />

−1<br />

( X')<br />

V Ζ<br />

( σ u u<br />

,<br />

The conditional covariance matrix based on the aggregate estimator is:


2 4 −1<br />

' −1<br />

' −1<br />

(7) Cov( u | u<br />

~<br />

) = σ u I −σ<br />

uV<br />

a ( Va<br />

− X a (X aVa<br />

X a ) X a ) Va<br />

.<br />

1.3. Bias <strong>in</strong> Estimation Introduced by Measurement Error<br />

Let us consider a two-level model with measurement error. The model is:<br />

(8) y ij = ( x β ) ij + u j + eij<br />

, i = 1, K , n j j = 1,<br />

K,<br />

J<br />

Y = y + q<br />

ij<br />

X hij = xhij<br />

+ mhij<br />

, h = 1,<br />

K,<br />

H<br />

cov( q ij , qi'<br />

j ) = cov( mhij<br />

, mhi'<br />

j ) = 0<br />

E ( q ) E(<br />

m ) = 0<br />

ij<br />

ij<br />

= hij<br />

cov( = σ<br />

h ij h ij h h m<br />

m m , ) , )<br />

1<br />

where yij is the real test score for the i th student <strong>in</strong> the j th school, qij is the measurement error for<br />

2<br />

y ij , ij ~ ( 0,<br />

q ) N q σ , Yij is the observed test score, xhij is the true measure of the h th student or<br />

school characteristic correspond<strong>in</strong>g to the i th student <strong>in</strong> the j th school, mhij is the measurement<br />

error for x hij , u j is the random component for school j, eij is the residual, and σ h , h ) m is the<br />

covariance of measurement errors from two explanatory variables, h1 and h2, for the same<br />

2<br />

student. The covariance of measurement errors from any two variables is assumed to be equal for<br />

all students regardless of the school they attend.<br />

10<br />

ij<br />

( 1 2<br />

( 1 2<br />

Follow<strong>in</strong>g Goldste<strong>in</strong> (1995), without measurement error, the parameters β could be<br />

estimated by the FGLS estimator ˆ ˆ − 1 −1<br />

( x V x)<br />

( x Vˆ<br />

−1<br />

β = ′ ′ y)<br />

. But measurement error as def<strong>in</strong>ed<br />

− 1 −1<br />

−1<br />

−1<br />

by model (8) implies that E ( x′<br />

V x)<br />

= ( X ′ V X ) − E(<br />

m′<br />

V m)<br />

; so an unbiased estimator for<br />

β <strong>in</strong> the presence of measurement error is proposed by Goldste<strong>in</strong> (1995) to be:


(9) ˆ − 1 ˆ −1<br />

−1<br />

[ X V X ( m V m)]<br />

( X Vˆ<br />

−1<br />

β = ′ − E ′<br />

′ Y )<br />

)<br />

.<br />

When measurement error is not taken <strong>in</strong>to account, the matrix ( m V m)<br />

1 −<br />

E ′ is omitted.<br />

Us<strong>in</strong>g Goldste<strong>in</strong>’s derivation of ( m V m)<br />

1 −<br />

E ′ and realiz<strong>in</strong>g that the <strong>in</strong>verse of V is also a block<br />

2 2<br />

( n j −1)<br />

σ u + σ e<br />

diagonal with elements<br />

<strong>in</strong> the diagonal, each element ( 2 2 2<br />

1, 2 )<br />

σ ( n σ + σ )<br />

h h of the H × H<br />

matrix ( m V m)<br />

1 −<br />

E ′ can be expressed as<br />

e<br />

j<br />

2<br />

J ⎧ σ ⎫ σ<br />

u<br />

( h1<br />

, h2<br />

) m<br />

(10) ∑ n<br />

= ⎨ − +<br />

j j ( n<br />

1 j 1)<br />

1⎬<br />

.<br />

2<br />

2 2<br />

⎩ σ e ⎭ n jσ<br />

u + σ e<br />

u<br />

e<br />

Now let us see how this omitted matrix, ( m V m)<br />

1 −<br />

E ′ , compares with the one obta<strong>in</strong>ed when<br />

aggregat<strong>in</strong>g the model. Aggregat<strong>in</strong>g the true disaggregate model, we obta<strong>in</strong>:<br />

(11) = x β ) + u + e , j = 1,<br />

K,<br />

J<br />

where notation is as <strong>in</strong> model (8).<br />

y. j ( . j j . j<br />

Y . j = y.<br />

j + q.<br />

j<br />

X h.<br />

j = xh.<br />

j + mh.<br />

j<br />

cov( q . j , q.<br />

j'<br />

) = 0<br />

E ( q.<br />

j ) = E(<br />

mh.<br />

j ) = 0<br />

m m = σ / n<br />

cov( h ij , h ij ) h , h ) m<br />

1<br />

2<br />

11<br />

( 1 2<br />

Notice how the covariance of measurement error between any two fixed explanatory<br />

variables is reduced <strong>in</strong> the aggregate model. Now the covariance matrix of the true model is a<br />

diagonal matrix with elements def<strong>in</strong>ed <strong>in</strong> the first part of this section; and which will be denoted<br />

j


y V a . Follow<strong>in</strong>g a procedure analogous to Goldste<strong>in</strong>’s derivation for the disaggregate model,<br />

one can obta<strong>in</strong> the follow<strong>in</strong>g unbiased estimator of β for the aggregate model:<br />

(12) ˆ ˆ −1<br />

( ˆ −1<br />

−1<br />

[ X V X m V m )] ( X Vˆ<br />

−1<br />

β = ′ − ′<br />

′ ) ,<br />

a a a a E a a a a a Ya<br />

where the subscript a denotes aggregate data. As can be seen, the bias now will depend on<br />

−1<br />

E( m′<br />

aVa<br />

m a ) , an H × H matrix whose ( 1, 2 ) h h element is<br />

J σ ( h1<br />

, h2<br />

) m<br />

(13) ∑ .<br />

2 2<br />

= n σ + σ<br />

j 1 j u e<br />

As can be seen by compar<strong>in</strong>g values <strong>in</strong> (10) and (13), the bias <strong>in</strong> β due to measurement<br />

error is attenuated <strong>in</strong> the aggregate model. Bias <strong>in</strong> the estimation of β without account<strong>in</strong>g for<br />

measurement error, is likely to affect the estimators of school effects, as suggested <strong>in</strong> (2) and (4).<br />

This result is worth consider<strong>in</strong>g s<strong>in</strong>ce adjustments for measurement error are seldom made and,<br />

as Woodhouse et al. (1996) argue, different assumptions about variances and covariances of<br />

measurement error may lead to totally different conclusions (when rank<strong>in</strong>g schools, for<br />

example). Therefore, when not correct<strong>in</strong>g for measurement error, ga<strong>in</strong>s from aggregation may<br />

somewhat offset the negative consequences of aggregation. Then, at least asymptotically,<br />

aggregate estimates of school effects may be less <strong>in</strong>accurate than what researchers have claimed.<br />

However, to exam<strong>in</strong>e the properties of our aggregate and disaggregate estimators of<br />

school effects <strong>in</strong> small samples, a Monte Carlo study will be necessary. Also, from the study we<br />

will be able to compare the estimators’ asymptotic and small sample behavior.<br />

12


2. <strong>Data</strong> and Procedures<br />

A Monte Carlo study was used to compare aggregate and disaggregate estimates of<br />

school effects with their true values. These values were also compared to OLS estimates with<br />

aggregate data s<strong>in</strong>ce this is what is most often done. The model on which the data generat<strong>in</strong>g<br />

process was based, was taken from Goldste<strong>in</strong>’s 1997 paper, table 3, page 387, because it was<br />

simple, and provided estimates of the random components for school and student, based on real<br />

data.<br />

This model regresses test scores of each student aga<strong>in</strong>st a previous test score, a dummy<br />

variable for gender, and a dummy for type of school (boys’, girls’, or mixed school). Test scores<br />

were transformed from ranks to standard normal deviates. The random part consists of the school<br />

effect and the student effect.<br />

Accord<strong>in</strong>g to Goldste<strong>in</strong>, multilevel analysis provides the follow<strong>in</strong>g estimated model:<br />

Tˆ score = -0.09 + 0.<br />

52Pscore<br />

+ 0.<br />

14Girl<br />

+ 0.<br />

10GirlsSch<br />

+ 0.<br />

09BoysSch<br />

,<br />

ij<br />

(14) i = 1 , K , n j j = 1,<br />

K J .<br />

ˆ 2 =<br />

u<br />

ij<br />

ij<br />

The estimated variance of school effects, also called between-school variance, is<br />

σ 0.<br />

07 , and the variance of student effects, also called with<strong>in</strong>-school variance, is ˆ 0.<br />

56<br />

2 σ = .<br />

These values and the estimates of the fixed part of the model were used to generate the<br />

disaggregate data. At each replication, n j observations were generated for each school, where n j<br />

was a random realization of a lognormal distribution. Lagged test scores were generated from a<br />

standard normal. Dummy variables were generated from b<strong>in</strong>omial distributions. The random<br />

components of the model for school and student were generated us<strong>in</strong>g a normal with zero mean<br />

and variance ˆ 0.<br />

07<br />

2 σ = and ˆ 0.<br />

56<br />

2 σ = respectively, and the actual test score was obta<strong>in</strong>ed as <strong>in</strong><br />

u<br />

e<br />

13<br />

j<br />

j<br />

e


equation (2). Then measurement error was <strong>in</strong>troduced to the previous and actual test scores.<br />

Measurement error was assumed to be a normal random variable with a zero mean and a<br />

standard deviation of 0.3. All dummy variables are assumed measured without error.<br />

Once a disaggregate data set is generated, estimates for school effects and variance<br />

components are obta<strong>in</strong>ed us<strong>in</strong>g multilevel analysis as provided by the Mixed procedure <strong>in</strong> SAS.<br />

Then, the disaggregate data set is aggregated by schools. Residuals as well as the two<br />

components of the variance of the error term are estimated us<strong>in</strong>g NLMIXED <strong>in</strong> SAS. At this<br />

po<strong>in</strong>t, we will have a set of 100 true school effects (s<strong>in</strong>ce the number of schools <strong>in</strong> the sample is<br />

100), and two sets of estimated school effects us<strong>in</strong>g aggregate and disaggregate data. Each set is<br />

used to rank the schools <strong>in</strong> the sample. The greater the school effect, the better the school’s<br />

performance, and therefore, the higher its rank<strong>in</strong>g. We also provide rank<strong>in</strong>gs with standardized<br />

school effects and the OLS estimate of school effects. F<strong>in</strong>ally, we compute the estimated<br />

variance components under all approaches and compare them with the true values.<br />

A comparison of the school effect estimators is done <strong>in</strong> several different ways. Estimated<br />

magnitudes of school effects are compared to the true magnitudes with the root mean squared<br />

error (RMSE). This statistic measures the sum of squared deviations of estimated <strong>versus</strong> true<br />

school effects, so the smaller the RMSE, the better the performance of the estimate. Spearman’s<br />

correlation coefficient is calculated for all estimators <strong>in</strong> order to measure the degree of<br />

correlation of each rank<strong>in</strong>g with the true schools rank<strong>in</strong>g. F<strong>in</strong>ally we compare the top-ten set of<br />

schools obta<strong>in</strong>ed with each estimator, with the true top-ten set 1 . The whole process described<br />

above constitutes a s<strong>in</strong>gle replication of the Monte Carlo study. As many as 1000 replications<br />

were used.<br />

14


Outcomes with and without measurement error are compared <strong>in</strong> order to see if the<br />

aggregate estimator is <strong>in</strong> fact more robust to errors <strong>in</strong> measurement than the disaggregate<br />

estimator. The parameters used to randomly generate the number of students <strong>in</strong> each school are<br />

also changed, to see how variability <strong>in</strong> school size affects the performance of the estimators.<br />

3. Results<br />

Table 1 shows the first set of results for 1000 samples, each of 100 schools whose size is<br />

distributed lognormal with mean 120 and variance 50000. Accord<strong>in</strong>g to this distribution, about<br />

70% of schools have sizes between 15 and 250 students. As expected, the disaggregate estimator<br />

performs best on almost all measures. The aggregate estimator’s performance, however, is very<br />

good, and clearly above the OLS estimator’s performance. OLS tends to pick small schools as<br />

the top schools. The average school size for the top ten schools as estimated by OLS is about<br />

102, while the true average for this group is 120. OLS estimators are based on residuals whose<br />

2 2<br />

variance is σ + σ / n . So, quality estimates for small schools will have a larger variance and<br />

u<br />

e<br />

will be more likely to be either at the bottom or top of the rank<strong>in</strong>gs. However, table 1 shows that<br />

both the aggregate and disaggregate estimators tend to pick large schools as the top schools so<br />

they are also a biased predictor of top schools.<br />

The aggregate and disaggregate estimators have a shr<strong>in</strong>kage factor reduces the residuals<br />

of small schools. Recall the shr<strong>in</strong>kage factor is<br />

2<br />

σ u<br />

2 2<br />

σ u + σ e / n<br />

15<br />

. This factor is always less than one, but<br />

decreases with school size, br<strong>in</strong>g<strong>in</strong>g down the absolute value of small school residuals. Results <strong>in</strong><br />

table 1 suggest that the shr<strong>in</strong>kage factor may over-compensate for the residuals effect, and thus,<br />

leave ma<strong>in</strong>ly large schools <strong>in</strong> the extremes. Estimators with a smaller shr<strong>in</strong>kage factor (the factor


is<br />

1<br />

2 2<br />

σ u + σ e / n<br />

) such as the standardized aggregate (equation 5) and standardized disaggregate<br />

estimators alleviate this problem. Table 1 shows how the average size for the top ten schools<br />

accord<strong>in</strong>g to the standardized estimators only differs by one student from the true top-ten group<br />

size average.<br />

When measur<strong>in</strong>g the RMSE of the estimators with respect to the true magnitude of school<br />

effects, we f<strong>in</strong>d aga<strong>in</strong> that the disaggregate estimator performs only slightly better than the<br />

aggregate estimator. Of course, the standardized estimators are not meant to match the<br />

magnitudes of school effects, so their RMSE’s are high and should not be compared to the non-<br />

standardized versions. When measur<strong>in</strong>g the performance of the estimators by their ability to<br />

match the true rank<strong>in</strong>g and not the true values of the school effects, the RMSE might not be as<br />

good of a measure as all the others presented <strong>in</strong> the table. However, when magnitudes are<br />

important, the non-standardized versions of these estimators should be used.<br />

The between- and with<strong>in</strong>-school variance estimates are presented <strong>in</strong> Table 1. Although<br />

the aggregate po<strong>in</strong>t estimates are close to the true variances, by look<strong>in</strong>g at the standard deviations<br />

of these estimates, it is clear that aggregation reduces the ability to estimate the with<strong>in</strong> schools<br />

variance as compared to the disaggregate estimator. In fact, be<strong>in</strong>g able to estimate these variance<br />

components is crucial to the performance of the aggregate estimator. The ability to estimate the<br />

variance components is determ<strong>in</strong>ed by the sample variation <strong>in</strong> school size. For the same mean of<br />

120 students and a variance of 10000 (5 times smaller), 91% of schools would have sizes<br />

between 15 and 250, and less than 1% would be smaller. In this case, it is almost impossible to<br />

estimate the variance components and OLS performs better than the aggregate estimator.<br />

Table 2 <strong>in</strong>troduces measurement error that is 30% of the highest possible test score. We<br />

had hypothesized that measurement error would have less effect on the aggregate estimators. Our<br />

16


esults validate this hypothesis. However, aggregate data are more likely to suffer from errors <strong>in</strong><br />

measurement than disaggregate data. As stated <strong>in</strong> the <strong>in</strong>troduction, this is due to student mobility,<br />

and <strong>in</strong> general, the fact that averages are not taken over the same group of students.<br />

Table 3 shows the results for samples with mean school size of 20 and a variance of 250,<br />

which implies that 70% of schools will have sizes between 10 and 50 students. This is done to<br />

consider the case when policy makers require evaluations at the grade rather than school level.<br />

Results are as before; the aggregate estimator is better than OLS and only slightly worse than the<br />

disaggregate estimator.<br />

2<br />

As school size <strong>in</strong>creases, the variation <strong>in</strong> averaged residuals due to students ( σ e / n )<br />

becomes <strong>in</strong>significant and the averages come closer to their true means. This implies that<br />

aggregation becomes less of a concern for estimat<strong>in</strong>g school effects and heteroskedasticity is<br />

almost <strong>in</strong>significant. The problem with small or large schools be<strong>in</strong>g consistently rewarded almost<br />

disappears. In fact, table 4 shows results for a mean school size of 300 and variance of 100,000.<br />

Differences among rank<strong>in</strong>g measures have narrowed for all estimators, and OLS, the only<br />

estimator that does not rely on estimat<strong>in</strong>g variance components performs at its best.<br />

4. Conclusions<br />

Researchers argue that value-added multilevel models provide the most accurate<br />

measures of school quality. But most states cont<strong>in</strong>ue to use aggregate data (usually not <strong>in</strong> a value<br />

added framework) to rank and reward schools. Research criticiz<strong>in</strong>g aggregate models, by<br />

compar<strong>in</strong>g them with disaggregate models, have used ord<strong>in</strong>ary least squares rather than<br />

maximum likelihood estimators. This article shows that the criticisms of aggregate models have<br />

been overstated.<br />

17


Results show that when many small schools are present <strong>in</strong> the data, the proposed<br />

aggregate data estimator performs better than OLS on aggregate data, and only slightly worse<br />

than the disaggregate data estimator. However, as school size <strong>in</strong>creases, the three estimates<br />

perform more alike. Even though the aggregate data estimator is only slightly worse than the<br />

disaggregate data estimator for rank<strong>in</strong>g schools based on efficiency, we still want to encourage<br />

the collection of disaggregate data because of their ability to handle differential effects and at<br />

least partly address student mobility.<br />

Reward systems based on OLS estimators tend to reward small schools over bigger ones,<br />

as the empirical literature has shown, while the shr<strong>in</strong>kage disaggregate estimator rewards large<br />

schools. A standardized version of this estimate is presented that elim<strong>in</strong>ates this problem. Thus,<br />

when school officials are able to collect multilevel data, this study suggests they should<br />

standardize the estimates of school quality before rank<strong>in</strong>g schools. However, when disaggregate<br />

data are not available, and small schools are present <strong>in</strong> the sample the standardized aggregate<br />

estimator proposed here should be used. Note that our application is to schools, but the results<br />

are applicable to measur<strong>in</strong>g efficiency <strong>in</strong> any <strong>in</strong>dustry where aggregate data may be the only data<br />

available.<br />

18


Notes<br />

1 Although the aggregate estimate is theoretically unbiased, a test for bias similar to Hanushek<br />

and Taylor’s was performed. Hanushek and Taylor f<strong>in</strong>d that aggregation biases downward the<br />

estimated school effects. They reestimate the value-added equation enter<strong>in</strong>g an estimate of<br />

school quality as a fixed effect. Bias of the school effect estimate is measured by deviations of<br />

the coefficient of school quality from one. In our Monte Carlo study, rather than an estimate, the<br />

true school effects can be used, and therefore, a generated regressor problem is avoided. No<br />

evidence of bias is found and thus Hanushek and Taylor’s f<strong>in</strong>d<strong>in</strong>g of bias is apparently due to a<br />

bias <strong>in</strong> the construction of their test rather than due to aggregation.<br />

19


References<br />

Clotfelter, C. T., & Ladd, H. F. (1996) Recogniz<strong>in</strong>g and Reward<strong>in</strong>g Success <strong>in</strong> Public <strong>School</strong>s <strong>in</strong><br />

H. F. Ladd (Ed.), Hold<strong>in</strong>g <strong>School</strong>s Accountable. Performance-Based Reform <strong>in</strong> Education<br />

Wash<strong>in</strong>gton, D.C.: The Brook<strong>in</strong>gs Institute.<br />

Dickens, W. (1990). Error Components <strong>in</strong> Grouped <strong>Data</strong>: Is It Ever Worth Weight<strong>in</strong>g? Review<br />

of Economics and Statistics, 328-333.<br />

Fowler-F<strong>in</strong>n, T. (2001). Student Stability vs. Mobility. Factors that Contribute to Achievement<br />

Gaps. <strong>School</strong> Adm<strong>in</strong>istrator. August 2001. Available at<br />

http://www.aasa.org/publications/sa/2001_08/fowler-f<strong>in</strong>n.htm.<br />

Goldste<strong>in</strong>, H. (1995). Multilevel Statistical Models. London: Edward Arnold.<br />

Goldste<strong>in</strong> H. (1997). Methods <strong>in</strong> <strong>School</strong> Effectiveness Research. <strong>School</strong> Effectiveness and<br />

<strong>School</strong> Improvement, 8, 369-395.<br />

Hanushek, E. A., Rivk<strong>in</strong> S. G. & Taylor L. L. (1990). Alternative Assessments of the<br />

Performance of <strong>School</strong>s:Measurement of State Variations <strong>in</strong> Achievment. The Journal of Human<br />

Resources, 25, 179-201.<br />

Hanushek, E. A., & Taylor L. L. (1996). Aggregation and the Estimated Effects of <strong>School</strong><br />

Resources. The Review of Economics and Statistics, 2, 611-627.<br />

20


Ladd, H. F. (1996). Catalysts for Learn<strong>in</strong>g. Recognition and Reward Programs <strong>in</strong> the Public<br />

<strong>School</strong>s. Brook<strong>in</strong>gs Review, 3, 14-17.<br />

Moser, B. K. (1996) L<strong>in</strong>ear Models: A Mean Model Approach. California: Academic Press.<br />

Odden, A., & Clune W. (1995) Improv<strong>in</strong>g Educational Productivity and <strong>School</strong> F<strong>in</strong>ance.<br />

Educational Researcher, 9, 6-10.<br />

Webster, W. J., Mendro, R. L., Orsak, T. H., & Weeras<strong>in</strong>ghe, D. (1996) The Applicability of<br />

Selected Regression and Hierarchical L<strong>in</strong>ear Models to the Estimation of <strong>School</strong> and Teacher<br />

Effects. Paper presented at the annual meet<strong>in</strong>g of the National Council of Measurement <strong>in</strong><br />

Education, New York, NY.<br />

Woodhouse, G., & Goldste<strong>in</strong>, H. (1998). Educational Performance Indicators and LEA League<br />

Tables. Oxford Review of Education, 14, 301-320.<br />

Woodhouse, G., Yang, M., Goldste<strong>in</strong>, H., & Rasbash, J. (1996) Adjust<strong>in</strong>g for Measurement Error<br />

<strong>in</strong> Multilevel Analysis. Journal of the Royal Statistical Society A, 159, 201-212.<br />

21


Table 1. Estimates of school quality us<strong>in</strong>g aggregate vs. disaggregate data with no measurement<br />

error.<br />

Measure Type of estimator Mean Std.Dev.<br />

Spearman <strong>Disaggregate</strong> 0.8852 0.0296<br />

L:\Home\Brorsen\PAPERS\Oldpapers Feb 05\Measur<strong>in</strong>g <strong>School</strong> <strong>Quality</strong>\aggdis1revised.doc 2/9/2006<br />

Std. disaggregate 0.8803 0.0317<br />

<strong>Aggregate</strong> 0.8724 0.0327<br />

Std. aggregate 0.8700 0.0342<br />

OLS 0.8603 0.0377<br />

RMSE <strong>Disaggregate</strong> 0.1182 0.0130<br />

Std. disaggregate 0.4714 0.0538<br />

<strong>Aggregate</strong> 0.1294 0.0167<br />

Std. aggregate 0.4950 0.0562<br />

OLS 0.1516 0.0219<br />

Top Ten <strong>Disaggregate</strong> 6.97 1.203<br />

Std. disaggregate 6.96 1.178<br />

<strong>Aggregate</strong> 6.67 1.284<br />

Std. aggregate 6.75 1.207<br />

OLS 6.57 1.258<br />

<strong>School</strong> Size Avg. Real Group 120.19 72.55<br />

In Top Ten Group <strong>Disaggregate</strong> 140.88 80.91<br />

Std. disaggregate 119.95 67.66<br />

<strong>Aggregate</strong> 143.21 83.90<br />

Std. aggregate 120.75 67.19


OLS 101.94 62.86<br />

Variance Estimates Dis. With<strong>in</strong> Sch. 0.560 0.008<br />

Dis. Between Sch. 0.070 0.012<br />

Agg.With<strong>in</strong> Sch. 0.556 0.444<br />

Agg Between Sch. 0.067 0.015<br />

Note: Results are for 1000 simulations, each <strong>in</strong>clud<strong>in</strong>g 100 schools. The number of students per school is<br />

a lognormal random variable with mean 120 and variance 50000. Mean is the average over all<br />

simulations, RMSE is root mean squared error, Top Ten is the average number of schools ranked <strong>in</strong> the<br />

top ten with the estimator, that belong to the true top ten set. Estimators compared are the disaggregate<br />

estimator, its standardized version, the aggregate estimator, its standardized version, and the OLS<br />

estimator of school effects. Variance estimates are also presented for the disaggregate and aggregate<br />

methods.<br />

1


Table 2. Estimates of school quality us<strong>in</strong>g aggregate vs. disaggregate data with measurement<br />

error.<br />

Measure Type of estimator Mean Std.Dev.<br />

Spearman <strong>Disaggregate</strong> 0.8557 0.0321<br />

Std. disaggregate 0.8503 0.0347<br />

<strong>Aggregate</strong> 0.8581 0.0326<br />

Std. aggregate 0.8544 0.0347<br />

OLS 0.8423 0.0389<br />

RMSE <strong>Disaggregate</strong> 0.1324 0.0130<br />

Std. disaggregate 0.5283 0.0552<br />

<strong>Aggregate</strong> 0.1364 0.0164<br />

Std. aggregate 0.5248 0.0567<br />

OLS 0.1667 0.0278<br />

Top Ten <strong>Disaggregate</strong> 6.50 1.197<br />

Std. disaggregate 6.45 1.188<br />

<strong>Aggregate</strong> 6.48 1.235<br />

Std. aggregate 6.53 1.185<br />

OLS 6.29 1.188<br />

<strong>School</strong> Size Avg. Real Group 116.36 64.52<br />

In Top Ten Group <strong>Disaggregate</strong> 137.19 68.01<br />

Std. disaggregate 116.00 64.28<br />

<strong>Aggregate</strong> 141.90 73.20<br />

Std. aggregate 117.93 65.74<br />

2


OLS 95.92 60.25<br />

Variance Estimates Dis. With<strong>in</strong> Sch. 0.670 0.009<br />

Dis. Between Sch. 0.073 0.013<br />

Agg.With<strong>in</strong> Sch. 0.654 0.484<br />

Agg Between Sch. 0.067 0.016<br />

Note: Results are for 1000 simulations, each <strong>in</strong>clud<strong>in</strong>g 100 schools. The number of students per school is<br />

a lognormal random variable with mean 120 and variance 50000. Measurement error is 30% of highest<br />

test score, <strong>in</strong> actual and previous scores. Mean is the average over all simulations, RMSE is root mean<br />

squared error, Top Ten is the average number of schools ranked top ten with the estimator, that belong to<br />

the true top ten set.<br />

3


Table 3. Estimates of school quality us<strong>in</strong>g aggregate vs. disaggregate data for small schools.<br />

Measure Type of estimator Mean Std.Dev.<br />

Spearman <strong>Disaggregate</strong> 0.7684 0.0140<br />

Std. disaggregate 0.7637 0.0140<br />

<strong>Aggregate</strong> 0.7580 0.0178<br />

Std. aggregate 0.7558 0.0168<br />

OLS 0.7457 0.0185<br />

RMSE <strong>Disaggregate</strong> 0.1643 0.0134<br />

Std. disaggregate 0.6610 0.0570<br />

<strong>Aggregate</strong> 0.1804 0.0227<br />

Std. aggregate 0.6784 0.0593<br />

OLS 0.2212 0.0227<br />

Top Ten <strong>Disaggregate</strong> 5.47 1.302<br />

Std. disaggregate 5.50 1.296<br />

<strong>Aggregate</strong> 5.29 1.315<br />

Std. aggregate 5.39 1.284<br />

OLS 5.23 1.333<br />

<strong>School</strong> Size Avg. Real Group 19.43 5.08<br />

In Top Ten Group <strong>Disaggregate</strong> 23.12 5.65<br />

Std. disaggregate 19.66 5.20<br />

<strong>Aggregate</strong> 23.02 5.60<br />

Std. aggregate 19.61 4.69<br />

OLS 16.44 4.64<br />

4


Variance Estimates Dis. With<strong>in</strong> Sch. 0.560 0.018<br />

Dis. Between Sch. 0.070 0.015<br />

Agg.With<strong>in</strong> Sch. 0.537 0.358<br />

Agg Between Sch. 0.066 0.026<br />

Note: Results are for 100 simulations, each <strong>in</strong>clud<strong>in</strong>g 100 schools. The number of students per school is a<br />

lognormal random variable with mean 20 and variance 250. Mean is the average over all simulations,<br />

RMSE is root mean squared error, Top Ten is the average number of schools ranked top ten with the<br />

estimator, that belong to the true top ten set.<br />

5


Table 4. Estimates of school quality us<strong>in</strong>g aggregate vs. disaggregate data for large schools.<br />

Measure Type of estimator Mean Std.Dev.<br />

Spearman <strong>Disaggregate</strong> 0.9560 0.0153<br />

Std. disaggregate 0.9557 0.0155<br />

<strong>Aggregate</strong> 0.9425 0.0198<br />

Std. aggregate 0.9438 0.0195<br />

OLS 0.9437 0.0197<br />

RMSE <strong>Disaggregate</strong> 0.0744 0.0123<br />

Std. disaggregate 0.2913 0.0480<br />

<strong>Aggregate</strong> 0.0904 0.0200<br />

Std. aggregate 0.3280 0.0529<br />

OLS 0.0846 0.0139<br />

Top Ten <strong>Disaggregate</strong> 8.08 1.020<br />

Std. disaggregate 8.08 1.020<br />

<strong>Aggregate</strong> 7.66 1.151<br />

Std. aggregate 7.77 1.069<br />

OLS 7.81 1.085<br />

<strong>School</strong> Size Avg. Real Group 301.07 96.76<br />

In Top Ten Group <strong>Disaggregate</strong> 313.28 101.81<br />

Std. disaggregate 302.47 100.95<br />

<strong>Aggregate</strong> 330.69 102.75<br />

Std. aggregate 313.33 96.67<br />

OLS 297.42 99.50<br />

6


Variance Estimates Dis. With<strong>in</strong> Sch. 0.560 0.005<br />

Dis. Between Sch. 0.698 0.011<br />

Agg.With<strong>in</strong> Sch. 0.972 1.433<br />

Agg Between Sch. 0.063 0.013<br />

Note: Results are for 100 simulations, each <strong>in</strong>clud<strong>in</strong>g 100 schools. The number of students per school is a<br />

lognormal random variable with mean 300 and variance 100000. Mean is the average over all<br />

simulations, RMSE is root mean squared error, Top Ten is the average number of schools ranked top ten<br />

with the estimator, that belong to the true top ten set.<br />

7


Appendix<br />

Derivation of the aggregate estimators of school effects<br />

Recall equation (3.a) which shows the aggregate model:<br />

L:\Home\Brorsen\PAPERS\Oldpapers Feb 05\Measur<strong>in</strong>g <strong>School</strong> <strong>Quality</strong>\aggdis1revised.doc 2/9/2006<br />

Y = X β + u + e<br />

a<br />

However, the aggregate model has no way of differentiat<strong>in</strong>g among its random terms, thus we<br />

rewrite the model as:<br />

a<br />

Y = X β + w .<br />

a<br />

We are to obta<strong>in</strong> the conditional mean of u given the total residual w = u + e a based on the<br />

distributions of u and e.<br />

S<strong>in</strong>ce u and e are <strong>in</strong>dependent normal random vectors, its distribution is given by:<br />

⎛u<br />

⎞<br />

⎜ ⎟ ~ N<br />

⎝e<br />

⎠<br />

( 0,<br />

V )<br />

u, e<br />

a<br />

2 ⎡σ<br />

⎤<br />

u I J 0<br />

, where Vu,e = ⎢<br />

2 ⎥ , N be<strong>in</strong>g the total number of students.<br />

⎣ 0 σ e I N ⎦<br />

But ⎟ ⎛u<br />

⎞<br />

⎛u<br />

⎞<br />

⎜ is a l<strong>in</strong>ear comb<strong>in</strong>ation of ⎜ ⎟ , this is:<br />

⎝e<br />

a ⎠<br />

⎝e<br />

⎠<br />

⎛ I J<br />

⎜<br />

⎛u<br />

⎞ ⎛u<br />

⎞ ⎜<br />

⎜<br />

⎟ = A1<br />

⎜ ⎟ = ⎜<br />

⎝e<br />

⎠ ⎝e<br />

⎠<br />

0<br />

a<br />

⎜<br />

⎜<br />

⎝<br />

⎡<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

1<br />

n<br />

1<br />

1′<br />

distribution will be as follows:<br />

0<br />

n<br />

1<br />

0<br />

O<br />

1<br />

n<br />

J<br />

0<br />

1′<br />

n<br />

J<br />

a<br />

⎞<br />

⎟<br />

⎤⎟⎛<br />

u⎞<br />

⎥⎟⎜<br />

⎟ , where 1 n is an nj vector of 1’s. Thus, its<br />

j<br />

⎥⎟⎝<br />

e ⎠<br />

⎥⎟<br />

⎦⎠<br />

⎛u<br />

⎞<br />

⎜<br />

⎛<br />

⎜<br />

⎟ ~ N 0,<br />

A1<br />

V<br />

⎝e<br />

⎠ ⎝<br />

a<br />

u,e<br />

′<br />

A ⎟<br />

⎞<br />

1 .<br />

⎠<br />

⎛u<br />

⎞<br />

From this random vector, we construct ⎜ ⎟ pre-multiply<strong>in</strong>g<br />

⎝ w<br />

⎟<br />

⎠<br />

⎟<br />

⎛u<br />

⎞<br />

⎜ by<br />

⎝e<br />

⎟<br />

a ⎠<br />

⎟<br />

⎛ I J 0 ⎞<br />

A2 = ⎜ . Then, its<br />

⎝ I J I J ⎠<br />

distribution will be:


⎛u<br />

⎞<br />

⎜<br />

⎛<br />

′ ′<br />

⎜ ⎟ ~ N 0,<br />

A<br />

⎟<br />

⎞<br />

2 A1<br />

Vu,e<br />

A1<br />

A2<br />

.<br />

⎝ w ⎠ ⎝<br />

⎠<br />

Hav<strong>in</strong>g the jo<strong>in</strong>t distribution of u and w = u + e a , our estimator is easily derived (Moser, theorem<br />

2.2.1) as:<br />

E(<br />

u | w)<br />

= Cov(<br />

u,<br />

w)<br />

Cov(<br />

w)<br />

1<br />

−1<br />

⎡<br />

⎢<br />

( w)<br />

= ⎢<br />

⎢<br />

⎢⎣<br />

2<br />

u<br />

σ + σ / n<br />

2<br />

u<br />

σ<br />

σ<br />

2<br />

u<br />

2<br />

e<br />

2<br />

u<br />

2<br />

e<br />

M<br />

σ + σ / n<br />

Derivation of the conditional covariance matrix Cov ( u / uˆ<br />

)<br />

<strong>Disaggregate</strong> data: Recall equation (1.a):<br />

Y = Xβ<br />

+ Ζ u + e<br />

⎡1n1<br />

⎢<br />

Ζ = ⎢<br />

⎢<br />

⎣<br />

0<br />

O<br />

Ζ u + e ~ N<br />

1<br />

0<br />

n<br />

J<br />

( 0,V )<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

J<br />

1<br />

w ⎤<br />

1<br />

⎥<br />

⎥<br />

w ⎥<br />

J ⎥⎦<br />

The shr<strong>in</strong>kage estimator of school effects (equation 2) <strong>in</strong> matrix notation is:<br />

2 −1<br />

uˆ<br />

σ Z'V<br />

(Y X ˆ<br />

−1<br />

−1<br />

−1<br />

−1<br />

= u − β ) or u = Z'V<br />

(I − X ( X'V<br />

X ) X'V<br />

)Y<br />

2 ˆ σ u<br />

(*)<br />

This shows clearly that the shr<strong>in</strong>kage estimator is a l<strong>in</strong>ear comb<strong>in</strong>ation of the <strong>in</strong>dependent<br />

variable vector.<br />

Thus, we can derive the jo<strong>in</strong>t distribution of ( u, uˆ<br />

)' by know<strong>in</strong>g the distribution of ( u, Y )' .<br />

The distribution of ( u, Y )' is:<br />

2<br />

⎡u<br />

⎤ ⎛ ⎡<br />

⎜<br />

⎡0<br />

⎤ σ u I<br />

⎢ ⎥ ~ N<br />

⎜ ⎢ ⎥,<br />

⎢ 2<br />

⎣Y<br />

⎦ ⎝ ⎣Xβ<br />

⎦ ⎣σ<br />

u Z<br />

2<br />

σ ⎤⎞<br />

u Z'<br />

⎥⎟<br />

.<br />

V ⎟<br />

⎦⎠<br />

In general, u and any l<strong>in</strong>ear comb<strong>in</strong>ation of Y of the form u ˆ = AY , will be jo<strong>in</strong>tly distributed as<br />

follows:


2<br />

⎡u⎤<br />

⎛ ⎡<br />

⎜<br />

⎡0<br />

⎤ σ u I<br />

⎢ ⎥ ~ N<br />

⎜ ⎢ ⎥,<br />

⎢ 2<br />

⎣uˆ<br />

⎦ ⎝ ⎣Xβ<br />

⎦ ⎣σ<br />

u AZ<br />

2<br />

2<br />

σ ⎤⎞<br />

u Z'<br />

A'<br />

⎥⎟<br />

AVA' ⎟<br />

⎦⎠<br />

Then, by Moser’s theorem 2.2.1, the conditional covariance is:<br />

2 4<br />

−1<br />

Cov( u | uˆ<br />

) = σ u I −σ<br />

u Z' A' (AVA' ) AZ .<br />

2 −1<br />

−1<br />

−1<br />

Equation (6) is obta<strong>in</strong>ed by replac<strong>in</strong>g A with σ Z'V<br />

(I − X ( X'V<br />

X ) X'V<br />

) , from (*), <strong>in</strong><br />

the expression above.<br />

<strong>Aggregate</strong> data: Aga<strong>in</strong>, we will use the same argument. First, re-express the aggregate<br />

estimators of school quality <strong>in</strong> matrix notation:<br />

~ 2<br />

u<br />

u = σ V<br />

−1<br />

a<br />

(Y<br />

a<br />

− X ˆ β )<br />

The distribution of ( u, Ya<br />

)' is:<br />

a<br />

u<br />

−1<br />

−1<br />

−1<br />

−1<br />

u V (I X X 'V<br />

X X 'V<br />

)Y<br />

~ σ (**)<br />

, or = u a − a ( a a a ) a a a<br />

2<br />

2<br />

⎡u<br />

⎤ ⎛ ⎡<br />

⎜<br />

⎡0<br />

⎤ σ u I<br />

⎢ ⎥ ~ N<br />

⎜ ⎢ ⎥,<br />

⎢ 2<br />

⎣Ya<br />

⎦ ⎝ ⎣X<br />

a β ⎦ ⎣σ<br />

u I<br />

2<br />

σ ⎤⎞<br />

u I<br />

⎥⎟<br />

V ⎟<br />

⎦⎠<br />

So, the distribution of u and AYa, a l<strong>in</strong>ear comb<strong>in</strong>ation of Ya is:<br />

⎡u⎤<br />

⎢ ⎥<br />

⎣u<br />

~ ~<br />

⎦<br />

and the conditional covariance matrix is:<br />

2<br />

⎛<br />

⎜<br />

⎡0<br />

⎤ ⎡σ<br />

u I<br />

⎜ ⎢ ⎥,<br />

⎢<br />

⎝ ⎣X<br />

a β ⎦ ⎣σ<br />

u A<br />

N 2<br />

2<br />

σ ⎤⎞<br />

u A'<br />

⎥⎟<br />

,<br />

AV A' ⎟<br />

a ⎦⎠<br />

2 4<br />

−1<br />

Cov( u | u<br />

~<br />

) = σ u I − σ u A' (AVA' ) A .<br />

−1<br />

−1<br />

−1<br />

−1<br />

= u a<br />

a a a a a a<br />

2<br />

When A σ V (I − X ( X 'V<br />

X ) X 'V<br />

) , we obta<strong>in</strong> equation (7).<br />

−1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!