11.07.2015 Views

Estimation and Inference of Discontinuity in Density

Estimation and Inference of Discontinuity in Density

Estimation and Inference of Discontinuity in Density

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

the empirical likelihood con…dence sets are well de…ned even if the local l<strong>in</strong>ear b<strong>in</strong>n<strong>in</strong>g estimator <strong>of</strong>McCrary (2008) yields negative estimates for the densities. Simulation results <strong>in</strong>dicate that the empiricallikelihood tests have accurate …nite sample sizes, <strong>and</strong> are generally more powerful (especially thosebased on the local likelihood estimators) than the Wald test.Angrist <strong>and</strong> Lavy (1999) exploited the so-called Maimonides’ rule, which stipulates that a classwith more than 40 pupils should be split <strong>in</strong>to two, as an exogenous source <strong>of</strong> variation <strong>in</strong> class sizeto identify the e¤ects <strong>of</strong> class size on the scholastic achievement <strong>of</strong> pupils <strong>in</strong> Israel. An importantassumption <strong>of</strong> their study is no manipulation <strong>of</strong> class size by parents. Evidence <strong>of</strong> such manipulationcasts doubt on the identi…cation strategy <strong>of</strong> the regression discont<strong>in</strong>uity design. Angrist <strong>and</strong> Lavy(1999) provided <strong>in</strong>tuitive arguments that manipulation is unlikely to happen <strong>in</strong> Israel. In this paper,we statistically re-exam<strong>in</strong>e the assumption <strong>of</strong> no manipulation <strong>of</strong> class sizes by test<strong>in</strong>g cont<strong>in</strong>uity <strong>of</strong>the enrollment density [i.e. density cont<strong>in</strong>uity <strong>of</strong> the runn<strong>in</strong>g variable (McCrary, 2008)]. Us<strong>in</strong>g theproposed local likelihood estimator <strong>and</strong> the associated empirical likelihood <strong>in</strong>ference procedure, we …ndsigni…cant evidence <strong>of</strong> manipulation at the …rst multiple <strong>of</strong> 40 but not clearly at other multiples. Theprogressively smaller estimates <strong>and</strong> weaker evidence <strong>of</strong> discont<strong>in</strong>uity our empirical results discover atmultiples <strong>of</strong> 40 co<strong>in</strong>cides with the fact that the parents are more likely to selectively manipulate theclass size as just above 40 because they could place their children <strong>in</strong> schools with smaller class sizesif the manipulation is successful, than they do as just above 80, 120, or 160. These …nd<strong>in</strong>gs are notshared by us<strong>in</strong>g McCrary’s b<strong>in</strong>n<strong>in</strong>g estimator <strong>and</strong> Wald test.This paper also contributes to the rapidly grow<strong>in</strong>g literature on empirical likelihood (see Owen, 2001,for a review). In particular, we extend the empirical likelihood approach to the density discont<strong>in</strong>uity<strong>in</strong>ference problem by <strong>in</strong>corporat<strong>in</strong>g local polynomial …tt<strong>in</strong>g techniques such as Fan <strong>and</strong> Gijbels (1996)<strong>and</strong> Loader (1996). We show that the empirical likelihood ratios for density discont<strong>in</strong>uities have anasymptotically chi-squared distribution. Therefore, we can still observe the Wilks phenomenon (Fan,Zhang <strong>and</strong> Zhang, 2001) <strong>in</strong> this nonparametric density discont<strong>in</strong>uity <strong>in</strong>ference problem.This paper is organized as follows. Section 2.1 presents our basic setup <strong>and</strong> po<strong>in</strong>t estimation methods.In Section 2.2 we construct empirical likelihood functions <strong>of</strong> discont<strong>in</strong>uity <strong>in</strong> density. Section 2.3presents the asymptotic properties <strong>of</strong> the empirical likelihood-based <strong>in</strong>ference methods. The proposedmethods are evaluated by Monte Carlo simulations <strong>in</strong> Section 3. An empirical application to validation<strong>of</strong> regression discont<strong>in</strong>uity design is provided <strong>in</strong> Section 4. Section 5 concludes. Appendix A conta<strong>in</strong>smathematical pro<strong>of</strong>s <strong>and</strong> some prelim<strong>in</strong>ary lemmas.2 Ma<strong>in</strong> Result2.1 Setup <strong>and</strong> <strong>Estimation</strong>We …rst <strong>in</strong>troduce our basic setup. Let fX i g n i=1be an iid sample <strong>of</strong> X 2 X R with the probabilitydensity function f (x). We do not impose any parametric functional form for f (x). Suppose that we3


are <strong>in</strong>terested <strong>in</strong> (dis)cont<strong>in</strong>uity <strong>of</strong> the density f (x) at some given po<strong>in</strong>t c 2 X. Let f l = lim x"c f (x)<strong>and</strong> f r = lim x#c f (x) be the left <strong>and</strong> right limits <strong>of</strong> f (x) at x = c, respectively. Our object <strong>of</strong> <strong>in</strong>terestis the di¤erence <strong>of</strong> the left <strong>and</strong> right limits: 0 = f r f l : (1)If 0 = 0, then the density functionf(x) is cont<strong>in</strong>uous at x = c. We wish to estimate, construct acon…dence set, <strong>and</strong> conduct a hypothesis test for the parameter 0 without assum<strong>in</strong>g any parametricfunctional form <strong>of</strong> f l or f r .First, let us consider the po<strong>in</strong>t estimation problem <strong>of</strong> 0 . If the density is discont<strong>in</strong>uous at c (i.e., 0 6= 0), we can regard the estimation problems for the limits f l <strong>and</strong> f r as the ones for nonparametricdensities at the boundary po<strong>in</strong>t c us<strong>in</strong>g sub-samples with X i < c <strong>and</strong> X i c. To reduce boundarybiases <strong>in</strong> nonparametric estimators, it is reasonable to apply a local polynomial …tt<strong>in</strong>g technique, whichhas favorable properties on a boundary (see e.g. Fan <strong>and</strong> Gijbels, 1996). In density estimation wedo not have any regress<strong>and</strong>s or regressors. However, there are at least two ways to adapt the localpolynomial method to the density estimation problem, the b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> local likelihood methods.The b<strong>in</strong>n<strong>in</strong>g method (e.g. Cheng, 1994, 1997, <strong>and</strong> Cheng, Fan <strong>and</strong> Marron, 1997) creates regress<strong>and</strong>sn o J<strong>and</strong> regressors based on b<strong>in</strong>ned data <strong>and</strong> then implements local polynomial regression. Let XjG =j=1f: : : ; c 2b; c b; c; c + b; c + 2b; : : :g, which plays the role <strong>of</strong> a regressor, be an equi-spaced grid <strong>of</strong>width b, where the <strong>in</strong>terval X1 G; XG J covers the support X. Let I fg be the <strong>in</strong>dicator function <strong>and</strong>Zj G = 1 P n onnb i=1 I XiXjG < b 2, which plays the role <strong>of</strong> a regress<strong>and</strong>, be the normalized frequencyfor the j-th b<strong>in</strong>. The b<strong>in</strong>-based local l<strong>in</strong>ear estimators ^f l G <strong>and</strong> ^f r G for f l <strong>and</strong> f r are de…ned as solutionsto the follow<strong>in</strong>g weighted least square problems with respect to a l <strong>and</strong> a r , respectively,m<strong>in</strong>a l ;b lm<strong>in</strong>a r;b rXK XG j chj:Xj G


whereKlj G = K Xj;hG ( JX1 IkGk=1Krj G = K Xj;hG ( JXIj G K Xk;h G XG 2k;h Xj;hGk=1K XGk;hXGk;h 2X G j;h J XJXk=1k=1)1 IkG K XGk;h XGk;h ;)Ik G K XG k;h XGk;h : h i h iE ^fl ; E ^fr , the b<strong>in</strong>-If we regard (6) as estimat<strong>in</strong>g equations or sample moment conditions for h i h ibased empirical likelihood function for E ^fl ; E ^fr is constructed asJYL G (a l ; a r ) = sup p j ; (7)fp j g J j=1 j=1s.t. 0 p j 1,JXp j = 1,j=1JXp j 1 IjGj=1KGlj Zj G JXa l = 0, p j Ij G Krj G Zj G a r = 0:j=1The weight p j can be <strong>in</strong>terpreted as probability mass allocated to the observed value <strong>of</strong> Zj G . By apply<strong>in</strong>gthe Lagrange multiplier method, under certa<strong>in</strong> regularity conditions (see Theorem 2.2 <strong>of</strong> Newey <strong>and</strong>Smith, 2004), we can obta<strong>in</strong> the dual representation <strong>of</strong> the maximization problem <strong>in</strong> (7). That is`G (a l ; a r ) = 2 log L G (a l ; a r ) + n log n = 2 supwhere G J (a l; a r ) =0, <strong>and</strong> G 2 G J (a l;a r) j=1JXlog 1 + G0 gj G (a l ; a r ) ; (8)no G 2 R 2 : G0 gj G (a l; a r ) 2 V for j = 1; : : : ; J , V is an open <strong>in</strong>terval conta<strong>in</strong><strong>in</strong>gg G j (a l ; a r ) = 1I G jKGlj Z G j a l; IGj K G rj Z G j a r 0:Note that the J-variable maximization problem <strong>in</strong> (7) with respect to fp j g J j=1reduces to the two-variableconvex maximization problem <strong>in</strong> (8) with respect to G , which is easily implemented by a Newton-typeoptimization algorithm. Therefore, <strong>in</strong> practice we use the dual formulation (8) to compute the (log)empirical likelihood function. Based on the empirical likelihood function `G (a l ; a r ), the concentratedlikelihood function for the parameter <strong>of</strong> <strong>in</strong>terest 0 = f rf l is de…ned as`G () =for the parameter space A l A r <strong>of</strong> (f l ; f r ).m<strong>in</strong> `G (a l ; a r ) ; (9)f(a l ;a r)2A l A r:=a r a l gWe now de…ne the empirical likelihood function based on the local likelihood approach. Let I i =I fX i cg <strong>and</strong> X i;h = X ich. The …rst-order conditions for the local likelihood maximization problems7


<strong>in</strong> (3) are written as0 = 1 n0 = 1 nnX(1; X i;h ) (1 I i ) K (X i;h )i=1nX(1; X i;h ) I i K (X i;h )i=1ZxcZx


the b<strong>in</strong>ned data, the computational cost <strong>of</strong> `G () is relatively cheaper than that <strong>of</strong> ` () because thedimensions <strong>of</strong> the auxiliary parameters <strong>and</strong> (a l ; b l ; b r ) <strong>in</strong> ` () are higher than those <strong>of</strong> G <strong>and</strong> a l <strong>in</strong>`G ().One useful feature <strong>of</strong> our empirical likelihood approach is that it can easily <strong>in</strong>corporate additional<strong>in</strong>formation. Suppose we have prior <strong>in</strong>formation about X speci…ed <strong>in</strong> the form <strong>of</strong> E [m (X)] = 0, suchas the mean, variance, or quantiles. Us<strong>in</strong>g the weights fp i g n i=1, this <strong>in</strong>formation can be <strong>in</strong>corporated<strong>in</strong>to the likelihood maximization problem (10) by add<strong>in</strong>g the restriction P ni=1 p im (X i ) = 0. In thiscase, the dual form (11) is re-de…ned by add<strong>in</strong>g m (X i ) to the moment function g i (a l ; a r ; b l ; b r ). Theresult<strong>in</strong>g empirical likelihood <strong>in</strong>ference is more e¢ cient if the additional <strong>in</strong>formation is valid.Although this paper focuses exclusively on discont<strong>in</strong>uities <strong>of</strong> density functions, our empirical likelihoodapproach can also deal with discont<strong>in</strong>uities <strong>in</strong> derivatives <strong>of</strong> density functions. For example,suppose we are <strong>in</strong>terested <strong>in</strong> conduct<strong>in</strong>g <strong>in</strong>ference on the …rst-order derivatives <strong>of</strong> the density us<strong>in</strong>gthe local likelihood-based empirical likelihood. Then the contrast 0 = fr0 fl 0 can be estimated by^ = ^b r exp (^a r ) ^bl exp (^a l ) us<strong>in</strong>g the solutions <strong>of</strong> (3), <strong>and</strong> the empirical likelihood function for 0 canbe de…ned as` () =m<strong>in</strong>` (a l; a r ; b l ; b r ) :f(a l ;a r;b l ;b r)2A l A rB l B r:=b r exp(a r) b l exp(a l )gIn this case, one may add quadratic terms by X 2 i;h to the moment function g i (a l ; a r ; b l ; b r ) to reducethe boundary bias for the derivative estimates.2.3 Large Sample PropertiesWe now <strong>in</strong>vestigate the asymptotic properties <strong>of</strong> the proposed empirical likelihood statistics. We imposethe follow<strong>in</strong>g assumptions.Assumption.1. fX i g n i=1is i.i.d.2. There exists a neighborhood N around c such that f is cont<strong>in</strong>uously second-order di¤erentiableon N n fcg. For the matrices V <strong>and</strong> G de…ned <strong>in</strong> (13), V is positive de…nite <strong>and</strong> G has fullcolumn rank.3. K is a symmetric <strong>and</strong> bounded density function with support [ k; k] for some k > 0.4. As n ! 1, h ! 0, nh ! 1, <strong>and</strong> nh 5 ! 0. Additionally, b=h ! 0 for `G () <strong>and</strong> nh 3 ! 1 for` ().5. A l <strong>and</strong> A r are compact, f l 2 <strong>in</strong>t (A l ), <strong>and</strong> f r 2 <strong>in</strong>t (A r ). Additionally, for ` (), B l <strong>and</strong> B r arecompact, fl 0=f l 2 <strong>in</strong>t (B l ), <strong>and</strong> fr=f 0 r 2 <strong>in</strong>t (B r ).9


Assumption 1 is on the data structure. Although it is beyond the scope <strong>of</strong> this paper, it wouldbe <strong>in</strong>terest<strong>in</strong>g to extend the proposed method to weakly dependent data, where we are <strong>in</strong>terested<strong>in</strong> the discont<strong>in</strong>uity <strong>of</strong> the stationary distribution. For this extension, we would need to <strong>in</strong>troduce ablock<strong>in</strong>g technique to h<strong>and</strong>le the time series dependence <strong>in</strong> the moment functions (see Kitamura, 1997).Assumption 2 restricts the local shape <strong>of</strong> the density function around x = c. This assumption allowsdiscont<strong>in</strong>uity <strong>of</strong> the density function at x = c. Assumption 3 is on the kernel function K <strong>and</strong> implies thesecond-order kernel. This assumption is satis…ed by e.g., the triangle kernel K (a) = max f0; 1jajg <strong>and</strong>Epanechnikov kernel K (a) = 3 4 1 a2 I fjaj 1g. Assumption 4 is on the b<strong>and</strong>width parameter h.The requirement nh 5 ! 0 corresponds to an undersmooth<strong>in</strong>g condition to remove the bias component h i h i(f r f l ) E ^fr E ^fl <strong>in</strong> the construction <strong>of</strong> empirical likelihood. The requirement b=h ! 0 ison the b<strong>in</strong> width b used for the b<strong>in</strong>n<strong>in</strong>g estimator. The requirement nh 3 ! 1 is required to obta<strong>in</strong>the consistency <strong>of</strong> the m<strong>in</strong>imizers to solve (12). If we do not have the local l<strong>in</strong>ear terms (X ic) <strong>and</strong>(u c) <strong>in</strong> (3), this requirement is unnecessary. Assumption 5 is similar to an assumption that wouldbe used <strong>in</strong> a parametric estimation problem.’Under these assumptions, the asymptotic distributions <strong>of</strong> the empirical likelihood functions `G ( 0 )<strong>and</strong> ` ( 0 ) evaluated at the true parameter value 0 are obta<strong>in</strong>ed as follows.Theorem. Under Assumption, as n ! 1`G ( 0 ) d ! 2 (1) ;` ( 0 ) d ! 2 (1) :Note that even <strong>in</strong> this nonparametric hypothesis test<strong>in</strong>g problem, we observe the convergence <strong>of</strong> theempirical likelihood statistic to the pivotal 2 distribution, i.e., the Wilks phenomenon emerges. Thenull hypothesis H 0 : 0 = aga<strong>in</strong>st H 1 : 0 6= for some given can be tested by the test statistic`G () or ` () with some 2 (1) critical value. For example, the null <strong>of</strong> density cont<strong>in</strong>uity is tested by`G (0) or ` (0). Also, by <strong>in</strong>vert<strong>in</strong>g these test statistics, the 100 (1 ) % empirical likelihood asymptoticcon…dence sets are obta<strong>in</strong>ed asCS G = : `G () 2 1 (1) ;CS = : ` () 2 1 (1) ;where 2 1 (1) is the (1 )-th quantile <strong>of</strong> the 2 (1) distribution.We now compare our empirical likelihood approach with the Wald approach suggested by McCrary(2008). First, <strong>in</strong> contrast to the t-test based on (5), the empirical likelihood test based on `G (0) or` (0) does not require any asymptotic variance estimation, which is automatically <strong>in</strong>corporated <strong>in</strong> theconstruction <strong>of</strong> the empirical likelihood functions. Also, while the Wald test requires the derivation <strong>of</strong>the asymptotic variance 2 Kfor each kernel function, the empirical likelihood tests do not require this10


type <strong>of</strong> derivation. Second, the empirical likelihood con…dence sets CS G <strong>and</strong> CS do not require thelocal l<strong>in</strong>ear (or polynomial) estimators ^f l <strong>and</strong> ^f r . Thus, even if ^fl or ^f r yields negative estimates <strong>in</strong>…nite samples, CS G <strong>and</strong> CS are well de…ned. Third, the empirical likelihood test statistics are <strong>in</strong>variantto the formulation <strong>of</strong> the nonl<strong>in</strong>ear null hypotheses. For example, to test the density cont<strong>in</strong>uity, wemay specify the null hypothesis as H 0 : log f l = log f r , H0 ~ : f l = f r , H0 f: lf r= 1, etc. For thesehypotheses, the empirical likelihood test statistics are identical (i.e., `G (0) or ` (0)). On the otherh<strong>and</strong>, the Wald test statistic is not <strong>in</strong>variant to the formulation <strong>of</strong> the null hypotheses <strong>and</strong> may yieldopposite conclusions <strong>in</strong> …nite samples (see, Gregory <strong>and</strong> Veal, 1985, for examples).3 SimulationsIn this section we study the …nite-sample behaviors <strong>of</strong> the aforementioned methods us<strong>in</strong>g simulations.First we focus on the po<strong>in</strong>t estimators, i.e. the local l<strong>in</strong>ear b<strong>in</strong>n<strong>in</strong>g estimator ^ G <strong>and</strong> the local (logl<strong>in</strong>ear) likelihood estimator ^ for 0 .For comparisons, we also consider the local constant b<strong>in</strong>n<strong>in</strong>gestimator ~ G <strong>and</strong> the local (log constant) likelihood estimator ~ . For the kernel function K, we usethe triangle kernel function K (a) = max f0; 1 jajg. For the b<strong>and</strong>width h, we consider both …xedb<strong>and</strong>widths h = 1; 2; 3; 4 <strong>and</strong> data-dependent b<strong>and</strong>widths h = h dd , where h dd is the data-dependentb<strong>and</strong>width used by McCrary (2008) <strong>and</strong> = 1:5 k for k = 1; 0; 1; 2. For the b<strong>in</strong> size b to implement ^ G<strong>and</strong> ~ G , we employ a data-dependent method suggested by McCrary (2008). The data are generatedfrom normal distribution N (12; 3) (follow<strong>in</strong>g McCrary, 2008) <strong>and</strong> Student’s t distribution 12 + 3 p5t (5).Both distributions have the same mean <strong>and</strong> variance. The sample size is n = 1000 <strong>and</strong> the suspecteddiscont<strong>in</strong>uity po<strong>in</strong>t is c = 13. S<strong>in</strong>ce the above densities are cont<strong>in</strong>uous, the true value is 0 = 0. Thebiases, variances <strong>and</strong> mean square errors (MSEs) <strong>of</strong> the above estimators are reported <strong>in</strong> Tables 1 <strong>and</strong>2.Among four estimators, ^ performs best <strong>in</strong> terms <strong>of</strong> MSEs. Its MSE is slightly smaller than that <strong>of</strong>its competitor, ^ G , when a small b<strong>and</strong>width is used, but is signi…cantly smaller when the b<strong>and</strong>width isrelatively large. The dom<strong>in</strong>ance <strong>of</strong> ^ ma<strong>in</strong>ly comes from its superior bias performance on boundaries,while its variance is comparable with that <strong>of</strong> ^ G . The local constant estimators ~ G <strong>and</strong> ~ generallyhave smaller variances than ^ G <strong>and</strong> ^, but have much larger biases <strong>and</strong> thus larger MSEs. All fourestimators are generally biased downwards. On the other h<strong>and</strong>, a prelim<strong>in</strong>ary simulation <strong>in</strong>dicates thatthese estimators are generally biased upwards if the discont<strong>in</strong>uity po<strong>in</strong>t suspected is on the left side <strong>of</strong>the peak, e.g. c = 11. Typical bias-variance trade-o¤s for the b<strong>and</strong>width selection is also observed: thebiases are larger <strong>and</strong> the variances are smaller when the b<strong>and</strong>width <strong>in</strong>creases. Data generated from aheavier tailed distribution <strong>in</strong>crease the biases <strong>of</strong> the four estimators signi…cantly <strong>and</strong> a¤ect the variancesonly slightly. Aga<strong>in</strong>, ^ appears to have smaller MSEs than other estimators.Next we look at the tests for (dis)cont<strong>in</strong>uity <strong>in</strong> the density function. We consider a general set-up<strong>of</strong> mixture <strong>of</strong> normal distributions.Suppose that the r<strong>and</strong>om variable X is drawn from truncated11


size at the values <strong>of</strong> enrollments that are just above multiples <strong>of</strong> 40, e.g., 41-45, 81-85, etc. Angrist<strong>and</strong> Lavy (1999) used this rule as an exogenous source <strong>of</strong> variation <strong>in</strong> class size to identify the e¤ects<strong>of</strong> class size on the scholastic achievement <strong>of</strong> Israeli pupils.An important identify<strong>in</strong>g assumption <strong>of</strong> Angrist <strong>and</strong> Lavy (1999) is no manipulation <strong>of</strong> class size byparents, which is the test<strong>in</strong>g focus <strong>of</strong> this section. Precisely, there could be two k<strong>in</strong>ds <strong>of</strong> manipulation.The …rst one is that parents may selectively exploit the Maimonides’rule by register<strong>in</strong>g their children <strong>in</strong>the schools with enrollments just above multiples <strong>of</strong> 40 so that their children are placed <strong>in</strong> classes withsmaller sizes. Follow<strong>in</strong>g McCrary’s (2008) arguments, this would lead to an <strong>in</strong>crease <strong>in</strong> the density <strong>of</strong>enrollment counts around the po<strong>in</strong>t that is just above a multiple <strong>of</strong> 40. Angrist <strong>and</strong> Lavy (1999) arguedthat this k<strong>in</strong>d <strong>of</strong> manipulation is unlikely to happen for two reasons. First, Israeli pupils are requiredto attend school <strong>in</strong> their local registration area. Also, pr<strong>in</strong>cipals are required to grant enrollment tostudents with<strong>in</strong> their district <strong>and</strong> are not permitted to register students from outside their district.Second, even <strong>in</strong> exceptional cases that parents <strong>in</strong>tentionally move to another school district hop<strong>in</strong>g toget a better draw <strong>in</strong> the enrollment lottery (e.g. 41-45 <strong>in</strong>stead <strong>of</strong> 38), “there is no way to know (exactly)whether a predicted enrollment <strong>of</strong> 41 will not decl<strong>in</strong>e to 38 by the time school starts, obviat<strong>in</strong>g the needfor two small classes”(Angrist <strong>and</strong> Lavy, 1999).The second k<strong>in</strong>d <strong>of</strong> manipulation <strong>of</strong> class size is that parents may extract their kids from the publicschool system when they …nd the enrollments <strong>of</strong> the schools where their kids are registered are justbelow multiples <strong>of</strong> 40. This would lead to a decrease <strong>of</strong> the enrollment density on the left side <strong>of</strong> themultiples <strong>of</strong> 40. However, as argued by Angrist <strong>and</strong> Lavy, unlike <strong>in</strong> the United States private elementaryschool<strong>in</strong>g is rare <strong>in</strong> Israel.To assess the validity <strong>of</strong> the assumption <strong>of</strong> no manipulation <strong>of</strong> class size we test cont<strong>in</strong>uity <strong>of</strong> thedensity function <strong>of</strong> enrollment counts. We consider …fth graders. In the end, our data conta<strong>in</strong>ed 2029schools (Angrist <strong>and</strong> Lavy, 1999). The histogram is displayed <strong>in</strong> Figure 2. It shows a sharp <strong>in</strong>crease <strong>of</strong>densities at the enrollment <strong>of</strong> 40 but such <strong>in</strong>crease is not clearly observed for other multiples <strong>of</strong> 40. Thisobservation is re<strong>in</strong>forced <strong>in</strong> graphical analysis displayed <strong>in</strong> Figures 3 <strong>and</strong> 4, which show the estimatedenrollment density function us<strong>in</strong>g the data on the either side <strong>of</strong> 40 <strong>and</strong> 120 respectively.We perform the b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> local likelihood estimation (^ G <strong>and</strong> ^) <strong>and</strong> the associated tests (W G ,`G, <strong>and</strong> `) <strong>of</strong> the discont<strong>in</strong>uities <strong>in</strong> enrollment densities that are suspected at the multiples <strong>of</strong> 40 overa range <strong>of</strong> smooth<strong>in</strong>g b<strong>and</strong>widths. The results are summarized <strong>in</strong> Table 4.The local likelihood method …nds upward jumps <strong>of</strong> the enrollment density at c =40, 80, <strong>and</strong> 120<strong>and</strong> a downward jump at c = 160. The associated empirical likelihood tests show that the discont<strong>in</strong>uityat an enrollment <strong>of</strong> 40 is very signi…cant with test statistics all valued larger than 20 for di¤erentb<strong>and</strong>widths. The evidence <strong>of</strong> discont<strong>in</strong>uity at 80 is relatively weak <strong>and</strong> signi…cance depends on theb<strong>and</strong>width used, while no evidence <strong>of</strong> discont<strong>in</strong>uity is found at 120 <strong>and</strong> 160. The progressively weakerevidence <strong>of</strong> discont<strong>in</strong>uity co<strong>in</strong>cides with the extent <strong>of</strong> decrement <strong>of</strong> class sizes at di¤erent multiples <strong>of</strong>40. For example, accord<strong>in</strong>g to Maimonides’rule, the class size drops faster at the enrollment <strong>of</strong> 40 than13


AMathematical AppendixHereafter “w.p.a.1”means “with probability approach<strong>in</strong>g one”. De…ne f = f (c),log f; lim = (a l ; b l ; b r ) 0 d log f (x); 0 =x"c dx^ = arg m<strong>in</strong> ` (a l; log ( 0 + e a l) ; b l ; b r ) ; = A l B l B r ;2Xi cg i () = g i (a l ; b l ; b r ) ; A i = (1 I i ) ; (1 I i )hZZK lj1 j 2= u j 1K j 2(u) du; K rj1 j 2= u j 1K j 2(u) du;ku


Also, by the change <strong>of</strong> variables,1h 2 E [g i ()] = Z 1(1; u) K (u) ff (c + uh) exp (a l + b l uh)g du;h u


w.p.a.1. On the other h<strong>and</strong>, an expansion <strong>of</strong> (15) around ^; ^(^) = ( 0 ; 0) yieldswhere H =1nhP ni=10 = 1nh0B@g i (~)g i (~) 01+ ~ 0 g i (~)nXi=11 0 00 h 00 0 hg i ( 0 ) + 1nh1CA~; , ~ nXi=111 + ~ 0 g i (~)is a po<strong>in</strong>t on the l<strong>in</strong>e jo<strong>in</strong><strong>in</strong>g@g i (^)@ 0 H 1 !H (^ 0 ) ^V3^ (^) ; (18)^; ^(^)<strong>and</strong> ( 0 ; 0), <strong>and</strong> ^V 3 = 2 is implicitly de…ned. Comb<strong>in</strong><strong>in</strong>g (17) multiplied H 1 from left <strong>and</strong> (18),0 =01nhP ni=1 g i ( 0 )!+ ^M H (^ 0)^ (^)!; where ^M =!0 ^G0 1; (19)^G 2^V3where^G 1 = 1nhnXi=111 + ^ (^) 0 g i (^)@g i (^)@ 0 H 1 ; ; ^G 2 = 1nhnXi=111 + ~ 0 g i (~)@g i (^)@ 0 H 1 ;which satisfy ^G 1 ; ^Gp2 ! G by a similar argument to Lemma 1 <strong>and</strong> the consistency <strong>of</strong> ^. Also note thatp^V 3 ! V . S<strong>in</strong>ce G is full column rank <strong>and</strong> V is positive de…nite (Assumption 2), ^M is <strong>in</strong>vertible w.p.a.1.By solv<strong>in</strong>g (19) for p nhH (^ 0 ), we havepnhH (^ 0 ) = G 0 V 1 G 1G 0 V 1 1pnhFrom this <strong>and</strong> an expansion <strong>of</strong>1 pnhP ni=1 g i (^) around ^ = 0 ,1pnhnXg i (^) =i=1nXg i ( 0 ) + o p (1) :i=1hI G G 0 V 1 G 1G 0 V 1i 1pnhnXg i ( 0 ) + o p (1) : (20)i=11PComb<strong>in</strong><strong>in</strong>g (16), (20), p nnh i=1 g i ( 0 ) ! d N (0; V ) (Lemma 1), <strong>and</strong> 2 ^V 11Lemma 1 <strong>and</strong> 2 with the consistency <strong>of</strong> ^), we have^V 11 1 ^V 2 ^V 1p! V 1 (by` (^)d! 0 V 1=2 h I G G 0 V 1 G 1G 0 V 1i 0V1 h I G G 0 V 1 G 1G 0 V 1i V 1=2 = 0 h I A A 0 A 1 A0 i = 2 (1) ;where N (0; I) <strong>and</strong> A = V1=2 G. Therefore, the conclusion is obta<strong>in</strong>ed.17


A.2 LemmaLemma 1. Under Assumption,1P1. nnh i=1 g i ( 0 ) g i ( 0 ) 0 p! V , p1P nnh i=1 g i ( 0 ) ! d N (0; V );2. For each 2 (0; 1) <strong>and</strong> n = : jj n , sup 2 ;2n;1<strong>in</strong> 0 g i () p ! 0 <strong>and</strong> for each 2, n n () = n (a l ; log ( 0 + e a l) ; b l ; b r ) w.p.a.1;3. For any satisfy<strong>in</strong>g ! p 0 <strong>and</strong> 1 P nnh i=1 g i () = O p (nh) 1=2 , there exists ^ () = arg max ^P 2n() (; )w.p.a.1., ^ () = O p (nh) 1=2 , <strong>and</strong> sup ^P 2n() (; ) = O p(nh) 1 ;1P 4. nnh i=1 g i (^) = O p (nh) 1=2 .Pro<strong>of</strong> <strong>of</strong> 1. Pro<strong>of</strong> <strong>of</strong> the …rst statement. Let ^Vh i= ^Vab1; : : : ; 4. By the change <strong>of</strong> variables,ZZx


Let g i;1 ( 0 ) be the …rst element <strong>of</strong> g i ( 0 ). By the change <strong>of</strong> variables,r nh E [g i;1 ( 0 )] = p Znh K (u) ff (c + uh) exp ( 0 + l0 uh)g duu 0,0 = ^P (; 0) ^P ; = 0 g012 0 B@ 1nh1nX g i () g i () 0C1 + _ 0 2 A jgj C 2 ; (22)g i ()i=1w.p.a.1., where the …rst <strong>in</strong>equality follows from the de…nition <strong>of</strong> , the second equality follows froma second-order expansion with _ on the l<strong>in</strong>e jo<strong>in</strong><strong>in</strong>g <strong>and</strong> 0, <strong>and</strong> the second <strong>in</strong>equality follows from1P nnh i=1 g i () g i () 0 p ! V , positive de…niteness <strong>of</strong> V , <strong>and</strong> Lemma 2. S<strong>in</strong>ce C jgj <strong>and</strong> jgj =O p (nh) 1=2 by the assumption, we have = O p(nh) 1=2 . S<strong>in</strong>ce is chosen to satisfy (nh) 1=2 n !0, we have 2 <strong>in</strong>t n, i.e., is an <strong>in</strong>terior solution. Thus, from concavity <strong>of</strong> ^P (; ) <strong>in</strong> , convexity<strong>of</strong> n (), <strong>and</strong> n n () (by Lemma 2), = ^ () = arg max 2n() ^P (; ) w.p.a.1., i.e., the …rststatement is obta<strong>in</strong>ed. S<strong>in</strong>ce = O p(nh) 1=2 , the second statement is also obta<strong>in</strong>ed. The thirdstatement is obta<strong>in</strong>ed from (22) with = ^ ().Pro<strong>of</strong> <strong>of</strong> 4. The basic steps are similar to Newey <strong>and</strong> Smith (2004, Lemma A3). Pick any 2 (0; 1)satisfy<strong>in</strong>g (nh) 1=2 n ! 0. Let ^g = 1 P nnh i=1 g i (^) <strong>and</strong> ~ = n ^g= j^gj for . Observe that for someC > 0,^P^; ~ = ~ 0^g012 ~ 0 B@ 1nh1nX g i (^) g i (^) 0C1 + _ 0 2 A ~ n j^gj Cn 2 ; (23)g i (^)i=119


w.p.a.1., where the equality follows from a second-order expansion with _ on the l<strong>in</strong>e jo<strong>in</strong><strong>in</strong>g <strong>and</strong> 0,<strong>and</strong> the <strong>in</strong>equality follows from 1 P nnh i=1 g i (^) g i (^) 0 p! V , boundedness <strong>of</strong> V , <strong>and</strong> Lemma 2. Also notethatsup ^P (^; ) sup ^P ( 0 ; ) = O p(nh) 1 ; (24)2 n(^)2 n( 0 )w.p.a.1., where the <strong>in</strong>equality follows from the de…nition <strong>of</strong> ^, <strong>and</strong> the equality follows from Lemma 3with = 0 <strong>and</strong> 1 P nnh i=1 g i ( 0 ) = O p (nh) 1=2 (by Lemma 1). S<strong>in</strong>ce ~ 2 n , Lemma 2 guarantees~ 2 n (^), w.p.a.1., which implies ^P^; ~ sup ^P 2n(^) (^; ). Thus, comb<strong>in</strong><strong>in</strong>g (23) <strong>and</strong> (24),n j^gj Cn 2 O p(nh) 1 ; (25)w.p.a.1. S<strong>in</strong>ce we chose to satisfy (nh) 1=2 n ! 0, we have j^gj = O p n . Now, pick any n ! 0<strong>and</strong> de…ne = n^g. From j^gj = O p n , we have = o p n <strong>and</strong> 2 n n (^). Thus, we applythe same argument to (23)-(25) after replac<strong>in</strong>g ~ with . Then we obta<strong>in</strong> n j^gj 2 C 2 n j^gj 2 ^P ^; sup ^P (^; ) = O p(nh) 1 ;2 n(^)which implies n j^gj 2 = O p(nh) 1 : S<strong>in</strong>ce this results holds for any n ! 0, we obta<strong>in</strong> the conclusion.20


References[1] Angrist, J. D. <strong>and</strong> V. Lavy (1999) Us<strong>in</strong>g Maimonides’rule to estimate the e¤ect <strong>of</strong> class size onscholastic achievement, Quarterly Journal <strong>of</strong> Economics, 114, 533-575.[2] Cheng, M. Y. (1994) On boundary e¤ects <strong>of</strong> smooth curve estimators (dissertation), Unpublishedmanuscript series # 2319, University <strong>of</strong> North Carol<strong>in</strong>a.[3] Cheng, M. Y. (1997) A b<strong>and</strong>width selector for local l<strong>in</strong>ear density estimators, Annals <strong>of</strong> Statistics,25, 1001-1013.[4] Cheng, M. Y., Fan, J. <strong>and</strong> J. S. Marron (1997) On automatic boundary corrections, Annals <strong>of</strong>Statistics, 25, 1691-1708.[5] Cl<strong>in</strong>e, D. B. <strong>and</strong> J. D. Hart (1991) Kernel estimation <strong>of</strong> densities with discont<strong>in</strong>uities or discont<strong>in</strong>uousderivatives, Statistics, 22, 69-84.[6] Copas, J. B. (1995) Local likelihood based on kernel censor<strong>in</strong>g. Journal <strong>of</strong> the Royal StatisticalSociety, B, 57, 221-235.[7] Davidson, R. <strong>and</strong> MacK<strong>in</strong>non, J. G. (1998) Graphical methods for <strong>in</strong>vestigat<strong>in</strong>g the size <strong>and</strong> power<strong>of</strong> test statistics. The Manchester School, 66, 1-26.[8] Fan, J. <strong>and</strong> I. Gijbels (1996) Local Polynomial Modell<strong>in</strong>g <strong>and</strong> Its Applications, Chapman & Hall.[9] Fan, J., Zhang, C. <strong>and</strong> J. Zhang (2001) Generalized likelihood ratio statistics <strong>and</strong> Wilks phenomenon,Annals <strong>of</strong> Statistics, 29, 153-193.[10] Gregory, A. W. <strong>and</strong> M. R. Veal (1985) Formulat<strong>in</strong>g Wald tests <strong>of</strong> nonl<strong>in</strong>ear restrictions, Econometrica,53, 1465-68.[11] Hahn, J., Todd, P. <strong>and</strong> W. van der Klaauw (2001) Identi…cation <strong>and</strong> estimation <strong>of</strong> treatment e¤ectswith a regression discont<strong>in</strong>uity design, Econometrica, 69, 201-209.[12] Hjort, N. L. <strong>and</strong> M. C. Jones (1996) Locally parametric nonparametric density estimation. Annals<strong>of</strong> Statistics, 24, 1619-1647.[13] Imbens, G. W. <strong>and</strong> T. Lemieux (2008) Regression discont<strong>in</strong>uity designs: a guide to practice,Journal <strong>of</strong> Econometrics, 142, 615-635.[14] Kitamura, Y. (1997) Empirical likelihood methods with weakly dependent data, Annals <strong>of</strong> Statistics,25, 2084-2102.[15] Loader, C. R. (1996) Local likelihood density estimation, Annals <strong>of</strong> Statistics, 24, 1602-1618.21


[16] Marron, J. S. <strong>and</strong> D. Ruppert (1994) Transformations to reduce boundary bias <strong>in</strong> kernel densityestimation, Journal <strong>of</strong> the Royal Statistical Society, B, 56, 653-671.[17] McCrary, J. (2008) Manipulation <strong>of</strong> the runn<strong>in</strong>g variable <strong>in</strong> the regression discont<strong>in</strong>uity design: adensity test, Journal <strong>of</strong> Econometrics, 142, 698-714.[18] Newey, W. K. <strong>and</strong> R. J. Smith (2004) Higher order properties <strong>of</strong> GMM <strong>and</strong> generalized empiricallikelihood estimators, Econometrica, 72, 219-255.[19] Owen, A. (2001) Empirical Likelihood, Chapman & Hall, New York.[20] Porter, J. (2003) <strong>Estimation</strong> <strong>in</strong> the regression discont<strong>in</strong>uity model. Work<strong>in</strong>g paper, Department <strong>of</strong>Economics, University <strong>of</strong> Wiscons<strong>in</strong>.[21] Saez, E. (2009) Do taxpayers bunch at k<strong>in</strong>k po<strong>in</strong>ts? forthcom<strong>in</strong>g <strong>in</strong> American Economic Journal:Economic Policy.[22] Xu, K.-L. <strong>and</strong> P. C. B. Phillips (2007) Tilted nonparametric estimation <strong>of</strong> volatility functions withempirical applications. Cowles Foundation Discussion Paper 1612R, Yale University.[23] Xu, K.-L. (2010). Re-weighted Functional <strong>Estimation</strong> <strong>of</strong> Di¤usion Models. Econometric Theory26, 541-563.22


Table 1: F<strong>in</strong>ite-sample biases, st<strong>and</strong>ard deviations (Std.’s) <strong>and</strong> root mean square errors (RMSEs) <strong>of</strong>the b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> the local likelihood estimators <strong>of</strong> . The data are generated from a N(12; 3) density<strong>and</strong> the sample size n = 1000. The discont<strong>in</strong>uity po<strong>in</strong>t is c = 13.Fixed b<strong>and</strong>widthData dependent b<strong>and</strong>widthEst. h = 1 h = 2 h = 3 h = 4 = 1:5 1 = 1 = 1:5 = 1:5 2BiasStd.RMSEGe :0403 :0710 :0886 :0949 :0462 :0644 :0825 :0936Gb :0020 :0148 :0360 :0624 :0039 :0108 :0266 :0577e :0408 :0725 :0907 :0974 :0478 :0662 :0841 :0959b :0016 :0032 :0094 :0221 :0024 :0036 :0057 :0194Ge :0236 :0157 :0127 :0104 :0227 :0188 :0149 :0107Gb :0468 :0312 :0254 :0217 :0409 :0345 :0296 :0262e :0232 :0155 :0124 :0103 :0228 :0185 :0151 :0105b :0452 :0326 :0292 :0283 :0425 :0354 :0316 :0291Ge :0467 :0728 :0895 :0955 :0514 :0671 :0838 :0942Gb :0468 :0345 :0441 :0661 :0411 :0362 :0398 :0633e :0469 :0742 :0915 :0979 :0530 :0688 :0854 :0965b :0453 :0328 :0307 :0359 :0426 :0356 :0321 :0350hMean 1:7257Std. 0:197523


Table 2: F<strong>in</strong>ite-sample biases, st<strong>and</strong>ard deviations (Std.’s) <strong>and</strong> root mean square errors (RMSEs) <strong>of</strong>the b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> the local likelihood estimators <strong>of</strong> . The data are generated from a Student’s t density[i.e. 12 + t(5)= p 5=9] <strong>and</strong> the sample size n = 1000. The discont<strong>in</strong>uity po<strong>in</strong>t is c = 13.Fixed b<strong>and</strong>widthData dependent b<strong>and</strong>widthEst. h = 1 h = 2 h = 3 h = 4 = 1:5 1 = 1 = 1:5 = 1:5 2BiasStd.RMSEGe :0759 :1208 :1355 :1315 :0887 :1163 :1319 :1276Gb :0083 :0394 :0876 :1254 :0120 :0369 :0790 :1274e :0762 :1223 :1378 :1343 :0899 :1180 :1343 :1305b :0036 :0183 :0492 :0788 :0060 :0184 :0442 :0832Ge :0235 :0166 :0130 :0100 :0267 :0209 :0138 :0120Gb :0455 :0333 :0262 :0219 :0444 :0399 :0381 :0305e :0232 :0163 :0130 :0096 :0254 :0199 :0136 :0119b :0456 :0358 :0320 :0296 :0421 :0374 :0377 :0380Ge :0795 :1220 :1361 :1319 :0926 :1182 :1326 :1282Gb :0462 :0516 :0915 :1273 :0460 :0543 :0877 :1310e :0796 :1234 :1385 :1346 :0934 :1197 :1350 :1311b :0457 :0402 :0587 :0842 :0425 :0417 :0581 :0914hMean 1:9132Std. 0:356024


Table 3: F<strong>in</strong>ite-sample sizes, quantiles <strong>and</strong> powers <strong>of</strong> the W G (); `G() <strong>and</strong> `() test <strong>of</strong> density cont<strong>in</strong>uity,i.e. H 0 : 0 = 0 (nom<strong>in</strong>al sizes: 5% <strong>and</strong> 10%). The powers are calculated when the data aregenerated from a mixture <strong>of</strong> left <strong>and</strong> right truncated normal distributions at c with probability ; where = (c) d with d 2 f0:02; 0:04; 0:06; 0:08; 0:10gn Test SizeF<strong>in</strong>ite SampleQuantilePower(vs. value <strong>of</strong> d)AsymptoticQuantile 0.02 0.04 0.06 0.08 0.101000 W G ; 5% .073 4.51 3.84 .058 .104 .202 .368 .545`G; 5% .067 4.19 .059 .139 .244 .367 .491`; 5% .067 4.29 .082 .190 .366 .578 .754W G ; 10% .138 3.27 2.71 .107 .176 .303 .489 .651`G; 10% .137 3.19 .131 .211 .353 .505 .648`, 10% .110 2.89 .152 .273 .474 .681 .8412000 W G ; 5% .074 4.53 3.84 .071 .191 .394 .642 .856`G; 5% .047 3.79 .072 .191 .418 .636 .800`, 5% .050 3.82 .112 .306 .604 .845 .973W G ; 10% .134 3.18 2.71 .126 .261 .530 .754 .924`G; 10% .125 2.98 .144 .280 .541 .753 .909`; 10% .104 2.75 .184 .418 .715 .898 .9835000 W G ; 5% .065 4.31 3.84 .108 .427 .796 .960 .995`G; 5% .038 3.47 .112 .423 .769 .932 .987`; 5% .062 4.54 .193 .593 .900 .990 1.00W G ; 10% .127 3.07 2.71 .185 .548 .864 .983 .998`G; 10% .099 2.64 .185 .545 .861 .977 .995`; 10% .112 2.85 .282 .690 .952 .996 1.0025


Actual SizeSize Distortion0.35(a)0.06(b)0.30.050.250.040.20.030.150.10.05W G00 0.05 0.1 0.15 0.2 0.25Nom<strong>in</strong>al Sizel Gl45 o L<strong>in</strong>e0.020.010­0.010 0.05 0.1 0.15 0.2 0.25Nom<strong>in</strong>al SizeFigure 1: (a) P-value plots <strong>and</strong> (b) P-value discrepancy plots (Davidson <strong>and</strong> MacK<strong>in</strong>non, 1998) forW G ; `G <strong>and</strong> ` tests when n = 2000:Table 4: <strong>Estimation</strong> <strong>and</strong> test<strong>in</strong>g <strong>of</strong> the discont<strong>in</strong>uity <strong>of</strong> the density <strong>of</strong> enrollments at multiples <strong>of</strong> 40(accord<strong>in</strong>g to Maimonides’s rule, Angrist <strong>and</strong> Lavy, 1999) for …fth graders. The b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> locallikelihood methods are used with various smooth<strong>in</strong>g b<strong>and</strong>widths.B<strong>in</strong>n<strong>in</strong>g methodLocal likelihood methodc h b fl b fr b GWald W G EL `G b fl b fr b EL `40 15 :0056 :0080 :0024 2:666 0:444 :0039 :0114 :0075 20:6720 :0052 :0089 :0037 7:888 1:331 :0040 :0114 :0074 26:9425 :0050 :0094 :0044 13:57 2:321 :0040 :0114 :0074 33:2830 :0054 :0099 :0045 16:23 2:856 :0045 :0116 :0072 36:3680 15 :0115 :0095 :0019 1:141 1:014 :0081 :0140 :0059 8:30120 :0112 :0089 :0024 2:389 1:964 :0085 :0116 :0030 3:36625 :0108 :0087 :0021 2:395 1:534 :0087 :0107 :0021 2:09230 :0105 :0089 :0016 1:698 1:001 :0088 :0107 :0020 2:350120 15 :0092 :0050 :0043 7:920 3:906 :0064 :0078 :0014 0:57720 :0088 :0048 :0039 9:355 3:977 :0066 :0070 :0003 0:04525 :0075 :0047 :0029 7:005 2:403 :0060 :0063 :0003 0:06930 :0066 :0045 :0021 4:988 1:263 :0055 :0060 :0005 0:178160 15 :0027 :0010 :0017 4:670 3:253 :0017 :0013 :0003 0:14220 :0025 :0010 :0015 5:176 2:397 :0018 :0012 :0006 0:73025 :0023 :0010 :0012 4:550 1:710 :0017 :0013 :0005 0:58630 :0021 :0011 :0011 4:316 1:456 :0017 :0013 :0004 0:47326


Frequencies0.0160.0140.0120.010.0080.0060.0040.00200 40 80 120 160 200EnrollmentsFigure 2: Histogram <strong>of</strong> the enrollments <strong>of</strong> 2029 classes <strong>in</strong> Grade 5 (Data: Angrist <strong>and</strong> Lavy, 1999).Table 5: Empirical likelihood con…dence sets (EL CSs) <strong>of</strong> the discont<strong>in</strong>uity <strong>of</strong> the density <strong>of</strong> enrollmentsat c = 40 for …fth graders. The b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> local likelihood methods are used with various smooth<strong>in</strong>gb<strong>and</strong>widths.B<strong>in</strong>n<strong>in</strong>g methodLocal likelihood methodhb GEL CS Length b EL CS Length15 :0024 [ :0104; :0152] :0256 :0075 [:0046; :0106] :006020 :0037 [ :0036; :0120] :0156 :0074 [:0049; :0101] :005225 :0044 [ :0014; :0108] :0122 :0074 [:0051; :0097] :004630 :0045 [ :0007; :0106] :0113 :0072 [:0051; :0096] :004527


Densities0.015B<strong>in</strong>n<strong>in</strong>g EstimatorLocal Likelihood Estimator0.010.00500 40 80 120 160 200EnrollmentsFigure 3: The estimated density function <strong>of</strong> school enrollments for the …fth graders us<strong>in</strong>g the data onleft <strong>and</strong> right sides <strong>of</strong> c = 40: B<strong>in</strong>ned data are also displayed.28


Densities0.015B<strong>in</strong>n<strong>in</strong>g EstimatorLocal Likelihood Estimator0.010.00500 40 80 120 160 200EnrollmentsFigure 4: The estimated density function <strong>of</strong> school enrollments for the …fth graders us<strong>in</strong>g the data onleft <strong>and</strong> right sides <strong>of</strong> c = 120: B<strong>in</strong>ned data are also displayed.29

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!