Estimation and Inference of Discontinuity in Density
Estimation and Inference of Discontinuity in Density
Estimation and Inference of Discontinuity in Density
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
the empirical likelihood con…dence sets are well de…ned even if the local l<strong>in</strong>ear b<strong>in</strong>n<strong>in</strong>g estimator <strong>of</strong>McCrary (2008) yields negative estimates for the densities. Simulation results <strong>in</strong>dicate that the empiricallikelihood tests have accurate …nite sample sizes, <strong>and</strong> are generally more powerful (especially thosebased on the local likelihood estimators) than the Wald test.Angrist <strong>and</strong> Lavy (1999) exploited the so-called Maimonides’ rule, which stipulates that a classwith more than 40 pupils should be split <strong>in</strong>to two, as an exogenous source <strong>of</strong> variation <strong>in</strong> class sizeto identify the e¤ects <strong>of</strong> class size on the scholastic achievement <strong>of</strong> pupils <strong>in</strong> Israel. An importantassumption <strong>of</strong> their study is no manipulation <strong>of</strong> class size by parents. Evidence <strong>of</strong> such manipulationcasts doubt on the identi…cation strategy <strong>of</strong> the regression discont<strong>in</strong>uity design. Angrist <strong>and</strong> Lavy(1999) provided <strong>in</strong>tuitive arguments that manipulation is unlikely to happen <strong>in</strong> Israel. In this paper,we statistically re-exam<strong>in</strong>e the assumption <strong>of</strong> no manipulation <strong>of</strong> class sizes by test<strong>in</strong>g cont<strong>in</strong>uity <strong>of</strong>the enrollment density [i.e. density cont<strong>in</strong>uity <strong>of</strong> the runn<strong>in</strong>g variable (McCrary, 2008)]. Us<strong>in</strong>g theproposed local likelihood estimator <strong>and</strong> the associated empirical likelihood <strong>in</strong>ference procedure, we …ndsigni…cant evidence <strong>of</strong> manipulation at the …rst multiple <strong>of</strong> 40 but not clearly at other multiples. Theprogressively smaller estimates <strong>and</strong> weaker evidence <strong>of</strong> discont<strong>in</strong>uity our empirical results discover atmultiples <strong>of</strong> 40 co<strong>in</strong>cides with the fact that the parents are more likely to selectively manipulate theclass size as just above 40 because they could place their children <strong>in</strong> schools with smaller class sizesif the manipulation is successful, than they do as just above 80, 120, or 160. These …nd<strong>in</strong>gs are notshared by us<strong>in</strong>g McCrary’s b<strong>in</strong>n<strong>in</strong>g estimator <strong>and</strong> Wald test.This paper also contributes to the rapidly grow<strong>in</strong>g literature on empirical likelihood (see Owen, 2001,for a review). In particular, we extend the empirical likelihood approach to the density discont<strong>in</strong>uity<strong>in</strong>ference problem by <strong>in</strong>corporat<strong>in</strong>g local polynomial …tt<strong>in</strong>g techniques such as Fan <strong>and</strong> Gijbels (1996)<strong>and</strong> Loader (1996). We show that the empirical likelihood ratios for density discont<strong>in</strong>uities have anasymptotically chi-squared distribution. Therefore, we can still observe the Wilks phenomenon (Fan,Zhang <strong>and</strong> Zhang, 2001) <strong>in</strong> this nonparametric density discont<strong>in</strong>uity <strong>in</strong>ference problem.This paper is organized as follows. Section 2.1 presents our basic setup <strong>and</strong> po<strong>in</strong>t estimation methods.In Section 2.2 we construct empirical likelihood functions <strong>of</strong> discont<strong>in</strong>uity <strong>in</strong> density. Section 2.3presents the asymptotic properties <strong>of</strong> the empirical likelihood-based <strong>in</strong>ference methods. The proposedmethods are evaluated by Monte Carlo simulations <strong>in</strong> Section 3. An empirical application to validation<strong>of</strong> regression discont<strong>in</strong>uity design is provided <strong>in</strong> Section 4. Section 5 concludes. Appendix A conta<strong>in</strong>smathematical pro<strong>of</strong>s <strong>and</strong> some prelim<strong>in</strong>ary lemmas.2 Ma<strong>in</strong> Result2.1 Setup <strong>and</strong> <strong>Estimation</strong>We …rst <strong>in</strong>troduce our basic setup. Let fX i g n i=1be an iid sample <strong>of</strong> X 2 X R with the probabilitydensity function f (x). We do not impose any parametric functional form for f (x). Suppose that we3
are <strong>in</strong>terested <strong>in</strong> (dis)cont<strong>in</strong>uity <strong>of</strong> the density f (x) at some given po<strong>in</strong>t c 2 X. Let f l = lim x"c f (x)<strong>and</strong> f r = lim x#c f (x) be the left <strong>and</strong> right limits <strong>of</strong> f (x) at x = c, respectively. Our object <strong>of</strong> <strong>in</strong>terestis the di¤erence <strong>of</strong> the left <strong>and</strong> right limits: 0 = f r f l : (1)If 0 = 0, then the density functionf(x) is cont<strong>in</strong>uous at x = c. We wish to estimate, construct acon…dence set, <strong>and</strong> conduct a hypothesis test for the parameter 0 without assum<strong>in</strong>g any parametricfunctional form <strong>of</strong> f l or f r .First, let us consider the po<strong>in</strong>t estimation problem <strong>of</strong> 0 . If the density is discont<strong>in</strong>uous at c (i.e., 0 6= 0), we can regard the estimation problems for the limits f l <strong>and</strong> f r as the ones for nonparametricdensities at the boundary po<strong>in</strong>t c us<strong>in</strong>g sub-samples with X i < c <strong>and</strong> X i c. To reduce boundarybiases <strong>in</strong> nonparametric estimators, it is reasonable to apply a local polynomial …tt<strong>in</strong>g technique, whichhas favorable properties on a boundary (see e.g. Fan <strong>and</strong> Gijbels, 1996). In density estimation wedo not have any regress<strong>and</strong>s or regressors. However, there are at least two ways to adapt the localpolynomial method to the density estimation problem, the b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> local likelihood methods.The b<strong>in</strong>n<strong>in</strong>g method (e.g. Cheng, 1994, 1997, <strong>and</strong> Cheng, Fan <strong>and</strong> Marron, 1997) creates regress<strong>and</strong>sn o J<strong>and</strong> regressors based on b<strong>in</strong>ned data <strong>and</strong> then implements local polynomial regression. Let XjG =j=1f: : : ; c 2b; c b; c; c + b; c + 2b; : : :g, which plays the role <strong>of</strong> a regressor, be an equi-spaced grid <strong>of</strong>width b, where the <strong>in</strong>terval X1 G; XG J covers the support X. Let I fg be the <strong>in</strong>dicator function <strong>and</strong>Zj G = 1 P n onnb i=1 I XiXjG < b 2, which plays the role <strong>of</strong> a regress<strong>and</strong>, be the normalized frequencyfor the j-th b<strong>in</strong>. The b<strong>in</strong>-based local l<strong>in</strong>ear estimators ^f l G <strong>and</strong> ^f r G for f l <strong>and</strong> f r are de…ned as solutionsto the follow<strong>in</strong>g weighted least square problems with respect to a l <strong>and</strong> a r , respectively,m<strong>in</strong>a l ;b lm<strong>in</strong>a r;b rXK XG j chj:Xj G
whereKlj G = K Xj;hG ( JX1 IkGk=1Krj G = K Xj;hG ( JXIj G K Xk;h G XG 2k;h Xj;hGk=1K XGk;hXGk;h 2X G j;h J XJXk=1k=1)1 IkG K XGk;h XGk;h ;)Ik G K XG k;h XGk;h : h i h iE ^fl ; E ^fr , the b<strong>in</strong>-If we regard (6) as estimat<strong>in</strong>g equations or sample moment conditions for h i h ibased empirical likelihood function for E ^fl ; E ^fr is constructed asJYL G (a l ; a r ) = sup p j ; (7)fp j g J j=1 j=1s.t. 0 p j 1,JXp j = 1,j=1JXp j 1 IjGj=1KGlj Zj G JXa l = 0, p j Ij G Krj G Zj G a r = 0:j=1The weight p j can be <strong>in</strong>terpreted as probability mass allocated to the observed value <strong>of</strong> Zj G . By apply<strong>in</strong>gthe Lagrange multiplier method, under certa<strong>in</strong> regularity conditions (see Theorem 2.2 <strong>of</strong> Newey <strong>and</strong>Smith, 2004), we can obta<strong>in</strong> the dual representation <strong>of</strong> the maximization problem <strong>in</strong> (7). That is`G (a l ; a r ) = 2 log L G (a l ; a r ) + n log n = 2 supwhere G J (a l; a r ) =0, <strong>and</strong> G 2 G J (a l;a r) j=1JXlog 1 + G0 gj G (a l ; a r ) ; (8)no G 2 R 2 : G0 gj G (a l; a r ) 2 V for j = 1; : : : ; J , V is an open <strong>in</strong>terval conta<strong>in</strong><strong>in</strong>gg G j (a l ; a r ) = 1I G jKGlj Z G j a l; IGj K G rj Z G j a r 0:Note that the J-variable maximization problem <strong>in</strong> (7) with respect to fp j g J j=1reduces to the two-variableconvex maximization problem <strong>in</strong> (8) with respect to G , which is easily implemented by a Newton-typeoptimization algorithm. Therefore, <strong>in</strong> practice we use the dual formulation (8) to compute the (log)empirical likelihood function. Based on the empirical likelihood function `G (a l ; a r ), the concentratedlikelihood function for the parameter <strong>of</strong> <strong>in</strong>terest 0 = f rf l is de…ned as`G () =for the parameter space A l A r <strong>of</strong> (f l ; f r ).m<strong>in</strong> `G (a l ; a r ) ; (9)f(a l ;a r)2A l A r:=a r a l gWe now de…ne the empirical likelihood function based on the local likelihood approach. Let I i =I fX i cg <strong>and</strong> X i;h = X ich. The …rst-order conditions for the local likelihood maximization problems7
<strong>in</strong> (3) are written as0 = 1 n0 = 1 nnX(1; X i;h ) (1 I i ) K (X i;h )i=1nX(1; X i;h ) I i K (X i;h )i=1ZxcZx
the b<strong>in</strong>ned data, the computational cost <strong>of</strong> `G () is relatively cheaper than that <strong>of</strong> ` () because thedimensions <strong>of</strong> the auxiliary parameters <strong>and</strong> (a l ; b l ; b r ) <strong>in</strong> ` () are higher than those <strong>of</strong> G <strong>and</strong> a l <strong>in</strong>`G ().One useful feature <strong>of</strong> our empirical likelihood approach is that it can easily <strong>in</strong>corporate additional<strong>in</strong>formation. Suppose we have prior <strong>in</strong>formation about X speci…ed <strong>in</strong> the form <strong>of</strong> E [m (X)] = 0, suchas the mean, variance, or quantiles. Us<strong>in</strong>g the weights fp i g n i=1, this <strong>in</strong>formation can be <strong>in</strong>corporated<strong>in</strong>to the likelihood maximization problem (10) by add<strong>in</strong>g the restriction P ni=1 p im (X i ) = 0. In thiscase, the dual form (11) is re-de…ned by add<strong>in</strong>g m (X i ) to the moment function g i (a l ; a r ; b l ; b r ). Theresult<strong>in</strong>g empirical likelihood <strong>in</strong>ference is more e¢ cient if the additional <strong>in</strong>formation is valid.Although this paper focuses exclusively on discont<strong>in</strong>uities <strong>of</strong> density functions, our empirical likelihoodapproach can also deal with discont<strong>in</strong>uities <strong>in</strong> derivatives <strong>of</strong> density functions. For example,suppose we are <strong>in</strong>terested <strong>in</strong> conduct<strong>in</strong>g <strong>in</strong>ference on the …rst-order derivatives <strong>of</strong> the density us<strong>in</strong>gthe local likelihood-based empirical likelihood. Then the contrast 0 = fr0 fl 0 can be estimated by^ = ^b r exp (^a r ) ^bl exp (^a l ) us<strong>in</strong>g the solutions <strong>of</strong> (3), <strong>and</strong> the empirical likelihood function for 0 canbe de…ned as` () =m<strong>in</strong>` (a l; a r ; b l ; b r ) :f(a l ;a r;b l ;b r)2A l A rB l B r:=b r exp(a r) b l exp(a l )gIn this case, one may add quadratic terms by X 2 i;h to the moment function g i (a l ; a r ; b l ; b r ) to reducethe boundary bias for the derivative estimates.2.3 Large Sample PropertiesWe now <strong>in</strong>vestigate the asymptotic properties <strong>of</strong> the proposed empirical likelihood statistics. We imposethe follow<strong>in</strong>g assumptions.Assumption.1. fX i g n i=1is i.i.d.2. There exists a neighborhood N around c such that f is cont<strong>in</strong>uously second-order di¤erentiableon N n fcg. For the matrices V <strong>and</strong> G de…ned <strong>in</strong> (13), V is positive de…nite <strong>and</strong> G has fullcolumn rank.3. K is a symmetric <strong>and</strong> bounded density function with support [ k; k] for some k > 0.4. As n ! 1, h ! 0, nh ! 1, <strong>and</strong> nh 5 ! 0. Additionally, b=h ! 0 for `G () <strong>and</strong> nh 3 ! 1 for` ().5. A l <strong>and</strong> A r are compact, f l 2 <strong>in</strong>t (A l ), <strong>and</strong> f r 2 <strong>in</strong>t (A r ). Additionally, for ` (), B l <strong>and</strong> B r arecompact, fl 0=f l 2 <strong>in</strong>t (B l ), <strong>and</strong> fr=f 0 r 2 <strong>in</strong>t (B r ).9
Assumption 1 is on the data structure. Although it is beyond the scope <strong>of</strong> this paper, it wouldbe <strong>in</strong>terest<strong>in</strong>g to extend the proposed method to weakly dependent data, where we are <strong>in</strong>terested<strong>in</strong> the discont<strong>in</strong>uity <strong>of</strong> the stationary distribution. For this extension, we would need to <strong>in</strong>troduce ablock<strong>in</strong>g technique to h<strong>and</strong>le the time series dependence <strong>in</strong> the moment functions (see Kitamura, 1997).Assumption 2 restricts the local shape <strong>of</strong> the density function around x = c. This assumption allowsdiscont<strong>in</strong>uity <strong>of</strong> the density function at x = c. Assumption 3 is on the kernel function K <strong>and</strong> implies thesecond-order kernel. This assumption is satis…ed by e.g., the triangle kernel K (a) = max f0; 1jajg <strong>and</strong>Epanechnikov kernel K (a) = 3 4 1 a2 I fjaj 1g. Assumption 4 is on the b<strong>and</strong>width parameter h.The requirement nh 5 ! 0 corresponds to an undersmooth<strong>in</strong>g condition to remove the bias component h i h i(f r f l ) E ^fr E ^fl <strong>in</strong> the construction <strong>of</strong> empirical likelihood. The requirement b=h ! 0 ison the b<strong>in</strong> width b used for the b<strong>in</strong>n<strong>in</strong>g estimator. The requirement nh 3 ! 1 is required to obta<strong>in</strong>the consistency <strong>of</strong> the m<strong>in</strong>imizers to solve (12). If we do not have the local l<strong>in</strong>ear terms (X ic) <strong>and</strong>(u c) <strong>in</strong> (3), this requirement is unnecessary. Assumption 5 is similar to an assumption that wouldbe used <strong>in</strong> a parametric estimation problem.’Under these assumptions, the asymptotic distributions <strong>of</strong> the empirical likelihood functions `G ( 0 )<strong>and</strong> ` ( 0 ) evaluated at the true parameter value 0 are obta<strong>in</strong>ed as follows.Theorem. Under Assumption, as n ! 1`G ( 0 ) d ! 2 (1) ;` ( 0 ) d ! 2 (1) :Note that even <strong>in</strong> this nonparametric hypothesis test<strong>in</strong>g problem, we observe the convergence <strong>of</strong> theempirical likelihood statistic to the pivotal 2 distribution, i.e., the Wilks phenomenon emerges. Thenull hypothesis H 0 : 0 = aga<strong>in</strong>st H 1 : 0 6= for some given can be tested by the test statistic`G () or ` () with some 2 (1) critical value. For example, the null <strong>of</strong> density cont<strong>in</strong>uity is tested by`G (0) or ` (0). Also, by <strong>in</strong>vert<strong>in</strong>g these test statistics, the 100 (1 ) % empirical likelihood asymptoticcon…dence sets are obta<strong>in</strong>ed asCS G = : `G () 2 1 (1) ;CS = : ` () 2 1 (1) ;where 2 1 (1) is the (1 )-th quantile <strong>of</strong> the 2 (1) distribution.We now compare our empirical likelihood approach with the Wald approach suggested by McCrary(2008). First, <strong>in</strong> contrast to the t-test based on (5), the empirical likelihood test based on `G (0) or` (0) does not require any asymptotic variance estimation, which is automatically <strong>in</strong>corporated <strong>in</strong> theconstruction <strong>of</strong> the empirical likelihood functions. Also, while the Wald test requires the derivation <strong>of</strong>the asymptotic variance 2 Kfor each kernel function, the empirical likelihood tests do not require this10
type <strong>of</strong> derivation. Second, the empirical likelihood con…dence sets CS G <strong>and</strong> CS do not require thelocal l<strong>in</strong>ear (or polynomial) estimators ^f l <strong>and</strong> ^f r . Thus, even if ^fl or ^f r yields negative estimates <strong>in</strong>…nite samples, CS G <strong>and</strong> CS are well de…ned. Third, the empirical likelihood test statistics are <strong>in</strong>variantto the formulation <strong>of</strong> the nonl<strong>in</strong>ear null hypotheses. For example, to test the density cont<strong>in</strong>uity, wemay specify the null hypothesis as H 0 : log f l = log f r , H0 ~ : f l = f r , H0 f: lf r= 1, etc. For thesehypotheses, the empirical likelihood test statistics are identical (i.e., `G (0) or ` (0)). On the otherh<strong>and</strong>, the Wald test statistic is not <strong>in</strong>variant to the formulation <strong>of</strong> the null hypotheses <strong>and</strong> may yieldopposite conclusions <strong>in</strong> …nite samples (see, Gregory <strong>and</strong> Veal, 1985, for examples).3 SimulationsIn this section we study the …nite-sample behaviors <strong>of</strong> the aforementioned methods us<strong>in</strong>g simulations.First we focus on the po<strong>in</strong>t estimators, i.e. the local l<strong>in</strong>ear b<strong>in</strong>n<strong>in</strong>g estimator ^ G <strong>and</strong> the local (logl<strong>in</strong>ear) likelihood estimator ^ for 0 .For comparisons, we also consider the local constant b<strong>in</strong>n<strong>in</strong>gestimator ~ G <strong>and</strong> the local (log constant) likelihood estimator ~ . For the kernel function K, we usethe triangle kernel function K (a) = max f0; 1 jajg. For the b<strong>and</strong>width h, we consider both …xedb<strong>and</strong>widths h = 1; 2; 3; 4 <strong>and</strong> data-dependent b<strong>and</strong>widths h = h dd , where h dd is the data-dependentb<strong>and</strong>width used by McCrary (2008) <strong>and</strong> = 1:5 k for k = 1; 0; 1; 2. For the b<strong>in</strong> size b to implement ^ G<strong>and</strong> ~ G , we employ a data-dependent method suggested by McCrary (2008). The data are generatedfrom normal distribution N (12; 3) (follow<strong>in</strong>g McCrary, 2008) <strong>and</strong> Student’s t distribution 12 + 3 p5t (5).Both distributions have the same mean <strong>and</strong> variance. The sample size is n = 1000 <strong>and</strong> the suspecteddiscont<strong>in</strong>uity po<strong>in</strong>t is c = 13. S<strong>in</strong>ce the above densities are cont<strong>in</strong>uous, the true value is 0 = 0. Thebiases, variances <strong>and</strong> mean square errors (MSEs) <strong>of</strong> the above estimators are reported <strong>in</strong> Tables 1 <strong>and</strong>2.Among four estimators, ^ performs best <strong>in</strong> terms <strong>of</strong> MSEs. Its MSE is slightly smaller than that <strong>of</strong>its competitor, ^ G , when a small b<strong>and</strong>width is used, but is signi…cantly smaller when the b<strong>and</strong>width isrelatively large. The dom<strong>in</strong>ance <strong>of</strong> ^ ma<strong>in</strong>ly comes from its superior bias performance on boundaries,while its variance is comparable with that <strong>of</strong> ^ G . The local constant estimators ~ G <strong>and</strong> ~ generallyhave smaller variances than ^ G <strong>and</strong> ^, but have much larger biases <strong>and</strong> thus larger MSEs. All fourestimators are generally biased downwards. On the other h<strong>and</strong>, a prelim<strong>in</strong>ary simulation <strong>in</strong>dicates thatthese estimators are generally biased upwards if the discont<strong>in</strong>uity po<strong>in</strong>t suspected is on the left side <strong>of</strong>the peak, e.g. c = 11. Typical bias-variance trade-o¤s for the b<strong>and</strong>width selection is also observed: thebiases are larger <strong>and</strong> the variances are smaller when the b<strong>and</strong>width <strong>in</strong>creases. Data generated from aheavier tailed distribution <strong>in</strong>crease the biases <strong>of</strong> the four estimators signi…cantly <strong>and</strong> a¤ect the variancesonly slightly. Aga<strong>in</strong>, ^ appears to have smaller MSEs than other estimators.Next we look at the tests for (dis)cont<strong>in</strong>uity <strong>in</strong> the density function. We consider a general set-up<strong>of</strong> mixture <strong>of</strong> normal distributions.Suppose that the r<strong>and</strong>om variable X is drawn from truncated11
size at the values <strong>of</strong> enrollments that are just above multiples <strong>of</strong> 40, e.g., 41-45, 81-85, etc. Angrist<strong>and</strong> Lavy (1999) used this rule as an exogenous source <strong>of</strong> variation <strong>in</strong> class size to identify the e¤ects<strong>of</strong> class size on the scholastic achievement <strong>of</strong> Israeli pupils.An important identify<strong>in</strong>g assumption <strong>of</strong> Angrist <strong>and</strong> Lavy (1999) is no manipulation <strong>of</strong> class size byparents, which is the test<strong>in</strong>g focus <strong>of</strong> this section. Precisely, there could be two k<strong>in</strong>ds <strong>of</strong> manipulation.The …rst one is that parents may selectively exploit the Maimonides’rule by register<strong>in</strong>g their children <strong>in</strong>the schools with enrollments just above multiples <strong>of</strong> 40 so that their children are placed <strong>in</strong> classes withsmaller sizes. Follow<strong>in</strong>g McCrary’s (2008) arguments, this would lead to an <strong>in</strong>crease <strong>in</strong> the density <strong>of</strong>enrollment counts around the po<strong>in</strong>t that is just above a multiple <strong>of</strong> 40. Angrist <strong>and</strong> Lavy (1999) arguedthat this k<strong>in</strong>d <strong>of</strong> manipulation is unlikely to happen for two reasons. First, Israeli pupils are requiredto attend school <strong>in</strong> their local registration area. Also, pr<strong>in</strong>cipals are required to grant enrollment tostudents with<strong>in</strong> their district <strong>and</strong> are not permitted to register students from outside their district.Second, even <strong>in</strong> exceptional cases that parents <strong>in</strong>tentionally move to another school district hop<strong>in</strong>g toget a better draw <strong>in</strong> the enrollment lottery (e.g. 41-45 <strong>in</strong>stead <strong>of</strong> 38), “there is no way to know (exactly)whether a predicted enrollment <strong>of</strong> 41 will not decl<strong>in</strong>e to 38 by the time school starts, obviat<strong>in</strong>g the needfor two small classes”(Angrist <strong>and</strong> Lavy, 1999).The second k<strong>in</strong>d <strong>of</strong> manipulation <strong>of</strong> class size is that parents may extract their kids from the publicschool system when they …nd the enrollments <strong>of</strong> the schools where their kids are registered are justbelow multiples <strong>of</strong> 40. This would lead to a decrease <strong>of</strong> the enrollment density on the left side <strong>of</strong> themultiples <strong>of</strong> 40. However, as argued by Angrist <strong>and</strong> Lavy, unlike <strong>in</strong> the United States private elementaryschool<strong>in</strong>g is rare <strong>in</strong> Israel.To assess the validity <strong>of</strong> the assumption <strong>of</strong> no manipulation <strong>of</strong> class size we test cont<strong>in</strong>uity <strong>of</strong> thedensity function <strong>of</strong> enrollment counts. We consider …fth graders. In the end, our data conta<strong>in</strong>ed 2029schools (Angrist <strong>and</strong> Lavy, 1999). The histogram is displayed <strong>in</strong> Figure 2. It shows a sharp <strong>in</strong>crease <strong>of</strong>densities at the enrollment <strong>of</strong> 40 but such <strong>in</strong>crease is not clearly observed for other multiples <strong>of</strong> 40. Thisobservation is re<strong>in</strong>forced <strong>in</strong> graphical analysis displayed <strong>in</strong> Figures 3 <strong>and</strong> 4, which show the estimatedenrollment density function us<strong>in</strong>g the data on the either side <strong>of</strong> 40 <strong>and</strong> 120 respectively.We perform the b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> local likelihood estimation (^ G <strong>and</strong> ^) <strong>and</strong> the associated tests (W G ,`G, <strong>and</strong> `) <strong>of</strong> the discont<strong>in</strong>uities <strong>in</strong> enrollment densities that are suspected at the multiples <strong>of</strong> 40 overa range <strong>of</strong> smooth<strong>in</strong>g b<strong>and</strong>widths. The results are summarized <strong>in</strong> Table 4.The local likelihood method …nds upward jumps <strong>of</strong> the enrollment density at c =40, 80, <strong>and</strong> 120<strong>and</strong> a downward jump at c = 160. The associated empirical likelihood tests show that the discont<strong>in</strong>uityat an enrollment <strong>of</strong> 40 is very signi…cant with test statistics all valued larger than 20 for di¤erentb<strong>and</strong>widths. The evidence <strong>of</strong> discont<strong>in</strong>uity at 80 is relatively weak <strong>and</strong> signi…cance depends on theb<strong>and</strong>width used, while no evidence <strong>of</strong> discont<strong>in</strong>uity is found at 120 <strong>and</strong> 160. The progressively weakerevidence <strong>of</strong> discont<strong>in</strong>uity co<strong>in</strong>cides with the extent <strong>of</strong> decrement <strong>of</strong> class sizes at di¤erent multiples <strong>of</strong>40. For example, accord<strong>in</strong>g to Maimonides’rule, the class size drops faster at the enrollment <strong>of</strong> 40 than13
AMathematical AppendixHereafter “w.p.a.1”means “with probability approach<strong>in</strong>g one”. De…ne f = f (c),log f; lim = (a l ; b l ; b r ) 0 d log f (x); 0 =x"c dx^ = arg m<strong>in</strong> ` (a l; log ( 0 + e a l) ; b l ; b r ) ; = A l B l B r ;2Xi cg i () = g i (a l ; b l ; b r ) ; A i = (1 I i ) ; (1 I i )hZZK lj1 j 2= u j 1K j 2(u) du; K rj1 j 2= u j 1K j 2(u) du;ku
Also, by the change <strong>of</strong> variables,1h 2 E [g i ()] = Z 1(1; u) K (u) ff (c + uh) exp (a l + b l uh)g du;h u
w.p.a.1. On the other h<strong>and</strong>, an expansion <strong>of</strong> (15) around ^; ^(^) = ( 0 ; 0) yieldswhere H =1nhP ni=10 = 1nh0B@g i (~)g i (~) 01+ ~ 0 g i (~)nXi=11 0 00 h 00 0 hg i ( 0 ) + 1nh1CA~; , ~ nXi=111 + ~ 0 g i (~)is a po<strong>in</strong>t on the l<strong>in</strong>e jo<strong>in</strong><strong>in</strong>g@g i (^)@ 0 H 1 !H (^ 0 ) ^V3^ (^) ; (18)^; ^(^)<strong>and</strong> ( 0 ; 0), <strong>and</strong> ^V 3 = 2 is implicitly de…ned. Comb<strong>in</strong><strong>in</strong>g (17) multiplied H 1 from left <strong>and</strong> (18),0 =01nhP ni=1 g i ( 0 )!+ ^M H (^ 0)^ (^)!; where ^M =!0 ^G0 1; (19)^G 2^V3where^G 1 = 1nhnXi=111 + ^ (^) 0 g i (^)@g i (^)@ 0 H 1 ; ; ^G 2 = 1nhnXi=111 + ~ 0 g i (~)@g i (^)@ 0 H 1 ;which satisfy ^G 1 ; ^Gp2 ! G by a similar argument to Lemma 1 <strong>and</strong> the consistency <strong>of</strong> ^. Also note thatp^V 3 ! V . S<strong>in</strong>ce G is full column rank <strong>and</strong> V is positive de…nite (Assumption 2), ^M is <strong>in</strong>vertible w.p.a.1.By solv<strong>in</strong>g (19) for p nhH (^ 0 ), we havepnhH (^ 0 ) = G 0 V 1 G 1G 0 V 1 1pnhFrom this <strong>and</strong> an expansion <strong>of</strong>1 pnhP ni=1 g i (^) around ^ = 0 ,1pnhnXg i (^) =i=1nXg i ( 0 ) + o p (1) :i=1hI G G 0 V 1 G 1G 0 V 1i 1pnhnXg i ( 0 ) + o p (1) : (20)i=11PComb<strong>in</strong><strong>in</strong>g (16), (20), p nnh i=1 g i ( 0 ) ! d N (0; V ) (Lemma 1), <strong>and</strong> 2 ^V 11Lemma 1 <strong>and</strong> 2 with the consistency <strong>of</strong> ^), we have^V 11 1 ^V 2 ^V 1p! V 1 (by` (^)d! 0 V 1=2 h I G G 0 V 1 G 1G 0 V 1i 0V1 h I G G 0 V 1 G 1G 0 V 1i V 1=2 = 0 h I A A 0 A 1 A0 i = 2 (1) ;where N (0; I) <strong>and</strong> A = V1=2 G. Therefore, the conclusion is obta<strong>in</strong>ed.17
A.2 LemmaLemma 1. Under Assumption,1P1. nnh i=1 g i ( 0 ) g i ( 0 ) 0 p! V , p1P nnh i=1 g i ( 0 ) ! d N (0; V );2. For each 2 (0; 1) <strong>and</strong> n = : jj n , sup 2 ;2n;1<strong>in</strong> 0 g i () p ! 0 <strong>and</strong> for each 2, n n () = n (a l ; log ( 0 + e a l) ; b l ; b r ) w.p.a.1;3. For any satisfy<strong>in</strong>g ! p 0 <strong>and</strong> 1 P nnh i=1 g i () = O p (nh) 1=2 , there exists ^ () = arg max ^P 2n() (; )w.p.a.1., ^ () = O p (nh) 1=2 , <strong>and</strong> sup ^P 2n() (; ) = O p(nh) 1 ;1P 4. nnh i=1 g i (^) = O p (nh) 1=2 .Pro<strong>of</strong> <strong>of</strong> 1. Pro<strong>of</strong> <strong>of</strong> the …rst statement. Let ^Vh i= ^Vab1; : : : ; 4. By the change <strong>of</strong> variables,ZZx
Let g i;1 ( 0 ) be the …rst element <strong>of</strong> g i ( 0 ). By the change <strong>of</strong> variables,r nh E [g i;1 ( 0 )] = p Znh K (u) ff (c + uh) exp ( 0 + l0 uh)g duu 0,0 = ^P (; 0) ^P ; = 0 g012 0 B@ 1nh1nX g i () g i () 0C1 + _ 0 2 A jgj C 2 ; (22)g i ()i=1w.p.a.1., where the …rst <strong>in</strong>equality follows from the de…nition <strong>of</strong> , the second equality follows froma second-order expansion with _ on the l<strong>in</strong>e jo<strong>in</strong><strong>in</strong>g <strong>and</strong> 0, <strong>and</strong> the second <strong>in</strong>equality follows from1P nnh i=1 g i () g i () 0 p ! V , positive de…niteness <strong>of</strong> V , <strong>and</strong> Lemma 2. S<strong>in</strong>ce C jgj <strong>and</strong> jgj =O p (nh) 1=2 by the assumption, we have = O p(nh) 1=2 . S<strong>in</strong>ce is chosen to satisfy (nh) 1=2 n !0, we have 2 <strong>in</strong>t n, i.e., is an <strong>in</strong>terior solution. Thus, from concavity <strong>of</strong> ^P (; ) <strong>in</strong> , convexity<strong>of</strong> n (), <strong>and</strong> n n () (by Lemma 2), = ^ () = arg max 2n() ^P (; ) w.p.a.1., i.e., the …rststatement is obta<strong>in</strong>ed. S<strong>in</strong>ce = O p(nh) 1=2 , the second statement is also obta<strong>in</strong>ed. The thirdstatement is obta<strong>in</strong>ed from (22) with = ^ ().Pro<strong>of</strong> <strong>of</strong> 4. The basic steps are similar to Newey <strong>and</strong> Smith (2004, Lemma A3). Pick any 2 (0; 1)satisfy<strong>in</strong>g (nh) 1=2 n ! 0. Let ^g = 1 P nnh i=1 g i (^) <strong>and</strong> ~ = n ^g= j^gj for . Observe that for someC > 0,^P^; ~ = ~ 0^g012 ~ 0 B@ 1nh1nX g i (^) g i (^) 0C1 + _ 0 2 A ~ n j^gj Cn 2 ; (23)g i (^)i=119
w.p.a.1., where the equality follows from a second-order expansion with _ on the l<strong>in</strong>e jo<strong>in</strong><strong>in</strong>g <strong>and</strong> 0,<strong>and</strong> the <strong>in</strong>equality follows from 1 P nnh i=1 g i (^) g i (^) 0 p! V , boundedness <strong>of</strong> V , <strong>and</strong> Lemma 2. Also notethatsup ^P (^; ) sup ^P ( 0 ; ) = O p(nh) 1 ; (24)2 n(^)2 n( 0 )w.p.a.1., where the <strong>in</strong>equality follows from the de…nition <strong>of</strong> ^, <strong>and</strong> the equality follows from Lemma 3with = 0 <strong>and</strong> 1 P nnh i=1 g i ( 0 ) = O p (nh) 1=2 (by Lemma 1). S<strong>in</strong>ce ~ 2 n , Lemma 2 guarantees~ 2 n (^), w.p.a.1., which implies ^P^; ~ sup ^P 2n(^) (^; ). Thus, comb<strong>in</strong><strong>in</strong>g (23) <strong>and</strong> (24),n j^gj Cn 2 O p(nh) 1 ; (25)w.p.a.1. S<strong>in</strong>ce we chose to satisfy (nh) 1=2 n ! 0, we have j^gj = O p n . Now, pick any n ! 0<strong>and</strong> de…ne = n^g. From j^gj = O p n , we have = o p n <strong>and</strong> 2 n n (^). Thus, we applythe same argument to (23)-(25) after replac<strong>in</strong>g ~ with . Then we obta<strong>in</strong> n j^gj 2 C 2 n j^gj 2 ^P ^; sup ^P (^; ) = O p(nh) 1 ;2 n(^)which implies n j^gj 2 = O p(nh) 1 : S<strong>in</strong>ce this results holds for any n ! 0, we obta<strong>in</strong> the conclusion.20
References[1] Angrist, J. D. <strong>and</strong> V. Lavy (1999) Us<strong>in</strong>g Maimonides’rule to estimate the e¤ect <strong>of</strong> class size onscholastic achievement, Quarterly Journal <strong>of</strong> Economics, 114, 533-575.[2] Cheng, M. Y. (1994) On boundary e¤ects <strong>of</strong> smooth curve estimators (dissertation), Unpublishedmanuscript series # 2319, University <strong>of</strong> North Carol<strong>in</strong>a.[3] Cheng, M. Y. (1997) A b<strong>and</strong>width selector for local l<strong>in</strong>ear density estimators, Annals <strong>of</strong> Statistics,25, 1001-1013.[4] Cheng, M. Y., Fan, J. <strong>and</strong> J. S. Marron (1997) On automatic boundary corrections, Annals <strong>of</strong>Statistics, 25, 1691-1708.[5] Cl<strong>in</strong>e, D. B. <strong>and</strong> J. D. Hart (1991) Kernel estimation <strong>of</strong> densities with discont<strong>in</strong>uities or discont<strong>in</strong>uousderivatives, Statistics, 22, 69-84.[6] Copas, J. B. (1995) Local likelihood based on kernel censor<strong>in</strong>g. Journal <strong>of</strong> the Royal StatisticalSociety, B, 57, 221-235.[7] Davidson, R. <strong>and</strong> MacK<strong>in</strong>non, J. G. (1998) Graphical methods for <strong>in</strong>vestigat<strong>in</strong>g the size <strong>and</strong> power<strong>of</strong> test statistics. The Manchester School, 66, 1-26.[8] Fan, J. <strong>and</strong> I. Gijbels (1996) Local Polynomial Modell<strong>in</strong>g <strong>and</strong> Its Applications, Chapman & Hall.[9] Fan, J., Zhang, C. <strong>and</strong> J. Zhang (2001) Generalized likelihood ratio statistics <strong>and</strong> Wilks phenomenon,Annals <strong>of</strong> Statistics, 29, 153-193.[10] Gregory, A. W. <strong>and</strong> M. R. Veal (1985) Formulat<strong>in</strong>g Wald tests <strong>of</strong> nonl<strong>in</strong>ear restrictions, Econometrica,53, 1465-68.[11] Hahn, J., Todd, P. <strong>and</strong> W. van der Klaauw (2001) Identi…cation <strong>and</strong> estimation <strong>of</strong> treatment e¤ectswith a regression discont<strong>in</strong>uity design, Econometrica, 69, 201-209.[12] Hjort, N. L. <strong>and</strong> M. C. Jones (1996) Locally parametric nonparametric density estimation. Annals<strong>of</strong> Statistics, 24, 1619-1647.[13] Imbens, G. W. <strong>and</strong> T. Lemieux (2008) Regression discont<strong>in</strong>uity designs: a guide to practice,Journal <strong>of</strong> Econometrics, 142, 615-635.[14] Kitamura, Y. (1997) Empirical likelihood methods with weakly dependent data, Annals <strong>of</strong> Statistics,25, 2084-2102.[15] Loader, C. R. (1996) Local likelihood density estimation, Annals <strong>of</strong> Statistics, 24, 1602-1618.21
[16] Marron, J. S. <strong>and</strong> D. Ruppert (1994) Transformations to reduce boundary bias <strong>in</strong> kernel densityestimation, Journal <strong>of</strong> the Royal Statistical Society, B, 56, 653-671.[17] McCrary, J. (2008) Manipulation <strong>of</strong> the runn<strong>in</strong>g variable <strong>in</strong> the regression discont<strong>in</strong>uity design: adensity test, Journal <strong>of</strong> Econometrics, 142, 698-714.[18] Newey, W. K. <strong>and</strong> R. J. Smith (2004) Higher order properties <strong>of</strong> GMM <strong>and</strong> generalized empiricallikelihood estimators, Econometrica, 72, 219-255.[19] Owen, A. (2001) Empirical Likelihood, Chapman & Hall, New York.[20] Porter, J. (2003) <strong>Estimation</strong> <strong>in</strong> the regression discont<strong>in</strong>uity model. Work<strong>in</strong>g paper, Department <strong>of</strong>Economics, University <strong>of</strong> Wiscons<strong>in</strong>.[21] Saez, E. (2009) Do taxpayers bunch at k<strong>in</strong>k po<strong>in</strong>ts? forthcom<strong>in</strong>g <strong>in</strong> American Economic Journal:Economic Policy.[22] Xu, K.-L. <strong>and</strong> P. C. B. Phillips (2007) Tilted nonparametric estimation <strong>of</strong> volatility functions withempirical applications. Cowles Foundation Discussion Paper 1612R, Yale University.[23] Xu, K.-L. (2010). Re-weighted Functional <strong>Estimation</strong> <strong>of</strong> Di¤usion Models. Econometric Theory26, 541-563.22
Table 1: F<strong>in</strong>ite-sample biases, st<strong>and</strong>ard deviations (Std.’s) <strong>and</strong> root mean square errors (RMSEs) <strong>of</strong>the b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> the local likelihood estimators <strong>of</strong> . The data are generated from a N(12; 3) density<strong>and</strong> the sample size n = 1000. The discont<strong>in</strong>uity po<strong>in</strong>t is c = 13.Fixed b<strong>and</strong>widthData dependent b<strong>and</strong>widthEst. h = 1 h = 2 h = 3 h = 4 = 1:5 1 = 1 = 1:5 = 1:5 2BiasStd.RMSEGe :0403 :0710 :0886 :0949 :0462 :0644 :0825 :0936Gb :0020 :0148 :0360 :0624 :0039 :0108 :0266 :0577e :0408 :0725 :0907 :0974 :0478 :0662 :0841 :0959b :0016 :0032 :0094 :0221 :0024 :0036 :0057 :0194Ge :0236 :0157 :0127 :0104 :0227 :0188 :0149 :0107Gb :0468 :0312 :0254 :0217 :0409 :0345 :0296 :0262e :0232 :0155 :0124 :0103 :0228 :0185 :0151 :0105b :0452 :0326 :0292 :0283 :0425 :0354 :0316 :0291Ge :0467 :0728 :0895 :0955 :0514 :0671 :0838 :0942Gb :0468 :0345 :0441 :0661 :0411 :0362 :0398 :0633e :0469 :0742 :0915 :0979 :0530 :0688 :0854 :0965b :0453 :0328 :0307 :0359 :0426 :0356 :0321 :0350hMean 1:7257Std. 0:197523
Table 2: F<strong>in</strong>ite-sample biases, st<strong>and</strong>ard deviations (Std.’s) <strong>and</strong> root mean square errors (RMSEs) <strong>of</strong>the b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> the local likelihood estimators <strong>of</strong> . The data are generated from a Student’s t density[i.e. 12 + t(5)= p 5=9] <strong>and</strong> the sample size n = 1000. The discont<strong>in</strong>uity po<strong>in</strong>t is c = 13.Fixed b<strong>and</strong>widthData dependent b<strong>and</strong>widthEst. h = 1 h = 2 h = 3 h = 4 = 1:5 1 = 1 = 1:5 = 1:5 2BiasStd.RMSEGe :0759 :1208 :1355 :1315 :0887 :1163 :1319 :1276Gb :0083 :0394 :0876 :1254 :0120 :0369 :0790 :1274e :0762 :1223 :1378 :1343 :0899 :1180 :1343 :1305b :0036 :0183 :0492 :0788 :0060 :0184 :0442 :0832Ge :0235 :0166 :0130 :0100 :0267 :0209 :0138 :0120Gb :0455 :0333 :0262 :0219 :0444 :0399 :0381 :0305e :0232 :0163 :0130 :0096 :0254 :0199 :0136 :0119b :0456 :0358 :0320 :0296 :0421 :0374 :0377 :0380Ge :0795 :1220 :1361 :1319 :0926 :1182 :1326 :1282Gb :0462 :0516 :0915 :1273 :0460 :0543 :0877 :1310e :0796 :1234 :1385 :1346 :0934 :1197 :1350 :1311b :0457 :0402 :0587 :0842 :0425 :0417 :0581 :0914hMean 1:9132Std. 0:356024
Table 3: F<strong>in</strong>ite-sample sizes, quantiles <strong>and</strong> powers <strong>of</strong> the W G (); `G() <strong>and</strong> `() test <strong>of</strong> density cont<strong>in</strong>uity,i.e. H 0 : 0 = 0 (nom<strong>in</strong>al sizes: 5% <strong>and</strong> 10%). The powers are calculated when the data aregenerated from a mixture <strong>of</strong> left <strong>and</strong> right truncated normal distributions at c with probability ; where = (c) d with d 2 f0:02; 0:04; 0:06; 0:08; 0:10gn Test SizeF<strong>in</strong>ite SampleQuantilePower(vs. value <strong>of</strong> d)AsymptoticQuantile 0.02 0.04 0.06 0.08 0.101000 W G ; 5% .073 4.51 3.84 .058 .104 .202 .368 .545`G; 5% .067 4.19 .059 .139 .244 .367 .491`; 5% .067 4.29 .082 .190 .366 .578 .754W G ; 10% .138 3.27 2.71 .107 .176 .303 .489 .651`G; 10% .137 3.19 .131 .211 .353 .505 .648`, 10% .110 2.89 .152 .273 .474 .681 .8412000 W G ; 5% .074 4.53 3.84 .071 .191 .394 .642 .856`G; 5% .047 3.79 .072 .191 .418 .636 .800`, 5% .050 3.82 .112 .306 .604 .845 .973W G ; 10% .134 3.18 2.71 .126 .261 .530 .754 .924`G; 10% .125 2.98 .144 .280 .541 .753 .909`; 10% .104 2.75 .184 .418 .715 .898 .9835000 W G ; 5% .065 4.31 3.84 .108 .427 .796 .960 .995`G; 5% .038 3.47 .112 .423 .769 .932 .987`; 5% .062 4.54 .193 .593 .900 .990 1.00W G ; 10% .127 3.07 2.71 .185 .548 .864 .983 .998`G; 10% .099 2.64 .185 .545 .861 .977 .995`; 10% .112 2.85 .282 .690 .952 .996 1.0025
Actual SizeSize Distortion0.35(a)0.06(b)0.30.050.250.040.20.030.150.10.05W G00 0.05 0.1 0.15 0.2 0.25Nom<strong>in</strong>al Sizel Gl45 o L<strong>in</strong>e0.020.0100.010 0.05 0.1 0.15 0.2 0.25Nom<strong>in</strong>al SizeFigure 1: (a) P-value plots <strong>and</strong> (b) P-value discrepancy plots (Davidson <strong>and</strong> MacK<strong>in</strong>non, 1998) forW G ; `G <strong>and</strong> ` tests when n = 2000:Table 4: <strong>Estimation</strong> <strong>and</strong> test<strong>in</strong>g <strong>of</strong> the discont<strong>in</strong>uity <strong>of</strong> the density <strong>of</strong> enrollments at multiples <strong>of</strong> 40(accord<strong>in</strong>g to Maimonides’s rule, Angrist <strong>and</strong> Lavy, 1999) for …fth graders. The b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> locallikelihood methods are used with various smooth<strong>in</strong>g b<strong>and</strong>widths.B<strong>in</strong>n<strong>in</strong>g methodLocal likelihood methodc h b fl b fr b GWald W G EL `G b fl b fr b EL `40 15 :0056 :0080 :0024 2:666 0:444 :0039 :0114 :0075 20:6720 :0052 :0089 :0037 7:888 1:331 :0040 :0114 :0074 26:9425 :0050 :0094 :0044 13:57 2:321 :0040 :0114 :0074 33:2830 :0054 :0099 :0045 16:23 2:856 :0045 :0116 :0072 36:3680 15 :0115 :0095 :0019 1:141 1:014 :0081 :0140 :0059 8:30120 :0112 :0089 :0024 2:389 1:964 :0085 :0116 :0030 3:36625 :0108 :0087 :0021 2:395 1:534 :0087 :0107 :0021 2:09230 :0105 :0089 :0016 1:698 1:001 :0088 :0107 :0020 2:350120 15 :0092 :0050 :0043 7:920 3:906 :0064 :0078 :0014 0:57720 :0088 :0048 :0039 9:355 3:977 :0066 :0070 :0003 0:04525 :0075 :0047 :0029 7:005 2:403 :0060 :0063 :0003 0:06930 :0066 :0045 :0021 4:988 1:263 :0055 :0060 :0005 0:178160 15 :0027 :0010 :0017 4:670 3:253 :0017 :0013 :0003 0:14220 :0025 :0010 :0015 5:176 2:397 :0018 :0012 :0006 0:73025 :0023 :0010 :0012 4:550 1:710 :0017 :0013 :0005 0:58630 :0021 :0011 :0011 4:316 1:456 :0017 :0013 :0004 0:47326
Frequencies0.0160.0140.0120.010.0080.0060.0040.00200 40 80 120 160 200EnrollmentsFigure 2: Histogram <strong>of</strong> the enrollments <strong>of</strong> 2029 classes <strong>in</strong> Grade 5 (Data: Angrist <strong>and</strong> Lavy, 1999).Table 5: Empirical likelihood con…dence sets (EL CSs) <strong>of</strong> the discont<strong>in</strong>uity <strong>of</strong> the density <strong>of</strong> enrollmentsat c = 40 for …fth graders. The b<strong>in</strong>n<strong>in</strong>g <strong>and</strong> local likelihood methods are used with various smooth<strong>in</strong>gb<strong>and</strong>widths.B<strong>in</strong>n<strong>in</strong>g methodLocal likelihood methodhb GEL CS Length b EL CS Length15 :0024 [ :0104; :0152] :0256 :0075 [:0046; :0106] :006020 :0037 [ :0036; :0120] :0156 :0074 [:0049; :0101] :005225 :0044 [ :0014; :0108] :0122 :0074 [:0051; :0097] :004630 :0045 [ :0007; :0106] :0113 :0072 [:0051; :0096] :004527
Densities0.015B<strong>in</strong>n<strong>in</strong>g EstimatorLocal Likelihood Estimator0.010.00500 40 80 120 160 200EnrollmentsFigure 3: The estimated density function <strong>of</strong> school enrollments for the …fth graders us<strong>in</strong>g the data onleft <strong>and</strong> right sides <strong>of</strong> c = 40: B<strong>in</strong>ned data are also displayed.28
Densities0.015B<strong>in</strong>n<strong>in</strong>g EstimatorLocal Likelihood Estimator0.010.00500 40 80 120 160 200EnrollmentsFigure 4: The estimated density function <strong>of</strong> school enrollments for the …fth graders us<strong>in</strong>g the data onleft <strong>and</strong> right sides <strong>of</strong> c = 120: B<strong>in</strong>ned data are also displayed.29