10.09.2013 Views

On Spatial Processes and Asymptotic Inference under Near$Epoch ...

On Spatial Processes and Asymptotic Inference under Near$Epoch ...

On Spatial Processes and Asymptotic Inference under Near$Epoch ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>On</strong> <strong>Spatial</strong> <strong>Processes</strong> <strong>and</strong> <strong>Asymptotic</strong> <strong>Inference</strong><br />

<strong>under</strong> Near-Epoch Dependence<br />

Nazgul Jenish 1 <strong>and</strong> Ingmar R. Prucha 2<br />

October 21, 2011<br />

1<br />

Department of Economics, New York University, 19 West 4th Street, New York,<br />

NY 10012. Tel.: 212-998-3891, Email: nazgul.jenish@nyu.edu<br />

2<br />

Department of Economics, University of Maryl<strong>and</strong>, College Park, MD 20742. Tel.:<br />

301-405-3499, Email: prucha@econ.umd.edu


Abstract<br />

The development of a general inferential theory for nonlinear models with crosssectionally<br />

or spatially dependent data has been hampered by a lack of appropriate<br />

limit theorems. To facilitate a general asymptotic inference theory relevant<br />

to economic applications, this paper …rst extends the notion of near-epoch dependent<br />

(NED) processes used in the time series literature to r<strong>and</strong>om …elds.<br />

The class of processes that is NED on, say, an -mixing process, is shown to<br />

be closed <strong>under</strong> in…nite transformations, <strong>and</strong> thus accommodates models with<br />

spatial dynamics. This would generally not be the case for the smaller class of<br />

-mixing processes. The paper then derives a central limit theorem <strong>and</strong> law<br />

of large numbers for NED r<strong>and</strong>om …elds. These limit theorems allow for fairly<br />

general forms of heterogeneity including asymptotically unbounded moments,<br />

<strong>and</strong> accommodate arrays of r<strong>and</strong>om …elds on unevenly spaced lattices. The<br />

limit theorems are employed to establish consistency <strong>and</strong> asymptotic normality<br />

of GMM estimators. These results provide a basis for inference in a wide range<br />

of models with cross-sectional or spatial dependence.<br />

JEL Classi…cation: C10, C21, C31<br />

Key words: R<strong>and</strong>om …elds, near-epoch dependent processes, central limit<br />

theorem, law of large numbers, GMM estimator


1 Introduction 1<br />

Models with cross-sectionally or spatially dependent data have recently attracted<br />

considerable attention in various …elds of economics including labor <strong>and</strong><br />

public economics, IO, political economy, macroeconomics, international economics,<br />

regional science <strong>and</strong> urban economics. In these models cross sectional<br />

dependence typically stems from agents’strategic interaction, neighborhood effects,<br />

shared resources <strong>and</strong> common shocks. Furthermore, economic agents are<br />

typically heterogeneous, e.g., vary in size. Mathematically, agents’observations<br />

may thus be viewed as a realization of a dependent heterogenous process indexed<br />

by a point in R d , d > 1, i.e., a r<strong>and</strong>om …eld. 2<br />

The aim of this paper is to de…ne a class of r<strong>and</strong>om …elds that is su¢ -<br />

ciently general to accommodate many applications of interest, <strong>and</strong> to establish<br />

corresponding limit theorems that can be used for asymptotic inference. In<br />

particular, we apply these limit theorems to prove consistency <strong>and</strong> asymptotic<br />

normality of generalized method of moments (GMM) estimators for a general<br />

class of nonlinear spatial models.<br />

There have been di¤erent approaches to modeling cross-sectional dependence<br />

in the econometrics literature. Two large str<strong>and</strong>s of the literature have focused<br />

on (i) linear spatial autoregressive models, also known as Cli¤-Ord (1981) type<br />

models 3 , <strong>and</strong> (ii) common factor models 4 . In both cases, progress was facilitated,<br />

loosely speaking, by imposing speci…c structural conditions on the data<br />

generating process, <strong>and</strong>/or by exploiting some <strong>under</strong>lying (conditional) independence<br />

assumptions. Another common approach to model dependence is through<br />

mixing conditions. Various mixing concepts developed for time series processes<br />

have been extended to r<strong>and</strong>om …elds. However, the respective limit theorems<br />

for r<strong>and</strong>om …elds have not been su¢ ciently general to accommodate many of<br />

the processes encountered in economics. This hampered the development of a<br />

general asymptotic inference theory for nonlinear models with cross-sectional<br />

dependence. Towards …lling this gap, Jenish <strong>and</strong> Prucha (2009) have recently<br />

introduced a set of limit theorems (CLT, ULLN, LLN) for -mixing r<strong>and</strong>om<br />

…elds on unevenly spaced lattices that allow for nonstationary processes with<br />

trending moments.<br />

Yet some dependent spatial processes do not satisfy mixing conditions. As<br />

with time series, spatial processes may be generated as functions of in…nitely<br />

many elements of an input process, with a spatial moving average process being<br />

1 We thank the participants of the Cowles Foundation Conference, Yale, June 2009, as well<br />

as the seminar participants at the Columbia University for helpful discussions. This research<br />

bene…tted from a University of Maryl<strong>and</strong> Ann G. Wylie Dissertation Fellowship for the …rst<br />

author <strong>and</strong> from …nancial support from the National Institute of Health through SBIR grant<br />

1 R43 AG027622 for the second author.<br />

2 The space <strong>and</strong> metric are not restricted to physical space <strong>and</strong> distance.<br />

3 For recent contributions see, e.g., Robinson (2009, 2007), Yu, de Jong <strong>and</strong> Lee (2008),<br />

Kelejian <strong>and</strong> Prucha (2007a,b,2004), Lee (2007a,b, 2004), <strong>and</strong> Chen <strong>and</strong> Conley (2001).<br />

4 For recent contributions, see, e.g., Andrews (2005), Bai <strong>and</strong> Ng (2004), Bai (2003), Phillips<br />

<strong>and</strong> Sul (2003).<br />

1


a leading example. As is well-known, the mixing property is not necessarily<br />

preserved <strong>under</strong> transformations involving in…nite number of lags. For instance,<br />

Andrews (1984) shows that a simple time series AR(1) process fails to be -<br />

mixing though its input process is independent. Thus, it is important to develop<br />

an asymptotic theory for a generalized class of r<strong>and</strong>om …elds that is “closed with<br />

respect to in…nite transformations.”<br />

To tackle this problem, the paper …rst extends the concept of near-epoch dependent<br />

(NED) processes used in the time series literature to spatial processes.<br />

The notion dates back to Ibragimov (1962), <strong>and</strong> Billingsley (1968). The NED<br />

concept, or variants thereof, have been used extensively in the time series literature<br />

by McLeish (1975a,b), Bierens (1981), Wooldridge (1986), Gallant (1987),<br />

Gallant <strong>and</strong> White (1988), Andrews (1988), Pötscher <strong>and</strong> Prucha (1991,1997),<br />

Davidson (1992, 1993, 1994) <strong>and</strong> de Jong (1997), among others. Doukhan <strong>and</strong><br />

Louhichi (1999), Ango Nze <strong>and</strong> Doukhan (2004) introduced an alternative class<br />

of dependent processes called “ -weakly dependent.”We provide various examples<br />

suggesting that the class of NED spatial processes, which subsumes mixing<br />

processes, is su¢ ciently broad to cover many applications of interest. For<br />

instance, we show that it covers Cli¤-Ord type processes, autoregressive <strong>and</strong><br />

moving average r<strong>and</strong>om …elds, <strong>and</strong>, more generally, nonlinear spatial Bernoulli<br />

shifts.<br />

The paper then derives a CLT <strong>and</strong> an LLN for spatial processes that are near<br />

epoch dependent on an -mixing input process. These limit theorems allow<br />

for fairly general forms of heterogeneity including asymptotically unbounded<br />

moments, <strong>and</strong> accommodate arrays of r<strong>and</strong>om …elds on unevenly spaced lattices.<br />

The LLN can be combined with the generic ULLN in Jenish <strong>and</strong> Prucha (2009)<br />

to obtain an ULLN for NED spatial processes. In the time series literature,<br />

CLTs for NED processes were derived by Wooldridge (1986), Davidson (1992,<br />

1993), <strong>and</strong> de Jong (1997). Interestingly, our CLT contains as a special case the<br />

CLT of Wooldridge (1986), Theorem 3.13 <strong>and</strong> Corollary 4.4.<br />

In addition, we give conditions <strong>under</strong> which the NED property is preserved<br />

<strong>under</strong> transformations. These results play a key role in verifying the NED property<br />

in applications. Thus, the NED property is compatible with considerable<br />

heterogeneity <strong>and</strong> dependence, invariant <strong>under</strong> transformations, <strong>and</strong> leads to<br />

a CLT <strong>and</strong> LLN <strong>under</strong> fairly general conditions. All these features make it a<br />

convenient tool for modeling spatial dependence.<br />

As an application, we establish consistency <strong>and</strong> asymptotic normality of<br />

spatial GMM estimators. These results provide a fundamental basis for constructing<br />

con…dence intervals <strong>and</strong> testing hypothesis for GMM estimators in<br />

nonlinear spatial models. Our results also exp<strong>and</strong> on Conley (1999), who established<br />

the asymptotic properties of GMM estimators assuming that the data<br />

generating process <strong>and</strong> the moment functions are stationary <strong>and</strong> -mixing. 5<br />

The rest of the paper is organized as follows. Section 2 introduces the concept<br />

of NED spatial processes <strong>and</strong> gives some mild conditions that ensure preserva-<br />

5 This important early contribution employs Bolthausen’s (1982) CLT for stationary -<br />

mixing r<strong>and</strong>om …elds on the regular lattice Z 2 . However, the mixing <strong>and</strong> stationarity assumptions<br />

may not hold in many applications. The present paper relaxes these critical assumptions.<br />

2


tion of the NED property <strong>under</strong> transformations. Section 3 provides examples<br />

of NED spatial processes. Section 4 contains the CLT <strong>and</strong> LLN for NED spatial<br />

processes. Section 5 establishes the asymptotic properties of spatial GMM<br />

estimators. All proofs are collected in the appendices.<br />

2 De…nition of NED <strong>Spatial</strong> <strong>Processes</strong><br />

Let D R d , d 1, be a (possibly) unevenly spaced lattice of locations in R d ,<br />

<strong>and</strong> let Z = fZi;n; i 2 Dn; n 1g <strong>and</strong> " = f"i;n; i 2 Tn; n 1g be triangular<br />

arrays of r<strong>and</strong>om …elds de…ned on a probability space ( ; F; P ) with Dn Tn<br />

D. The space R d is equipped with the metric (i; j) = max1 l d jjl ilj, where<br />

il is the l-th component of i. The distance between any subsets U; V D is<br />

de…ned as (U; V ) = inf f (i; j) : i 2 U <strong>and</strong> j 2 V g. Furthermore, let jUj denote<br />

the cardinality of a …nite subset U D.<br />

The r<strong>and</strong>om variables Zi;n <strong>and</strong> "i;n are possibly vector-valued taking their<br />

values in R pz <strong>and</strong> R p" . We assume that R pz <strong>and</strong> R p" are normed metric spaces<br />

equipped with the Euclidian norm, which we denote (in an obvious misuse of<br />

notation) as j:j. For any r<strong>and</strong>om vector Y , let kY k p = [E jY j p ] 1=p , p 1;<br />

denote its Lp-norm. Finally, let Fi;n(s) = ("j;n; j 2 Tn : (i; j) s) be the<br />

-…eld generated by the r<strong>and</strong>om vectors "j;n located in the s-neighborhood of<br />

location i.<br />

Throughout the paper, we maintain the above notational conventions as well<br />

as the following assumption concerning D.<br />

Assumption 1 The lattice D R d , d 1, is in…nite countable. All elements<br />

in D are located at distances of at least 0 > 0 from each other, i.e., for all i;<br />

j 2 D : (i; j) 0; w.l.o.g. we assume that 0 > 1.<br />

The assumption of a minimum distance has also been used by Conley (1999)<br />

<strong>and</strong> Jenish <strong>and</strong> Prucha (2009). It ensures the growth of the sample size as the<br />

sample regions Dn <strong>and</strong> Tn exp<strong>and</strong>. The setup is thus geared towards what is<br />

referred to in the spatial literature as increasing domain asymptotics.<br />

We now introduce the notion of near-epoch dependent (NED) r<strong>and</strong>om …elds.<br />

De…nition 1 Let Z = fZi;n; i 2 Dn; n 1g be a r<strong>and</strong>om …eld with kZi;nk p <<br />

1, p 1, let " = f"i;n; i 2 Tn; n 1g be a r<strong>and</strong>om …eld, where jTnj ! 1 as<br />

n ! 1, <strong>and</strong> let d = fdi;n; i 2 Dn; n 1g be an array of …nite positive constants.<br />

Then the r<strong>and</strong>om …eld Z is said to be Lp(d)-near-epoch dependent on the r<strong>and</strong>om<br />

…eld " if<br />

kZi;n E(Zi;njFi;n(s))k p di;n (s) (1)<br />

3


for some sequence (s) 0 with lims!1 (s) = 0. The (s), which are w.l.o.g.<br />

assumed to be non-increasing, are called the NED coe¢ cients, <strong>and</strong> the di;n are<br />

called NED scaling factors. Z is said to be Lp-NED on " of size if (s) =<br />

O(s ) for some > > 0. Furthermore, if sup n sup i2Dn di;n < 1, then Z is<br />

said to be uniformly Lp-NED on ":<br />

Recall that Dn Tn. Typically, Tn will be an in…nite subset of D, <strong>and</strong><br />

often Tn = D. However, to cover Cli¤-Ord type processes, see Section 3.3, Tn<br />

is allowed to depend on n <strong>and</strong> to be …nite provided that it increases in size with<br />

n.<br />

The role of the scaling factors fdi;ng is to allow for the the possibility of<br />

“unbounded moments”, i.e., sup n sup i2Dn di;n = 1. Unbounded moments may<br />

re‡ect trends in the moments in certain directions, in which case we may also<br />

use, as in the time series literature, the terminology of “trending moments”. The<br />

NED property is thus compatible with a considerable amount of heterogeneity.<br />

In establishing limit theorems for NED processes, we will have to impose restrictions<br />

on the scaling factors di;n. In this respect, observe that<br />

kZi;n E(Zi;njFi;n(s))k p kZi;nk p + kE(Zi;njFi;n(s))k p 2 kZi;nk p<br />

by the Minkowski <strong>and</strong> the conditional Jensen inequalities. Given this, we may<br />

choose di;n 2 kZi;nk p , <strong>and</strong> consequently w.l.o.g. 0 (s) 1; see, e.g.,<br />

Davidson (1994), p. 262, for a corresponding discussion within the context of<br />

time series processes. Note that by the Lyapunov inequality, if Zi;n is Lp-NED,<br />

then it is also Lq-NED with the same coe¢ cients fdi;ng <strong>and</strong> f (s)g for any<br />

q p.<br />

Our de…nition of NED for spatial processes is adapted from the de…nition of<br />

NED for time series processes. In the time series literature, the NED concept<br />

…rst appeared in the works of Ibragimov (1962) <strong>and</strong> Billingsley (1968), although<br />

they did use the present term. The concept of time series NED processes was<br />

later formalized by McLeish (1975a,b), Wooldridge (1986), Gallant (1987), Gallant<br />

<strong>and</strong> White (1988). These authors considered only L2-NED processes. Andrews<br />

(1988) generalized it to Lp-NED processes for p 1: Davidson (1992,<br />

1993, 1994) <strong>and</strong> de Jong (1997) further extended it to allow for trending time<br />

series processes.<br />

An important motivation for considering NED processes is that mixing is<br />

not necessarily preserved <strong>under</strong> transformations involving in…nitely many arguments;<br />

see, e.g., Andrews (1984) for simple examples in a time series context.<br />

However, as illustrated below, for a wide range of models the output process<br />

is obtained as a function of in…nitely many input variables. In those situations<br />

mixing of the input process does not necessarily carry over to the output process,<br />

<strong>and</strong> thus limit theorems for averages of the output process cannot simply be established<br />

from limit theorems for mixing processes. Nevertheless, as with time<br />

series processes, we show below that limit theorems can be extended to spatial<br />

processes that are NED on a mixing input process, provided the approxima-<br />

4


tion error declines “su¢ ciently fast” as the conditioning set of input variables<br />

exp<strong>and</strong>s.<br />

A further attractive feature of NED processes is that the NED property is<br />

preserved <strong>under</strong> transformations. Econometric estimators are usually de…ned<br />

either explicitly as functions of some <strong>under</strong>lying data generating processes or<br />

implicitly as optimizers of a function of the data generating process. Thus, if<br />

the data generating process is NED on some input process, the question arises<br />

<strong>under</strong> what conditions functions of r<strong>and</strong>om …elds are also NED on the same<br />

input process.<br />

Various conditions that ensure preservation of the NED property <strong>under</strong><br />

transformations have been established in the time series literature by Gallant<br />

<strong>and</strong> White (1988), <strong>and</strong> Davidson (1994). In fact, these results extend to r<strong>and</strong>om<br />

…elds. In particular, the NED property is preserved <strong>under</strong> summation <strong>and</strong> multiplication,<br />

<strong>and</strong> carries over from a r<strong>and</strong>om vector to its components <strong>and</strong> vice<br />

versa. For future reference, we now state some results for generalized classes<br />

of nonlinear functions. Their proofs are analogous to those in the time series<br />

literature, <strong>and</strong> therefore omitted.<br />

Consider transformations of Zi;n given by a family of functions<br />

gi;n : R pz ! R:<br />

The functions gi;n are assumed Borel-measurable for all n <strong>and</strong> i 2 D. They<br />

are furthermore assumed to satisfy the following Lipschitz condition: For all<br />

(z; z ) 2 R pz R pz <strong>and</strong> all i 2 Dn <strong>and</strong> n 1:<br />

where<br />

jgi;n(z) gi;n(z )j Bi;n(z; z ) jz z j (2)<br />

Bi;n(z; z ) : R pz R pz ! R+;<br />

is Borel-measurable. Of course, this condition would be devoid of meaning without<br />

further restrictions on Bi;n(z; z ), which are given in the next propositions.<br />

Proposition 1 Suppose gi;n( ) satis…es Lipschitz condition (2) with<br />

jBi;n(z; z )j C < 1<br />

for all (z; z ) 2 R pz R pz <strong>and</strong> all i <strong>and</strong> n. If for p 1 the fZi;ng are Lp-NED<br />

of size on f"i;ng with scaling factors fdi;ng, then gi;n(Zi;n) is also Lp-NED<br />

of size on f"i;ng with scaling factors f2Cdi;ng.<br />

Proposition 2 Suppose gi;n( ) satis…es Lipschitz condition (2) with<br />

sup<br />

s<br />

B (s)<br />

i;n 2 < 1 <strong>and</strong> sup<br />

s<br />

5<br />

B (s)<br />

i;n Zi;n<br />

eZ s i;n r < 1 (3)


for some r > 2; where B (s)<br />

i;n = Bi;n(Zi;n; e Z s i;n ) <strong>and</strong> e Z s i;n = E [Zi;njFi;n(s)]. If<br />

kgi;n(Zi;n)k 2 < 1 <strong>and</strong> Zi;n is L2-NED of size on f"i;ng with scaling factors<br />

fdi;ng ; then gi;n(Zi;n) is L2-NED of size (r 2)=(2r 2) on f"i;ng with scaling<br />

factors d0 2)=(2r<br />

i;n = d(r i;n<br />

2)<br />

sups B (s)<br />

i;n<br />

(r<br />

2<br />

2)=(2r 2)<br />

B (s)<br />

i;n Zi;n<br />

eZ s i;n<br />

r=(2r<br />

r<br />

2)<br />

Thus, the NED property is hereditary <strong>under</strong> reasonably weak conditions.<br />

These conditions facilitate veri…cation of the NED property in practical application.<br />

In particular, we will use them in the proof of asymptotic normality of<br />

spatial GMM estimators in Section 5.<br />

3 Examples of NED <strong>Spatial</strong> <strong>Processes</strong><br />

In this section, we give examples of some important classes of spatial processes,<br />

which are widely used in applications, <strong>and</strong> show that they satisfy the NED<br />

property. Of course, to establish limit theorems the NED property would be<br />

accompanied by mixing assumptions on the input process.<br />

3.1 Moving Average R<strong>and</strong>om Fields<br />

Consider the in…nite moving average (or linear) r<strong>and</strong>om …eld on Zd , say, Y =<br />

fYi; i 2 Zdg de…ned as<br />

Yi = X<br />

gij"j; (4)<br />

j2Z d<br />

where " = f"i; i 2 Zdg is a zero mean vector-valued r<strong>and</strong>om …eld on Zd <strong>and</strong> the<br />

gij are some real numbers. We assume furthermore that the coe¢ cients gij <strong>and</strong><br />

the innovations "i satisfy, respectively,<br />

<strong>and</strong><br />

lim<br />

s!1 sup<br />

X<br />

i2Zd j2Zd ; (i;j)>s<br />

jgijj = 0; (5)<br />

sup<br />

i2Zd k"ikp < 1 where p 1: (6)<br />

Linear stationary r<strong>and</strong>om …elds on Z 2 have been explored by Whittle (1954)<br />

in a paper that signi…cantly in‡uenced the subsequent modeling of spatial dependencies.<br />

More recently, linear r<strong>and</strong>om …elds on Z d , d 1, have been studied<br />

by Doukhan <strong>and</strong> Guyon (1991), see also Doukhan (1994). Analogous to the<br />

time series literature, Yi is de…ned as the limit in Lp as s ! 1 of the partial<br />

sums Y s<br />

i<br />

= P<br />

j2Z d ; (i;j) s gij"j. The following existence results is proven by<br />

Doukhan (1994), pp. 75-81. 6<br />

6 In fact, Doukhan (1994) proves a more general result that also covers the case p < 1.<br />

6<br />

:


Proposition 3 (Doukhan, 1994) Under conditions (5)-(6) the distribution of<br />

linear r<strong>and</strong>om …eld Y in (4) is well de…ned. In particular, the …nite dimensional<br />

distributions of Y are limits in Lp as s ! 1 of the those of the r<strong>and</strong>om …elds<br />

Y s = fY s<br />

i ; i 2 Zd g.<br />

We next give a result that shows that <strong>under</strong> the same conditions, the linear<br />

…eld Y is Lp-NED on the …eld ".<br />

Proposition 4 Under conditions (5)-(6) the linear r<strong>and</strong>om …eld Y in (4) is<br />

uniformly Lp-NED on ":<br />

The nice feature of this result is that it does not impose any additional<br />

conditions over <strong>and</strong> above those employed in Proposition 3 to establish the<br />

existence of the linear …eld.<br />

3.2 Autoregressive R<strong>and</strong>om Fields<br />

The linear r<strong>and</strong>om …elds of the previous example may arise as solutions of<br />

autoregressive spatial models. For example, Whittle (1954) considered the following<br />

simple autoregressive model on Z 2 :<br />

Yi1;i2 = Yi1+1;i2 + Yi1;i2+1 + "i1;i2 (7)<br />

where j j + j j < 1 <strong>and</strong> " = f"i1;i2 ; (i1; i2) 2 Z 2 g are i.i.d. r<strong>and</strong>om variables<br />

with zero mean <strong>and</strong> …nite variance. Whittle (1954), p. 447, gives the following<br />

the stationary solution of (7):<br />

Yi1;i2 =<br />

=<br />

1X<br />

1X<br />

j1=0 j2=0<br />

1X<br />

1X<br />

j1=i1 j2=i2<br />

j1 + j2<br />

Thus, (8) is a linear r<strong>and</strong>om …eld with<br />

j1<br />

j1 i1 + j2 i2<br />

j1 i1<br />

gij = j1 i1 + j2 i2<br />

j1 i1<br />

j1 j2 "i1+j1;i2+j2 (8)<br />

j1 i1 j2 i2 "j1;j2<br />

j1 i1 j2 i2 :<br />

for j = (j1; j2) i = (i1; i2) <strong>and</strong> zero, otherwise. Furthermore, note that<br />

the assumptions of Proposition 4 are satis…ed for p = 2, since P<br />

P 1<br />

k=0<br />

P k<br />

l=0<br />

k<br />

l j j l j j k l = P1 k=0 (j j + j j)k < 1, <strong>and</strong> hence<br />

0 lim<br />

s!1 sup<br />

X<br />

i2Z2 j i; (i;j)>s<br />

observing that j j + j j < 1.<br />

jgijj lim<br />

7<br />

s!1<br />

k=s<br />

1X<br />

(j j + j j) k = 0;<br />

j i jgijj =


3.3 Cli¤-Ord Type <strong>Spatial</strong> <strong>Processes</strong><br />

Consider the following Cli¤-Ord type model:<br />

Yn = WnYn + Xn + un (9)<br />

un = Mnun + vn<br />

where Yn = (Y1;n; :::; Yn;n) 0 is an n 1 vector of endogenous variables, Xn =<br />

(Xij;n) is n k matrix of regressors, Wn = (wij;n) <strong>and</strong> Mn = (mij;n) are n n<br />

nonstochastic weight matrices, vn = (v1;n; : : : ; vn;n) 0 is an n 1 vector of disturbances,<br />

<strong>and</strong> are scalar parameters typically referred to as spatial autoregressive<br />

parameters, <strong>and</strong> is a k 1 parameter vector. To simplify presentation, we<br />

assume in the following that k = 1 <strong>and</strong> use the notation Xn = (X1;n; :::; Xn;n) 0 .<br />

The above model is frequently referred to as a spatial ARAR(1,1) model, <strong>and</strong><br />

WnYn <strong>and</strong> Mnun are referred to as spatial lags. For recent contributions on<br />

estimation strategies for this model see, e.g., Robinson (2009, 2007), Kelejian<br />

<strong>and</strong> Prucha (2007a,b, 2004), <strong>and</strong> Lee (2007a,b, 2004).<br />

The spatial weights mij;n <strong>and</strong> wij;n are typically thought of as to depend<br />

on some measure of distance, say dij, between units i <strong>and</strong> j, where the weights<br />

decline as the distance increases. Thus although in Cli¤-Ord models observations<br />

are indexed by natural numbers, i.e., i = 1; : : : ; n, a leading case would<br />

be where i correspond to some location, say si 2 R d , <strong>and</strong> the distance between<br />

i <strong>and</strong> j is given by dij = (si; sj). In the following, we assume that<br />

Dn = fs1; : : : ; sng D, where the lattice D satis…es Assumption 1. However,<br />

to simplify notation we will use (i; j) instead of (si; sj). We assume<br />

furthermore that for all n<br />

In Wn <strong>and</strong> In Mn are nonsingular. (10)<br />

The reduced form of the model is then given by Yn = AnXn + Bnvn:with<br />

An = (aij;n) = (In Wn) 1 <strong>and</strong> Bn = (bij;n) = (In Wn) 1 (In Mn) 1 .<br />

In scalar notation we have for i = 1; :::; n:<br />

Yi;n =<br />

nX<br />

j=1<br />

aij;nXj;n +<br />

nX<br />

j=1<br />

bij;nvj;n;<br />

Although for …xed n; the output process Yi;n depends on only a …nite number of<br />

elements of the input process "i;n = (Xi;n; vi;n) 0 , the mixing property of "i;n may<br />

not carry over to Yi;n: The reason is that the number of elements composing the<br />

spatial lags grows unboundedly with the sample size so that the mixing property<br />

can break down in the limit. This is especially important when analyzing the<br />

asymptotic properties of Cli¤-Ord type processes.<br />

Towards establishing that Y = fYi;n; si 2 Dn; n 1g is NED on " =<br />

f"i;n; si 2 Dn; n 1g, we maintain the following assumptions:<br />

lim<br />

s!1 sup sup<br />

n<br />

X<br />

jaij;nj + jbij;nj = 0 (11)<br />

1 i n<br />

1 j n: (i;j)>s<br />

8


<strong>and</strong><br />

sup<br />

n<br />

sup<br />

1 i n<br />

We have the following result.<br />

k"i;nk p < 1 for some p 1: (12)<br />

Proposition 5 Under conditions (10)-(12), the process Y = fYi;n; si 2 Dn; n<br />

1g, where Yi;n is de…ned by (9), is uniformly Lp-NED on the process " =<br />

f"i;n; si 2 Dn; n 1g, where "i;n = (Xi;n; vi;n) 0 :<br />

since<br />

It is readily seen that a su¢ cient condition for (11) is that for some > 0,<br />

sup<br />

n<br />

sup<br />

n<br />

sup<br />

n<br />

sup<br />

sup<br />

1 i n<br />

j=1<br />

nX<br />

(jaij;nj + jbij;nj) (i; j) < 1 (13)<br />

X<br />

jaij;nj + jbij;nj<br />

1 i n<br />

1 j n; (i;j)>s<br />

sup<br />

X<br />

(jaij;nj + jbij;nj)<br />

1 i n<br />

1 j n; (i;j)>s<br />

1<br />

s sup sup<br />

n<br />

1 i n<br />

j=1<br />

(i; j)<br />

s<br />

nX<br />

(jaij;nj + jbij;nj) (i; j) ! 0 as s ! 1:<br />

A condition analogous to (13) has been used recently by Kelejian <strong>and</strong> Prucha<br />

(2007a), <strong>and</strong> should be satis…ed in a wide range of applications. It is slightly<br />

stronger than the typical assumption in the Cli¤-Ord literature which maintains<br />

assumptions that imply that the row <strong>and</strong> column sums of the absolute elements<br />

of the matrices An <strong>and</strong> Bn are uniformly bounded; compare, e.g., Kelejian <strong>and</strong><br />

Prucha (2007b) <strong>and</strong> Lee (2007a,b, 2004).<br />

3.4 <strong>Spatial</strong> Bernoulli Shifts<br />

Consider a real-valued r<strong>and</strong>om …eld Y = fYi; i 2 Z d g de…ned as:<br />

Yi = fi("i+j; j 2 Z d ) (14)<br />

where " = f"i; i 2 Z d g is a real-valued r<strong>and</strong>om …eld <strong>and</strong> the fi : R Zd<br />

! R<br />

are measurable functions. The process (14) is a generalization of one-parameter<br />

Bernoulli shifts considered by Doukhan <strong>and</strong> Louhichi (1999). A simple example<br />

of a spatial Bernoulli shift is the in…nite moving average r<strong>and</strong>om …eld of the<br />

…rst example. More generally, the fi may be nonlinear functions. We assume<br />

that the fi satisfy the following Lipschitz-type regularity condition:<br />

fi(xj; j 2 Z d ) fi(yj; j 2 Z d ) X<br />

wij jxj yjj (15)<br />

9<br />

j2Z d


for some nonnegative constants wij such that<br />

lim<br />

s!1 sup<br />

X<br />

wij = 0: (16)<br />

i2Zd j2Zd : (i;j)>s<br />

Finally, we assume that the input process " = f"i; i 2 Z d g has uniformly<br />

bounded second moments, i.e.,<br />

Then, one can establish the following result.<br />

sup<br />

i2Zd k"ik2 < 1 (17)<br />

Proposition 6 Under conditions (15)-(17), the r<strong>and</strong>om …eld Y = fYi; i 2 Z d g<br />

given by (14) is well-de…ned <strong>and</strong> uniformly L2-NED on the r<strong>and</strong>om …eld " =<br />

f"i; i 2 Z d g.<br />

Conditions (15)-(17) are analogous to those used by Doukhan <strong>and</strong> Louhichi<br />

(1999) for time-series Bernoulli shifts. Condition (15) is ful…lled if the functions<br />

fi have bounded partial derivatives, i.e., j@fi=@xjj wij.<br />

Thus, the class of NED spatial processes covers fairly large classes of linear<br />

<strong>and</strong> nonlinear transformations of r<strong>and</strong>om …elds.<br />

4 Limit Theorems<br />

4.1 Central Limit Theorem<br />

In the following, we introduce our CLT for real valued r<strong>and</strong>om …elds Y =<br />

fYi;n; i 2 Dn; n 1g, where Z is assumed to be L2-NED on some vector-valued<br />

-mixing r<strong>and</strong>om …eld " = f"i;n; i 2 Tn; n 1g with the NED coe¢ cients<br />

f (s)g, where Dn Tn D <strong>and</strong> the lattice D satis…es Assumption 1. In the<br />

sequel, we will use the following notation:<br />

Sn = X<br />

i2Dn<br />

Yi;n;<br />

2<br />

n = var(Sn):<br />

For ease of reference, we state below the de…nition of the -mixing coe¢ cients<br />

employed in the paper.<br />

De…nition 2 Let A <strong>and</strong> B be two -algebras of F, <strong>and</strong> let<br />

(A; B) = sup(jP (AB) P (A)P (B)j; A 2 A; B 2 B);<br />

For U Dn <strong>and</strong> V Dn, let n(U) = ("i;n; i 2 U) <strong>and</strong> n(U; V ) =<br />

( n(U); n(V )). Then, the -mixing coe¢ cients for the r<strong>and</strong>om …eld " are<br />

de…ned as:<br />

(u; v; r) = sup<br />

n<br />

sup(<br />

n(U; V ); jUj u; jV j v; (U; V ) r):<br />

U;V<br />

10


Dobrushin (1968) showed that weak dependence conditions based on the<br />

above mixing coe¢ cients are satis…ed by broad classes of r<strong>and</strong>om …elds including<br />

Markov …elds. In contrast to st<strong>and</strong>ard mixing numbers for time-series processes,<br />

the mixing coe¢ cients for r<strong>and</strong>om …elds depend not only on the distance between<br />

two datasets but also their sizes. To explicitly account for such dependence, it<br />

is furthermore assumed that<br />

(u; v; r) '(u; v)b(r) (18)<br />

where the function '(u; v) is nondecreasing in each argument, <strong>and</strong> b(r) ! 0<br />

as r ! 1. The idea is to account separately for the two di¤erent aspects of<br />

dependence: (i) decay of dependence with the distance, <strong>and</strong> (ii) accumulation of<br />

dependence as the sample region exp<strong>and</strong>s. The two common choices of '(u; v)<br />

in the r<strong>and</strong>om …elds literature are<br />

<strong>and</strong><br />

'(u; v) = (u + v) ; 0; (19)<br />

'(u; v) = min fu; vg : (20)<br />

The above mixing conditions have been used extensively in the r<strong>and</strong>om …elds<br />

literature including Neaderhouser (1978), Takahata (1983), Nahapetian (1987),<br />

Bulinskii (1989), Bulinskii <strong>and</strong> Doukhan (1990), Tran (1990) <strong>and</strong> Bradley (1993).<br />

They are satis…ed by fairly large classes of r<strong>and</strong>om …elds. Bradley (1993) provides<br />

examples of r<strong>and</strong>om …elds satisfying conditions (18)-(19) with u = v <strong>and</strong><br />

= 1. Furthermore, Bulinskii (1989) constructs in…nite moving average r<strong>and</strong>om<br />

…elds satisfying the same conditions with = 1 for any given decay rate of coe¢<br />

cients b(r): Clearly, st<strong>and</strong>ard mixing coe¢ cients in the time series literature<br />

are covered by conditions (18)-(19) when = 0.<br />

Following the literature, we employ the above mixing conditions for the<br />

input r<strong>and</strong>om …eld, <strong>and</strong> impose the following restriction on the decay rates of<br />

the mixing coe¢ cients.<br />

Assumption 2 The -mixing coe¢ cients of " satisfy (18) <strong>and</strong> (19) for some<br />

0 <strong>and</strong> b(r), such that for some > 0<br />

where = =(2 + ).<br />

1X<br />

r=1<br />

r d( +1) 1 b 2(2+ ) (r) < 1; (21)<br />

Note that if = 1 this assumption also covers the case where '(u; v) is given<br />

by (20). Furthermore, the CLT relies on the following conditions regarding<br />

moments, the NED coe¢ cients <strong>and</strong> the NED scaling factors.<br />

Assumption 3 (a) (Uniform L2+ integrability)<br />

lim<br />

k!1 sup sup E[jYi;nj 2+ 1(jYi;nj > k)] = 0;<br />

n<br />

i2Dn<br />

where 1( ) is the indicator function <strong>and</strong> > 0 is as in Assumption 2.<br />

11


(b) infn jDnj 1 2 n > 0, where fDng is a sequence of …nite subsets of D such<br />

that jDnj ! 1 as n ! 1:<br />

(c) NED coe¢ cients satisfy P 1<br />

r=1 rd 1 (r) < 1:<br />

We can now state the following CLT for L2-NED r<strong>and</strong>om …elds.<br />

Theorem 1 Suppose fDng is a sequence of …nite subsets such that jDnj ! 1<br />

as n ! 1 <strong>and</strong> fTng is a sequence of subsets such that Dn Tn D of the<br />

lattice D satisfying Assumption 1: Let Y = fYi;n; i 2 Dn; n 1g be a real valued<br />

zero-mean r<strong>and</strong>om …eld that is L2-NED on a vector-valued -mixing r<strong>and</strong>om<br />

…eld " = f"i;n; i 2 Tn; n 1g. Suppose Assumptions 2 <strong>and</strong> 3 hold, then<br />

1<br />

n Sn =) N(0; 1):<br />

The above CLT contains the CLT for -mixing r<strong>and</strong>om …elds given in Jenish<br />

<strong>and</strong> Prucha (2009) as a special case. As shown earlier, the class of NED r<strong>and</strong>om<br />

…elds is an important extension of the class of -mixing r<strong>and</strong>om …elds, in that the<br />

class includes important <strong>and</strong> widely considered …elds, e.g., spatial autoregressive<br />

processes, that do not generally satisfy mixing conditions. It is for that reason<br />

that an estimation theory limited to the class of -mixing r<strong>and</strong>om …elds is not<br />

fully satisfactory; see, e.g., Gallant <strong>and</strong> White (1988) <strong>and</strong> Pötscher <strong>and</strong> Prucha<br />

(1997) for an analogous discussion in the time series literature. We note that<br />

Theorem 1 also contains as a special case the CLT for time series NED processes<br />

of Wooldridge (1986), see Theorem 3.13 <strong>and</strong> Corollary 4.4.<br />

Assumption 2 restricts the dependence structure of the input process ":<br />

Assumptions 3(a),(b) are st<strong>and</strong>ard in the limit theory of mixing processes,<br />

e.g., Wooldridge (1986), Davidson (1992), <strong>and</strong> de Jong (1997), <strong>and</strong> Jenish <strong>and</strong><br />

Prucha (2009).<br />

Assumption 3(a) is satis…ed if the Yi;n are uniformly Lp-bounded for some<br />

p > 2 + , i.e., sup n sup i2Dn kYi;nk p < 1. Assumption 3(b) is an asymptotic<br />

negligibility condition that ensures that no single summ<strong>and</strong> in‡uences disproportionately<br />

the entire sum. Assumption 3(c) controls the size of the NED<br />

coe¢ cients which measure the error in the approximation of Yi;n by ". Intuitively,<br />

the approximation errors have to decline su¢ ciently fast with each<br />

successive approximation. Assumption 3(c) is satis…ed if (r) = O(r d ) for<br />

some > 0; i.e., (r) is of size d.<br />

Theorem 1 can be easily extended to vector-valued …elds using the st<strong>and</strong>ard<br />

Cramér-Wold device.<br />

Corollary 1 Suppose fDng is a sequence of …nite subsets such that jDnj ! 1<br />

as n ! 1 <strong>and</strong> fTng is a sequence of subsets such that Dn Tn D of the<br />

lattice D satisfying Assumption 1: Let Y = fYi;n; i 2 Dn; n 1g with Yi;n 2<br />

12


R k be a zero-mean r<strong>and</strong>om …eld that is L2-NED on a vector-valued -mixing<br />

r<strong>and</strong>om …eld " = f"i;n; i 2 Tn; n 1g. Suppose Assumptions 2 <strong>and</strong> 3 hold with<br />

jYi;nj denoting the Euclidean norm of Yi;n <strong>and</strong> 2 n replaced by min( n), where<br />

n = V ar(Sn) <strong>and</strong> min(:) is the smallest eigenvalue, then<br />

1=2<br />

n Sn =) N(0; Ik):<br />

Furthermore, sup n jDnj 1 max ( n) < 1, where max(:) denotes the largest<br />

eigenvalue.<br />

4.2 Law of Large Numbers<br />

To prove consistency of spatial estimators, we also derive a weak LLN for r<strong>and</strong>om<br />

…elds which are for L1-NED on some mixing …eld. This LLN can then be<br />

used to establish uniform convergence of r<strong>and</strong>om functions by combining it<br />

with the generic ULLN given in Jenish <strong>and</strong> Prucha (2009), which transforms<br />

pointwise LLNs (at a given parameter value) into ULLNs. The LLN maintains<br />

the following moment <strong>and</strong> mixing assumptions.<br />

Assumption 4 (a) sup n sup i2Dn E jYi;nj p < 1:; p > 1:<br />

(b) The -mixing coe¢ cients of the input …eld " satisfy (18) for some function<br />

'(u; v) which is nondecreasing in each argument, <strong>and</strong> some b(r) such that<br />

P 1<br />

r=1 rd 1 b (r) < 1.<br />

We now have the following weak LLN.<br />

Theorem 2 Let fDng be a sequence of arbitrary …nite subsets of D such that<br />

jDnj ! 1 as n ! 1; where D Rd , d 1 is as in Assumption 1, <strong>and</strong> let<br />

Tn be a sequence of subsets of D such that Dn Tn. Suppose further that<br />

Y = fYi;n; i 2 Dn; n 1g is L1-NED on " = f"i;n; i 2 Tn; n 1g. If Y <strong>and</strong> "<br />

satisfy Assumption 4, then<br />

1<br />

jDnj<br />

X<br />

i2Dn<br />

(Yi;n EYi;n) L1<br />

! 0;<br />

We note that Y <strong>and</strong> " can be vector-valued r<strong>and</strong>om …elds. In this case, jYi;nj<br />

should be <strong>under</strong>stood as the Euclidean norm in the respective vector-space.<br />

Assumption 4(a) is a st<strong>and</strong>ard moment condition employed in weak LLNs for<br />

dependent processes. It requires existence of moments of order slightly greater<br />

than 1. Assumption 4(b) is weaker than the mixing conditions used for the CLT.<br />

Furthermore, the LLN does not require any restrictions on the NED coe¢ cients.<br />

In the time series literature, weak LLNs for NED processes have been obtained<br />

by Andrews (1988) <strong>and</strong> Davidson (1993), among others. Andrews (1988)<br />

derives an L1-law for triangular arrays of L1-mixingales. He then shows that<br />

NED processes are L1-mixingales, <strong>and</strong> hence, satisfy his LLN. Davidson (1993)<br />

extends the latter result to processes with trending moments.<br />

13


5 Large Sample Properties of <strong>Spatial</strong> GMM Estimators<br />

In this section, we apply the above developed limit theorems to establish the<br />

large sample properties of spatial GMM estimators <strong>under</strong> a reasonably general<br />

set of assumptions that should cover a wide range of empirical problems. More<br />

speci…cally, our consistency <strong>and</strong> asymptotic normality results (i) maintain only<br />

that the spatial data process is NED on an -mixing basis process to accommodate<br />

spatial lags in the data process as discussed above, (ii) allow for the data<br />

process to be located on an unevenly spaced grid, <strong>and</strong> (iii) allow for the data<br />

process to be non-stationary, which will frequently be the case in empirical applications.<br />

We also give our results <strong>under</strong> a set of primitive su¢ cient conditions<br />

for easier interpretation by the applied researcher. 7<br />

We continue with the basic set-up of Section 2. Consider the moment function<br />

qi;n : R pz ! R pq , where denotes the parameter space, <strong>and</strong> let 0 n 2<br />

denote the parameter vector of interest (which we allow to depend on n for reasons<br />

of generality). Suppose the following moment conditions hold<br />

Eqi;n(Zi;n; 0 n) = 0; (22)<br />

then the corresponding spatial GMM estimator is de…ned as<br />

where Qn: ! R,<br />

b n = arg min<br />

2 Qn(!; ); (23)<br />

Qn(!; ) = Rn( ) 0 PnRn( );<br />

Rn( ) = jDnj<br />

X<br />

1<br />

qi;n(Zi;n; );<br />

i2Dn<br />

where the Pn are a positive de…nite weighting matrices. To prove that b n is<br />

0<br />

a consistent estimator for n consider the following non-stochastic analogue of<br />

Qn, say<br />

Qn( ) = [ERn( )] 0 P [ERn( )] ; (24)<br />

where P denotes the probability limit of Pn. Then given the moment condition<br />

(22) clearly Rn( o o<br />

n) = 0, <strong>and</strong> the functions Qn are minimized at n with<br />

Qn( o n) = 0. In proving consistency, we follow the classical approach; see, e.g.,<br />

7 In an important contribution, Conley (1999) gives a …rst set of results regarding the<br />

asymptotic properties of GMM estimators <strong>under</strong> the assumption that the data process is stationary<br />

<strong>and</strong> -mixing. Conley also maintains some high level assumption such as …rst moment<br />

continuity of the moment function, which in turn immediately implies uniform convergence<br />

- see, e.g., Pötscher <strong>and</strong> Prucha (1989) for a discussion. Our results extend Conley (1999)<br />

in several important directions, as indicated above. We establish uniform convergence from<br />

primitive su¢ cient conditions via the generic uniform law of large numbers given in Jenish<br />

<strong>and</strong> Prucha (2009) <strong>and</strong> the law of large numbers given as Theorem 2 above.<br />

14


Gallant <strong>and</strong> White (1988) or Pötscher <strong>and</strong> Prucha (1997) for more recent expo-<br />

o<br />

sitions. In particular, given identi…able uniqueness of n we establish, loosely<br />

speaking, convergence of the minimizers b o<br />

n to the minimizers n by establishing<br />

convergence of the objective function Qn(!; ) to its non-stochastic analogue<br />

Qn( ) uniformly over the parameter space.<br />

Throughout the sequel, we maintain the following assumptions regarding the<br />

parameter space, the GMM objective function <strong>and</strong> the unknown parameters 0 n.<br />

Assumption 5 (a) The parameter space is a compact metric space with<br />

metric .<br />

(b) The functions qi;n : R pz ! R pq are B pz =B pq -measurable for each<br />

2 , <strong>and</strong> continuous on for each z 2 R pz :<br />

(c) The elements of the pq pq real matrices Pn are B-measurable, <strong>and</strong> Pn is<br />

positive de…nite a.s. Furthermore P = p lim Pn exists <strong>and</strong> P is positive<br />

de…nite.<br />

(d) The minimizers o n are identi…ably unique in the sense that for every " > 0,<br />

lim infn!1 inf 2 : ( ; o n ) " [ERn( )] 0 [ERn( )] > 0:<br />

Compactness of the parameter space as maintained in Assumption 5(a) is<br />

typical for the GMM literature. Assumptions 5(b),(c) imply that Qn( ; ) is<br />

measurable for all 2 , <strong>and</strong> Qn(!; ) is continuous on . Given those assumptions<br />

the existence of measurable functions b n that solves (23) follows, e.g., from<br />

Lemma A3 of Pötscher <strong>and</strong> Prucha (1997).<br />

Since P is positive de…nite, it is readily seen that Assumption 5(d) implies<br />

that for every " > 0 :<br />

lim inf<br />

n!1<br />

inf<br />

2 : ( ; o n ) " Qn( ) Qn( o n) > 0;<br />

observing that Qn( o n) = 0. Thus <strong>under</strong> Assumption 5(d) the minimizers<br />

are identi…ably unique; compare, e.g., Gallant <strong>and</strong> White (1988), p.19. For<br />

interpretation consider the important special case where o n = o , Rn( ) = R( )<br />

o<br />

<strong>and</strong> R( ) is continuous. In this case identi…able uniqueness of is equivalent<br />

to the assumption that<br />

o<br />

is the unique solution of the moment conditions, i.e.,<br />

Rn( ) 6= 0 for all 6= o ; compare, e.g., Pötscher <strong>and</strong> Prucha (1997), p. 16.<br />

5.1 Consistency<br />

o<br />

Given the minimizers n are identi…ably unique, b n is a consistent estimator<br />

o<br />

p<br />

for n if Qn converges uniformly to Qn, i.e., if sup 2 Qn(!; ) Qn( ) ! 0<br />

as n ! 1; this follows immediately from, e.g, Pötscher <strong>and</strong> Prucha (1997),<br />

Lemma 3.1.<br />

We now proceed by giving a set of primitive domination <strong>and</strong> Lipschitz type<br />

conditions for the moment functions that ensure uniform convergence of Qn to<br />

15<br />

o<br />

n


Q n. The conditions are in line with those maintained in the general literature<br />

on M-estimation, e.g., Andrews (1987), Gallant <strong>and</strong> White (1988), <strong>and</strong> Pötscher<br />

<strong>and</strong> Prucha (1989,1994).<br />

De…nition 3 Let fi;n : R pz ! R pq be B pz =B pq -measurable functions for<br />

each 2 , then:<br />

(a) The r<strong>and</strong>om functions fi;n(Zi;n; ) are said to be p-dominated on for<br />

some p > 1 if sup n sup i2Dn E sup 2 jfi;n(Zi;n; )j p < 1.<br />

(b) The r<strong>and</strong>om functions fi;n(Zi;n; ) are said to be Lipschitz in the parameter<br />

on if<br />

jfi;n(Zi;n; ) fi;n(Zi;n; )j Li;n(Zi;n)h( ( ; )) a.s., (25)<br />

for all ; 2 <strong>and</strong> i 2 Dn; n 1, where h is a nonr<strong>and</strong>om function<br />

with h(x) # 0 as x # 0, <strong>and</strong> Li;n are r<strong>and</strong>om variables with<br />

< 1 for some > 0:<br />

lim sup n!1 jDnj 1 P<br />

i2Dn EL i;n<br />

Towards establishing consistency of b n we furthermore maintain the following<br />

moment <strong>and</strong> mixing assumptions.<br />

Assumption 6 The moment functions qi;n(Zi;n; ) have the following properties:<br />

(a) They are p-dominated on for p = 2.<br />

(b) They are uniformly L1-NED on " = f"i;n; i 2 Tn; n 1g, where Dn Tn<br />

D, <strong>and</strong> " is -mixing with -mixing coe¢ cients the conditions stated in<br />

Assumption 4(b).<br />

(c) They are Lipschitz in the parameter on .<br />

Assumption 6(a) implies that sup n;i2Dn E jqi;n(Zi;n; )j p < 1 for each 2<br />

. Assumptions 6(b) then allow us to apply the LLN given as Theorem 2 above<br />

the sample moments Rn( ) = jDnj 1 P<br />

i2Dn qi;n(Zi;n; ).<br />

To verify Assumption 6(b) one can use either Proposition 1 or Proposition<br />

2 to imply this condition from the lower level assumption that the data Zi;n are<br />

L1-NED. For example, the qi;n are L1-NED, if the Zi;n are L1-NED <strong>and</strong> satisfy<br />

the Lipschitz condition of Proposition 1. Note that no restrictions on the sizes<br />

of the NED coe¢ cients are required.<br />

Assumption 6(c) ensures stochastic equicontinuity of qi;n w.r.t. . Stochastic<br />

equicontinuity jointly with Assumption 6(a) <strong>and</strong> the pointwise LLN enable us<br />

to invoke the ULLN of Jenish <strong>and</strong> Prucha (2009) to prove uniform convergence<br />

of the sample moments, which in turn is used to establish that Qn converges<br />

16


uniformly to Q n. A su¢ cient condition for Assumption 6(c) is existence of<br />

integrable partial derivatives of qi;n w.r.t. if 2 R k .<br />

Our consistency results for the spatial GMM estimator given by (23) is summarized<br />

by the next theorem.<br />

Theorem 3 (Consistency) Suppose fDng is a sequence of …nite sets of D such<br />

that jDnj ! 1 as n ! 1; where D R d ; d 1 is as in Assumption 1. Suppose<br />

further that Assumptions 5 <strong>and</strong> 6 hold. Then<br />

( b n; o n) p ! 0 as n ! 1;<br />

<strong>and</strong> Q n( ) is uniformly equicontinuous on .<br />

5.2 <strong>Asymptotic</strong> Normality<br />

We next establish that the spatial GMM estimators de…ned by (23) is asymptotically<br />

normally distributed. For that purpose, we need a stronger set of assumptions<br />

than for consistency, including di¤erentiability of the moment functions in<br />

. It proofs helpful to adopt the notation r in place of @=@ . 8<br />

o<br />

Assumption 7 (a) The minimizers n lie uniformly in the interior of with<br />

Rk . Furthermore E [Rn( o n)] = 0.<br />

(b) The functions qi;n : R pz ! R pq are continuously di¤erentiable w.r.t.<br />

in the interior of for each z 2 R pz :<br />

(c) The functions qi;n(Zi;n; o n) are uniformly L2-NED on " of size d, <strong>and</strong><br />

supn;i2Dn E jqi;n(Zi;n; o 0<br />

2+<br />

n)j < 1 for some<br />

r qi;n(Zi;n; ) are uniformly L1-NED on ".<br />

0<br />

> 0. The functions<br />

(d) The input process " = f"i;n; i 2 Tn; n 1g, where Dn Tn D, is -<br />

mixing <strong>and</strong> the mixing coe¢ cients satisfy Assumption 2 for some < 0 ,<br />

where 0 is the same as in Assumption 7(c).<br />

(e) The functions r qi;n are p-dominated on for some p > 1:<br />

(f) The functions r qi;n are Lipschitz in on .<br />

(g) infn min(jDnj 1 n) > 0 where n = V ar P<br />

(h) infn min [Er Rn( o n) 0 r Rn( o n)] > 0.<br />

i2Dn qi;n(Zi;n; o n) .<br />

8 To ensure that the derivatives are de…ned on the border of , we assume in the following<br />

that the moment functions are de…ned on an open set containing , <strong>and</strong> that the qi;n <strong>and</strong><br />

r qi;n are restrictions to .<br />

17


The …rst part of Assumption 7(a) is needed to ensure that the estimator<br />

b n lies in the interior of with probability tending to one, <strong>and</strong> facilitates the<br />

application of the mean value theorem to Rn( b n) around o n. The second part<br />

states in essence that the moment conditions are correctly speci…ed. Its violation<br />

will generally invalidate the limiting distribution result.<br />

Assumptions 7(c),(d),(g) enable us to apply the CLT for vector-valued NED<br />

processes given above as Corollary 1 to Rn( o n). Some low level su¢ cient conditions<br />

for Assumption 7(c) are given below. To establish asymptotic normality,<br />

we also need uniform convergence of r Rn on , which is implied via Assumptions<br />

7(c),(d),(e),(f).<br />

Given the above assumptions, we have the following asymptotic normality<br />

result for the spatial GMM estimator de…ned by(23).<br />

Theorem 4 Suppose fDng is a sequence of …nite sets of D such that jDnj ! 1<br />

as n ! 1; where D R d ; d 1 is as in Assumption 1. Suppose further that<br />

Assumptions 5- 7 hold. Then<br />

where<br />

A 1<br />

n BnB 0 nA 10<br />

n<br />

1=2 jDnj 1=2 b n<br />

o<br />

n =) N(0; Ik);<br />

An = [Er Rn( o n)] 0 P [Er Rn( o n)] <strong>and</strong> Bn = [Er Rn( o n)] 0 P<br />

h<br />

jDnj 1 i1=2 n :<br />

Moreover, jAnj = O(1); A 1<br />

n = O(1); jBnj = O(1); (BnB 0 n) 1 = O(1) <strong>and</strong><br />

hence, b n is jDnj 1=2 -consistent for<br />

o<br />

n.<br />

As remarked above, relative to the existing literature Theorem 4 allows for<br />

nonstationary processes <strong>and</strong> only assumes that qi;n <strong>and</strong> r qi;n are NED on<br />

an -mixing input process, rather than postulating that qi;n <strong>and</strong> r qi;n are -<br />

mixing. The latter assumption is restrictive in that the -mixing property is not<br />

preserved <strong>under</strong> transformation. Since the NED property is preserved <strong>under</strong> a<br />

wide range of transformations it accommodates a much larger class of dependent<br />

spatial processes, including in…nite moving average <strong>and</strong> spatial autoregressive<br />

models. As such, Theorem 4 should provide a basis for constructing con…dence<br />

intervals <strong>and</strong> hypothesis testing in a wider range of spatial models.<br />

Using Proposition 2, we now give some su¢ cient conditions for Assumption<br />

7(c).<br />

Assumption 8 The process fZi;n; i 2 Dn Tn; n 1g is uniformly L2-NED<br />

on f"i;n; i 2 Tn; n 1g of size 2d(r 1)=(r 2) for some r > 2:<br />

Assumption 9 For every sequence f ng on , the functions qi;n(Zi;n; n) <strong>and</strong><br />

r qi;n(Zi;n; n) satisfy Lipschitz condition (2) in z, that is, for gi;n = qi;n or<br />

r qi;n:<br />

jgi;n(z; n) gi;n(z ; n)j Bi;n(z; z ) jz z j :<br />

18


Furthermore, for the r > 2 as speci…ed in Assumption 8,<br />

sup sup<br />

n;i2Dn s<br />

B (s)<br />

i;n < 1 <strong>and</strong> sup sup<br />

2<br />

n;i2Dn s<br />

B (s)<br />

i;n Zi;n<br />

where B (s)<br />

i;n = Bi;n(Zi;n; e Zs i;n ) with e Zs i;n = E [Zi;njFi;n(s)].<br />

6 Conclusion<br />

eZ s i;n r<br />

< 1<br />

The paper develops an asymptotic inference theory for a class of dependent<br />

nonstationary r<strong>and</strong>om …elds that could be used in a wide range of econometric<br />

models with cross-sectional or spatial dependence, e.g., social interaction models.<br />

More speci…cally, the paper extends the notion of near-epoch dependent<br />

(NED) processes used in the time series literature to spatial processes. This<br />

allows to accommodate larger classes of dependent processes than mixing r<strong>and</strong>om<br />

…elds. The class of NED r<strong>and</strong>om …elds is “closed with respect to in…nite<br />

transformations ” <strong>and</strong> thus should be su¢ ciently broad for many applications<br />

of interest. In particular, it covers autoregressive <strong>and</strong> in…nite moving average<br />

r<strong>and</strong>om …elds as well as nonlinear spatial Bernoulli shifts. The NED property<br />

is also compatible with considerable heterogeneity <strong>and</strong> preserved <strong>under</strong> transformations<br />

<strong>under</strong> fairly mild conditions. Furthermore, a CLT <strong>and</strong> an LLN are<br />

derived for spatial processes that are NED on an -mixing process. Apart from<br />

covering a larger class of dependent processes, these limit theorems also allow<br />

for arrays of nonstationary r<strong>and</strong>om …elds on unevenly spaced lattices. Building<br />

on these limit results, the paper develops an asymptotic theory of spatial GMM<br />

estimators, which provides a basis for inference in a broad range of models with<br />

cross-sectional or spatial dependence.<br />

Much of the r<strong>and</strong>om …elds literature assumes that the process resides on an<br />

equally spaced grid. In contrast, <strong>and</strong> as in Jenish <strong>and</strong> Prucha (2009), we allow<br />

for locations to be unequally spaced. The implicit assumption of …xed locations<br />

seems reasonable for a large class of applications, especially in the short run.<br />

Still, an important direction for future work would be to extend the asymptotic<br />

theory to spatial processes with endogenous locations, while maintaining a set<br />

of assumptions that are reasonably easy to interpret. 9 <strong>On</strong>e possible approach<br />

may be to augment the contributions of the present paper with theory from<br />

point processes.<br />

A Appendix: Proofs for Section 3<br />

Proof of Proposition 4: Let Fi(s) = ("j; j 2 Z d : (i; j) s). By the<br />

Minkowski inequality for in…nite sums, the conditional Jensen inequality, <strong>and</strong><br />

9 Pinkse et al. (2007) made an interesting contribution in this direction. Their catalogue<br />

of assumption is at the level of Bernstein blocks. Without further su¢ cient conditions, veri-<br />

…cation of those assumptions would typically be challenging in practical situations.<br />

19


(6), we have<br />

kYi E[YijFi(s)]k p = X<br />

X<br />

j2Z d : (i;j)>s<br />

j2Z d ; (i;j)>s<br />

jgijj k"j E ["jjFi(s)]k p K (s):<br />

gij f"j E ["jjFi(s)]g<br />

P<br />

with K = 2 supj2Zd k"jkp < 1 <strong>and</strong> (s) = supi2Zd j2Zd : (i;j)>s jgijj. Since<br />

(s) ! 0 as s ! 1 by condition (5) we have kYi E[YijFi(s)]kp ! 0 as s ! 1,<br />

which proves the claim. Q.E.D.<br />

Proof of Proposition 5: Let Fi;n(s) = ("j;n; 1 j n: (i; j) s). By the<br />

Minkowski inequality, the conditional Jensen inequality, <strong>and</strong> (12), we have<br />

X<br />

kYi;n E[Yi;njFi;n(s)]k p<br />

+<br />

X<br />

1 j n; (i;j)>s<br />

1 j n; (i;j)>s<br />

jbij;nj kvj;n E[vj;njFi;n(s)]k p K (s)<br />

jaij;nj kXj;n E[Xj;njFi;n(s)]k p<br />

P<br />

with K = maxf2 kXj;nkp ; 2 kvj;nkpg < 1 <strong>and</strong> (s) = supn;1 i n 1 j n; (i;j)>s (jaij;nj + jbij;nj).<br />

Since (s) ! 0 as s ! 1 by condition (11) we have kYi E[YijFi(s)]kp ! 0 as<br />

s ! 1, which proves the claim. Q.E.D.<br />

Proof of Proposition 6: We …rst show that Yi is well de…ned as the limit in<br />

L2 as s ! 1 of the r<strong>and</strong>om variables<br />

where<br />

Y s<br />

i = fi(" (s)<br />

i+j ; j 2 Zd );<br />

" (s)<br />

i+j = "i+j for (i; j) s<br />

0 for (i; j) > s :<br />

In light of condition (15) we have for any s, m 2 N:<br />

Y s+m<br />

i<br />

Y s<br />

i 2<br />

= fi(" (s+m)<br />

i+j ) fi(" (s)<br />

i+j ) 2<br />

sup<br />

i2Zd k"ik2<br />

X<br />

j2Z d :s< (i;j) s+m<br />

X<br />

j2Z d :s< (i;j) s+m<br />

wij K sup<br />

p<br />

wij k"i+jk 2<br />

X<br />

i2Zd j2Zd wij<br />

: (i;j)>s<br />

with K = supi2Zd k"ik2. In light of (16)-(17) the r.h.s. converges monotonically<br />

to zero as s ! 1. Thus clearly Y s<br />

i is a Cauchy series in L2, <strong>and</strong> hence Yi is<br />

well de…ned since the L2 space is a Banach space <strong>and</strong> as such complete.<br />

Now, let Fi(s) = ("i+j; j 2 Zd : (i; j) s). By the minimum mean-squared<br />

error property of the conditional expectation, we have<br />

20


with<br />

kYi E[YijFi(s)]k 2 fi("i+j; Z d ) fi(" (s)<br />

i+j ; j 2 Zd ) 2<br />

(s) = sup<br />

K<br />

X<br />

j2Z d : (i;j)>s<br />

X<br />

i2Zd j2Zd : (i;j)>s<br />

wij K (s);<br />

wij ! 0<br />

as s ! 1 by condition (16). This completes the proof of the lemma. Q.E.D.<br />

B Appendix: Proofs for Section 4<br />

Throughout this appendix let Fi;n(s) = ("j;n; j 2 Tn : (i; j) s) be the<br />

-…eld generated by the r<strong>and</strong>om vectors "j;n located in the s-neighborhood of<br />

location i. Furthermore, C denotes a generic constant that does not depend on<br />

n <strong>and</strong> may be di¤erent from line to line.<br />

The proof of the CLT builds on Ibragimov <strong>and</strong> Linnik (1971), pp. 352-355,<br />

<strong>and</strong> makes use of the following lemmata:<br />

Lemma B.1 (Brockwell <strong>and</strong> Davis (1991), Proposition 6.3.9). Let Yn, n =<br />

1; 2; ::: <strong>and</strong> Vns; s = 1; 2; :::; n = 1; 2; :::, be r<strong>and</strong>om vectors such that<br />

(i) Vns =) Vs as n ! 1 for each s = 1; 2; :::<br />

(ii) Vs =) V as s ! 1, <strong>and</strong><br />

(iii) lims!1 lim sup n!1 P (jYn Vnsj > ) = 0 for every > 0:<br />

Then Yn =) V as n ! 1:<br />

Lemma B.2 (Ibragimov <strong>and</strong> Linnik (1971)) Let Lp(F1) <strong>and</strong> Lp(F2) denote,<br />

respectively, the class of F1-measurable <strong>and</strong> F2-measurable r<strong>and</strong>om variables<br />

satisfying k k p < 1. Let X 2 Lp(F1) <strong>and</strong> Y 2 Lq(F2): Then, for any 1<br />

p; q; r < 1 such that p 1 + q 1 + r 1 = 1,<br />

jCov(X; Y )j < 4 1=r (F1; F2) kXk p kY k q<br />

where (F1; F2) = sup A2F1;B2F2 (jP (AB) P (A)P (B)j).<br />

To prove the CLT for NED r<strong>and</strong>om …elds, we …rst establish some moment<br />

inequalities <strong>and</strong> a slightly modi…ed version of the CLT for mixing …elds<br />

developed in Jenish <strong>and</strong> Prucha (2009). It is helpful to introduce the following<br />

notation. Let X = fXi;n; i 2 Dn; n 1g be a r<strong>and</strong>om …eld, then<br />

kXk q := sup n;i2Dn kXi;nk q for q 1.<br />

21


Lemma B.3 Let fXi;ng be uniformly L2-NED on a r<strong>and</strong>om …eld f"i;ng with<br />

-mixing coe¢ cients (u; v; r) (u + v) b(r), 0. Let Sn = P<br />

i2Dn Xi;n<br />

<strong>and</strong> suppose that the NED coe¢ cients of fXi;ng satisfy P1 r=1 rd 1 (r) < 1<br />

<strong>and</strong> kXk2+ < 1 for some > 0. Then,<br />

(a)<br />

(b)<br />

jCov (Xi;nXj;n)j kXk2+<br />

n<br />

C1kXk2+ [h=3] d<br />

b =(2+ ) o<br />

([h=3]) + C2 ([h=3])<br />

where h = (i; j) <strong>and</strong> = =(2+ ). If in addition P 1<br />

r=1 rd( +1) 1 b =(2+ ) (r) <<br />

1, then for some constant C < 1, not depending on n<br />

jCov (Xi;nXj;n)j kXk2<br />

V ar (Sn) C jDnj :<br />

n<br />

C3kXk2+ [h=3] d<br />

b =(4+2 ) o<br />

([h=3]) + C4 ([h=3])<br />

where h = (i; j) <strong>and</strong> = =(4+2 ). If, in addition, P 1<br />

r=1 rd( +1) 1 b =(4+2 ) (r) <<br />

1 where = =(4 + 2 ), then for some constant C < 1, not depending<br />

on n<br />

V ar (Sn) CkXk2 jDnj :<br />

Proof of Lemma B.3: (a) For any i 2 Dn <strong>and</strong> m > 0, let<br />

m<br />

i;n = E(Xi;njFi;n(m));<br />

m<br />

i;n = Xi;n<br />

By the Jensen <strong>and</strong> Lyapunov inequalities, we have for all i 2 Dn; n; m 2 N<br />

<strong>and</strong> any 1 q 2 + :<br />

E m i;n<br />

<strong>and</strong> thus<br />

m<br />

i;n q<br />

m<br />

i;n<br />

q = EfjE(Xi;njFi;n(m))j q g EfE(jXi;nj q jFi;n(m))g = E jXi;nj q<br />

kXi;nk q kXk 2+ ;<br />

m<br />

i;n q<br />

2 kXi;nk q 2 kXk 2+ :<br />

Thus, both m i;n <strong>and</strong> m i;n are uniformly L2+ bounded. Also, note that<br />

sup<br />

n;i2Dn<br />

m<br />

i;n 2<br />

(m);<br />

given that the fXi;ng is uniformly L2-NED on f"i;ng <strong>and</strong> thus the NED-scaling<br />

factors can be chosen w.l.g. to be one. Furthermore, let ( m i;n) denote the<br />

-…eld generated by m i;n: Since ( m i;n) Fi;n(m), the mixing coe¢ cients of m i;n<br />

satisfy<br />

(1; 1; r)<br />

1; r 2m<br />

(Mm d ; Mm d ; r 2m); r > 2m<br />

22


where (u; v; r) are the mixing coe¢ cients of the input process ", since the mneighborhood<br />

of any point on D contains at most Mm d points of D for some M<br />

that does not depend on m, see the proof of Lemma A.1 of Jenish <strong>and</strong> Prucha<br />

(2009).<br />

Now, decompose Xi;n <strong>and</strong> <strong>and</strong> Xj;n as<br />

Xi;n = [h=3]<br />

i;n<br />

where h = (i; j). Then,<br />

jCov (Xi;nXj;n)j = Cov<br />

[h=3]<br />

+ i;n ; <strong>and</strong> Xj;n = [h=3] [h=3]<br />

j;n + j;n ;<br />

Cov<br />

+ Cov<br />

[h=3]<br />

i;n<br />

[h=3]<br />

i;n ;<br />

[h=3]<br />

j;n ;<br />

+ [h=3]<br />

i;n ;<br />

[h=3]<br />

j;n<br />

[h=3]<br />

i;n<br />

[h=3]<br />

j;n<br />

+ Cov<br />

+ Cov<br />

+ [h=3]<br />

j;n<br />

[h=3]<br />

i;n ;<br />

[h=3]<br />

i;n ;<br />

[h=3]<br />

j;n<br />

[h=3]<br />

j;n<br />

:<br />

(B.1)<br />

We will now bound separately each term on the r.h.s. of the last inequality.<br />

First, using Lemma B.2 with p = q = 2 + ; <strong>and</strong> r = (2 + )= yields the<br />

following bound on the …rst term:<br />

Cov<br />

[h=3]<br />

i;n ;<br />

[h=3]<br />

j;n<br />

4<br />

[h=3]<br />

i;n<br />

4kXk 2 2+<br />

2+<br />

C1kXk 2 2+ [h=3] d<br />

[h=3]<br />

j;n<br />

2+<br />

=(2+ ) (1; 1; [h=3]) (B.2)<br />

=(2+ ) M [h=3] d ; M [h=3] d ; h 2 [h=3]<br />

b =(2+ ) ([h=3])<br />

where = =(2 + ).<br />

Second, the Cauchy-Schwartz inequality gives the following bound on the<br />

second <strong>and</strong> third terms:<br />

Cov<br />

[h=3]<br />

i;n ;<br />

[h=3]<br />

j;n<br />

4<br />

[h=3]<br />

i;n<br />

2<br />

[h=3]<br />

j;n<br />

Similarly, the fourth term can be bounded as:<br />

Cov<br />

[h=3]<br />

i;n ;<br />

[h=3]<br />

j;n<br />

4<br />

[h=3]<br />

i;n<br />

2<br />

[h=3]<br />

j;n<br />

Collecting (B.2)-(B.4), we have<br />

n<br />

jCov (Xi;nXj;n)j kXk2+ C1kXk2+ [h=3] d<br />

2<br />

2<br />

4kXk2 ([h=3]) (B.3)<br />

8kXk2 ([h=3]) (B.4)<br />

b =(2+ ) o<br />

([h=3]) + C2 ([h=3])<br />

which proves the …rst inequality.<br />

Using this inequality as well as the bounds on the sizes of the sets given in<br />

23


Lemma A.1 of Jenish <strong>and</strong> Prucha (2009), we have<br />

V ar (Sn)<br />

X<br />

V ar (Xi;n) +<br />

X<br />

i2Dn<br />

i;j2Dn;i6=j<br />

2jDnj kXk 2<br />

2+ + C1kXk 2 2+<br />

+C2kXk2+<br />

X<br />

1X<br />

jCov (Xi;nXj;n)j<br />

X<br />

1X<br />

i2Dn r=1 j2Dn: (i;j)2[r;r+1)<br />

X<br />

i2Dn r=1 j2Dn: (i;j)2[r;r+1)<br />

X<br />

2jDnj kXk 2<br />

2+ + C jDnj kXk2+<br />

for some constant C < 1, not depending on n.<br />

r=1<br />

([ (i; j)=3])<br />

[ (i; j)=3] d<br />

b =(2+ ) ([ (i; j)=3])<br />

"<br />

1X<br />

r d( +1) 1 b =(2+ ) 1X<br />

(r) + r d 1 #<br />

(r)<br />

(b) To prove the second part of the lemma, apply Lemma B.2 with p = 2+ ;<br />

q = 2, <strong>and</strong> r = 2(2 + )= to obtain the following bound on the …rst term:<br />

Cov<br />

[h=3]<br />

i;n ;<br />

[h=3]<br />

j;n<br />

C3kXk2+ kXk2 [h=3] d<br />

r=1<br />

b =(4+2 ) ([h=3]) ; (B.5)<br />

where = =(4 + 2 ). The other terms on the r.h.s. of (B.1) are bounded as<br />

in part (a). Collecting (B.3)-(B.5) gives<br />

jCov (Xi;nXj;n)j<br />

n<br />

kXk2 C3kXk2+ [h=3] d<br />

b =(4+2 ) o<br />

([h=3]) + C4 ([h=3])<br />

as required. Finally, using similar arguments as in the proof of part (a), we can<br />

bound V ar (Sn) as<br />

V ar (Sn) CkXk2jDnj<br />

for some constant C < 1, not depending on n. Q.E.D.<br />

Theorem B.1 Suppose fDng is a sequence of …nite subsets of D, satisfying<br />

Assumption 1, with jDnj ! 1 as n ! 1: Suppose further that f"i;n; i 2 Dn; n 2<br />

Ng is an array of zero-mean r<strong>and</strong>om variables with -coe¢ cients (u; v; r)<br />

C(u + v) b(r) for some constants C < 1 <strong>and</strong> 0. Suppose for some > 0<br />

<strong>and</strong> > 0<br />

<strong>and</strong><br />

lim<br />

k!1 sup E[j"i;nj<br />

n;i2Dn<br />

2+ 1(j"i;nj > k)] = 0<br />

b(r) = O(r<br />

d(2 +1)<br />

with = max f ; 1= g, <strong>and</strong> suppose lim infn!1 jDnj 1 2 n > 0, then<br />

where 2 n = V ar P<br />

1<br />

n<br />

i2Dn "i;n .<br />

X<br />

"i;n =) N(0; 1):<br />

i2Dn<br />

24<br />

)<br />

CjDnj


The above CLT is in essence a variant of CLT for -mixing r<strong>and</strong>om …elds<br />

given as Corollary 1 of Theorem 1 in Jenish <strong>and</strong> Prucha (2009), applied to<br />

mixing coe¢ cients of the type (u; v; r) C(u + v) b(r); 0.<br />

Proof of Theorem B.1: The proof of the theorem is largely the same as the<br />

proof of Theorem 1 in Jenish <strong>and</strong> Prucha (2009). We will …rst show that all<br />

assumptions of that theorem except Assumption 3(c) are satis…ed. We will then<br />

show that the entire proof goes through if Assumption 3(c) is replaced by the<br />

condition b(r) = O(r d(2 +1) ) with = max f ; 1= g.<br />

Clearly, Assumptions 1, 2 <strong>and</strong> 5 of that theorem with ci;n = 1 are satis…ed.<br />

To verify Assumptions 3(a) note that<br />

1X<br />

(1; 1; r)r [d(2+ )= ] 1 1X<br />

1X<br />

2d( 1= ) 1<br />

C r C r 1 < 1;<br />

r=1<br />

r=1<br />

since = max f ; 1= g. As shown in the proof of Corollary 1 in Jenish <strong>and</strong><br />

Prucha (2009), the latter condition implies Assumption 3(a) of Theorem 1 in<br />

Jenish <strong>and</strong> Prucha (2009). Furthermore, it is easy to see that Assumption 3 (b)<br />

is also satis…ed. Indeed, for any u + v 4, we have<br />

1X<br />

d<br />

(u; v; r)r<br />

1<br />

1X<br />

4 C r d 1 b(r)<br />

1X<br />

C r d 1 r d(2 +1) < 1:<br />

r=1<br />

r=1<br />

Thus, all assumptions, except Assumption 3(c), of Theorem 1 in Jenish <strong>and</strong><br />

Prucha (2009) hold, <strong>and</strong> hence, all steps of its proof which do not rely on that<br />

assumptions remain valid in our case. Assumption 3(c) is only used in step 5<br />

of that proof. Speci…cally, all arguments in that step continue to hold given we<br />

show that there exists sequence mn such that<br />

<strong>and</strong><br />

r=1<br />

r=1<br />

m d njDnj 1=2 ! 0 as n ! 1 (B.6)<br />

(1; jDnj ; mn)jDnj 1=2 ! 0 as n ! 1 (B.7)<br />

(Though 1;1 is used instead of 1;jDnj in the proof of Theorem 1 of Jenish <strong>and</strong><br />

Prucha (2009), in fact the proof only relies on the coe¢ cient 1;jDnj; see Step 9<br />

(jEA3;nj ! 0) of the proof in the working version of the paper).<br />

The desired sequence mn can be chosen as<br />

It is immediate that (B.6) holds,<br />

To verify (B.7), observe<br />

(1; jDnj ; mn)jDnj 1=2<br />

mn = jDnj 1=2 = log jDnj 1=d<br />

:<br />

m d njDnj 1=2 = (log jDnj) 1 ! 0:<br />

CjDnj +1=2 m<br />

CjDnj +1=2 jDnj<br />

CjDnj<br />

d(2 +1)<br />

n<br />

1=2 jDnj<br />

=(2d) [log jDnj] (2 +1)+ =d ! 0:<br />

25<br />

=(2d) (2 +1)+ =d<br />

[log jDnj]


The rest of the proof is the same, word-by-word, as the proof of Theorem 1 of<br />

Jenish <strong>and</strong> Prucha (2009). Q.E.D.<br />

Proof of Theorem 1: Since the proof is lengthy it is broken into steps.<br />

1. Decomposition of Yi;n<br />

For any …xed s > 0, decompose Xi;n as<br />

where<br />

Let<br />

Yi;n = s i;n + s i;n<br />

s<br />

i;n = E(Yi;njFi;n(s)),<br />

Sn;s = X<br />

i2Dn<br />

s<br />

i;n = Yi;n<br />

s<br />

i;n; e Sn;s = X<br />

i2Dn<br />

s<br />

i;n<br />

s<br />

i;n<br />

2<br />

n;s = V ar [Sn;s] ; e 2<br />

h i<br />

n;s = V ar eSn;s<br />

Repeated use of the Minkowski inequality yields:<br />

Observe that<br />

<strong>and</strong> hence<br />

j n n;sj en;s, j n en;sj n;s: (B.8)<br />

E [E(Yi;njFi;n(s))jFi;n(m))] =<br />

E(Yi;njFi;n(s)); m s;<br />

E(Yi;njFi;n(m)); m < s:<br />

s<br />

i;n E( s i;njFi;n(m))<br />

2<br />

= kYi;n E[Yi;njFi;n(s)] E[Yi;njFi;n(m)] + E[(Yi;njFi;n(s))jFi;n(m)]k2 =<br />

kYi;n E(Yi;njFi;n(m))k2 C (m); if m s;<br />

kYi;n E(Yi;njFi;n(s))k2 C (s) C (m); if m < s:<br />

since by de…nition the sequence (m) is non-increasing. Thus, for any …xed<br />

s > 0, s i;n is uniformly L2-NED on " with the same NED coe¢ cients (m)<br />

as the r<strong>and</strong>om …eld fYi;ng. Furthermore, as shown in the proof of Lemma B.3,<br />

s<br />

i;n is also uniformly L2+ bounded.<br />

2. Bounds for the Variances of P Yi;n <strong>and</strong> P s i;n<br />

First note that exists 0 < B < 1 such that for all n<br />

BjDnj<br />

2<br />

n: (B.9)<br />

Furthermore, in light of Assumption 2, <strong>and</strong> observing that = =(4 + 2 )<br />

= =(2 + ) <strong>and</strong> b =(2+ ) (r) b 2(2+ ) (r) we have<br />

1X<br />

r d( +1) 1 b<br />

r=1<br />

=(2+ )<br />

1X<br />

r d( +1) 1 b =(4+2 ) (r)<br />

r=1<br />

26<br />

1X<br />

r=1<br />

1X<br />

r=1<br />

r d( +1) 1 b 2(2+ ) (r) < 1;<br />

r d( +1) 1 b 2(2+ ) (r) < 1:


Using part (a) of Lemma B.3 with Xi;n = Yi;n, we have<br />

BjDnj<br />

2<br />

n = V ar (Sn) C jDnj .<br />

for some B > 0. Using part (b) of Lemma B.3 with Xi;n = s i;n we have<br />

Hence,<br />

e 2<br />

n;s = V ar e Sn;s C jDnj k s i;nk2 = CjDnj (s) (B.10)<br />

lim lim sup<br />

s!1 n!1<br />

Furthermore, by (B.8) we have<br />

lim lim sup<br />

s!1 n!1<br />

<strong>and</strong> hence for all s 1 <strong>and</strong> n 1<br />

1<br />

e 2<br />

n;s<br />

2<br />

n<br />

n;s<br />

n<br />

n;s<br />

n<br />

C lim<br />

s!1<br />

(s) = 0: (B.11)<br />

en;s<br />

lim lim sup = 0 (B.12)<br />

s!1 n!1 n<br />

C < 1: (B.13)<br />

3. CLT for P s<br />

i2Dn i;n<br />

We now show that for any …xed s > 0, s i;n satis…es the CLT given in Theorem<br />

B.1.<br />

h<br />

si;n 2+<br />

First, since supn;i2Dn E<br />

i<br />

< 1, the process is uniformly<br />

L2+<br />

0-integrable for<br />

Second, note that since<br />

<strong>and</strong> r > 2s<br />

0<br />

= =2, i.e.,<br />

lim<br />

k!1 sup E[<br />

n;i2Dn<br />

s 2+ =2<br />

i;n 1(<br />

s<br />

i;n > k)] = 0:<br />

s<br />

i;n<br />

s<br />

i;n is a measurable function of "i;n for any u; v 2 N<br />

(u; v; r) (uMs d ; vMs d ; r 2s) C (u + v) b (r 2s)<br />

We next to show that b(r) = O(r d(2 +1) ) for = max f ; 2= g <strong>and</strong> some<br />

> 0. By assumption,<br />

1X<br />

r=1<br />

where = =(2 + ), which implies<br />

r d( +1) 1 b 2(2+ ) (r) < 1;<br />

b (r) = o(r 2d(2+ )( +1)= ) = o(r d[2( +2= )+1] d ) = o(r d[2 +1] d )<br />

since +2= for = max f ; 2= g. Thus, b(r) = O(r d(2 +1) ) for = d.<br />

27


We next show that for su¢ ciently large s;<br />

By (B.9),<br />

0 < lim inf<br />

n!1 jDnj 1 2 n;s:<br />

B 1=2<br />

inf jDnj 1=2 n<br />

Since lims!1 (s) = 0; there exists s such that in light of (B.10) for all s s ,<br />

Hence by (B.8) for all s s ;<br />

jDnj 1=2 en;s C 1=2 (s) B 1=2 =2: (B.14)<br />

jDnj 1=2 ( n en;s) jDnj 1=2 n;s<br />

<strong>and</strong> thus infn jDnj 1=2 n;s infn jDnj 1=2 n sup n jDnj 1=2 en;s. Using (??)<br />

<strong>and</strong> (B.14), we have<br />

Thus, for all s s ,<br />

lim inf<br />

n!1 jDnj 1=2 1=2 B1=2 B1=2<br />

n;s B = > 0<br />

2 2<br />

1<br />

n;s<br />

X<br />

i2Dn<br />

s<br />

i;n =) N(0; 1) as n ! 1: (B.15)<br />

Since the …rst s terms do not a¤ect the analysis below we take in the following<br />

s = 1:<br />

P 1<br />

n i2Dn Yi;n<br />

4. CLT for<br />

Finally, using Lemma B.1 we now show that, given the maintained NED<br />

assumption, the just established CLT in (B.15) for the approximators<br />

be carried over to the the Yi;n. De…ne<br />

s<br />

i;n can<br />

X<br />

X<br />

X<br />

Wn =<br />

1<br />

n<br />

i2Dn<br />

Yi;n; Vns = 1<br />

n<br />

i2Dn<br />

s<br />

i;n, Wn Vns = 1<br />

n<br />

so that we can exploit Lemma B.1 to prove that<br />

X<br />

Wn = Yi;n =) V N(0; 1):<br />

1<br />

n<br />

i2Dn<br />

i2Dn<br />

We …rst verify condition (iii) of Lemma B.1. By Markov’s inequality <strong>and</strong> (B.11),<br />

for every > 0 we have<br />

lim lim sup P (jWn Vnsj > ) = lim lim sup P (<br />

s!1 n!1<br />

s!1 n!1<br />

lim lim sup<br />

s!1 n!1<br />

28<br />

e 2<br />

n;s<br />

= 0:<br />

2 2<br />

n<br />

1<br />

n<br />

X<br />

i2Dn<br />

s<br />

i;n<br />

s<br />

i;n > )


Next observe that<br />

Vns = n;s<br />

n<br />

"<br />

1<br />

n;s<br />

X<br />

i2Dn<br />

We proceed to show Wn =) V by contradiction. For that purpose let M be the<br />

set of all probability measures on (R;B), <strong>and</strong> observe that we can metrize M by,<br />

e.g., the Prokhorov distance d(:; :). Let n <strong>and</strong> be the probability measures<br />

corresponding to Wn <strong>and</strong> V , respectively, then Wn =) V , or n =) , i¤<br />

d( n; ) ! 0 as n ! 1. Now suppose n does not converge to . Then for<br />

some > 0 there exists a subsequence fn(m)g such that d( n(m); ) > for<br />

all n(m). By (B.13), we have 0 n;s= n C < 1 for all s; n 1. Hence,<br />

0 n(m);s= n(m) C < 1 for all n(m). Consequently, for s = 1 there<br />

exists a subsubsequence fn(m(l1))g such that n(m(l1));1= n(m(l1)) ! p(1) as<br />

l1 ! 1. For s = 2, there exists a subsubsubsequence fn(m(l1(l2)))g such that<br />

n(m(l1(l2)));2= n(m(l1(l2))) ! p(2) as l2 ! 1. The argument can be repeated<br />

for s = 3; 4:::. Now construct a subsequence fnlg such that n1 corresponds<br />

to the …rst element of fn(m(l1))g, n2 corresponds to the second element of<br />

fn(m(l1(l2)))g, <strong>and</strong> so on, then<br />

lim<br />

l!1<br />

nl;s<br />

nl<br />

s<br />

i;n<br />

for s = 1; 2; : : : Given (B.15), it follows that as l ! 1<br />

Then, it follows from (B.12) that<br />

lim jp(s) 1j lim<br />

s!1 s!1 lim<br />

l!1 p(s)<br />

#<br />

Vnls =) Vs N(0; p 2 (s)):<br />

:<br />

= p(s) (B.16)<br />

nl;s<br />

nl<br />

+ lim<br />

s!1 sup<br />

n 1<br />

n;s<br />

n<br />

1 = 0:<br />

Thus Vs =) V <strong>and</strong> thus by Lemma B.1 Wnl =) V N(0; 1) as l ! 1.<br />

Since fnlg fn(m)g this contradicts the assumption that d( n(m); ) > for<br />

all n(m). This completes the proof of the CLT. Q.E.D.<br />

Proof of Corollary 1: To prove the theorem, we apply the Cramer-Wold<br />

device, <strong>and</strong> verify that for every 2 Rk P 1<br />

with j j = 1, n Vi;n ) N(0; 1),<br />

where Vi;n = 0 Yi;n. Observe that using the properties of norms, we have<br />

<strong>and</strong><br />

jVi;nj = 0 Yi;n j j jYi;nj = jYi;nj<br />

1(jVi;nj > k) = 1(<br />

0 Yi;n > k) 1(jYi;nj > k);<br />

<strong>and</strong> thus limk!1 sup n;i2Dn E[jVi;nj 2+ 1(jVi;nj > k)] = 0. Furthermore, observe<br />

that<br />

kVi;n E(Vi;n j Fi;n(s))k 2 j j kYi;n EYi;n j Fi;n(s))k 2 di;n (s)<br />

29


<strong>and</strong> that for 2 n = V ar( P<br />

i2Dn Vi;n) = 0 n we have<br />

inf<br />

n jDnj 1 2 n = inf<br />

n jDnj 1 0 n inf<br />

n jDnj 1 min( n) > 0:<br />

From this we see that <strong>under</strong> the maintained assumptions Vi;n satis…es all assumptions<br />

P<br />

of the CLT for scalar-valued r<strong>and</strong>om …elds (Theorem 1) <strong>and</strong>, there-<br />

1 fore, n Vi;n ) N(0; 1) as claimed.<br />

Next de…ne Xi;n = Vi;n, then by analogous arguments as above<br />

jXi;nj jYi;nj :<br />

From the maintained uniform L2+ integrability of jYi;nj it then follows that<br />

kXk 2+ kY k 2+ , which shows that the 2+ moments of Xi;n can be bounded<br />

by a constant that does not depend on . Consequently if follows from the last<br />

inequality in the proof of part (a) of Lemma B.3 that<br />

V ar( X<br />

i2Dn<br />

Xi;n) = 0 n CjDnj<br />

where C < 1 does not depend on n <strong>and</strong> . Hence<br />

sup<br />

n<br />

jDnj 1 max( n) = sup jDnj<br />

n<br />

1 sup<br />

j j=1<br />

0 n C < 1.<br />

This proves the second claim of the lemma. Q.E.D.<br />

Proof of Theorem 2: To prove the theorem, it su¢ ces to show that jDnj 1 P<br />

0. First we show that for each given s > 0, the conditional mean V s<br />

i;n =<br />

E(Yi;njFi;n(s)) satis…es the assumptions of the L1-norm LLN of Jenish <strong>and</strong><br />

Prucha (2009, Theorem 3). Using the Jensen <strong>and</strong> Lyapunov inequalities gives<br />

for all s > 0, i 2 Dn; n 1:<br />

E V s p<br />

i;n EfE(jYi;nj p jFi;n(s))g sup E jYi;nj<br />

n;i2Dn<br />

p < 1:<br />

So, V s<br />

i;n is uniformly Lp-bounded for p > 1 <strong>and</strong> hence uniformly integrable.<br />

For each …xed s; V s<br />

i;n is a measurable function of f"j;n; j 2 Tn : (i; j) sg.<br />

Observe that <strong>under</strong> Assumption 1 there exists a …nite constant C such that the<br />

cardinality of the set fj 2 Tn : (i; j) sg is bounded by Csd ; compare Lemma<br />

A.1 in Jenish <strong>and</strong> Prucha (2009). Hence,<br />

V s(1; 1; r)<br />

<strong>and</strong> thus in light of Assumption 6(b)<br />

1X<br />

r=1<br />

r d 1 V s(1; 1; r)<br />

X2s<br />

r=1<br />

1; r 2s<br />

Cs d ; Cs d ; r 2s ; r > 2s<br />

r d 1 + '(Cs d ; Cs d )<br />

30<br />

1X<br />

(r + 2s) d 1 b(r) < 1:<br />

r=1<br />

!<br />

i2Dn (Yi;n EYi;n) L1


The above shows that indeed, for each …xed s, V s<br />

i;n satis…es the assumptions<br />

of the L1-norm LLN of Jenish <strong>and</strong> Prucha (2009, Theorem 3). Therefore, for<br />

each s; we have<br />

jDnj<br />

X<br />

1<br />

[E(Yi;njFi;n(s)) EYi;n]<br />

i2Dn<br />

1<br />

! 0 as n ! 1: (B.17)<br />

Furthermore observe that from (??) <strong>and</strong> the Minkowski inequality<br />

jDnj<br />

1 X<br />

i2Dn<br />

(Yi;n E(Yi;njFi;n(s)))<br />

1<br />

(s): (B.18)<br />

Given (B.17) <strong>and</strong> (B.18), <strong>and</strong> observing that lims!1 (s) = 0 it now follows<br />

that<br />

lim<br />

n!1<br />

jDnj 1 X<br />

lim lim sup<br />

s!1 n!1<br />

+ lim<br />

s!1 lim<br />

n!1<br />

i2Dn<br />

jDnj<br />

(Yi;n EYi;n)<br />

1 X<br />

i2Dn<br />

jDnj 1 X<br />

i2Dn<br />

1<br />

= lim<br />

s!1 lim<br />

n!1<br />

(Yi;n E(Yi;njFi;n(s)))<br />

(E(Yi;njFi;n(s)) EYi;n)<br />

jDnj 1 X<br />

1<br />

1<br />

= 0:<br />

i2Dn<br />

(Yi;n EYi;n)<br />

This completes the proof of the LLN. Q.E.D.<br />

C Appendix: Proofs for Section 5<br />

Proof of Theorem 3: We show that<br />

sup<br />

2<br />

Qn( ) Q n( )<br />

p<br />

! 0 (C.1)<br />

as n ! 1. As discussed in the text, given that the o n are identi…ably unique it<br />

then follows immediately from, e.g., Pötscher <strong>and</strong> Prucha (1997), Lemma 3.1,<br />

that ( b n; o n) p ! 0 as n ! 1 as claimed.<br />

We start by proving that<br />

jDnj<br />

1 X<br />

i2Dn<br />

[qi;n(Zi;n; ) Eqi;n(Zi;n; )] p ! 0 (C.2)<br />

for each 2 , by applying the LLN given as Theorem 2 in the text to<br />

qi;n(Zi;n; ). First note that by Assumption 6(a), we have sup n;i2Dn E jqi;n(Zi;n; )j p <<br />

1 for each 2 <strong>and</strong> p = 2, which veri…es Assumption 4(a) for qi;n(Zi;n; ) with<br />

ci;n = 1. By Assumption 6(b), the qi;n(Zi;n; ) are uniformly L1-NED on ", <strong>and</strong><br />

31<br />

1


hence w.l.g. we can take di;n = 1. Furthermore, by Assumption 6(b) the input<br />

process " is -mixing, <strong>and</strong> the -mixing coe¢ cients satisfy Assumption 4(b).<br />

Consequently (C.2) follows directly from Theorem 2 applied to qi;n(Zi;n; ).<br />

Next observe that by Proposition 1 of Jenish <strong>and</strong> Prucha (2009), Assumption<br />

6(c) implies that qi;n is L0 stochastically equicontinuous on , i.e., for every<br />

" > 0<br />

1<br />

lim sup<br />

n!1 jDnj<br />

X<br />

P<br />

i2Dn<br />

sup<br />

( ; )<br />

jqi;n(Zi;n; ) qi;n(Zi;n; )j > "<br />

!<br />

! 0 as ! 0:<br />

Furthermore, in light of Assumption 6(a) the qi;n(Zi;n; ) clearly satisfy the<br />

domination condition postulated by the ULLN in Jenish <strong>and</strong> Prucha (2009),<br />

stated as Theorem 2 in that paper. Given that we have already veri…ed the<br />

pointwise LLN in (C.2) it now follows directly from that theorem that<br />

1 P<br />

sup jRn( ) ERn( )j<br />

2<br />

p ! 0 (C.3)<br />

with Rn( ) = jDnj i2Dn qi;n(Zi;n; ), <strong>and</strong> that the ERn( ) are uniformly<br />

equicontinuous on in the sense that<br />

lim sup sup<br />

n!1 2<br />

sup<br />

( ; )<br />

To prove (C.1) observe that<br />

sup<br />

2<br />

sup<br />

2<br />

sup<br />

2<br />

jERn( ) ERn( )j ! 0 as ! 0:<br />

Qn( ) Q n( ) (C.4)<br />

jRn( ) 0 P Rn( ) ERn( )P ERn( )j + sup jRn( )<br />

2<br />

0 (Pn P )Rn( )j<br />

jRn( ) 0 P Rn( ) ERn( )P ERn( )j + 2 sup jRn( )j<br />

2<br />

2 jPn P j :<br />

Furthermore observe that Assumption 6(a) we have E [sup 2 jqi;n(Zi;n; )j]<br />

K <strong>and</strong> E [sup 2 jqi;n(Zi;n; )j] 2 K for some …nite constant K. Thus<br />

<strong>and</strong><br />

sup<br />

2<br />

E jRn( )j E sup jRn( )j jDnj<br />

2<br />

E sup jRn( )j<br />

2<br />

2<br />

jDnj<br />

jDnj<br />

2 X<br />

i;j2Dn<br />

2 X<br />

i;j2Dn<br />

E sup<br />

2<br />

"<br />

1 X<br />

i2Dn<br />

E sup jqi;n(Zi;n; )j K (C.5)<br />

2<br />

jqi;n(Zi;n; )j sup jqj;n(Zj;n; )j (C.6)<br />

2<br />

E sup jqi;n(Zi;n; )j<br />

2<br />

2 # 1=2 "<br />

E sup jqj;n(Zj;n; )j<br />

2<br />

Now consider the …rst terms on the r.h.s. of the last inequality of (C.4). From<br />

(C.5) we see that E jRn( )j takes on its values in a compact set. Given (C.3) it<br />

32<br />

2 # 1=2<br />

K:


now follows immediately from part (a.I) of Lemma 3.3 of Pötscher <strong>and</strong> Prucha<br />

(1997) that<br />

sup jRn( )<br />

2<br />

0 P Rn( ) ERn( )P ERn( )j p ! 0: (C.7)<br />

Next we show that also the second term on the r.h.s. of the last inequality<br />

of (C.4) converges in probability to zero. To see that this is indeed the case<br />

observe that sup 2 jRn( )j 2 = Op(1) in light of (C.6) <strong>and</strong> jPn P j p ! 0 by<br />

assumption. This completes the proof of (C.1).<br />

Having established that ERn( ) are uniformly equicontinuous on , the<br />

uniform equicontinuity of Q n( ) on follows immediately from Lemma 3.3(b)<br />

of Pötscher <strong>and</strong> Prucha (1997). Q.E.D.<br />

Proof of Theorem 4: Clearly by Theorem 3 we have b o<br />

n n = op(1).<br />

Step 1. The estimators b n corresponding to the objective function (23)<br />

satisfy the following …rst order conditions:<br />

h<br />

jDnj 1=2 Rn( b i<br />

n) = op(1): (C.8)<br />

r Rn( b n) 0 Pn<br />

The op(1) term on the r.h.s. re‡ects that the …rst order conditions may not<br />

hold if b n falls onto the boundary of , <strong>and</strong> that the probability of that event<br />

goes to zero as n ! 1, since the o n are uniformly in the in the interior of by<br />

Assumption 7(a). If b n is in the interior of , then the l.h.s. of (C.8) is zero.<br />

Taking the mean value expansion of Rn( b n) about<br />

Rn( b n) = Rn( o n) + r Rn( e n)( b n<br />

o<br />

n yields<br />

o<br />

n) (C.9)<br />

where e n 2 is between b n <strong>and</strong> o n (component-by-component). Let<br />

bAn = r Rn( b n) 0 Pnr Rn( e n) <strong>and</strong> b Bn = r Rn( b n) 0 Pn<br />

then combining (C.8) <strong>and</strong> (C.9) gives:<br />

=<br />

=<br />

jDnj 1=2 b n<br />

h<br />

I<br />

h<br />

I<br />

i<br />

A b+ b<br />

n An<br />

i<br />

A b+ b<br />

n An<br />

o<br />

n<br />

jDnj 1=2 b n<br />

jDnj 1=2 b n<br />

where b A + n denotes the generalized inverse of b An.<br />

o<br />

n<br />

o<br />

n<br />

h<br />

jDnj 1 i1=2 n ;<br />

(C.10)<br />

bA + n r Rn( b n) 0 h<br />

Pn jDnj 1=2 Rn( o i<br />

n) + b A + n op(1)<br />

bA + n b h<br />

1=2<br />

Bn n jDnj Rn( o i<br />

n) + b A + n op(1);<br />

Step 2. By Assumptions 7(c) the qi;n(Zi;n; o n) are uniformly L2-NED <strong>and</strong><br />

uniformly L2+ -integrable with ci;n = 1. Given Assumptions 7(d),(g) it is now<br />

readily seen that the process fqi;n(Zi;n; o n); i 2 Dng satis…es all assumptions<br />

of the CLT for vector-valued NED processes, given as Corollary 1 in the text,<br />

33


with cin = 1. (Note that Assumption 3(d) is satis…ed automatically since the<br />

qi;n(Zi;n; o n) are uniformly L2-NED.) Hence,<br />

1=2<br />

n jDnj Rn( o X<br />

1=2<br />

n) = n qi;n(Zi;n; o n) =) N(0; Ipq ); (C.11)<br />

with n = V ar P<br />

i2Dn<br />

i2Dn qi;n(Zi;n; o n) <strong>and</strong> sup n max<br />

h<br />

jDnj 1 i<br />

n < 1.<br />

Step 3. By Assumptions 7(c),(d),(e) the functions r qi;n(Zi;n; ) satisfy for<br />

each 2 the LLN given as Theorem 2 in the text with ci;n = 1, observing<br />

that Assumption 4(b) is implied by 2. By argumentation analogous as used in<br />

the proof of consistency we have<br />

jDnj<br />

1 X<br />

i2Dn<br />

(r qi;n(Zi;n; ) Er qi;n(Zi;n; )) p ! 0:<br />

By Proposition 1 of Jenish <strong>and</strong> Prucha (2009), Assumption 7(f) implies that the<br />

r qi;n(Zi;n; ) are uniformly L0-equicontinuous on . Given L0-equicontinuity<br />

<strong>and</strong> Assumption 7(e), we have by the ULLN of Jenish <strong>and</strong> Prucha (2009), Theorem<br />

2,<br />

sup jr Rn( )<br />

2<br />

Er Rn( )j p ! 0: (C.12)<br />

<strong>and</strong> furthermore, the Er Rn( ) are uniformly equicontinuous on<br />

that<br />

in the sense<br />

sup jEr Rn( ) Er Rn( )j ! 0 (C.13)<br />

lim sup sup<br />

n!1 02 j<br />

as ! 0. In light of (C.12) <strong>and</strong> (C.13), <strong>and</strong> given that b n<br />

hence e o<br />

n n = op(1), if follows further that<br />

0 j<<br />

o<br />

n = op(1) <strong>and</strong><br />

r Rn( b n) Er Rn( o n) p ! 0; <strong>and</strong> r Rn( e n) Er Rn( o n) p ! 0:<br />

Hence,<br />

bAn An<br />

where b An <strong>and</strong> b Bn are as de…ned above, <strong>and</strong><br />

p<br />

p<br />

! 0 <strong>and</strong> Bn<br />

b Bn ! 0; (C.14)<br />

An = [Er Rn( o n)] 0 P [Er Rn( o n)] <strong>and</strong> Bn = [Er Rn( o n)] 0 P<br />

h<br />

jDnj 1 i1=2 n :<br />

Step 4. Given Assumptions 7(e),(f), <strong>and</strong> since P is positive de…nite, we<br />

have jAnj = O(1) <strong>and</strong> A 1<br />

n = O(1), respectively. Hence by, e.g., Lemma<br />

F1 in Pötscher <strong>and</strong> Prucha (1997) we have b An = Op(1), b A + n = Op(1), b An is<br />

nonsingular with probability tending to one, <strong>and</strong> b A + n A 1 p<br />

n ! 0. In light of the<br />

above it follows from (C.10) that<br />

jDnj 1=2 b<br />

n<br />

o<br />

n =<br />

h<br />

A b+ b 1=2<br />

n Bn n jDnj Rn( o =<br />

i<br />

n) + op(1)<br />

A 1<br />

h<br />

1=2<br />

n Bn n jDnj Rn( o i<br />

n) + op(1)<br />

34


Recalling that supn h<br />

max jDnj 1 i<br />

n < 1, Assumptions 7(e) implies that<br />

jBnj = Op(1). In light of Assumption 7(g),(h) BnB0 n is invertible <strong>and</strong> fur-<br />

thermore (BnB0 n) 1 = O(1). Thus A 1<br />

n BnB0 nA 10<br />

n<br />

O(1) <strong>and</strong> therefore<br />

A 1<br />

n BnB 0 nA 10<br />

n<br />

= A 1<br />

n BnB 0 nA 10<br />

n<br />

The claim that A 1<br />

n BnB0 nA 10<br />

n<br />

1=2 jDnj 1=2 b n<br />

h<br />

1=2 1<br />

An Bn<br />

1=2 jDnj 1=2 b n<br />

o<br />

n<br />

1<br />

1=2<br />

n jDnj Rn( o n)<br />

o<br />

n<br />

jAnj 2 (BnB 0 n) 1 =<br />

i<br />

+ op(1):<br />

=) N(0; Ik) now fol-<br />

lows, e.g., from Corollary F4(b) in Pötscher <strong>and</strong> Prucha (1997): Q.E.D.<br />

References<br />

[1] Andrews, D.W.K. (1984): "Non-Strong Mixing Autoregressive processes,"<br />

Journal of Applied Probability, 21, 930-934.<br />

[2] Andrews, D.W.K. (1987): "Consistency in Nonlinear Econometric Models:<br />

a Generic Uniform Law of Large Numbers," Econometrica, 55, 1465-1471.<br />

[3] Andrews, D.W.K. (1988): "Laws of Large Numbers for Dependent Nonidentically<br />

Distributed Variables," Econometric Theory, 4, 458-467.<br />

[4] Andrews, D.W.K. (2005): "Cross-section Regression with Common<br />

Shocks," Econometrica, 2005, 73, 1551–1585.<br />

[5] Ango Nze, P. <strong>and</strong> P. Doukhan (2004): "Weak Dependence: Models <strong>and</strong><br />

Applications to Econometrics," Econometric Theory, 20, 995-1045.<br />

[6] Bai, J. (2003): "Inferential Theory for Factor Models of Large Dimensions,"<br />

Econometrica 71, 135-171.<br />

[7] Bai, J. <strong>and</strong> S. Ng (2004): "A PANIC Attack on Unit Roots <strong>and</strong> Cointegration,",<br />

Econometrica, 72, 1127-1177.<br />

[8] Bera, A.K, <strong>and</strong> P. Simlai: "Testing <strong>Spatial</strong> Autoregressive Model <strong>and</strong> a<br />

Formulation of <strong>Spatial</strong> ARCH (SARCH) Model with Applications. Paper<br />

presented at the Econometric Society World Congress, London, August<br />

2005.<br />

[9] Bierens, H.J. (1981): Robust methods <strong>and</strong> asymptotic theory. Berlin:<br />

Springer Verlag.<br />

[10] Billingsley, P. (1968): Convergence of Probability Measures. New York:<br />

John Wiley <strong>and</strong> Sons.<br />

[11] Bolthausen, E. (1982): "<strong>On</strong> the Central Limit Theorem for Stationary<br />

Mixing R<strong>and</strong>om Fields," The Annals of Probability, 10, 1047-1050.<br />

35


[12] Bradley, R. (1993): "Some Examples of Mixing R<strong>and</strong>om Fields," Rocky<br />

Mountain Journal of Mathematics, 23, 2, 495-519.<br />

[13] Brockwell, P., <strong>and</strong> R. Davis (1991): Times Series: Theory <strong>and</strong> Methods,<br />

Springer Verlag.<br />

[14] Bulinskii, A.V. (1989): Limit Theorems <strong>under</strong> Weak Dependence Conditions,<br />

Moscow University Press.<br />

[15] Bulinskii, A.V., <strong>and</strong> P. Doukhan (1990): "Vitesse de convergence dans le<br />

theorem de limite centrale pour des champs melangeants des hypotheses de<br />

moment faibles," C.R. Academie Sci. Paris, Serie I, 801-105.<br />

[16] Chen, X., <strong>and</strong> T. Conley (2001): "A New Semiparametric <strong>Spatial</strong> Model<br />

for Panel Time Series," Journal of Econometrics,105, 59-83.<br />

[17] Cli¤, A., <strong>and</strong> J. Ord (1981): <strong>Spatial</strong> <strong>Processes</strong>, Models <strong>and</strong> Applications.<br />

London: Pion.<br />

[18] Conley, T. (1999): "GMM Estimation with Cross Sectional Dependence,"<br />

Journal of Econometrics, 92, 1-45.<br />

[19] Davidson, J. (1992): "A Central Limit Theorem for Globally Nonstationary<br />

Near-Epoch Dependent Functions of Mixing <strong>Processes</strong>," Econometric<br />

Theory, 8, 313-329.<br />

[20] Davidson, J. (1993): "An L1 Convergence Theorem for Heterogenous<br />

Mixingale Arrays with Trending Moments," Statistics <strong>and</strong> Probability Letters,<br />

8, 313-329.<br />

[21] Davidson, J. (1994): Stochastic Limit Theory. Oxford University Press.<br />

[22] De Jong, R.M. (1997): "Central Limit Theorems for Dependent Heterogeneous<br />

R<strong>and</strong>om Variables," Econometric Theory, 13, 353–367.<br />

[23] Dobrushin, R. (1968): "The Description of a R<strong>and</strong>om Field by its Conditional<br />

Distribution <strong>and</strong> its Regularity Condition," Theory of Probability<br />

<strong>and</strong> its Applications, 13, 197-227.<br />

[24] Doukhan, P. (1994): Mixing. Properties <strong>and</strong> Examples. Springer-Verlag.<br />

[25] Doukhan, P., <strong>and</strong> X. Guyon (1991): "Mixing for Linear R<strong>and</strong>om Fields,"<br />

C.R. Academie Sciences Paris, Serie 1, 313, 465-470.<br />

[26] Doukhan, P., <strong>and</strong> S. Louhichi (1999): "A New Weak Dependence Condition<br />

<strong>and</strong> Applications to Moment Inequalities," Stochastic <strong>Processes</strong> <strong>and</strong> Their<br />

Applications, 84, 313-342.<br />

[27] Gallant, A.R.(1987): Nonlinear Statistical Models. New York: Wiley.<br />

[28] Gallant, A.R., <strong>and</strong> H. White (1988): A Uni…ed Theory of Estimation <strong>and</strong><br />

<strong>Inference</strong> for Nonlinear Dynamic Models. New York: Basil Blackwell.<br />

36


[29] Jenish, N., <strong>and</strong> I.R. Prucha (2009): "Central Limit Theorems <strong>and</strong> Uniform<br />

Laws of Large Numbers for Arrays of R<strong>and</strong>om Fields," Journal of<br />

Econometrics, 150, 86-98.<br />

[30] Ibragimov, I.A. (1962): "Some Limit Theorems for Stationary <strong>Processes</strong>,"<br />

Theory of Probability <strong>and</strong> Applications, 7, 349-382.<br />

[31] Ibragimov, I.A., <strong>and</strong> Y. V. Linnik (1971): Independent <strong>and</strong> Stationary<br />

Sequences of R<strong>and</strong>om Variables. Wolters-Noordho¤, Groningen.<br />

[32] Kelejian, H.H., <strong>and</strong> I.R. Prucha (2004): "Estimation of Simultaneous<br />

Systems of <strong>Spatial</strong>ly Interrelated Cross Sectional Equations," Journal of<br />

Econometrics, 118, 27-50.<br />

[33] Kelejian, H.H., <strong>and</strong> I.R. Prucha (2007a): "HAC Estimation in a <strong>Spatial</strong><br />

Framework," Journal of Econometrics,140, 131-154.<br />

[34] Kelejian, H.H., <strong>and</strong> I.R. Prucha (2007b): "Speci…cation <strong>and</strong> Estimation<br />

of <strong>Spatial</strong> Autoregressive Models with Autoregressive <strong>and</strong> Heteroskedastic<br />

Disturbances," forthcoming in Journal of Econometrics.<br />

[35] Lee, L.-F. (2004): "<strong>Asymptotic</strong> Distributions of Quasi-Maximum Likelihood<br />

Estimators for <strong>Spatial</strong> Autoregressive Models," Econometrica, 72,<br />

1899-1925.<br />

[36] Lee, L.-F. (2007a): "GMM <strong>and</strong> 2SLS for Mixed Regressive, <strong>Spatial</strong> Autoregressive<br />

Models," Journal of Econometrics, 137, 489–514.<br />

[37] Lee, L.-F. (2007b): "Method of Elimination <strong>and</strong> Substitution in the GMM<br />

Estimation of Mixed Regressive <strong>Spatial</strong> Autoregressive Models," forthcoming<br />

in Journal of Econometrics.<br />

[38] McLeish, D. L. (1975a): "A Maximal Inequality <strong>and</strong> Dependent Strong<br />

Laws," Annals of Probability, 3, 826-36.<br />

[39] McLeish, D. L. (1975b): "Invariance Principles for Dependent Variables,"<br />

Z. Wahrsch. verw. Gebiete, 32, 165-78.<br />

[40] Nahapetian, B. (1987): "An Approach to Limit Theorems for Dependent<br />

R<strong>and</strong>om Variables," Theory of Probability <strong>and</strong> its Applications, 32, 589-594.<br />

[41] Neaderhouser, C. (1978): "Limit Theorems for Multiply Indexed Mixing<br />

R<strong>and</strong>om Variables," Annals of Probability, 6.<br />

[42] Pinkse, J., L. Shen <strong>and</strong> M.E. Slade (2007): "A Central Limit Theorem<br />

for Endogenous Locations <strong>and</strong> Complex <strong>Spatial</strong> Interactions," Journal of<br />

Econometrics, 140, 215-225.<br />

[43] Phillips, P.C. <strong>and</strong> D. Sul (2003): “Dynamic Panel Estimation <strong>and</strong> Homogeneity<br />

Testing Under Cross Section Dependence,” The Econometrics<br />

Journal, 6, 217–259.<br />

37


[44] Pötscher, B. M., <strong>and</strong> I. R. Prucha (1989): "A Uniform Law of Large Numbers<br />

for Dependent <strong>and</strong> Heterogeneous Data <strong>Processes</strong>," Journal of Econometrics,<br />

140, 215-225.<br />

[45] Pötscher, B.M., <strong>and</strong> I. R. Prucha (1994): "Generic Uniform Convergence<br />

<strong>and</strong> Equicontinuity Concepts for R<strong>and</strong>om Functions," Journal of Econometrics,<br />

60, 23-63.<br />

[46] Pötscher, B.M. <strong>and</strong> I. R. Prucha (1997): Dynamic Nonlinear Econometric<br />

Models. Springer-Verlag, New York.<br />

[47] Robinson, P.M. (2009): "Large-Sample <strong>Inference</strong> on <strong>Spatial</strong> Dependence,"<br />

The Econometrics Journal, Tenth Anniversary Special Issue, 12, S68-S82.<br />

[48] Robinson, P.M. (2007): "E¢ cient Estimation of the Semiparametric <strong>Spatial</strong><br />

Autoregressive Model," forthcoming in Journal of Econometrics.<br />

[49] Takahata, H. (1983): "<strong>On</strong> the Rates in the Central Limit Theorem for<br />

Weakly Dependent R<strong>and</strong>om Fields," Z. Wahrsch. verw. Gebiete, 64, 445-<br />

456.<br />

[50] Tran, L. (1990): "Kernel Density Estimation on R<strong>and</strong>om Fields," Journal<br />

of Multivariate Analysis, 34, 37-53.<br />

[51] Whittle, P. (1954): "<strong>On</strong> Stationary <strong>Processes</strong> in the Plane," Biometrica,<br />

41, 434-449.<br />

[52] Wooldridge, J. (1986): "<strong>Asymptotic</strong> Properties of Econometric Estimators,"<br />

University of California San Diego, Department of Economics, Ph.D.<br />

Dissertation.<br />

[53] Yu, J., R. de Jong, <strong>and</strong> L.-F. Lee (2008): "Quasi-maximum Likelihood<br />

Estimators for <strong>Spatial</strong> Dynamic Panel Data with Fixed E¤ects when Both<br />

N <strong>and</strong> T are Large," Journal of Econometrics, 146, 118-134.<br />

38

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!