On Spatial Processes and Asymptotic Inference under Near$Epoch ...

On Spatial Processes and Asymptotic Inference 

under Near-Epoch Dependence 

Nazgul Jenish 1 and Ingmar R. Prucha 2 

October 21, 2011 

1 

Department of Economics, New York University, 19 West 4th Street, New York, 

NY 10012. Tel.: 212-998-3891, Email: nazgul.jenish@nyu.edu 

2 

Department of Economics, University of Maryland, College Park, MD 20742. Tel.: 

301-405-3499, Email: prucha@econ.umd.edu

Abstract 

The development of a general inferential theory for nonlinear models with crosssectionally 

or spatially dependent data has been hampered by a lack of appropriate 

limit theorems. To facilitate a general asymptotic inference theory relevant 

to economic applications, this paper …rst extends the notion of near-epoch dependent 

(NED) processes used in the time series literature to random …elds. 

The class of processes that is NED on, say, an -mixing process, is shown to 

be closed under in…nite transformations, and thus accommodates models with 

spatial dynamics. This would generally not be the case for the smaller class of 

-mixing processes. The paper then derives a central limit theorem and law 

of large numbers for NED random …elds. These limit theorems allow for fairly 

general forms of heterogeneity including asymptotically unbounded moments, 

and accommodate arrays of random …elds on unevenly spaced lattices. The 

limit theorems are employed to establish consistency and asymptotic normality 

of GMM estimators. These results provide a basis for inference in a wide range 

of models with cross-sectional or spatial dependence. 

JEL Classi…cation: C10, C21, C31 

Key words: Random …elds, near-epoch dependent processes, central limit 

theorem, law of large numbers, GMM estimator

1 Introduction 1 

Models with cross-sectionally or spatially dependent data have recently attracted 

considerable attention in various …elds of economics including labor and 

public economics, IO, political economy, macroeconomics, international economics, 

regional science and urban economics. In these models cross sectional 

dependence typically stems from agents’strategic interaction, neighborhood effects, 

shared resources and common shocks. Furthermore, economic agents are 

typically heterogeneous, e.g., vary in size. Mathematically, agents’observations 

may thus be viewed as a realization of a dependent heterogenous process indexed 

by a point in R d , d > 1, i.e., a random …eld. 2 

The aim of this paper is to de…ne a class of random …elds that is su¢ - 

ciently general to accommodate many applications of interest, and to establish 

corresponding limit theorems that can be used for asymptotic inference. In 

particular, we apply these limit theorems to prove consistency and asymptotic 

normality of generalized method of moments (GMM) estimators for a general 

class of nonlinear spatial models. 

There have been di¤erent approaches to modeling cross-sectional dependence 

in the econometrics literature. Two large strands of the literature have focused 

on (i) linear spatial autoregressive models, also known as Cli¤-Ord (1981) type 

models 3 , and (ii) common factor models 4 . In both cases, progress was facilitated, 

loosely speaking, by imposing speci…c structural conditions on the data 

generating process, and/or by exploiting some underlying (conditional) independence 

assumptions. Another common approach to model dependence is through 

mixing conditions. Various mixing concepts developed for time series processes 

have been extended to random …elds. However, the respective limit theorems 

for random …elds have not been su¢ ciently general to accommodate many of 

the processes encountered in economics. This hampered the development of a 

general asymptotic inference theory for nonlinear models with cross-sectional 

dependence. Towards …lling this gap, Jenish and Prucha (2009) have recently 

introduced a set of limit theorems (CLT, ULLN, LLN) for -mixing random 

…elds on unevenly spaced lattices that allow for nonstationary processes with 

trending moments. 

Yet some dependent spatial processes do not satisfy mixing conditions. As 

with time series, spatial processes may be generated as functions of in…nitely 

many elements of an input process, with a spatial moving average process being 

1 We thank the participants of the Cowles Foundation Conference, Yale, June 2009, as well 

as the seminar participants at the Columbia University for helpful discussions. This research 

bene…tted from a University of Maryland Ann G. Wylie Dissertation Fellowship for the …rst 

author and from …nancial support from the National Institute of Health through SBIR grant 

1 R43 AG027622 for the second author. 

2 The space and metric are not restricted to physical space and distance. 

3 For recent contributions see, e.g., Robinson (2009, 2007), Yu, de Jong and Lee (2008), 

Kelejian and Prucha (2007a,b,2004), Lee (2007a,b, 2004), and Chen and Conley (2001). 

4 For recent contributions, see, e.g., Andrews (2005), Bai and Ng (2004), Bai (2003), Phillips 

and Sul (2003). 

1

a leading example. As is well-known, the mixing property is not necessarily 

preserved under transformations involving in…nite number of lags. For instance, 

Andrews (1984) shows that a simple time series AR(1) process fails to be - 

mixing though its input process is independent. Thus, it is important to develop 

an asymptotic theory for a generalized class of random …elds that is “closed with 

respect to in…nite transformations.” 

To tackle this problem, the paper …rst extends the concept of near-epoch dependent 

(NED) processes used in the time series literature to spatial processes. 

The notion dates back to Ibragimov (1962), and Billingsley (1968). The NED 

concept, or variants thereof, have been used extensively in the time series literature 

by McLeish (1975a,b), Bierens (1981), Wooldridge (1986), Gallant (1987), 

Gallant and White (1988), Andrews (1988), Pötscher and Prucha (1991,1997), 

Davidson (1992, 1993, 1994) and de Jong (1997), among others. Doukhan and 

Louhichi (1999), Ango Nze and Doukhan (2004) introduced an alternative class 

of dependent processes called “ -weakly dependent.”We provide various examples 

suggesting that the class of NED spatial processes, which subsumes mixing 

processes, is su¢ ciently broad to cover many applications of interest. For 

instance, we show that it covers Cli¤-Ord type processes, autoregressive and 

moving average random …elds, and, more generally, nonlinear spatial Bernoulli 

shifts. 

The paper then derives a CLT and an LLN for spatial processes that are near 

epoch dependent on an -mixing input process. These limit theorems allow 

for fairly general forms of heterogeneity including asymptotically unbounded 

moments, and accommodate arrays of random …elds on unevenly spaced lattices. 

The LLN can be combined with the generic ULLN in Jenish and Prucha (2009) 

to obtain an ULLN for NED spatial processes. In the time series literature, 

CLTs for NED processes were derived by Wooldridge (1986), Davidson (1992, 

1993), and de Jong (1997). Interestingly, our CLT contains as a special case the 

CLT of Wooldridge (1986), Theorem 3.13 and Corollary 4.4. 

In addition, we give conditions under which the NED property is preserved 

under transformations. These results play a key role in verifying the NED property 

in applications. Thus, the NED property is compatible with considerable 

heterogeneity and dependence, invariant under transformations, and leads to 

a CLT and LLN under fairly general conditions. All these features make it a 

convenient tool for modeling spatial dependence. 

As an application, we establish consistency and asymptotic normality of 

spatial GMM estimators. These results provide a fundamental basis for constructing 

con…dence intervals and testing hypothesis for GMM estimators in 

nonlinear spatial models. Our results also expand on Conley (1999), who established 

the asymptotic properties of GMM estimators assuming that the data 

generating process and the moment functions are stationary and -mixing. 5 

The rest of the paper is organized as follows. Section 2 introduces the concept 

of NED spatial processes and gives some mild conditions that ensure preserva- 

5 This important early contribution employs Bolthausen’s (1982) CLT for stationary - 

mixing random …elds on the regular lattice Z 2 . However, the mixing and stationarity assumptions 

may not hold in many applications. The present paper relaxes these critical assumptions. 

2

tion of the NED property under transformations. Section 3 provides examples 

of NED spatial processes. Section 4 contains the CLT and LLN for NED spatial 

processes. Section 5 establishes the asymptotic properties of spatial GMM 

estimators. All proofs are collected in the appendices. 

2 De…nition of NED Spatial Processes 

Let D R d , d 1, be a (possibly) unevenly spaced lattice of locations in R d , 

and let Z = fZi;n; i 2 Dn; n 1g and " = f"i;n; i 2 Tn; n 1g be triangular 

arrays of random …elds de…ned on a probability space ( ; F; P ) with Dn Tn 

D. The space R d is equipped with the metric (i; j) = max1 l d jjl ilj, where 

il is the l-th component of i. The distance between any subsets U; V D is 

de…ned as (U; V ) = inf f (i; j) : i 2 U and j 2 V g. Furthermore, let jUj denote 

the cardinality of a …nite subset U D. 

The random variables Zi;n and "i;n are possibly vector-valued taking their 

values in R pz and R p" . We assume that R pz and R p" are normed metric spaces 

equipped with the Euclidian norm, which we denote (in an obvious misuse of 

notation) as j:j. For any random vector Y , let kY k p = [E jY j p ] 1=p , p 1; 

denote its Lp-norm. Finally, let Fi;n(s) = ("j;n; j 2 Tn : (i; j) s) be the 

-…eld generated by the random vectors "j;n located in the s-neighborhood of 

location i. 

Throughout the paper, we maintain the above notational conventions as well 

as the following assumption concerning D. 

Assumption 1 The lattice D R d , d 1, is in…nite countable. All elements 

in D are located at distances of at least 0 > 0 from each other, i.e., for all i; 

j 2 D : (i; j) 0; w.l.o.g. we assume that 0 > 1. 

The assumption of a minimum distance has also been used by Conley (1999) 

and Jenish and Prucha (2009). It ensures the growth of the sample size as the 

sample regions Dn and Tn expand. The setup is thus geared towards what is 

referred to in the spatial literature as increasing domain asymptotics. 

We now introduce the notion of near-epoch dependent (NED) random …elds. 

De…nition 1 Let Z = fZi;n; i 2 Dn; n 1g be a random …eld with kZi;nk p < 

1, p 1, let " = f"i;n; i 2 Tn; n 1g be a random …eld, where jTnj ! 1 as 

n ! 1, and let d = fdi;n; i 2 Dn; n 1g be an array of …nite positive constants. 

Then the random …eld Z is said to be Lp(d)-near-epoch dependent on the random 

…eld " if 

kZi;n E(Zi;njFi;n(s))k p di;n (s) (1) 

3

for some sequence (s) 0 with lims!1 (s) = 0. The (s), which are w.l.o.g. 

assumed to be non-increasing, are called the NED coe¢ cients, and the di;n are 

called NED scaling factors. Z is said to be Lp-NED on " of size if (s) = 

O(s ) for some > > 0. Furthermore, if sup n sup i2Dn di;n < 1, then Z is 

said to be uniformly Lp-NED on ": 

Recall that Dn Tn. Typically, Tn will be an in…nite subset of D, and 

often Tn = D. However, to cover Cli¤-Ord type processes, see Section 3.3, Tn 

is allowed to depend on n and to be …nite provided that it increases in size with 

n. 

The role of the scaling factors fdi;ng is to allow for the the possibility of 

“unbounded moments”, i.e., sup n sup i2Dn di;n = 1. Unbounded moments may 

re‡ect trends in the moments in certain directions, in which case we may also 

use, as in the time series literature, the terminology of “trending moments”. The 

NED property is thus compatible with a considerable amount of heterogeneity. 

In establishing limit theorems for NED processes, we will have to impose restrictions 

on the scaling factors di;n. In this respect, observe that 

kZi;n E(Zi;njFi;n(s))k p kZi;nk p + kE(Zi;njFi;n(s))k p 2 kZi;nk p 

by the Minkowski and the conditional Jensen inequalities. Given this, we may 

choose di;n 2 kZi;nk p , and consequently w.l.o.g. 0 (s) 1; see, e.g., 

Davidson (1994), p. 262, for a corresponding discussion within the context of 

time series processes. Note that by the Lyapunov inequality, if Zi;n is Lp-NED, 

then it is also Lq-NED with the same coe¢ cients fdi;ng and f (s)g for any 

q p. 

Our de…nition of NED for spatial processes is adapted from the de…nition of 

NED for time series processes. In the time series literature, the NED concept 

…rst appeared in the works of Ibragimov (1962) and Billingsley (1968), although 

they did use the present term. The concept of time series NED processes was 

later formalized by McLeish (1975a,b), Wooldridge (1986), Gallant (1987), Gallant 

and White (1988). These authors considered only L2-NED processes. Andrews 

(1988) generalized it to Lp-NED processes for p 1: Davidson (1992, 

1993, 1994) and de Jong (1997) further extended it to allow for trending time 

series processes. 

An important motivation for considering NED processes is that mixing is 

not necessarily preserved under transformations involving in…nitely many arguments; 

see, e.g., Andrews (1984) for simple examples in a time series context. 

However, as illustrated below, for a wide range of models the output process 

is obtained as a function of in…nitely many input variables. In those situations 

mixing of the input process does not necessarily carry over to the output process, 

and thus limit theorems for averages of the output process cannot simply be established 

from limit theorems for mixing processes. Nevertheless, as with time 

series processes, we show below that limit theorems can be extended to spatial 

processes that are NED on a mixing input process, provided the approxima- 

4

tion error declines “su¢ ciently fast” as the conditioning set of input variables 

expands. 

A further attractive feature of NED processes is that the NED property is 

preserved under transformations. Econometric estimators are usually de…ned 

either explicitly as functions of some underlying data generating processes or 

implicitly as optimizers of a function of the data generating process. Thus, if 

the data generating process is NED on some input process, the question arises 

under what conditions functions of random …elds are also NED on the same 

input process. 

Various conditions that ensure preservation of the NED property under 

transformations have been established in the time series literature by Gallant 

and White (1988), and Davidson (1994). In fact, these results extend to random 

…elds. In particular, the NED property is preserved under summation and multiplication, 

and carries over from a random vector to its components and vice 

versa. For future reference, we now state some results for generalized classes 

of nonlinear functions. Their proofs are analogous to those in the time series 

literature, and therefore omitted. 

Consider transformations of Zi;n given by a family of functions 

gi;n : R pz ! R: 

The functions gi;n are assumed Borel-measurable for all n and i 2 D. They 

are furthermore assumed to satisfy the following Lipschitz condition: For all 

(z; z ) 2 R pz R pz and all i 2 Dn and n 1: 

where 

jgi;n(z) gi;n(z )j Bi;n(z; z ) jz z j (2) 

Bi;n(z; z ) : R pz R pz ! R+; 

is Borel-measurable. Of course, this condition would be devoid of meaning without 

further restrictions on Bi;n(z; z ), which are given in the next propositions. 

Proposition 1 Suppose gi;n( ) satis…es Lipschitz condition (2) with 

jBi;n(z; z )j C < 1 

for all (z; z ) 2 R pz R pz and all i and n. If for p 1 the fZi;ng are Lp-NED 

of size on f"i;ng with scaling factors fdi;ng, then gi;n(Zi;n) is also Lp-NED 

of size on f"i;ng with scaling factors f2Cdi;ng. 

Proposition 2 Suppose gi;n( ) satis…es Lipschitz condition (2) with 

sup 

s 

B (s) 

i;n 2 < 1 and sup 

s 

5 

B (s) 

i;n Zi;n 

eZ s i;n r < 1 (3)

for some r > 2; where B (s) 

i;n = Bi;n(Zi;n; e Z s i;n ) and e Z s i;n = E [Zi;njFi;n(s)]. If 

kgi;n(Zi;n)k 2 < 1 and Zi;n is L2-NED of size on f"i;ng with scaling factors 

fdi;ng ; then gi;n(Zi;n) is L2-NED of size (r 2)=(2r 2) on f"i;ng with scaling 

factors d0 2)=(2r 

i;n = d(r i;n 

2) 

sups B (s) 

i;n 

(r 

2 

2)=(2r 2) 

B (s) 

i;n Zi;n 

eZ s i;n 

r=(2r 

r 

2) 

Thus, the NED property is hereditary under reasonably weak conditions. 

These conditions facilitate veri…cation of the NED property in practical application. 

In particular, we will use them in the proof of asymptotic normality of 

spatial GMM estimators in Section 5. 

3 Examples of NED Spatial Processes 

In this section, we give examples of some important classes of spatial processes, 

which are widely used in applications, and show that they satisfy the NED 

property. Of course, to establish limit theorems the NED property would be 

accompanied by mixing assumptions on the input process. 

3.1 Moving Average Random Fields 

Consider the in…nite moving average (or linear) random …eld on Zd , say, Y = 

fYi; i 2 Zdg de…ned as 

Yi = X 

gij"j; (4) 

j2Z d 

where " = f"i; i 2 Zdg is a zero mean vector-valued random …eld on Zd and the 

gij are some real numbers. We assume furthermore that the coe¢ cients gij and 

the innovations "i satisfy, respectively, 

and 

lim 

s!1 sup 

X 

i2Zd j2Zd ; (i;j)>s 

jgijj = 0; (5) 

sup 

i2Zd k"ikp < 1 where p 1: (6) 

Linear stationary random …elds on Z 2 have been explored by Whittle (1954) 

in a paper that signi…cantly in‡uenced the subsequent modeling of spatial dependencies. 

More recently, linear random …elds on Z d , d 1, have been studied 

by Doukhan and Guyon (1991), see also Doukhan (1994). Analogous to the 

time series literature, Yi is de…ned as the limit in Lp as s ! 1 of the partial 

sums Y s 

i 

= P 

j2Z d ; (i;j) s gij"j. The following existence results is proven by 

Doukhan (1994), pp. 75-81. 6 

6 In fact, Doukhan (1994) proves a more general result that also covers the case p < 1. 

6 

:

Proposition 3 (Doukhan, 1994) Under conditions (5)-(6) the distribution of 

linear random …eld Y in (4) is well de…ned. In particular, the …nite dimensional 

distributions of Y are limits in Lp as s ! 1 of the those of the random …elds 

Y s = fY s 

i ; i 2 Zd g. 

We next give a result that shows that under the same conditions, the linear 

…eld Y is Lp-NED on the …eld ". 

Proposition 4 Under conditions (5)-(6) the linear random …eld Y in (4) is 

uniformly Lp-NED on ": 

The nice feature of this result is that it does not impose any additional 

conditions over and above those employed in Proposition 3 to establish the 

existence of the linear …eld. 

3.2 Autoregressive Random Fields 

The linear random …elds of the previous example may arise as solutions of 

autoregressive spatial models. For example, Whittle (1954) considered the following 

simple autoregressive model on Z 2 : 

Yi1;i2 = Yi1+1;i2 + Yi1;i2+1 + "i1;i2 (7) 

where j j + j j < 1 and " = f"i1;i2 ; (i1; i2) 2 Z 2 g are i.i.d. random variables 

with zero mean and …nite variance. Whittle (1954), p. 447, gives the following 

the stationary solution of (7): 

Yi1;i2 = 

= 

1X 

1X 

j1=0 j2=0 

1X 

1X 

j1=i1 j2=i2 

j1 + j2 

Thus, (8) is a linear random …eld with 

j1 

j1 i1 + j2 i2 

j1 i1 

gij = j1 i1 + j2 i2 

j1 i1 

j1 j2 "i1+j1;i2+j2 (8) 

j1 i1 j2 i2 "j1;j2 

j1 i1 j2 i2 : 

for j = (j1; j2) i = (i1; i2) and zero, otherwise. Furthermore, note that 

the assumptions of Proposition 4 are satis…ed for p = 2, since P 

P 1 

k=0 

P k 

l=0 

k 

l j j l j j k l = P1 k=0 (j j + j j)k < 1, and hence 

0 lim 

s!1 sup 

X 

i2Z2 j i; (i;j)>s 

observing that j j + j j < 1. 

jgijj lim 

7 

s!1 

k=s 

1X 

(j j + j j) k = 0; 

j i jgijj =

3.3 Cli¤-Ord Type Spatial Processes 

Consider the following Cli¤-Ord type model: 

Yn = WnYn + Xn + un (9) 

un = Mnun + vn 

where Yn = (Y1;n; :::; Yn;n) 0 is an n 1 vector of endogenous variables, Xn = 

(Xij;n) is n k matrix of regressors, Wn = (wij;n) and Mn = (mij;n) are n n 

nonstochastic weight matrices, vn = (v1;n; : : : ; vn;n) 0 is an n 1 vector of disturbances, 

and are scalar parameters typically referred to as spatial autoregressive 

parameters, and is a k 1 parameter vector. To simplify presentation, we 

assume in the following that k = 1 and use the notation Xn = (X1;n; :::; Xn;n) 0 . 

The above model is frequently referred to as a spatial ARAR(1,1) model, and 

WnYn and Mnun are referred to as spatial lags. For recent contributions on 

estimation strategies for this model see, e.g., Robinson (2009, 2007), Kelejian 

and Prucha (2007a,b, 2004), and Lee (2007a,b, 2004). 

The spatial weights mij;n and wij;n are typically thought of as to depend 

on some measure of distance, say dij, between units i and j, where the weights 

decline as the distance increases. Thus although in Cli¤-Ord models observations 

are indexed by natural numbers, i.e., i = 1; : : : ; n, a leading case would 

be where i correspond to some location, say si 2 R d , and the distance between 

i and j is given by dij = (si; sj). In the following, we assume that 

Dn = fs1; : : : ; sng D, where the lattice D satis…es Assumption 1. However, 

to simplify notation we will use (i; j) instead of (si; sj). We assume 

furthermore that for all n 

In Wn and In Mn are nonsingular. (10) 

The reduced form of the model is then given by Yn = AnXn + Bnvn:with 

An = (aij;n) = (In Wn) 1 and Bn = (bij;n) = (In Wn) 1 (In Mn) 1 . 

In scalar notation we have for i = 1; :::; n: 

Yi;n = 

nX 

j=1 

aij;nXj;n + 

nX 

j=1 

bij;nvj;n; 

Although for …xed n; the output process Yi;n depends on only a …nite number of 

elements of the input process "i;n = (Xi;n; vi;n) 0 , the mixing property of "i;n may 

not carry over to Yi;n: The reason is that the number of elements composing the 

spatial lags grows unboundedly with the sample size so that the mixing property 

can break down in the limit. This is especially important when analyzing the 

asymptotic properties of Cli¤-Ord type processes. 

Towards establishing that Y = fYi;n; si 2 Dn; n 1g is NED on " = 

f"i;n; si 2 Dn; n 1g, we maintain the following assumptions: 

lim 

s!1 sup sup 

n 

X 

jaij;nj + jbij;nj = 0 (11) 

1 i n 

1 j n: (i;j)>s 

8


sup 

n 

sup 

1 i n 

We have the following result. 

k"i;nk p < 1 for some p 1: (12) 

Proposition 5 Under conditions (10)-(12), the process Y = fYi;n; si 2 Dn; n 

1g, where Yi;n is de…ned by (9), is uniformly Lp-NED on the process " = 

f"i;n; si 2 Dn; n 1g, where "i;n = (Xi;n; vi;n) 0 : 

since 

It is readily seen that a su¢ cient condition for (11) is that for some > 0, 

sup 

n 

sup 

n 

sup 

n 

sup 

sup 

1 i n 

j=1 

nX 

(jaij;nj + jbij;nj) (i; j) < 1 (13) 

X 

jaij;nj + jbij;nj 

1 i n 

1 j n; (i;j)>s 

sup 

X 

(jaij;nj + jbij;nj) 

1 i n 

1 j n; (i;j)>s 

1 

s sup sup 

n 

1 i n 

j=1 

(i; j) 

s 

nX 

(jaij;nj + jbij;nj) (i; j) ! 0 as s ! 1: 

A condition analogous to (13) has been used recently by Kelejian and Prucha 

(2007a), and should be satis…ed in a wide range of applications. It is slightly 

stronger than the typical assumption in the Cli¤-Ord literature which maintains 

assumptions that imply that the row and column sums of the absolute elements 

of the matrices An and Bn are uniformly bounded; compare, e.g., Kelejian and 

Prucha (2007b) and Lee (2007a,b, 2004). 

3.4 Spatial Bernoulli Shifts 

Consider a real-valued random …eld Y = fYi; i 2 Z d g de…ned as: 

Yi = fi("i+j; j 2 Z d ) (14) 

where " = f"i; i 2 Z d g is a real-valued random …eld and the fi : R Zd 

! R 

are measurable functions. The process (14) is a generalization of one-parameter 

Bernoulli shifts considered by Doukhan and Louhichi (1999). A simple example 

of a spatial Bernoulli shift is the in…nite moving average random …eld of the 

…rst example. More generally, the fi may be nonlinear functions. We assume 

that the fi satisfy the following Lipschitz-type regularity condition: 

fi(xj; j 2 Z d ) fi(yj; j 2 Z d ) X 

wij jxj yjj (15) 

9 

j2Z d

for some nonnegative constants wij such that 

lim 

s!1 sup 

X 

wij = 0: (16) 

i2Zd j2Zd : (i;j)>s 

Finally, we assume that the input process " = f"i; i 2 Z d g has uniformly 

bounded second moments, i.e., 

Then, one can establish the following result. 

sup 

i2Zd k"ik2 < 1 (17) 

Proposition 6 Under conditions (15)-(17), the random …eld Y = fYi; i 2 Z d g 

given by (14) is well-de…ned and uniformly L2-NED on the random …eld " = 

f"i; i 2 Z d g. 

Conditions (15)-(17) are analogous to those used by Doukhan and Louhichi 

(1999) for time-series Bernoulli shifts. Condition (15) is ful…lled if the functions 

fi have bounded partial derivatives, i.e., j@fi=@xjj wij. 

Thus, the class of NED spatial processes covers fairly large classes of linear 

and nonlinear transformations of random …elds. 

4 Limit Theorems 

4.1 Central Limit Theorem 

In the following, we introduce our CLT for real valued random …elds Y = 

fYi;n; i 2 Dn; n 1g, where Z is assumed to be L2-NED on some vector-valued 

-mixing random …eld " = f"i;n; i 2 Tn; n 1g with the NED coe¢ cients 

f (s)g, where Dn Tn D and the lattice D satis…es Assumption 1. In the 

sequel, we will use the following notation: 

Sn = X 

i2Dn 

Yi;n; 

2 

n = var(Sn): 

For ease of reference, we state below the de…nition of the -mixing coe¢ cients 

employed in the paper. 

De…nition 2 Let A and B be two -algebras of F, and let 

(A; B) = sup(jP (AB) P (A)P (B)j; A 2 A; B 2 B); 

For U Dn and V Dn, let n(U) = ("i;n; i 2 U) and n(U; V ) = 

( n(U); n(V )). Then, the -mixing coe¢ cients for the random …eld " are 

de…ned as: 

(u; v; r) = sup 

n 

sup( 

n(U; V ); jUj u; jV j v; (U; V ) r): 

U;V 

10

Dobrushin (1968) showed that weak dependence conditions based on the 

above mixing coe¢ cients are satis…ed by broad classes of random …elds including 

Markov …elds. In contrast to standard mixing numbers for time-series processes, 

the mixing coe¢ cients for random …elds depend not only on the distance between 

two datasets but also their sizes. To explicitly account for such dependence, it 

is furthermore assumed that 

(u; v; r) '(u; v)b(r) (18) 

where the function '(u; v) is nondecreasing in each argument, and b(r) ! 0 

as r ! 1. The idea is to account separately for the two di¤erent aspects of 

dependence: (i) decay of dependence with the distance, and (ii) accumulation of 

dependence as the sample region expands. The two common choices of '(u; v) 

in the random …elds literature are 


'(u; v) = (u + v) ; 0; (19) 

'(u; v) = min fu; vg : (20) 

The above mixing conditions have been used extensively in the random …elds 

literature including Neaderhouser (1978), Takahata (1983), Nahapetian (1987), 

Bulinskii (1989), Bulinskii and Doukhan (1990), Tran (1990) and Bradley (1993). 

They are satis…ed by fairly large classes of random …elds. Bradley (1993) provides 

examples of random …elds satisfying conditions (18)-(19) with u = v and 

= 1. Furthermore, Bulinskii (1989) constructs in…nite moving average random 

…elds satisfying the same conditions with = 1 for any given decay rate of coe¢ 

cients b(r): Clearly, standard mixing coe¢ cients in the time series literature 

are covered by conditions (18)-(19) when = 0. 

Following the literature, we employ the above mixing conditions for the 

input random …eld, and impose the following restriction on the decay rates of 

the mixing coe¢ cients. 

Assumption 2 The -mixing coe¢ cients of " satisfy (18) and (19) for some 

0 and b(r), such that for some > 0 

where = =(2 + ). 

1X 

r=1 

r d( +1) 1 b 2(2+ ) (r) < 1; (21) 

Note that if = 1 this assumption also covers the case where '(u; v) is given 

by (20). Furthermore, the CLT relies on the following conditions regarding 

moments, the NED coe¢ cients and the NED scaling factors. 

Assumption 3 (a) (Uniform L2+ integrability) 

lim 

k!1 sup sup E[jYi;nj 2+ 1(jYi;nj > k)] = 0; 

n 

i2Dn 

where 1( ) is the indicator function and > 0 is as in Assumption 2. 

11

(b) infn jDnj 1 2 n > 0, where fDng is a sequence of …nite subsets of D such 

that jDnj ! 1 as n ! 1: 

(c) NED coe¢ cients satisfy P 1 

r=1 rd 1 (r) < 1: 

We can now state the following CLT for L2-NED random …elds. 

Theorem 1 Suppose fDng is a sequence of …nite subsets such that jDnj ! 1 

as n ! 1 and fTng is a sequence of subsets such that Dn Tn D of the 

lattice D satisfying Assumption 1: Let Y = fYi;n; i 2 Dn; n 1g be a real valued 

zero-mean random …eld that is L2-NED on a vector-valued -mixing random 

…eld " = f"i;n; i 2 Tn; n 1g. Suppose Assumptions 2 and 3 hold, then 

1 

n Sn =) N(0; 1): 

The above CLT contains the CLT for -mixing random …elds given in Jenish 

and Prucha (2009) as a special case. As shown earlier, the class of NED random 

…elds is an important extension of the class of -mixing random …elds, in that the 

class includes important and widely considered …elds, e.g., spatial autoregressive 

processes, that do not generally satisfy mixing conditions. It is for that reason 

that an estimation theory limited to the class of -mixing random …elds is not 

fully satisfactory; see, e.g., Gallant and White (1988) and Pötscher and Prucha 

(1997) for an analogous discussion in the time series literature. We note that 

Theorem 1 also contains as a special case the CLT for time series NED processes 

of Wooldridge (1986), see Theorem 3.13 and Corollary 4.4. 

Assumption 2 restricts the dependence structure of the input process ": 

Assumptions 3(a),(b) are standard in the limit theory of mixing processes, 

e.g., Wooldridge (1986), Davidson (1992), and de Jong (1997), and Jenish and 

Prucha (2009). 

Assumption 3(a) is satis…ed if the Yi;n are uniformly Lp-bounded for some 

p > 2 + , i.e., sup n sup i2Dn kYi;nk p < 1. Assumption 3(b) is an asymptotic 

negligibility condition that ensures that no single summand in‡uences disproportionately 

the entire sum. Assumption 3(c) controls the size of the NED 

coe¢ cients which measure the error in the approximation of Yi;n by ". Intuitively, 

the approximation errors have to decline su¢ ciently fast with each 

successive approximation. Assumption 3(c) is satis…ed if (r) = O(r d ) for 

some > 0; i.e., (r) is of size d. 

Theorem 1 can be easily extended to vector-valued …elds using the standard 

Cramér-Wold device. 

Corollary 1 Suppose fDng is a sequence of …nite subsets such that jDnj ! 1 

as n ! 1 and fTng is a sequence of subsets such that Dn Tn D of the 

lattice D satisfying Assumption 1: Let Y = fYi;n; i 2 Dn; n 1g with Yi;n 2 

12

R k be a zero-mean random …eld that is L2-NED on a vector-valued -mixing 

random …eld " = f"i;n; i 2 Tn; n 1g. Suppose Assumptions 2 and 3 hold with 

jYi;nj denoting the Euclidean norm of Yi;n and 2 n replaced by min( n), where 

n = V ar(Sn) and min(:) is the smallest eigenvalue, then 

1=2 

n Sn =) N(0; Ik): 

Furthermore, sup n jDnj 1 max ( n) < 1, where max(:) denotes the largest 

eigenvalue. 

4.2 Law of Large Numbers 

To prove consistency of spatial estimators, we also derive a weak LLN for random 

…elds which are for L1-NED on some mixing …eld. This LLN can then be 

used to establish uniform convergence of random functions by combining it 

with the generic ULLN given in Jenish and Prucha (2009), which transforms 

pointwise LLNs (at a given parameter value) into ULLNs. The LLN maintains 

the following moment and mixing assumptions. 

Assumption 4 (a) sup n sup i2Dn E jYi;nj p < 1:; p > 1: 

(b) The -mixing coe¢ cients of the input …eld " satisfy (18) for some function 

'(u; v) which is nondecreasing in each argument, and some b(r) such that 

P 1 

r=1 rd 1 b (r) < 1. 

We now have the following weak LLN. 

Theorem 2 Let fDng be a sequence of arbitrary …nite subsets of D such that 

jDnj ! 1 as n ! 1; where D Rd , d 1 is as in Assumption 1, and let 

Tn be a sequence of subsets of D such that Dn Tn. Suppose further that 

Y = fYi;n; i 2 Dn; n 1g is L1-NED on " = f"i;n; i 2 Tn; n 1g. If Y and " 

satisfy Assumption 4, then 

1 

jDnj 

X 

i2Dn 

(Yi;n EYi;n) L1 

! 0; 

We note that Y and " can be vector-valued random …elds. In this case, jYi;nj 

should be understood as the Euclidean norm in the respective vector-space. 

Assumption 4(a) is a standard moment condition employed in weak LLNs for 

dependent processes. It requires existence of moments of order slightly greater 

than 1. Assumption 4(b) is weaker than the mixing conditions used for the CLT. 

Furthermore, the LLN does not require any restrictions on the NED coe¢ cients. 

In the time series literature, weak LLNs for NED processes have been obtained 

by Andrews (1988) and Davidson (1993), among others. Andrews (1988) 

derives an L1-law for triangular arrays of L1-mixingales. He then shows that 

NED processes are L1-mixingales, and hence, satisfy his LLN. Davidson (1993) 

extends the latter result to processes with trending moments. 

13

5 Large Sample Properties of Spatial GMM Estimators 

In this section, we apply the above developed limit theorems to establish the 

large sample properties of spatial GMM estimators under a reasonably general 

set of assumptions that should cover a wide range of empirical problems. More 

speci…cally, our consistency and asymptotic normality results (i) maintain only 

that the spatial data process is NED on an -mixing basis process to accommodate 

spatial lags in the data process as discussed above, (ii) allow for the data 

process to be located on an unevenly spaced grid, and (iii) allow for the data 

process to be non-stationary, which will frequently be the case in empirical applications. 

We also give our results under a set of primitive su¢ cient conditions 

for easier interpretation by the applied researcher. 7 

We continue with the basic set-up of Section 2. Consider the moment function 

qi;n : R pz ! R pq , where denotes the parameter space, and let 0 n 2 

denote the parameter vector of interest (which we allow to depend on n for reasons 

of generality). Suppose the following moment conditions hold 

Eqi;n(Zi;n; 0 n) = 0; (22) 

then the corresponding spatial GMM estimator is de…ned as 

where Qn: ! R, 

b n = arg min 

2 Qn(!; ); (23) 

Qn(!; ) = Rn( ) 0 PnRn( ); 

Rn( ) = jDnj 

X 

1 

qi;n(Zi;n; ); 

i2Dn 

where the Pn are a positive de…nite weighting matrices. To prove that b n is 

0 

a consistent estimator for n consider the following non-stochastic analogue of 

Qn, say 

Qn( ) = [ERn( )] 0 P [ERn( )] ; (24) 

where P denotes the probability limit of Pn. Then given the moment condition 

(22) clearly Rn( o o 

n) = 0, and the functions Qn are minimized at n with 

Qn( o n) = 0. In proving consistency, we follow the classical approach; see, e.g., 

7 In an important contribution, Conley (1999) gives a …rst set of results regarding the 

asymptotic properties of GMM estimators under the assumption that the data process is stationary 

and -mixing. Conley also maintains some high level assumption such as …rst moment 

continuity of the moment function, which in turn immediately implies uniform convergence 

- see, e.g., Pötscher and Prucha (1989) for a discussion. Our results extend Conley (1999) 

in several important directions, as indicated above. We establish uniform convergence from 

primitive su¢ cient conditions via the generic uniform law of large numbers given in Jenish 

and Prucha (2009) and the law of large numbers given as Theorem 2 above. 

14

Gallant and White (1988) or Pötscher and Prucha (1997) for more recent expo- 

o 

sitions. In particular, given identi…able uniqueness of n we establish, loosely 

speaking, convergence of the minimizers b o 

n to the minimizers n by establishing 

convergence of the objective function Qn(!; ) to its non-stochastic analogue 

Qn( ) uniformly over the parameter space. 

Throughout the sequel, we maintain the following assumptions regarding the 

parameter space, the GMM objective function and the unknown parameters 0 n. 

Assumption 5 (a) The parameter space is a compact metric space with 

metric . 

(b) The functions qi;n : R pz ! R pq are B pz =B pq -measurable for each 

2 , and continuous on for each z 2 R pz : 

(c) The elements of the pq pq real matrices Pn are B-measurable, and Pn is 

positive de…nite a.s. Furthermore P = p lim Pn exists and P is positive 

de…nite. 

(d) The minimizers o n are identi…ably unique in the sense that for every " > 0, 

lim infn!1 inf 2 : ( ; o n ) " [ERn( )] 0 [ERn( )] > 0: 

Compactness of the parameter space as maintained in Assumption 5(a) is 

typical for the GMM literature. Assumptions 5(b),(c) imply that Qn( ; ) is 

measurable for all 2 , and Qn(!; ) is continuous on . Given those assumptions 

the existence of measurable functions b n that solves (23) follows, e.g., from 

Lemma A3 of Pötscher and Prucha (1997). 

Since P is positive de…nite, it is readily seen that Assumption 5(d) implies 

that for every " > 0 : 

lim inf 

n!1 

inf 

2 : ( ; o n ) " Qn( ) Qn( o n) > 0; 

observing that Qn( o n) = 0. Thus under Assumption 5(d) the minimizers 

are identi…ably unique; compare, e.g., Gallant and White (1988), p.19. For 

interpretation consider the important special case where o n = o , Rn( ) = R( ) 

o 

and R( ) is continuous. In this case identi…able uniqueness of is equivalent 

to the assumption that 

o 

is the unique solution of the moment conditions, i.e., 

Rn( ) 6= 0 for all 6= o ; compare, e.g., Pötscher and Prucha (1997), p. 16. 

5.1 Consistency 

o 

Given the minimizers n are identi…ably unique, b n is a consistent estimator 

o 

p 

for n if Qn converges uniformly to Qn, i.e., if sup 2 Qn(!; ) Qn( ) ! 0 

as n ! 1; this follows immediately from, e.g, Pötscher and Prucha (1997), 

Lemma 3.1. 

We now proceed by giving a set of primitive domination and Lipschitz type 

conditions for the moment functions that ensure uniform convergence of Qn to 

15 

o 

n

Q n. The conditions are in line with those maintained in the general literature 

on M-estimation, e.g., Andrews (1987), Gallant and White (1988), and Pötscher 

and Prucha (1989,1994). 

De…nition 3 Let fi;n : R pz ! R pq be B pz =B pq -measurable functions for 

each 2 , then: 

(a) The random functions fi;n(Zi;n; ) are said to be p-dominated on for 

some p > 1 if sup n sup i2Dn E sup 2 jfi;n(Zi;n; )j p < 1. 

(b) The random functions fi;n(Zi;n; ) are said to be Lipschitz in the parameter 

on if 

jfi;n(Zi;n; ) fi;n(Zi;n; )j Li;n(Zi;n)h( ( ; )) a.s., (25) 

for all ; 2 and i 2 Dn; n 1, where h is a nonrandom function 

with h(x) # 0 as x # 0, and Li;n are random variables with 

< 1 for some > 0: 

lim sup n!1 jDnj 1 P 

i2Dn EL i;n 

Towards establishing consistency of b n we furthermore maintain the following 

moment and mixing assumptions. 

Assumption 6 The moment functions qi;n(Zi;n; ) have the following properties: 

(a) They are p-dominated on for p = 2. 

(b) They are uniformly L1-NED on " = f"i;n; i 2 Tn; n 1g, where Dn Tn 

D, and " is -mixing with -mixing coe¢ cients the conditions stated in 

Assumption 4(b). 

(c) They are Lipschitz in the parameter on . 

Assumption 6(a) implies that sup n;i2Dn E jqi;n(Zi;n; )j p < 1 for each 2 

. Assumptions 6(b) then allow us to apply the LLN given as Theorem 2 above 

the sample moments Rn( ) = jDnj 1 P 

i2Dn qi;n(Zi;n; ). 

To verify Assumption 6(b) one can use either Proposition 1 or Proposition 

2 to imply this condition from the lower level assumption that the data Zi;n are 

L1-NED. For example, the qi;n are L1-NED, if the Zi;n are L1-NED and satisfy 

the Lipschitz condition of Proposition 1. Note that no restrictions on the sizes 

of the NED coe¢ cients are required. 

Assumption 6(c) ensures stochastic equicontinuity of qi;n w.r.t. . Stochastic 

equicontinuity jointly with Assumption 6(a) and the pointwise LLN enable us 

to invoke the ULLN of Jenish and Prucha (2009) to prove uniform convergence 

of the sample moments, which in turn is used to establish that Qn converges 

16

uniformly to Q n. A su¢ cient condition for Assumption 6(c) is existence of 

integrable partial derivatives of qi;n w.r.t. if 2 R k . 

Our consistency results for the spatial GMM estimator given by (23) is summarized 

by the next theorem. 

Theorem 3 (Consistency) Suppose fDng is a sequence of …nite sets of D such 

that jDnj ! 1 as n ! 1; where D R d ; d 1 is as in Assumption 1. Suppose 

further that Assumptions 5 and 6 hold. Then 

( b n; o n) p ! 0 as n ! 1; 

and Q n( ) is uniformly equicontinuous on . 

5.2 Asymptotic Normality 

We next establish that the spatial GMM estimators de…ned by (23) is asymptotically 

normally distributed. For that purpose, we need a stronger set of assumptions 

than for consistency, including di¤erentiability of the moment functions in 

. It proofs helpful to adopt the notation r in place of @=@ . 8 

o 

Assumption 7 (a) The minimizers n lie uniformly in the interior of with 

Rk . Furthermore E [Rn( o n)] = 0. 

(b) The functions qi;n : R pz ! R pq are continuously di¤erentiable w.r.t. 

in the interior of for each z 2 R pz : 

(c) The functions qi;n(Zi;n; o n) are uniformly L2-NED on " of size d, and 

supn;i2Dn E jqi;n(Zi;n; o 0 

2+ 

n)j < 1 for some 

r qi;n(Zi;n; ) are uniformly L1-NED on ". 

0 

> 0. The functions 

(d) The input process " = f"i;n; i 2 Tn; n 1g, where Dn Tn D, is - 

mixing and the mixing coe¢ cients satisfy Assumption 2 for some < 0 , 

where 0 is the same as in Assumption 7(c). 

(e) The functions r qi;n are p-dominated on for some p > 1: 

(f) The functions r qi;n are Lipschitz in on . 

(g) infn min(jDnj 1 n) > 0 where n = V ar P 

(h) infn min [Er Rn( o n) 0 r Rn( o n)] > 0. 

i2Dn qi;n(Zi;n; o n) . 

8 To ensure that the derivatives are de…ned on the border of , we assume in the following 

that the moment functions are de…ned on an open set containing , and that the qi;n and 

r qi;n are restrictions to . 

17

The …rst part of Assumption 7(a) is needed to ensure that the estimator 

b n lies in the interior of with probability tending to one, and facilitates the 

application of the mean value theorem to Rn( b n) around o n. The second part 

states in essence that the moment conditions are correctly speci…ed. Its violation 

will generally invalidate the limiting distribution result. 

Assumptions 7(c),(d),(g) enable us to apply the CLT for vector-valued NED 

processes given above as Corollary 1 to Rn( o n). Some low level su¢ cient conditions 

for Assumption 7(c) are given below. To establish asymptotic normality, 

we also need uniform convergence of r Rn on , which is implied via Assumptions 

7(c),(d),(e),(f). 

Given the above assumptions, we have the following asymptotic normality 

result for the spatial GMM estimator de…ned by(23). 

Theorem 4 Suppose fDng is a sequence of …nite sets of D such that jDnj ! 1 

as n ! 1; where D R d ; d 1 is as in Assumption 1. Suppose further that 

Assumptions 5- 7 hold. Then 

where 

A 1 

n BnB 0 nA 10 

n 

1=2 jDnj 1=2 b n 

o 

n =) N(0; Ik); 

An = [Er Rn( o n)] 0 P [Er Rn( o n)] and Bn = [Er Rn( o n)] 0 P 

h 

jDnj 1 i1=2 n : 

Moreover, jAnj = O(1); A 1 

n = O(1); jBnj = O(1); (BnB 0 n) 1 = O(1) and 

hence, b n is jDnj 1=2 -consistent for 

o 

n. 

As remarked above, relative to the existing literature Theorem 4 allows for 

nonstationary processes and only assumes that qi;n and r qi;n are NED on 

an -mixing input process, rather than postulating that qi;n and r qi;n are - 

mixing. The latter assumption is restrictive in that the -mixing property is not 

preserved under transformation. Since the NED property is preserved under a 

wide range of transformations it accommodates a much larger class of dependent 

spatial processes, including in…nite moving average and spatial autoregressive 

models. As such, Theorem 4 should provide a basis for constructing con…dence 

intervals and hypothesis testing in a wider range of spatial models. 

Using Proposition 2, we now give some su¢ cient conditions for Assumption 

7(c). 

Assumption 8 The process fZi;n; i 2 Dn Tn; n 1g is uniformly L2-NED 

on f"i;n; i 2 Tn; n 1g of size 2d(r 1)=(r 2) for some r > 2: 

Assumption 9 For every sequence f ng on , the functions qi;n(Zi;n; n) and 

r qi;n(Zi;n; n) satisfy Lipschitz condition (2) in z, that is, for gi;n = qi;n or 

r qi;n: 

jgi;n(z; n) gi;n(z ; n)j Bi;n(z; z ) jz z j : 

18

Furthermore, for the r > 2 as speci…ed in Assumption 8, 

sup sup 

n;i2Dn s 

B (s) 

i;n < 1 and sup sup 

2 

n;i2Dn s 

B (s) 

i;n Zi;n 

where B (s) 

i;n = Bi;n(Zi;n; e Zs i;n ) with e Zs i;n = E [Zi;njFi;n(s)]. 

6 Conclusion 

eZ s i;n r 

< 1 

The paper develops an asymptotic inference theory for a class of dependent 

nonstationary random …elds that could be used in a wide range of econometric 

models with cross-sectional or spatial dependence, e.g., social interaction models. 

More speci…cally, the paper extends the notion of near-epoch dependent 

(NED) processes used in the time series literature to spatial processes. This 

allows to accommodate larger classes of dependent processes than mixing random 

…elds. The class of NED random …elds is “closed with respect to in…nite 

transformations ” and thus should be su¢ ciently broad for many applications 

of interest. In particular, it covers autoregressive and in…nite moving average 

random …elds as well as nonlinear spatial Bernoulli shifts. The NED property 

is also compatible with considerable heterogeneity and preserved under transformations 

under fairly mild conditions. Furthermore, a CLT and an LLN are 

derived for spatial processes that are NED on an -mixing process. Apart from 

covering a larger class of dependent processes, these limit theorems also allow 

for arrays of nonstationary random …elds on unevenly spaced lattices. Building 

on these limit results, the paper develops an asymptotic theory of spatial GMM 

estimators, which provides a basis for inference in a broad range of models with 

cross-sectional or spatial dependence. 

Much of the random …elds literature assumes that the process resides on an 

equally spaced grid. In contrast, and as in Jenish and Prucha (2009), we allow 

for locations to be unequally spaced. The implicit assumption of …xed locations 

seems reasonable for a large class of applications, especially in the short run. 

Still, an important direction for future work would be to extend the asymptotic 

theory to spatial processes with endogenous locations, while maintaining a set 

of assumptions that are reasonably easy to interpret. 9 One possible approach 

may be to augment the contributions of the present paper with theory from 

point processes. 

A Appendix: Proofs for Section 3 

Proof of Proposition 4: Let Fi(s) = ("j; j 2 Z d : (i; j) s). By the 

Minkowski inequality for in…nite sums, the conditional Jensen inequality, and 

9 Pinkse et al. (2007) made an interesting contribution in this direction. Their catalogue 

of assumption is at the level of Bernstein blocks. Without further su¢ cient conditions, veri- 

…cation of those assumptions would typically be challenging in practical situations. 

19

(6), we have 

kYi E[YijFi(s)]k p = X 

X 

j2Z d : (i;j)>s 

j2Z d ; (i;j)>s 

jgijj k"j E ["jjFi(s)]k p K (s): 

gij f"j E ["jjFi(s)]g 

P 

with K = 2 supj2Zd k"jkp < 1 and (s) = supi2Zd j2Zd : (i;j)>s jgijj. Since 

(s) ! 0 as s ! 1 by condition (5) we have kYi E[YijFi(s)]kp ! 0 as s ! 1, 

which proves the claim. Q.E.D. 

Proof of Proposition 5: Let Fi;n(s) = ("j;n; 1 j n: (i; j) s). By the 

Minkowski inequality, the conditional Jensen inequality, and (12), we have 

X 

kYi;n E[Yi;njFi;n(s)]k p 

+ 

X 

1 j n; (i;j)>s 

1 j n; (i;j)>s 

jbij;nj kvj;n E[vj;njFi;n(s)]k p K (s) 

jaij;nj kXj;n E[Xj;njFi;n(s)]k p 

P 

with K = maxf2 kXj;nkp ; 2 kvj;nkpg < 1 and (s) = supn;1 i n 1 j n; (i;j)>s (jaij;nj + jbij;nj). 

Since (s) ! 0 as s ! 1 by condition (11) we have kYi E[YijFi(s)]kp ! 0 as 

s ! 1, which proves the claim. Q.E.D. 

Proof of Proposition 6: We …rst show that Yi is well de…ned as the limit in 

L2 as s ! 1 of the random variables 

where 

Y s 

i = fi(" (s) 

i+j ; j 2 Zd ); 

" (s) 

i+j = "i+j for (i; j) s 

0 for (i; j) > s : 

In light of condition (15) we have for any s, m 2 N: 

Y s+m 

i 

Y s 

i 2 

= fi(" (s+m) 

i+j ) fi(" (s) 

i+j ) 2 

sup 

i2Zd k"ik2 

X 

j2Z d :s< (i;j) s+m 

X 

j2Z d :s< (i;j) s+m 

wij K sup 

p 

wij k"i+jk 2 

X 

i2Zd j2Zd wij 

: (i;j)>s 

with K = supi2Zd k"ik2. In light of (16)-(17) the r.h.s. converges monotonically 

to zero as s ! 1. Thus clearly Y s 

i is a Cauchy series in L2, and hence Yi is 

well de…ned since the L2 space is a Banach space and as such complete. 

Now, let Fi(s) = ("i+j; j 2 Zd : (i; j) s). By the minimum mean-squared 

error property of the conditional expectation, we have 

20

with 

kYi E[YijFi(s)]k 2 fi("i+j; Z d ) fi(" (s) 

i+j ; j 2 Zd ) 2 

(s) = sup 

K 

X 

j2Z d : (i;j)>s 

X 

i2Zd j2Zd : (i;j)>s 

wij K (s); 

wij ! 0 

as s ! 1 by condition (16). This completes the proof of the lemma. Q.E.D. 

B Appendix: Proofs for Section 4 

Throughout this appendix let Fi;n(s) = ("j;n; j 2 Tn : (i; j) s) be the 

-…eld generated by the random vectors "j;n located in the s-neighborhood of 

location i. Furthermore, C denotes a generic constant that does not depend on 

n and may be di¤erent from line to line. 

The proof of the CLT builds on Ibragimov and Linnik (1971), pp. 352-355, 

and makes use of the following lemmata: 

Lemma B.1 (Brockwell and Davis (1991), Proposition 6.3.9). Let Yn, n = 

1; 2; ::: and Vns; s = 1; 2; :::; n = 1; 2; :::, be random vectors such that 

(i) Vns =) Vs as n ! 1 for each s = 1; 2; ::: 

(ii) Vs =) V as s ! 1, and 

(iii) lims!1 lim sup n!1 P (jYn Vnsj > ) = 0 for every > 0: 

Then Yn =) V as n ! 1: 

Lemma B.2 (Ibragimov and Linnik (1971)) Let Lp(F1) and Lp(F2) denote, 

respectively, the class of F1-measurable and F2-measurable random variables 

satisfying k k p < 1. Let X 2 Lp(F1) and Y 2 Lq(F2): Then, for any 1 

p; q; r < 1 such that p 1 + q 1 + r 1 = 1, 

jCov(X; Y )j < 4 1=r (F1; F2) kXk p kY k q 

where (F1; F2) = sup A2F1;B2F2 (jP (AB) P (A)P (B)j). 

To prove the CLT for NED random …elds, we …rst establish some moment 

inequalities and a slightly modi…ed version of the CLT for mixing …elds 

developed in Jenish and Prucha (2009). It is helpful to introduce the following 

notation. Let X = fXi;n; i 2 Dn; n 1g be a random …eld, then 

kXk q := sup n;i2Dn kXi;nk q for q 1. 

21

Lemma B.3 Let fXi;ng be uniformly L2-NED on a random …eld f"i;ng with 

-mixing coe¢ cients (u; v; r) (u + v) b(r), 0. Let Sn = P 

i2Dn Xi;n 

and suppose that the NED coe¢ cients of fXi;ng satisfy P1 r=1 rd 1 (r) < 1 

and kXk2+ < 1 for some > 0. Then, 

(a) 

(b) 

jCov (Xi;nXj;n)j kXk2+ 

n 

C1kXk2+ [h=3] d 

b =(2+ ) o 

([h=3]) + C2 ([h=3]) 

where h = (i; j) and = =(2+ ). If in addition P 1 

r=1 rd( +1) 1 b =(2+ ) (r) < 

1, then for some constant C < 1, not depending on n 

jCov (Xi;nXj;n)j kXk2 

V ar (Sn) C jDnj : 

n 

C3kXk2+ [h=3] d 

b =(4+2 ) o 

([h=3]) + C4 ([h=3]) 

where h = (i; j) and = =(4+2 ). If, in addition, P 1 

r=1 rd( +1) 1 b =(4+2 ) (r) < 

1 where = =(4 + 2 ), then for some constant C < 1, not depending 

on n 

V ar (Sn) CkXk2 jDnj : 

Proof of Lemma B.3: (a) For any i 2 Dn and m > 0, let 

m 

i;n = E(Xi;njFi;n(m)); 

m 

i;n = Xi;n 

By the Jensen and Lyapunov inequalities, we have for all i 2 Dn; n; m 2 N 

and any 1 q 2 + : 

E m i;n 

and thus 

m 

i;n q 

m 

i;n 

q = EfjE(Xi;njFi;n(m))j q g EfE(jXi;nj q jFi;n(m))g = E jXi;nj q 

kXi;nk q kXk 2+ ; 

m 

i;n q 

2 kXi;nk q 2 kXk 2+ : 

Thus, both m i;n and m i;n are uniformly L2+ bounded. Also, note that 

sup 

n;i2Dn 

m 

i;n 2 

(m); 

given that the fXi;ng is uniformly L2-NED on f"i;ng and thus the NED-scaling 

factors can be chosen w.l.g. to be one. Furthermore, let ( m i;n) denote the 

-…eld generated by m i;n: Since ( m i;n) Fi;n(m), the mixing coe¢ cients of m i;n 

satisfy 

(1; 1; r) 

1; r 2m 

(Mm d ; Mm d ; r 2m); r > 2m 

22

where (u; v; r) are the mixing coe¢ cients of the input process ", since the mneighborhood 

of any point on D contains at most Mm d points of D for some M 

that does not depend on m, see the proof of Lemma A.1 of Jenish and Prucha 

(2009). 

Now, decompose Xi;n and and Xj;n as 

Xi;n = [h=3] 

i;n 

where h = (i; j). Then, 

jCov (Xi;nXj;n)j = Cov 

[h=3] 

+ i;n ; and Xj;n = [h=3] [h=3] 

j;n + j;n ; 

Cov 

+ Cov 

[h=3] 

i;n 

[h=3] 

i;n ; 

[h=3] 

j;n ; 

+ [h=3] 

i;n ; 

[h=3] 

j;n 

[h=3] 

i;n 

[h=3] 

j;n 

+ Cov 

+ Cov 

+ [h=3] 

j;n 

[h=3] 

i;n ; 

[h=3] 

i;n ; 

[h=3] 

j;n 

[h=3] 

j;n 

: 

(B.1) 

We will now bound separately each term on the r.h.s. of the last inequality. 

First, using Lemma B.2 with p = q = 2 + ; and r = (2 + )= yields the 

following bound on the …rst term: 

Cov 

[h=3] 

i;n ; 

[h=3] 

j;n 

4 

[h=3] 

i;n 

4kXk 2 2+ 

2+ 

C1kXk 2 2+ [h=3] d 

[h=3] 

j;n 

2+ 

=(2+ ) (1; 1; [h=3]) (B.2) 

=(2+ ) M [h=3] d ; M [h=3] d ; h 2 [h=3] 

b =(2+ ) ([h=3]) 

where = =(2 + ). 

Second, the Cauchy-Schwartz inequality gives the following bound on the 

second and third terms: 

Cov 

[h=3] 

i;n ; 

[h=3] 

j;n 

4 

[h=3] 

i;n 

2 

[h=3] 

j;n 

Similarly, the fourth term can be bounded as: 

Cov 

[h=3] 

i;n ; 

[h=3] 

j;n 

4 

[h=3] 

i;n 

2 

[h=3] 

j;n 

Collecting (B.2)-(B.4), we have 

n 

jCov (Xi;nXj;n)j kXk2+ C1kXk2+ [h=3] d 

2 

2 

4kXk2 ([h=3]) (B.3) 

8kXk2 ([h=3]) (B.4) 

b =(2+ ) o 

([h=3]) + C2 ([h=3]) 

which proves the …rst inequality. 

Using this inequality as well as the bounds on the sizes of the sets given in 

23

Lemma A.1 of Jenish and Prucha (2009), we have 

V ar (Sn) 

X 

V ar (Xi;n) + 

X 

i2Dn 

i;j2Dn;i6=j 

2jDnj kXk 2 

2+ + C1kXk 2 2+ 

+C2kXk2+ 

X 

1X 

jCov (Xi;nXj;n)j 

X 

1X 

i2Dn r=1 j2Dn: (i;j)2[r;r+1) 

X 

i2Dn r=1 j2Dn: (i;j)2[r;r+1) 

X 

2jDnj kXk 2 

2+ + C jDnj kXk2+ 

for some constant C < 1, not depending on n. 

r=1 

([ (i; j)=3]) 

[ (i; j)=3] d 

b =(2+ ) ([ (i; j)=3]) 

" 

1X 

r d( +1) 1 b =(2+ ) 1X 

(r) + r d 1 # 

(r) 

(b) To prove the second part of the lemma, apply Lemma B.2 with p = 2+ ; 

q = 2, and r = 2(2 + )= to obtain the following bound on the …rst term: 

Cov 

[h=3] 

i;n ; 

[h=3] 

j;n 

C3kXk2+ kXk2 [h=3] d 

r=1 

b =(4+2 ) ([h=3]) ; (B.5) 

where = =(4 + 2 ). The other terms on the r.h.s. of (B.1) are bounded as 

in part (a). Collecting (B.3)-(B.5) gives 

jCov (Xi;nXj;n)j 

n 

kXk2 C3kXk2+ [h=3] d 

b =(4+2 ) o 

([h=3]) + C4 ([h=3]) 

as required. Finally, using similar arguments as in the proof of part (a), we can 

bound V ar (Sn) as 

V ar (Sn) CkXk2jDnj 

for some constant C < 1, not depending on n. Q.E.D. 

Theorem B.1 Suppose fDng is a sequence of …nite subsets of D, satisfying 

Assumption 1, with jDnj ! 1 as n ! 1: Suppose further that f"i;n; i 2 Dn; n 2 

Ng is an array of zero-mean random variables with -coe¢ cients (u; v; r) 

C(u + v) b(r) for some constants C < 1 and 0. Suppose for some > 0 

and > 0 


lim 

k!1 sup E[j"i;nj 

n;i2Dn 

2+ 1(j"i;nj > k)] = 0 

b(r) = O(r 

d(2 +1) 

with = max f ; 1= g, and suppose lim infn!1 jDnj 1 2 n > 0, then 

where 2 n = V ar P 

1 

n 

i2Dn "i;n . 

X 

"i;n =) N(0; 1): 

i2Dn 

24 

) 

CjDnj

The above CLT is in essence a variant of CLT for -mixing random …elds 

given as Corollary 1 of Theorem 1 in Jenish and Prucha (2009), applied to 

mixing coe¢ cients of the type (u; v; r) C(u + v) b(r); 0. 

Proof of Theorem B.1: The proof of the theorem is largely the same as the 

proof of Theorem 1 in Jenish and Prucha (2009). We will …rst show that all 

assumptions of that theorem except Assumption 3(c) are satis…ed. We will then 

show that the entire proof goes through if Assumption 3(c) is replaced by the 

condition b(r) = O(r d(2 +1) ) with = max f ; 1= g. 

Clearly, Assumptions 1, 2 and 5 of that theorem with ci;n = 1 are satis…ed. 

To verify Assumptions 3(a) note that 

1X 

(1; 1; r)r [d(2+ )= ] 1 1X 

1X 

2d( 1= ) 1 

C r C r 1 < 1; 

r=1 

r=1 

since = max f ; 1= g. As shown in the proof of Corollary 1 in Jenish and 

Prucha (2009), the latter condition implies Assumption 3(a) of Theorem 1 in 

Jenish and Prucha (2009). Furthermore, it is easy to see that Assumption 3 (b) 

is also satis…ed. Indeed, for any u + v 4, we have 

1X 

d 

(u; v; r)r 

1 

1X 

4 C r d 1 b(r) 

1X 

C r d 1 r d(2 +1) < 1: 

r=1 

r=1 

Thus, all assumptions, except Assumption 3(c), of Theorem 1 in Jenish and 

Prucha (2009) hold, and hence, all steps of its proof which do not rely on that 

assumptions remain valid in our case. Assumption 3(c) is only used in step 5 

of that proof. Speci…cally, all arguments in that step continue to hold given we 

show that there exists sequence mn such that 


r=1 

r=1 

m d njDnj 1=2 ! 0 as n ! 1 (B.6) 

(1; jDnj ; mn)jDnj 1=2 ! 0 as n ! 1 (B.7) 

(Though 1;1 is used instead of 1;jDnj in the proof of Theorem 1 of Jenish and 

Prucha (2009), in fact the proof only relies on the coe¢ cient 1;jDnj; see Step 9 

(jEA3;nj ! 0) of the proof in the working version of the paper). 

The desired sequence mn can be chosen as 

It is immediate that (B.6) holds, 

To verify (B.7), observe 

(1; jDnj ; mn)jDnj 1=2 

mn = jDnj 1=2 = log jDnj 1=d 

: 

m d njDnj 1=2 = (log jDnj) 1 ! 0: 

CjDnj +1=2 m 

CjDnj +1=2 jDnj 

CjDnj 

d(2 +1) 

n 

1=2 jDnj 

=(2d) [log jDnj] (2 +1)+ =d ! 0: 

25 

=(2d) (2 +1)+ =d 

[log jDnj]

The rest of the proof is the same, word-by-word, as the proof of Theorem 1 of 

Jenish and Prucha (2009). Q.E.D. 

Proof of Theorem 1: Since the proof is lengthy it is broken into steps. 

1. Decomposition of Yi;n 

For any …xed s > 0, decompose Xi;n as 

where 

Let 

Yi;n = s i;n + s i;n 

s 

i;n = E(Yi;njFi;n(s)), 

Sn;s = X 

i2Dn 

s 

i;n = Yi;n 

s 

i;n; e Sn;s = X 

i2Dn 

s 

i;n 

s 

i;n 

2 

n;s = V ar [Sn;s] ; e 2 

h i 

n;s = V ar eSn;s 

Repeated use of the Minkowski inequality yields: 

Observe that 

and hence 

j n n;sj en;s, j n en;sj n;s: (B.8) 

E [E(Yi;njFi;n(s))jFi;n(m))] = 

E(Yi;njFi;n(s)); m s; 

E(Yi;njFi;n(m)); m < s: 

s 

i;n E( s i;njFi;n(m)) 

2 

= kYi;n E[Yi;njFi;n(s)] E[Yi;njFi;n(m)] + E[(Yi;njFi;n(s))jFi;n(m)]k2 = 

kYi;n E(Yi;njFi;n(m))k2 C (m); if m s; 

kYi;n E(Yi;njFi;n(s))k2 C (s) C (m); if m < s: 

since by de…nition the sequence (m) is non-increasing. Thus, for any …xed 

s > 0, s i;n is uniformly L2-NED on " with the same NED coe¢ cients (m) 

as the random …eld fYi;ng. Furthermore, as shown in the proof of Lemma B.3, 

s 

i;n is also uniformly L2+ bounded. 

2. Bounds for the Variances of P Yi;n and P s i;n 

First note that exists 0 

BjDnj 

2 

n: (B.9) 

Furthermore, in light of Assumption 2, and observing that = =(4 + 2 ) 

= =(2 + ) and b =(2+ ) (r) b 2(2+ ) (r) we have 

1X 

r d( +1) 1 b 

r=1 

=(2+ ) 

1X 

r d( +1) 1 b =(4+2 ) (r) 

r=1 

26 

1X 

r=1 

1X 

r=1 

r d( +1) 1 b 2(2+ ) (r) < 1; 

r d( +1) 1 b 2(2+ ) (r) < 1:

Using part (a) of Lemma B.3 with Xi;n = Yi;n, we have 

BjDnj 

2 

n = V ar (Sn) C jDnj . 

for some B > 0. Using part (b) of Lemma B.3 with Xi;n = s i;n we have 

Hence, 

e 2 

n;s = V ar e Sn;s C jDnj k s i;nk2 = CjDnj (s) (B.10) 

lim lim sup 

s!1 n!1 

Furthermore, by (B.8) we have 

lim lim sup 

s!1 n!1 

and hence for all s 1 and n 1 

1 

e 2 

n;s 

2 

n 

n;s 

n 

n;s 

n 

C lim 

s!1 

(s) = 0: (B.11) 

en;s 

lim lim sup = 0 (B.12) 

s!1 n!1 n 

C < 1: (B.13) 

3. CLT for P s 

i2Dn i;n 

We now show that for any …xed s > 0, s i;n satis…es the CLT given in Theorem 

B.1. 

h 

si;n 2+ 

First, since supn;i2Dn E 

i 

< 1, the process is uniformly 

L2+ 

0-integrable for 

Second, note that since 

and r > 2s 

0 

= =2, i.e., 

lim 

k!1 sup E[ 

n;i2Dn 

s 2+ =2 

i;n 1( 

s 

i;n > k)] = 0: 

s 

i;n 

s 

i;n is a measurable function of "i;n for any u; v 2 N 

(u; v; r) (uMs d ; vMs d ; r 2s) C (u + v) b (r 2s) 

We next to show that b(r) = O(r d(2 +1) ) for = max f ; 2= g and some 

> 0. By assumption, 

1X 

r=1 

where = =(2 + ), which implies 

r d( +1) 1 b 2(2+ ) (r) < 1; 

b (r) = o(r 2d(2+ )( +1)= ) = o(r d[2( +2= )+1] d ) = o(r d[2 +1] d ) 

since +2= for = max f ; 2= g. Thus, b(r) = O(r d(2 +1) ) for = d. 

27

We next show that for su¢ ciently large s; 

By (B.9), 

0 < lim inf 

n!1 jDnj 1 2 n;s: 

B 1=2 

inf jDnj 1=2 n 

Since lims!1 (s) = 0; there exists s such that in light of (B.10) for all s s , 

Hence by (B.8) for all s s ; 

jDnj 1=2 en;s C 1=2 (s) B 1=2 =2: (B.14) 

jDnj 1=2 ( n en;s) jDnj 1=2 n;s 

and thus infn jDnj 1=2 n;s infn jDnj 1=2 n sup n jDnj 1=2 en;s. Using (??) 

and (B.14), we have 

Thus, for all s s , 

lim inf 

n!1 jDnj 1=2 1=2 B1=2 B1=2 

n;s B = > 0 

2 2 

1 

n;s 

X 

i2Dn 

s 

i;n =) N(0; 1) as n ! 1: (B.15) 

Since the …rst s terms do not a¤ect the analysis below we take in the following 

s = 1: 

P 1 

n i2Dn Yi;n 

4. CLT for 

Finally, using Lemma B.1 we now show that, given the maintained NED 

assumption, the just established CLT in (B.15) for the approximators 

be carried over to the the Yi;n. De…ne 

s 

i;n can 

X 

X 

X 

Wn = 

1 

n 

i2Dn 

Yi;n; Vns = 1 

n 

i2Dn 

s 

i;n, Wn Vns = 1 

n 

so that we can exploit Lemma B.1 to prove that 

X 

Wn = Yi;n =) V N(0; 1): 

1 

n 

i2Dn 

i2Dn 

We …rst verify condition (iii) of Lemma B.1. By Markov’s inequality and (B.11), 

for every > 0 we have 

lim lim sup P (jWn Vnsj > ) = lim lim sup P ( 

s!1 n!1 

s!1 n!1 

lim lim sup 

s!1 n!1 

28 

e 2 

n;s 

= 0: 

2 2 

n 

1 

n 

X 

i2Dn 

s 

i;n 

s 

i;n > )

Next observe that 

Vns = n;s 

n 

" 

1 

n;s 

X 

i2Dn 

We proceed to show Wn =) V by contradiction. For that purpose let M be the 

set of all probability measures on (R;B), and observe that we can metrize M by, 

e.g., the Prokhorov distance d(:; :). Let n and be the probability measures 

corresponding to Wn and V , respectively, then Wn =) V , or n =) , i¤ 

d( n; ) ! 0 as n ! 1. Now suppose n does not converge to . Then for 

some > 0 there exists a subsequence fn(m)g such that d( n(m); ) > for 

all n(m). By (B.13), we have 0 n;s= n C < 1 for all s; n 1. Hence, 

0 n(m);s= n(m) C < 1 for all n(m). Consequently, for s = 1 there 

exists a subsubsequence fn(m(l1))g such that n(m(l1));1= n(m(l1)) ! p(1) as 

l1 ! 1. For s = 2, there exists a subsubsubsequence fn(m(l1(l2)))g such that 

n(m(l1(l2)));2= n(m(l1(l2))) ! p(2) as l2 ! 1. The argument can be repeated 

for s = 3; 4:::. Now construct a subsequence fnlg such that n1 corresponds 

to the …rst element of fn(m(l1))g, n2 corresponds to the second element of 

fn(m(l1(l2)))g, and so on, then 

lim 

l!1 

nl;s 

nl 

s 

i;n 

for s = 1; 2; : : : Given (B.15), it follows that as l ! 1 

Then, it follows from (B.12) that 

lim jp(s) 1j lim 

s!1 s!1 lim 

l!1 p(s) 

# 

Vnls =) Vs N(0; p 2 (s)): 

: 

= p(s) (B.16) 

nl;s 

nl 

+ lim 

s!1 sup 

n 1 

n;s 

n 

1 = 0: 

Thus Vs =) V and thus by Lemma B.1 Wnl =) V N(0; 1) as l ! 1. 

Since fnlg fn(m)g this contradicts the assumption that d( n(m); ) > for 

all n(m). This completes the proof of the CLT. Q.E.D. 

Proof of Corollary 1: To prove the theorem, we apply the Cramer-Wold 

device, and verify that for every 2 Rk P 1 

with j j = 1, n Vi;n ) N(0; 1), 

where Vi;n = 0 Yi;n. Observe that using the properties of norms, we have 


jVi;nj = 0 Yi;n j j jYi;nj = jYi;nj 

1(jVi;nj > k) = 1( 

0 Yi;n > k) 1(jYi;nj > k); 

and thus limk!1 sup n;i2Dn E[jVi;nj 2+ 1(jVi;nj > k)] = 0. Furthermore, observe 

that 

kVi;n E(Vi;n j Fi;n(s))k 2 j j kYi;n EYi;n j Fi;n(s))k 2 di;n (s) 

29

and that for 2 n = V ar( P 

i2Dn Vi;n) = 0 n we have 

inf 

n jDnj 1 2 n = inf 

n jDnj 1 0 n inf 

n jDnj 1 min( n) > 0: 

From this we see that under the maintained assumptions Vi;n satis…es all assumptions 

P 

of the CLT for scalar-valued random …elds (Theorem 1) and, there- 

1 fore, n Vi;n ) N(0; 1) as claimed. 

Next de…ne Xi;n = Vi;n, then by analogous arguments as above 

jXi;nj jYi;nj : 

From the maintained uniform L2+ integrability of jYi;nj it then follows that 

kXk 2+ kY k 2+ , which shows that the 2+ moments of Xi;n can be bounded 

by a constant that does not depend on . Consequently if follows from the last 

inequality in the proof of part (a) of Lemma B.3 that 

V ar( X 

i2Dn 

Xi;n) = 0 n CjDnj 

where C < 1 does not depend on n and . Hence 

sup 

n 

jDnj 1 max( n) = sup jDnj 

n 

1 sup 

j j=1 

0 n C < 1. 

This proves the second claim of the lemma. Q.E.D. 

Proof of Theorem 2: To prove the theorem, it su¢ ces to show that jDnj 1 P 

0. First we show that for each given s > 0, the conditional mean V s 

i;n = 

E(Yi;njFi;n(s)) satis…es the assumptions of the L1-norm LLN of Jenish and 

Prucha (2009, Theorem 3). Using the Jensen and Lyapunov inequalities gives 

for all s > 0, i 2 Dn; n 1: 

E V s p 

i;n EfE(jYi;nj p jFi;n(s))g sup E jYi;nj 

n;i2Dn 

p < 1: 

So, V s 

i;n is uniformly Lp-bounded for p > 1 and hence uniformly integrable. 

For each …xed s; V s 

i;n is a measurable function of f"j;n; j 2 Tn : (i; j) sg. 

Observe that under Assumption 1 there exists a …nite constant C such that the 

cardinality of the set fj 2 Tn : (i; j) sg is bounded by Csd ; compare Lemma 

A.1 in Jenish and Prucha (2009). Hence, 

V s(1; 1; r) 

and thus in light of Assumption 6(b) 

1X 

r=1 

r d 1 V s(1; 1; r) 

X2s 

r=1 

1; r 2s 

Cs d ; Cs d ; r 2s ; r > 2s 

r d 1 + '(Cs d ; Cs d ) 

30 

1X 

(r + 2s) d 1 b(r) < 1: 

r=1 

! 

i2Dn (Yi;n EYi;n) L1

The above shows that indeed, for each …xed s, V s 

i;n satis…es the assumptions 

of the L1-norm LLN of Jenish and Prucha (2009, Theorem 3). Therefore, for 

each s; we have 

jDnj 

X 

1 

[E(Yi;njFi;n(s)) EYi;n] 

i2Dn 

1 

! 0 as n ! 1: (B.17) 

Furthermore observe that from (??) and the Minkowski inequality 

jDnj 

1 X 

i2Dn 

(Yi;n E(Yi;njFi;n(s))) 

1 

(s): (B.18) 

Given (B.17) and (B.18), and observing that lims!1 (s) = 0 it now follows 

that 

lim 

n!1 

jDnj 1 X 

lim lim sup 

s!1 n!1 

+ lim 

s!1 lim 

n!1 

i2Dn 

jDnj 

(Yi;n EYi;n) 

1 X 

i2Dn 

jDnj 1 X 

i2Dn 

1 

= lim 

s!1 lim 

n!1 

(Yi;n E(Yi;njFi;n(s))) 

(E(Yi;njFi;n(s)) EYi;n) 

jDnj 1 X 

1 

1 

= 0: 

i2Dn 

(Yi;n EYi;n) 

This completes the proof of the LLN. Q.E.D. 

C Appendix: Proofs for Section 5 

Proof of Theorem 3: We show that 

sup 

2 

Qn( ) Q n( ) 

p 

! 0 (C.1) 

as n ! 1. As discussed in the text, given that the o n are identi…ably unique it 

then follows immediately from, e.g., Pötscher and Prucha (1997), Lemma 3.1, 

that ( b n; o n) p ! 0 as n ! 1 as claimed. 

We start by proving that 

jDnj 

1 X 

i2Dn 

[qi;n(Zi;n; ) Eqi;n(Zi;n; )] p ! 0 (C.2) 

for each 2 , by applying the LLN given as Theorem 2 in the text to 

qi;n(Zi;n; ). First note that by Assumption 6(a), we have sup n;i2Dn E jqi;n(Zi;n; )j p < 

1 for each 2 and p = 2, which veri…es Assumption 4(a) for qi;n(Zi;n; ) with 

ci;n = 1. By Assumption 6(b), the qi;n(Zi;n; ) are uniformly L1-NED on ", and 

31 

1

hence w.l.g. we can take di;n = 1. Furthermore, by Assumption 6(b) the input 

process " is -mixing, and the -mixing coe¢ cients satisfy Assumption 4(b). 

Consequently (C.2) follows directly from Theorem 2 applied to qi;n(Zi;n; ). 

Next observe that by Proposition 1 of Jenish and Prucha (2009), Assumption 

6(c) implies that qi;n is L0 stochastically equicontinuous on , i.e., for every 

" > 0 

1 

lim sup 

n!1 jDnj 

X 

P 

i2Dn 

sup 

( ; ) 

jqi;n(Zi;n; ) qi;n(Zi;n; )j > " 

! 

! 0 as ! 0: 

Furthermore, in light of Assumption 6(a) the qi;n(Zi;n; ) clearly satisfy the 

domination condition postulated by the ULLN in Jenish and Prucha (2009), 

stated as Theorem 2 in that paper. Given that we have already veri…ed the 

pointwise LLN in (C.2) it now follows directly from that theorem that 

1 P 

sup jRn( ) ERn( )j 

2 

p ! 0 (C.3) 

with Rn( ) = jDnj i2Dn qi;n(Zi;n; ), and that the ERn( ) are uniformly 

equicontinuous on in the sense that 

lim sup sup 

n!1 2 

sup 

( ; ) 

To prove (C.1) observe that 

sup 

2 

sup 

2 

sup 

2 

jERn( ) ERn( )j ! 0 as ! 0: 

Qn( ) Q n( ) (C.4) 

jRn( ) 0 P Rn( ) ERn( )P ERn( )j + sup jRn( ) 

2 

0 (Pn P )Rn( )j 

jRn( ) 0 P Rn( ) ERn( )P ERn( )j + 2 sup jRn( )j 

2 

2 jPn P j : 

Furthermore observe that Assumption 6(a) we have E [sup 2 jqi;n(Zi;n; )j] 

K and E [sup 2 jqi;n(Zi;n; )j] 2 K for some …nite constant K. Thus 


sup 

2 

E jRn( )j E sup jRn( )j jDnj 

2 

E sup jRn( )j 

2 

2 

jDnj 

jDnj 

2 X 

i;j2Dn 

2 X 

i;j2Dn 

E sup 

2 

" 

1 X 

i2Dn 

E sup jqi;n(Zi;n; )j K (C.5) 

2 

jqi;n(Zi;n; )j sup jqj;n(Zj;n; )j (C.6) 

2 

E sup jqi;n(Zi;n; )j 

2 

2 # 1=2 " 

E sup jqj;n(Zj;n; )j 

2 

Now consider the …rst terms on the r.h.s. of the last inequality of (C.4). From 

(C.5) we see that E jRn( )j takes on its values in a compact set. Given (C.3) it 

32 

2 # 1=2 

K:

now follows immediately from part (a.I) of Lemma 3.3 of Pötscher and Prucha 

(1997) that 

sup jRn( ) 

2 

0 P Rn( ) ERn( )P ERn( )j p ! 0: (C.7) 

Next we show that also the second term on the r.h.s. of the last inequality 

of (C.4) converges in probability to zero. To see that this is indeed the case 

observe that sup 2 jRn( )j 2 = Op(1) in light of (C.6) and jPn P j p ! 0 by 

assumption. This completes the proof of (C.1). 

Having established that ERn( ) are uniformly equicontinuous on , the 

uniform equicontinuity of Q n( ) on follows immediately from Lemma 3.3(b) 

of Pötscher and Prucha (1997). Q.E.D. 

Proof of Theorem 4: Clearly by Theorem 3 we have b o 

n n = op(1). 

Step 1. The estimators b n corresponding to the objective function (23) 

satisfy the following …rst order conditions: 

h 

jDnj 1=2 Rn( b i 

n) = op(1): (C.8) 

r Rn( b n) 0 Pn 

The op(1) term on the r.h.s. re‡ects that the …rst order conditions may not 

hold if b n falls onto the boundary of , and that the probability of that event 

goes to zero as n ! 1, since the o n are uniformly in the in the interior of by 

Assumption 7(a). If b n is in the interior of , then the l.h.s. of (C.8) is zero. 

Taking the mean value expansion of Rn( b n) about 

Rn( b n) = Rn( o n) + r Rn( e n)( b n 

o 

n yields 

o 

n) (C.9) 

where e n 2 is between b n and o n (component-by-component). Let 

bAn = r Rn( b n) 0 Pnr Rn( e n) and b Bn = r Rn( b n) 0 Pn 

then combining (C.8) and (C.9) gives: 

= 

= 

jDnj 1=2 b n 

h 

I 

h 

I 

i 

A b+ b 

n An 

i 

A b+ b 

n An 

o 

n 

jDnj 1=2 b n 

jDnj 1=2 b n 

where b A + n denotes the generalized inverse of b An. 

o 

n 

o 

n 

h 

jDnj 1 i1=2 n ; 

(C.10) 

bA + n r Rn( b n) 0 h 

Pn jDnj 1=2 Rn( o i 

n) + b A + n op(1) 

bA + n b h 

1=2 

Bn n jDnj Rn( o i 

n) + b A + n op(1); 

Step 2. By Assumptions 7(c) the qi;n(Zi;n; o n) are uniformly L2-NED and 

uniformly L2+ -integrable with ci;n = 1. Given Assumptions 7(d),(g) it is now 

readily seen that the process fqi;n(Zi;n; o n); i 2 Dng satis…es all assumptions 

of the CLT for vector-valued NED processes, given as Corollary 1 in the text, 

33

with cin = 1. (Note that Assumption 3(d) is satis…ed automatically since the 

qi;n(Zi;n; o n) are uniformly L2-NED.) Hence, 

1=2 

n jDnj Rn( o X 

1=2 

n) = n qi;n(Zi;n; o n) =) N(0; Ipq ); (C.11) 

with n = V ar P 

i2Dn 

i2Dn qi;n(Zi;n; o n) and sup n max 

h 

jDnj 1 i 

n < 1. 

Step 3. By Assumptions 7(c),(d),(e) the functions r qi;n(Zi;n; ) satisfy for 

each 2 the LLN given as Theorem 2 in the text with ci;n = 1, observing 

that Assumption 4(b) is implied by 2. By argumentation analogous as used in 

the proof of consistency we have 

jDnj 

1 X 

i2Dn 

(r qi;n(Zi;n; ) Er qi;n(Zi;n; )) p ! 0: 

By Proposition 1 of Jenish and Prucha (2009), Assumption 7(f) implies that the 

r qi;n(Zi;n; ) are uniformly L0-equicontinuous on . Given L0-equicontinuity 

and Assumption 7(e), we have by the ULLN of Jenish and Prucha (2009), Theorem 

2, 

sup jr Rn( ) 

2 

Er Rn( )j p ! 0: (C.12) 

and furthermore, the Er Rn( ) are uniformly equicontinuous on 

that 

in the sense 

sup jEr Rn( ) Er Rn( )j ! 0 (C.13) 

lim sup sup 

n!1 02 j 

as ! 0. In light of (C.12) and (C.13), and given that b n 

hence e o 

n n = op(1), if follows further that 

0 j< 

o 

n = op(1) and 

r Rn( b n) Er Rn( o n) p ! 0; and r Rn( e n) Er Rn( o n) p ! 0: 

Hence, 

bAn An 

where b An and b Bn are as de…ned above, and 

p 

p 

! 0 and Bn 

b Bn ! 0; (C.14) 

An = [Er Rn( o n)] 0 P [Er Rn( o n)] and Bn = [Er Rn( o n)] 0 P 

h 

jDnj 1 i1=2 n : 

Step 4. Given Assumptions 7(e),(f), and since P is positive de…nite, we 

have jAnj = O(1) and A 1 

n = O(1), respectively. Hence by, e.g., Lemma 

F1 in Pötscher and Prucha (1997) we have b An = Op(1), b A + n = Op(1), b An is 

nonsingular with probability tending to one, and b A + n A 1 p 

n ! 0. In light of the 

above it follows from (C.10) that 

jDnj 1=2 b 

n 

o 

n = 

h 

A b+ b 1=2 

n Bn n jDnj Rn( o = 

i 

n) + op(1) 

A 1 

h 

1=2 

n Bn n jDnj Rn( o i 

n) + op(1) 

34

Recalling that supn h 

max jDnj 1 i 

n < 1, Assumptions 7(e) implies that 

jBnj = Op(1). In light of Assumption 7(g),(h) BnB0 n is invertible and fur- 

thermore (BnB0 n) 1 = O(1). Thus A 1 

n BnB0 nA 10 

n 

O(1) and therefore 

A 1 

n BnB 0 nA 10 

n 

= A 1 

n BnB 0 nA 10 

n 

The claim that A 1 

n BnB0 nA 10 

n 

1=2 jDnj 1=2 b n 

h 

1=2 1 

An Bn 

1=2 jDnj 1=2 b n 

o 

n 

1 

1=2 

n jDnj Rn( o n) 

o 

n 

jAnj 2 (BnB 0 n) 1 = 

i 

+ op(1): 

=) N(0; Ik) now fol- 

lows, e.g., from Corollary F4(b) in Pötscher and Prucha (1997): Q.E.D. 

References 

[1] Andrews, D.W.K. (1984): "Non-Strong Mixing Autoregressive processes," 

Journal of Applied Probability, 21, 930-934. 

[2] Andrews, D.W.K. (1987): "Consistency in Nonlinear Econometric Models: 

a Generic Uniform Law of Large Numbers," Econometrica, 55, 1465-1471. 

[3] Andrews, D.W.K. (1988): "Laws of Large Numbers for Dependent Nonidentically 

Distributed Variables," Econometric Theory, 4, 458-467. 

[4] Andrews, D.W.K. (2005): "Cross-section Regression with Common 

Shocks," Econometrica, 2005, 73, 1551–1585. 

[5] Ango Nze, P. and P. Doukhan (2004): "Weak Dependence: Models and 

Applications to Econometrics," Econometric Theory, 20, 995-1045. 

[6] Bai, J. (2003): "Inferential Theory for Factor Models of Large Dimensions," 

Econometrica 71, 135-171. 

[7] Bai, J. and S. Ng (2004): "A PANIC Attack on Unit Roots and Cointegration,", 

Econometrica, 72, 1127-1177. 

[8] Bera, A.K, and P. Simlai: "Testing Spatial Autoregressive Model and a 

Formulation of Spatial ARCH (SARCH) Model with Applications. Paper 

presented at the Econometric Society World Congress, London, August 

2005. 

[9] Bierens, H.J. (1981): Robust methods and asymptotic theory. Berlin: 

Springer Verlag. 

[10] Billingsley, P. (1968): Convergence of Probability Measures. New York: 

John Wiley and Sons. 

[11] Bolthausen, E. (1982): "On the Central Limit Theorem for Stationary 

Mixing Random Fields," The Annals of Probability, 10, 1047-1050. 

35

[12] Bradley, R. (1993): "Some Examples of Mixing Random Fields," Rocky 

Mountain Journal of Mathematics, 23, 2, 495-519. 

[13] Brockwell, P., and R. Davis (1991): Times Series: Theory and Methods, 

Springer Verlag. 

[14] Bulinskii, A.V. (1989): Limit Theorems under Weak Dependence Conditions, 

Moscow University Press. 

[15] Bulinskii, A.V., and P. Doukhan (1990): "Vitesse de convergence dans le 

theorem de limite centrale pour des champs melangeants des hypotheses de 

moment faibles," C.R. Academie Sci. Paris, Serie I, 801-105. 

[16] Chen, X., and T. Conley (2001): "A New Semiparametric Spatial Model 

for Panel Time Series," Journal of Econometrics,105, 59-83. 

[17] Cli¤, A., and J. Ord (1981): Spatial Processes, Models and Applications. 

London: Pion. 

[18] Conley, T. (1999): "GMM Estimation with Cross Sectional Dependence," 

Journal of Econometrics, 92, 1-45. 

[19] Davidson, J. (1992): "A Central Limit Theorem for Globally Nonstationary 

Near-Epoch Dependent Functions of Mixing Processes," Econometric 

Theory, 8, 313-329. 

[20] Davidson, J. (1993): "An L1 Convergence Theorem for Heterogenous 

Mixingale Arrays with Trending Moments," Statistics and Probability Letters, 

8, 313-329. 

[21] Davidson, J. (1994): Stochastic Limit Theory. Oxford University Press. 

[22] De Jong, R.M. (1997): "Central Limit Theorems for Dependent Heterogeneous 

Random Variables," Econometric Theory, 13, 353–367. 

[23] Dobrushin, R. (1968): "The Description of a Random Field by its Conditional 

Distribution and its Regularity Condition," Theory of Probability 

and its Applications, 13, 197-227. 

[24] Doukhan, P. (1994): Mixing. Properties and Examples. Springer-Verlag. 

[25] Doukhan, P., and X. Guyon (1991): "Mixing for Linear Random Fields," 

C.R. Academie Sciences Paris, Serie 1, 313, 465-470. 

[26] Doukhan, P., and S. Louhichi (1999): "A New Weak Dependence Condition 

and Applications to Moment Inequalities," Stochastic Processes and Their 

Applications, 84, 313-342. 

[27] Gallant, A.R.(1987): Nonlinear Statistical Models. New York: Wiley. 

[28] Gallant, A.R., and H. White (1988): A Uni…ed Theory of Estimation and 

Inference for Nonlinear Dynamic Models. New York: Basil Blackwell. 

36

[29] Jenish, N., and I.R. Prucha (2009): "Central Limit Theorems and Uniform 

Laws of Large Numbers for Arrays of Random Fields," Journal of 

Econometrics, 150, 86-98. 

[30] Ibragimov, I.A. (1962): "Some Limit Theorems for Stationary Processes," 

Theory of Probability and Applications, 7, 349-382. 

[31] Ibragimov, I.A., and Y. V. Linnik (1971): Independent and Stationary 

Sequences of Random Variables. Wolters-Noordho¤, Groningen. 

[32] Kelejian, H.H., and I.R. Prucha (2004): "Estimation of Simultaneous 

Systems of Spatially Interrelated Cross Sectional Equations," Journal of 


[33] Kelejian, H.H., and I.R. Prucha (2007a): "HAC Estimation in a Spatial 

Framework," Journal of Econometrics,140, 131-154. 

[34] Kelejian, H.H., and I.R. Prucha (2007b): "Speci…cation and Estimation 

of Spatial Autoregressive Models with Autoregressive and Heteroskedastic 

Disturbances," forthcoming in Journal of Econometrics. 

[35] Lee, L.-F. (2004): "Asymptotic Distributions of Quasi-Maximum Likelihood 

Estimators for Spatial Autoregressive Models," Econometrica, 72, 

1899-1925. 

[36] Lee, L.-F. (2007a): "GMM and 2SLS for Mixed Regressive, Spatial Autoregressive 

Models," Journal of Econometrics, 137, 489–514. 

[37] Lee, L.-F. (2007b): "Method of Elimination and Substitution in the GMM 

Estimation of Mixed Regressive Spatial Autoregressive Models," forthcoming 

in Journal of Econometrics. 

[38] McLeish, D. L. (1975a): "A Maximal Inequality and Dependent Strong 

Laws," Annals of Probability, 3, 826-36. 

[39] McLeish, D. L. (1975b): "Invariance Principles for Dependent Variables," 

Z. Wahrsch. verw. Gebiete, 32, 165-78. 

[40] Nahapetian, B. (1987): "An Approach to Limit Theorems for Dependent 

Random Variables," Theory of Probability and its Applications, 32, 589-594. 

[41] Neaderhouser, C. (1978): "Limit Theorems for Multiply Indexed Mixing 

Random Variables," Annals of Probability, 6. 

[42] Pinkse, J., L. Shen and M.E. Slade (2007): "A Central Limit Theorem 

for Endogenous Locations and Complex Spatial Interactions," Journal of 


[43] Phillips, P.C. and D. Sul (2003): “Dynamic Panel Estimation and Homogeneity 

Testing Under Cross Section Dependence,” The Econometrics 

Journal, 6, 217–259. 

37

[44] Pötscher, B. M., and I. R. Prucha (1989): "A Uniform Law of Large Numbers 

for Dependent and Heterogeneous Data Processes," Journal of Econometrics, 

140, 215-225. 

[45] Pötscher, B.M., and I. R. Prucha (1994): "Generic Uniform Convergence 

and Equicontinuity Concepts for Random Functions," Journal of Econometrics, 

60, 23-63. 

[46] Pötscher, B.M. and I. R. Prucha (1997): Dynamic Nonlinear Econometric 

Models. Springer-Verlag, New York. 

[47] Robinson, P.M. (2009): "Large-Sample Inference on Spatial Dependence," 

The Econometrics Journal, Tenth Anniversary Special Issue, 12, S68-S82. 

[48] Robinson, P.M. (2007): "E¢ cient Estimation of the Semiparametric Spatial 

Autoregressive Model," forthcoming in Journal of Econometrics. 

[49] Takahata, H. (1983): "On the Rates in the Central Limit Theorem for 

Weakly Dependent Random Fields," Z. Wahrsch. verw. Gebiete, 64, 445- 

456. 

[50] Tran, L. (1990): "Kernel Density Estimation on Random Fields," Journal 

of Multivariate Analysis, 34, 37-53. 

[51] Whittle, P. (1954): "On Stationary Processes in the Plane," Biometrica, 

41, 434-449. 

[52] Wooldridge, J. (1986): "Asymptotic Properties of Econometric Estimators," 

University of California San Diego, Department of Economics, Ph.D. 

Dissertation. 

[53] Yu, J., R. de Jong, and L.-F. Lee (2008): "Quasi-maximum Likelihood 

Estimators for Spatial Dynamic Panel Data with Fixed E¤ects when Both 

N and T are Large," Journal of Econometrics, 146, 118-134. 

38

On Spatial Processes and Asymptotic Inference under Near$Epoch ...

Create successful ePaper yourself

Delete template?

Save as template?