08.03.2014 Views

A note on the use of V and U statistics in nonparametric models of ...

A note on the use of V and U statistics in nonparametric models of ...

A note on the use of V and U statistics in nonparametric models of ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

AISM (2006) 58: 389–406<br />

DOI 10.1007/s10463-006-0034-z<br />

Carlos Mart<strong>in</strong>s-Filho · Feng Yao<br />

A <str<strong>on</strong>g>note</str<strong>on</strong>g> <strong>on</strong> <strong>the</strong> <strong>use</strong> <strong>of</strong> V <strong>and</strong> U <strong>statistics</strong><br />

<strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> <strong>of</strong> regressi<strong>on</strong><br />

Received: 8 September 2004 / Revised: 12 April 2005 / Published <strong>on</strong>l<strong>in</strong>e: 1 June 2006<br />

© The Institute <strong>of</strong> Statistical Ma<strong>the</strong>matics, Tokyo 2006<br />

Abstract We establish <strong>the</strong> √ n asymptotic equivalence <strong>of</strong> V <strong>and</strong> U <strong>statistics</strong> when<br />

<strong>the</strong> statistic’s kernel depends <strong>on</strong> n. Comb<strong>in</strong>ed with a lemma <strong>of</strong> B. Lee this result<br />

provides c<strong>on</strong>diti<strong>on</strong>s under which U <strong>statistics</strong> projecti<strong>on</strong>s <strong>and</strong> V <strong>statistics</strong> are √ n<br />

asymptotically equivalent. The <strong>use</strong> <strong>of</strong> this equivalence <strong>in</strong> n<strong>on</strong>parametric regressi<strong>on</strong><br />

<strong>models</strong> is illustrated with several examples; <strong>the</strong> estimati<strong>on</strong> <strong>of</strong> c<strong>on</strong>diti<strong>on</strong>al variances,<br />

skewness, kurtosis <strong>and</strong> <strong>the</strong> c<strong>on</strong>structi<strong>on</strong> <strong>of</strong> a n<strong>on</strong>parametric R-squared measure.<br />

Keywords U <strong>statistics</strong> · V <strong>statistics</strong> · local l<strong>in</strong>ear estimati<strong>on</strong><br />

1 Introducti<strong>on</strong><br />

The <strong>use</strong> <strong>of</strong> U <strong>statistics</strong> (see Hoeffd<strong>in</strong>g 1948), as a means <strong>of</strong> obta<strong>in</strong><strong>in</strong>g asymptotic<br />

properties <strong>of</strong> kernel based estimators <strong>in</strong> semiparametric <strong>and</strong> n<strong>on</strong>parametric<br />

regressi<strong>on</strong> <strong>models</strong> is now comm<strong>on</strong> practice <strong>in</strong> ec<strong>on</strong>ometrics. For example, Powell,<br />

Stock <strong>and</strong> Stoker (1989) <strong>use</strong> U <strong>statistics</strong> to obta<strong>in</strong> <strong>the</strong> asymptotic properties <strong>of</strong> <strong>the</strong>ir<br />

semiparametric <strong>in</strong>dex model estimator. Ahn <strong>and</strong> Powell (1993) <strong>use</strong> U <strong>statistics</strong> to<br />

<strong>in</strong>vestigate <strong>the</strong> properties <strong>of</strong> a two stage semiparametric estimati<strong>on</strong> <strong>of</strong> censored<br />

selecti<strong>on</strong> <strong>models</strong> where <strong>the</strong> first stage estimator <strong>in</strong>volves a n<strong>on</strong>parametric regressi<strong>on</strong><br />

estimator for <strong>the</strong> selecti<strong>on</strong> variable. Fan <strong>and</strong> Li (1996) <strong>use</strong> U <strong>statistics</strong> to study<br />

C. Mart<strong>in</strong>s-Filho (B)<br />

Department <strong>of</strong> Ec<strong>on</strong>omics, Oreg<strong>on</strong> State University,<br />

Ballard Hall 303, Corvallis, OR 97331-3612 USA<br />

E-mail: carlos.mart<strong>in</strong>s@orst.edu<br />

Tel.:+1-541-737-1476<br />

F. Yao<br />

Department <strong>of</strong> Ec<strong>on</strong>omics, University <strong>of</strong> North Dakota,<br />

P.O. Box 8369, Gr<strong>and</strong> Forks, ND 58203 USA<br />

E-mail: feng.yao@mail.bus<strong>in</strong>ess.und.edu<br />

Tel.: +1-701-7773360


390 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />

a set <strong>of</strong> specificati<strong>on</strong> tests for n<strong>on</strong>parametric regressi<strong>on</strong> <strong>models</strong> <strong>and</strong> Zheng (1998)<br />

<strong>use</strong>s U-<strong>statistics</strong> to provide a n<strong>on</strong>parametric kernel based test for parametric quantile<br />

regressi<strong>on</strong> <strong>models</strong>. See also Kemp (2000) <strong>and</strong> D’Amico (2003) for more recent<br />

<strong>use</strong>s.<br />

The appeal for <strong>the</strong> <strong>use</strong> <strong>of</strong> U <strong>statistics</strong> derives from <strong>the</strong> fact that <strong>of</strong>ten estimators<br />

or <strong>statistics</strong> <strong>of</strong> <strong>in</strong>terest – T n – are expressed as l<strong>in</strong>ear comb<strong>in</strong>ati<strong>on</strong>s <strong>of</strong> n<strong>on</strong>parametric<br />

regressi<strong>on</strong> estimators. C<strong>on</strong>sider for example a n<strong>on</strong>parametric regressi<strong>on</strong> model<br />

E(Y|X = x) = m(x) with observati<strong>on</strong>s {(y i ,x i )} n i=1 , T n = ∑ n<br />

i=1 c i ˆm(x i ) where<br />

c i ∈Rare n<strong>on</strong>stochastic <strong>and</strong> ˆm(x) is an arbitrary n<strong>on</strong>parametric estimator for<br />

m(x). S<strong>in</strong>ce ˆm(x) can usually be written as ˆm(x) = ∑ n<br />

j=1 w jn(x)y j , we can write<br />

T n =<br />

n∑<br />

i=1<br />

( )<br />

n<br />

c i w <strong>in</strong> (x i )y i + u<br />

2 n = T 1n + T 2n ,<br />

( ) −1<br />

n ∑<br />

where u n =<br />

2 1≤i


V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 391<br />

where ∑ (n,k) de<str<strong>on</strong>g>note</str<strong>on</strong>g>s a sum over all subsets 1 ≤ i 1 0 <strong>and</strong> c<strong>on</strong>sequently to reach <strong>the</strong> desired c<strong>on</strong>clusi<strong>on</strong> it suffices to show<br />

that nE((u n − v n ) 2 ) = o(1).By<strong>the</strong>c r -<strong>in</strong>equality we can write<br />

(<br />

nE((u n − v n ) 2 ) ≤ kn 1 − P k<br />

n ) 2 k−1<br />

E(u 2<br />

n k n ) + kn ∑<br />

κ=1<br />

(<br />

n<br />

κ<br />

) 2<br />

n −2k E((u (κ)<br />

n )2 ).<br />

(4)


392 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />

We will deal with <strong>the</strong> two terms <strong>on</strong> <strong>the</strong> righth<strong>and</strong> side <strong>of</strong> <strong>in</strong>equality (4) separately.<br />

We first <str<strong>on</strong>g>note</str<strong>on</strong>g> that by <strong>the</strong> representati<strong>on</strong> <strong>the</strong>orem for U <strong>statistics</strong> (Serfl<strong>in</strong>g, 1980; p.<br />

180) u (κ)<br />

n = 1 n!<br />

∑p W n<br />

(κ)(Z<br />

i 1<br />

,... ,Z <strong>in</strong> ) where<br />

W (κ)<br />

n (Z i 1<br />

,... ,Z <strong>in</strong> ) = 1 γ κ<br />

(φ κ,n (Z i1 ,... ,Z iκ ) + φ κ,n (Z iκ+1 ,... ,Z i2κ ) +···<br />

+φ κ,n (Z iγκ κ−κ+1<br />

,...,Z iγκ κ<br />

)),<br />

γ κ = [n/κ] is <strong>the</strong> greatest <strong>in</strong>teger ≤ n/κ <strong>and</strong> ∑ p<br />

de<str<strong>on</strong>g>note</str<strong>on</strong>g>s <strong>the</strong> sum over all permutati<strong>on</strong>s<br />

(i 1 ,...,i n ) <strong>of</strong> {1, 2,...,n} <strong>and</strong> similarly u n = 1 ∑<br />

n! p W n(Z i1 ,... ,Z <strong>in</strong> )<br />

where W n (Z i1 ,... ,Z <strong>in</strong> ) = 1 γ (ψ n(Z i1 ,... ,Z ik ) + ψ n (Z ik+1 ,... ,Z i2k ) + ··· +<br />

ψ n (Z iγk−k+1 ,...,Z iγk )) <strong>and</strong> γ = [n/k] is <strong>the</strong> greatest <strong>in</strong>teger ≤ n/k. Now, for <strong>the</strong><br />

first term <strong>on</strong> <strong>the</strong> righth<strong>and</strong> side <strong>of</strong> <strong>in</strong>equality (4), s<strong>in</strong>ce n k −Pk<br />

n = O(nk−1 ) we have<br />

( )<br />

n 1 − P k<br />

n 2 ( )<br />

n = O(n −1 ) <strong>and</strong> n 1 − P n 2 ((<br />

k<br />

k<br />

n E(u<br />

2 k n )=O(n −1 1<br />

∑<br />

)E<br />

n! p W n(Z i1 ,...,<br />

Z <strong>in</strong> ) ) ) 2<br />

. By M<strong>in</strong>kowski’s <strong>in</strong>equality<br />

⎛(<br />

E ⎝<br />

1<br />

n!<br />

) ⎞ 2<br />

∑<br />

W n (Z i1 ,... ,Z <strong>in</strong> )<br />

p<br />

⎠ ≤ 1<br />

(n!) 2 ( ∑<br />

p<br />

= E(|W n (Z i1 ,... ,Z <strong>in</strong> )| 2 ),<br />

(<br />

E(|Wn (Z i1 ,... ,Z <strong>in</strong> )| 2 ) ) ) 2<br />

1/2<br />

where <strong>the</strong> last equality follows s<strong>in</strong>ce {Z i } i=1,2,... is an i.i.d. sequence. By def<strong>in</strong>iti<strong>on</strong><br />

<strong>of</strong> W n <strong>and</strong> ano<strong>the</strong>r <strong>use</strong> <strong>of</strong> M<strong>in</strong>kowski’s <strong>in</strong>equality we have<br />

⎛<br />

⎞<br />

γ∑<br />

E(|W n (Z i1 ,... ,Z <strong>in</strong> )| 2 ) ≤ γ −2 (<br />

⎝ E(ψ<br />

2<br />

n (Z ijk−k+1 ,... ,Z ijk )) ) 1/2<br />

⎠<br />

j=1<br />

= E(ψ n (Z i1 ,... ,Z ik )) = o(n),<br />

( )<br />

where <strong>the</strong> last equality follows by assumpti<strong>on</strong>.We <strong>the</strong>refore c<strong>on</strong>clude that n 1− P k<br />

n 2<br />

n k<br />

E(u 2 n ) = o(1).<br />

(<br />

n<br />

For <strong>the</strong> sec<strong>on</strong>d term <strong>on</strong> <strong>in</strong>equality (4), <str<strong>on</strong>g>note</str<strong>on</strong>g> first that n<br />

κ)<br />

−k = O(n κ−k ) <strong>and</strong><br />

(( ) 2<br />

n<br />

s<strong>in</strong>ce κ ∈{1,... ,k− 1}, max κ n κ−k = n −1 <strong>and</strong> <strong>the</strong>refore n<br />

κ)<br />

−k =O(n −2 ).<br />

Once aga<strong>in</strong>, by M<strong>in</strong>kowski’s <strong>in</strong>equality we have E((u (κ)<br />

n )2 ) ≤ E(|W n (κ)(Z<br />

i 1<br />

,<br />

... ,Z <strong>in</strong> )| 2 ). Now, given <strong>the</strong> def<strong>in</strong>iti<strong>on</strong> <strong>of</strong> W n<br />

(κ) <strong>and</strong> <strong>the</strong> fact that {φ κ,n (Z ijκ−κ+1<br />

,... ,Z ijκ )} j=1,2,... is an i.i.d. sequence <strong>of</strong> r<strong>and</strong>om variables we have by ano<strong>the</strong>r<br />

applicati<strong>on</strong> <strong>of</strong> M<strong>in</strong>kowski’s <strong>in</strong>equality that E(|W n<br />

(κ)(Z<br />

i 1<br />

,... ,Z <strong>in</strong> )| 2 ) ≤ E((φ κ,n (Z i1<br />

,... ,Z iκ )) 2 ). Let l κ de<str<strong>on</strong>g>note</str<strong>on</strong>g> <strong>the</strong> number <strong>of</strong> terms <strong>in</strong> ∑ {ν 1 ,... ,ν κ ,k} , <strong>the</strong>n by <strong>the</strong> c r<br />

<strong>in</strong>equality<br />

2


V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 393<br />

E((φ κ,n (Z i1 ,... ,Z iκ )) 2 )<br />

⎛⎛<br />

⎞2⎞<br />

= E ⎝⎝<br />

∑ k!<br />

∏ κ<br />

{ν 1 ,... ,ν κ ,k} j=1 ν j ! ψ ⎟ ⎟<br />

n ×(Z i1 ,... ,Z i1 ,... ,Z iκ ,... ,Z iκ ) ⎠ ⎠<br />

} {{ } } {{ }<br />

ν 1 − times<br />

ν κ − times<br />

⎛<br />

⎞<br />

∑ (k!) 2<br />

≤ l κ<br />

( ∏ κ<br />

{ν 1 ,... ,ν κ ,k} j=1 ν j !) E ⎜<br />

⎝ψ 2 2 n (Z ⎟<br />

i 1<br />

,... ,Z i1 ,... ,Z iκ ,... ,Z iκ ) ⎠ .<br />

} {{ } } {{ }<br />

ν 1 − times<br />

ν κ − times<br />

Now, s<strong>in</strong>ce by assumpti<strong>on</strong> E ( ψn 2(Z i 1<br />

,... ,Z ik ) ) = o(n) for all 1 ≤ i 1 ,...,i k<br />

≤ n, k ≤ n,, <strong>the</strong>n it is true that for all κ ∈{1,... ,k − 1} <strong>and</strong> all ν j ≥ 1 such<br />

that k = ∑ κ<br />

j=1 ν j we have E ( ψn 2(Z i 1<br />

,... ,Z i1 ,... ,Z iκ ,... ,Z iκ ) ) = o(n). Then<br />

E(|W (κ)<br />

n<br />

∑k−1<br />

(<br />

n<br />

n<br />

κ<br />

κ=1<br />

(Z i 1<br />

,... ,Z <strong>in</strong> )| 2 ) = o(n) <strong>and</strong> c<strong>on</strong>sequently E((u (κ))2<br />

) = o(n).As a result,<br />

) 2<br />

n −2k E((u (κ)<br />

n<br />

∑k−1<br />

)2 )≤nO(n −2 )o(n)<br />

κ=1<br />

l κ<br />

∑<br />

n<br />

{ν 1 ,... ,ν κ ,k}<br />

(k!) 2<br />

( ∏κ<br />

j=1 ν j !) 2<br />

=o(1),<br />

which completes <strong>the</strong> pro<strong>of</strong>.<br />

⊓⊔<br />

Theorem 1 can be proved us<strong>in</strong>g <strong>the</strong> follow<strong>in</strong>g two c<strong>on</strong>diti<strong>on</strong>s which are implied<br />

by E ( ψn 2(Z i 1<br />

,... ,Z ik ) ) = o(n) for all 1 ≤ i 1 ,...,i k ≤ n, k ≤ n: (a)<br />

E ( ψn 2(Z i 1<br />

,... ,Z ik ) ) = o(n) for all 1 ≤ i 1 < i 2 < ··· < i k ≤ n, k ≤ n;<br />

(b) for all κ ∈ {1,... ,k − 1} <strong>and</strong> all ν j ≥ 1 such that k = ∑ κ<br />

j=1 ν j that<br />

def<strong>in</strong>e Z i1 ,... ,Z i1 ,... ,Z iκ ,... ,Z iκ E ( ψ<br />

} {{ } } {{ }<br />

n 2(Z i 1<br />

,... ,Z i1 ,... ,Z iκ ,... ,Z iκ ) )<br />

ν 1 − times<br />

ν κ − times<br />

= o(n 2(k−κ)−1 ). The next corollary establishes <strong>the</strong> √ n asymptotic equivalence<br />

between V <strong>statistics</strong> <strong>and</strong> U <strong>statistics</strong> projecti<strong>on</strong>s.<br />

Corollary 1 Let {Z i } n i=1 be a sequence <strong>of</strong> i.i.d. r<strong>and</strong>om variables <strong>and</strong> u n <strong>and</strong><br />

v n be U <strong>and</strong> V <strong>statistics</strong> with kernel functi<strong>on</strong> ψ n (Z 1 ,... ,Z k ). In additi<strong>on</strong>, let<br />

û n = k n<br />

∑ n<br />

i=1 (ψ 1n(Z i ) − θ n ) + θ n , where ψ 1n (Z i ) = E (ψ n (Z 1 ,... ,Z k )|Z i ) <strong>and</strong><br />

θ n = E (ψ n (Z 1 ,... ,Z k )). IfE ( ψ 2 n (Z 1,... ,Z k ) ) = o(n) <strong>the</strong>n √ n(v n −û n ) =<br />

o p (1).<br />

Pro<strong>of</strong> From Theorem 1 we have √ n(v n − u n ) = o p (1), <strong>and</strong> from Lemma 2.1 <strong>in</strong><br />

Lee (1988) we have that √ n(u n −û n ) = o p (1). Hence, √ n(v n −û n ) = o p (1). ⊓⊔<br />

3 Some applicati<strong>on</strong>s <strong>in</strong> regressi<strong>on</strong> <strong>models</strong><br />

We provide applicati<strong>on</strong>s <strong>of</strong> Theorem 1 <strong>and</strong> its corollary around <strong>the</strong> follow<strong>in</strong>g regressi<strong>on</strong><br />

model,<br />

y t = m(x t ) + ε t , (5)


394 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />

where, {(y t ,x t )} n t=1 is a sequence <strong>of</strong> i.i.d. r<strong>and</strong>om variables, y t,x t ∈Rwith jo<strong>in</strong>t<br />

density q(y,x), E(ε t |x t ) = 0 <strong>and</strong> V(ε t |x t ) = σ 2 (x t ). This is similar to <strong>the</strong> regressi<strong>on</strong><br />

model c<strong>on</strong>sidered by Fan <strong>and</strong> Yao (1998), with <strong>the</strong> excepti<strong>on</strong> that here <strong>the</strong><br />

observati<strong>on</strong>s are i.i.d. ra<strong>the</strong>r than strictly stati<strong>on</strong>ary. Our applicati<strong>on</strong>s all <strong>in</strong>volve<br />

establish<strong>in</strong>g <strong>the</strong> asymptotic distributi<strong>on</strong> <strong>of</strong> <strong>statistics</strong> that are c<strong>on</strong>structed as l<strong>in</strong>ear<br />

comb<strong>in</strong>ati<strong>on</strong>s <strong>of</strong> a first stage n<strong>on</strong>parametric estimator <strong>of</strong> m(x). As <strong>the</strong> first stage<br />

estimator for m(x) we c<strong>on</strong>sider <strong>the</strong> local l<strong>in</strong>ear estimator <strong>of</strong> Fan (1992), i.e., for<br />

x ∈R, we obta<strong>in</strong> ˆm(x) =ˆα where<br />

n∑<br />

( ˆα, ˆβ) = argm<strong>in</strong> α,β (y t − α − β(x t − x)) 2 K<br />

t=1<br />

( )<br />

xt − x<br />

,<br />

h n<br />

where K(·) is a density functi<strong>on</strong> <strong>and</strong> 0


V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 395<br />

(a) Given E(y 2 t f 2 (x t ,y t )) < ∞ <strong>the</strong>n<br />

I n = 1 n<br />

n∑<br />

∫<br />

ε t f(x t , y)g(y|x t )dy + 1 2 h2 n σ K 2 E(m(2) (x t )f (x t ,y t ))<br />

t=1<br />

where g(y|x) de<str<strong>on</strong>g>note</str<strong>on</strong>g>s <strong>the</strong> c<strong>on</strong>diti<strong>on</strong>al density <strong>of</strong> y given x.<br />

(b) Given E(y 4 t f 2 (x t ,y t )) < ∞ <strong>the</strong>n<br />

+o p (n − 1 2 ) + op (h 2 n ), (6)<br />

J n = σ 2 K h2 n<br />

n<br />

n∑<br />

ε t m (2) (x t )E(f (x t ,y t )|x 1 ,... ,x n )<br />

t=1<br />

+ 1 4 h4 n σ 4 K E((m(2) (x t )) 2 f(x t ,y t ))<br />

+o p (n − 1 2 ) + op (h 3 n ), (7)<br />

(c) Given E(y 6 t f 2 (x t ,y t )) < ∞ <strong>the</strong>n<br />

L n = o p (h 3 n ) (8)<br />

(d) Given E(y 8 t f 2 (x t ,y t )) < ∞ <strong>the</strong>n<br />

M n = o p (h 4 n ). (9)<br />

Pro<strong>of</strong> Here we prove parts (a)–(c). The pro<strong>of</strong> <strong>of</strong> (d) is similar to that <strong>of</strong> (c) <strong>and</strong> is<br />

omitted. Let ∑ (p) h i 1 ...i p<br />

de<str<strong>on</strong>g>note</str<strong>on</strong>g> <strong>the</strong> sum <strong>of</strong> h i1 ...i p<br />

over <strong>the</strong> permutati<strong>on</strong>s <strong>of</strong> i 1 ...i p ,<br />

i.e., ∑ (p) h i 1 ...i p<br />

is <strong>the</strong> sum <strong>of</strong> p! terms given by h i1 i 2 ...i p<br />

+ h i2 i 1 ...i p<br />

+···+h ip i p−1 ...i 1<br />

.<br />

(a) Note that,<br />

ˆm(x t ) − m(x t ) =<br />

1<br />

nh n g(x t )<br />

n∑<br />

K<br />

k=1<br />

1<br />

+<br />

2nh n g(x t )<br />

( )<br />

xk − x t<br />

ε k<br />

h n<br />

n∑<br />

K<br />

k=1<br />

( )<br />

xk − x t<br />

m (2) (x kt )(x k − x t ) 2 + w(x t ),<br />

h n<br />

where w(x t ) = ˆm(x t )−m(x t )− 1 ∑ n<br />

nh n g(x t ) k=1 K(x k−x t<br />

h n<br />

)yk ∗ <strong>and</strong> y∗ k = ε k+ 1 2 m(2) (x kt )<br />

(x k −x t ) 2 with x kt = λx t +(1−λ)x k for some λ ∈ (0, 1). Hence, I n = I 1n +I 2n +I 3n ,<br />

where<br />

I 1n = 1<br />

n 2 h n<br />

n∑<br />

n∑<br />

K<br />

t=1 k=1<br />

( )<br />

xk − x t<br />

h n<br />

1<br />

ε k f(x t ,y t )<br />

g(x t )


396 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />

I 2n = h n<br />

2n 2<br />

I 3n = 1 n<br />

n∑<br />

t=1 k=1<br />

n∑<br />

( )( )<br />

xk − x t xk − x 2<br />

t<br />

K<br />

m (2) 1<br />

(x kt )f (x t ,y t )<br />

h n g(x t )<br />

h n<br />

n∑<br />

w(x t )f (x t ,y t ).<br />

t=1<br />

We treat each term separately. Observe that I 1n can be written as<br />

I 1n = 1 n∑ n∑<br />

( ( )<br />

1 xk − x t<br />

1<br />

K ε<br />

2n 2<br />

k f(x t ,y t )<br />

h<br />

t=1 k=1 n h n g(x t )<br />

+ 1 ( )<br />

)<br />

xt − x k<br />

1<br />

K ε t f(x k ,y k )<br />

h n h n g(x k )<br />

⎛ ⎞<br />

= 1 n∑ n∑<br />

⎝ ∑ h<br />

2n 2<br />

tk<br />

⎠ = 1 n∑ n∑<br />

ψ<br />

2n 2<br />

n (Z t ,Z k ) = 1 2 v n,<br />

t=1 k=1 (2)<br />

t=1 k=1<br />

where v n is a V statistic with Z t = (x t ,y t ) <strong>and</strong> ψ n (Z t ,Z k ) a symmetric kernel.<br />

Hence, by Corollary 1, if E ( ψn 2(Z t,Z k ) ) = o(n) for all t,k <strong>the</strong>n √ n(v n −û n ) =<br />

o p (1). First, <str<strong>on</strong>g>note</str<strong>on</strong>g> that for t ≠ k 1 n E ( ψn 2(Z t,Z k ) ) = 2 n E(h2 tk ) + 2 n E(h tkh kt ) <strong>and</strong><br />

s<strong>in</strong>ce E(εk 2|x 1,... ,x n ) = σ 2 (x k ), we have from <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa-Rao<br />

that<br />

1<br />

n E(h2 tk ) = 1 E<br />

nh 2 n<br />

( ( )<br />

K 2 xk −x t<br />

h n<br />

)<br />

σ 2 (x k )E(f 2 1<br />

(x t ,y t )|x 1 ,... ,x n ) → 0,<br />

g 2 (x t )<br />

provided that nh n → ∞. Similarly it can be shown that E( 1 n h tkh kt ) = o(1)<br />

<strong>and</strong> <strong>the</strong>refore 1 n E(ψ2 n (Z t,Z k )) = o(1). Fort = k it suffices to verify that 4K2 (0)<br />

(<br />

)<br />

nh 2 n<br />

E σ 2 (x t ) f 2 (x t ,y t )<br />

g 2 (x t<br />

= o(1), but this follows directly given our assumpti<strong>on</strong>s. Now,<br />

)<br />

s<strong>in</strong>ce E(ε t |x 1 ,... ,x n ) = 0, θ n = E (ψ n (Z t ,Z k )) = 0 <strong>and</strong> û n = 2 ∑ n<br />

n t=1 ψ 1n(Z t ).<br />

Therefore,<br />

ψ 1n (Z t ) = 1 ∫ ∫ ( )<br />

xk − x t<br />

ε t K f(x k ,y k )g(y k |x k )dx k dy k<br />

h n h<br />

∫<br />

n<br />

= ε t f(x t , y)g(y|x t )dy + o(1),<br />

where g(y|x) = q(y,x)<br />

g(x) . C<strong>on</strong>sequently,<br />

I 1n = 1 n∑<br />

∫<br />

ε t f(x t , y)g(y|x t )dy + o p (n − 1 2 ) (10)<br />

n<br />

t=1<br />

given that 1 n<br />

∑ n<br />

t=1 ε t = O p (n − 1 2 ). Now, <str<strong>on</strong>g>note</str<strong>on</strong>g> that<br />

I 2n = 1<br />

4n 2<br />

n∑<br />

n∑ ∑<br />

t=1 k=1 (2)<br />

h tk = 1<br />

4n 2<br />

n∑<br />

t=1 k=1<br />

n∑<br />

ψ n (Z t ,Z k ) = 1 4 v n,


V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 397<br />

where h tk = h n K( x k−x t<br />

h n<br />

)( x k−x t<br />

h n<br />

) 2 m (2) (x kt )f (x t ,y t ) 1<br />

g(x t ) . By Corollary 1 if E(ψ2 n (Z t,<br />

Z k )) = o(n) for all t,k, <strong>the</strong>n √ n(v n −û n ) = o p (1). Hence we focus <strong>on</strong> û n . Note<br />

that, û n = 2 ∑ n<br />

n t=1 ψ 1n(Z t ) − θ n , where ψ 1n (Z t ) = E(h tk |Z t ) + E(h kt |Z t ) <strong>and</strong><br />

E(û n ) = θ n = 2E(h tk ). Hence, by <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa-Rao,<br />

) (ûn<br />

E = 2 ∫ ∫ ∫ ( )( )<br />

xk − x t xk − x 2<br />

t<br />

K<br />

m (2) (x<br />

h 2 kt )<br />

n h n h n h n<br />

×f(x t ,y t ) g(x k)<br />

g(x t ) q(x t,y t )dx k dx t dy t<br />

−→ 2σ 2 K E(m(2) (x t )f (x t ,y t )).<br />

In additi<strong>on</strong>, <str<strong>on</strong>g>note</str<strong>on</strong>g> that<br />

( ) 1<br />

V û<br />

h 2 n = 4 nV (E(h<br />

n n 2 h 4 tk |Z t ) + E(h kt |Z t ))<br />

n<br />

= 4 (<br />

E(E(htk |Z<br />

nh 4 t )) 2 +E(E(h kt |Z t )) 2<br />

n<br />

+2E(E(h tk |Z t )E(h kt |Z k ))−θn) 2 .<br />

Under our assumpti<strong>on</strong>s, it is straightforward to show that<br />

1<br />

E(E(h tk |Z t )) 2<br />

nh 4 n<br />

= 1 E<br />

nh 2 n<br />

(<br />

as n →∞. Similarly,<br />

(<br />

as n →∞. Hence, V<br />

f(x t ,y t ) 2 1<br />

g 2 (x t ) E2 (<br />

K<br />

1<br />

nh 4 n<br />

1<br />

h 2 n<br />

(<br />

xk − x t<br />

h n<br />

)(<br />

xk − x t<br />

h n<br />

) 2<br />

m (2) (x kt )|Z t<br />

))<br />

→0<br />

E(E(h kt |Z t )) 2 → 0 <strong>and</strong> 1<br />

nh 4 n<br />

E(E(h tk |Z t )E(h kt |Z t )) → 0<br />

û n<br />

)<br />

→ 0. By Markov’s <strong>in</strong>equality we c<strong>on</strong>clude that<br />

û n = 2h 2 n σ K 2 E(m(2) (x t )f (x t ,y t )) + o p (h 2 n ) <strong>and</strong> by Corollary 1,<br />

I 2n = 1 2 h2 n σ K 2 E(m(2) (x t )f (x t ,y t )) + o p (h 2 n ). (11)<br />

F<strong>in</strong>ally, verificati<strong>on</strong> that E(ψn 2(Z t,Z k )) = 2E(h 2 tk ) + 2E(h tkh kt ) = o(n) for all<br />

t ≠ k follows directly from our assumpti<strong>on</strong>s <strong>and</strong> <strong>use</strong> <strong>of</strong> <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa-Rao,<br />

<strong>and</strong> for <strong>the</strong> case where t = k we have that E(ψn 2(Z t,Z k )) = 0. From<br />

Mart<strong>in</strong>s-Filho <strong>and</strong> Yao’s (2003) Lemma 2 <strong>and</strong> Theorem 1<br />

|w(x t )|=<br />

∣ ˆm(x t) − m(x t ) −<br />

1<br />

nh n g(x t )<br />

n∑<br />

K<br />

k=1<br />

( )<br />

xk − x t<br />

h n<br />

y ∗ k<br />

∣ = O p(R n,2 (x t ))<br />

<strong>and</strong> sup xt ∈G |R n,2(x t )|=o p (h 2 n ), hence sup x t ∈G |w(x t)| =o p (h 2 n ). As a result we<br />

have that |I 3n |≤o p (h 2 n ) 1 ∑ n<br />

n t=1 |f(x t,y t )|=o p (h 2 n ), s<strong>in</strong>ce E(f 2 (x t ,y t )) < ∞.<br />

Therefore, I 3n = 1 ∑ n<br />

n t=1 w(x t)f (x t ,y t ) = o p (h 2 n ). Comb<strong>in</strong><strong>in</strong>g this result with


398 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />

(10) <strong>and</strong> (11) proves part (a).<br />

For part (b) <str<strong>on</strong>g>note</str<strong>on</strong>g> that<br />

J n = 1 n∑ n∑ n∑<br />

( )<br />

1<br />

f(x<br />

n 3 h 2 t ,y t )<br />

n<br />

g 2 (x<br />

t=1 k=1 l=1<br />

t ) K xk − x t<br />

K<br />

h n<br />

n∑ n∑ n∑<br />

(<br />

+ h2 n<br />

1<br />

4n 3<br />

g 2 (x t ) f(x xk − x t<br />

t,y t )K<br />

t=1 k=1 l=1<br />

(<br />

xl − x t<br />

×K<br />

h n<br />

h n<br />

)(<br />

xl − x t<br />

h n<br />

) 2<br />

m (2) (x kt )m (2) (x lt )<br />

+ 1 n∑<br />

w 2 (x t )f (x t ,y t )<br />

n<br />

t=1<br />

+ 1 n∑ n∑ n∑<br />

( )<br />

1<br />

n 3 g 2 (x<br />

t=1 k=1 l=1 t ) f(x xk − x t<br />

t,y t )K<br />

h n<br />

( )( )<br />

xl − x t xl − x 2<br />

t<br />

×K<br />

m (2) (x lt )ε k<br />

+ 2<br />

n 2 h n<br />

n∑<br />

h n<br />

t=1 k=1<br />

h n<br />

n∑ 1<br />

w(x t )<br />

g(x t ) f(x t,y t )K<br />

+ h n∑ n∑<br />

n<br />

1<br />

w(x<br />

n 2<br />

t )<br />

g(x<br />

t=1 k=1<br />

t ) f(x t,y t )K<br />

= J 1n + J 2n + J 3n + J 4n + J 5n + J 6n .<br />

We exam<strong>in</strong>e each term separately.<br />

J 1n = 1 1<br />

n∑ n∑ n∑ ∑<br />

h<br />

6 n 3<br />

tkl = 1 1<br />

6 n 3<br />

t=1 k=1 l=1 (3)<br />

n∑<br />

( )<br />

xk − x t<br />

ε k<br />

h n<br />

(<br />

xk − x t<br />

n∑<br />

h n<br />

t=1 k=1 l=1<br />

1<br />

( )<br />

xl − x t<br />

ε k ε l<br />

h n<br />

)(<br />

xk − x t<br />

h n<br />

) 2<br />

)(<br />

xk − x t<br />

h n<br />

) 2<br />

m (2) (x kt )<br />

n∑<br />

ψ n (Z t ,Z k ,Z l ) = 1 6 v n,<br />

where h tkl = 1 K( x k−x t<br />

h 2 n h n<br />

)K( x l−x t<br />

h n<br />

)ε k ε l f(x t ,y t )<br />

g 2 (x t ) . By Corollary 1, if E(ψ2 n (Z t,<br />

Z k ,Z l )) = o(n) for all t,k,l <strong>the</strong>n √ n(v n −û n ) = o p (1). S<strong>in</strong>ce ψ 1n (Z t ) = 0<br />

<strong>and</strong> θ n = 0, we have that û n = 0 <strong>and</strong> <strong>the</strong>refore √ nI 1n = o p (1). To verify that<br />

1<br />

n E(ψ2 n (Z t,Z k ,Z l )) = o(1) for t ≠ k ≠ l we <str<strong>on</strong>g>note</str<strong>on</strong>g> that 1 n E(ψ2 n (Z t,Z k ,Z l )) =<br />

4<br />

n (3E(h2 tkl ) + 2E(h tklh ktl ) + 2E(h tkl h ltk ) + 2E(h ktl h ltk )). Under A1–A4, repeated<br />

applicati<strong>on</strong> <strong>of</strong> <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa-Rao to each <strong>of</strong> <strong>the</strong>se expectati<strong>on</strong>s shows<br />

that each approaches zero as n →∞. Also, if t = k but t ≠ l we have that<br />

E(ψn 2(Z t,Z t ,Z l )) = o(n) <strong>and</strong> when t = k = l, E(ψn 2(Z t,Z t ,Z t )) = o(n 2 )<br />

directly from our assumpti<strong>on</strong>s provided nh 3 n → 0.<br />

J 2n = 1<br />

24n 3<br />

n∑<br />

n∑<br />

n∑ ∑<br />

t=1 k=1 l=1 (3)<br />

)( x k−x t<br />

h n<br />

h tkl = 1<br />

24n 3<br />

n∑<br />

) 2 K( x l−x t<br />

h n<br />

)( x l−x t<br />

n∑<br />

n∑<br />

t=1 k=1 l=1<br />

ψ n (Z t ,Z k ,Z l ) = 1<br />

24 v n,<br />

where h tkl = h 2 n K(x k−x t<br />

h n<br />

By Corollary 1 √ n(v n −û n ) = o p (1) provided that E(ψ n (Z t ,Z k ,Z l )) = o(n)<br />

h n<br />

) 2 m (2) (x kt )m (2) (x lt )<br />

g 2 (x t ) f(x t,y t ).<br />

1


V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 399<br />

for all t,k,l. As above, verificati<strong>on</strong> <strong>of</strong> this c<strong>on</strong>diti<strong>on</strong> is straightforward given our<br />

assumpti<strong>on</strong>s. Note that û n = 3 ∑ n<br />

n t=1 ψ 1n(Z t ) − 2θ n with E(û n ) = θ n . Hence,<br />

(<br />

1<br />

E(û<br />

h 4 n ) = 6 ( )( )<br />

xk −x t xk − x 2 ( )( )<br />

t xl −x t xl −x 2<br />

t<br />

E K<br />

K<br />

m (2) (x<br />

n h 2 kt )<br />

n h n h n h n h n<br />

)<br />

× m (2) 1<br />

(x lt )<br />

g 2 (x t ) f(x t,y t )<br />

<strong>and</strong> by <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa-Rao we have, 1 E(û<br />

h 4 n )→6σK 4 E((m(2) (x t )) 2 f(x t ,<br />

n<br />

y t )) <strong>and</strong> V( 1 û<br />

h 4 n ) → 0. By Markov’s <strong>in</strong>equality we c<strong>on</strong>clude that J 2n = 1<br />

n 4 h4 n σ K<br />

4<br />

E((m (2) (x t )) 2 f(x t ,y t )) + o p (h 4 n ) + o p(n −1/2 ). J 3n = 1 ∑ n<br />

n t=1 w2 (x t )f (x t ,y t ) =<br />

o p (h 4 n ) follows directly from <strong>the</strong> analysis <strong>of</strong> <strong>the</strong> I 3n <strong>in</strong> part (a). Now,<br />

J 4n = 1<br />

6n 3<br />

n∑<br />

n∑<br />

where h tkl = K( x k−x t<br />

h n<br />

n∑ ∑<br />

t=1 k=1 l=1 (3)<br />

h tkl = 1<br />

6n 3<br />

n∑<br />

n∑<br />

t=1 k=1 l=1<br />

n∑<br />

ψ n (Z t ,Z k ,Z l ) = 1 6 v n,<br />

)K( x l−x t<br />

h n<br />

)( x l−x t<br />

h n<br />

) 2 m (2) 1<br />

(x lt )ε k g 2 (x t ) f(x t,y t ). Once aga<strong>in</strong> given<br />

our assumpti<strong>on</strong>s it can be verified that E(ψn 2(Z t,Z k ,Z l )) = o(n) for all t,k,l,<br />

<strong>the</strong>refore we have √ n(v n −û n ) = o p (1), where û n = 3 ∑ n<br />

n t=1 ψ 1n(Z t ) − 2θ n .<br />

Given that <strong>in</strong> this case θ n = 0, we have<br />

û n = 6 n<br />

(<br />

n∑<br />

ε t E K<br />

t=1<br />

( )<br />

xt −x k<br />

K<br />

h n<br />

( )( ) )<br />

xl −x k xl −x 2<br />

k<br />

m (2) 1<br />

(x lk )<br />

h n g 2 (x k ) f(x k,y k )|Z t ,<br />

h n<br />

Under A1–A4, <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa Rao gives<br />

( ( ) ( )( ) )<br />

1 xt − x k xl − x k xl − x 2<br />

k<br />

E K K<br />

m (2) 1<br />

(x<br />

h 2 lk )<br />

n h n h n h n g 2 (x k ) f(x k,y k )|Z t →<br />

σK 2 m(2) (x t )E(f (x t ,y)|x 1 ,... ,x n ).<br />

∑<br />

Hence, û n = 6h2 n n<br />

n t=1 ε tm (2) (x t )σK 2 E(f(x t,y)|x 1 ,... ,x n )+o p (n − 1 2 ), given that<br />

6<br />

∑ n<br />

n t=1 ε t = O p (n − 1 2 ). C<strong>on</strong>sequently, J 4n = σ K 2 ∑ h2 n n<br />

n t=1 ε tm (2) (x t )E(f (x t ,y)<br />

|x 1 ,... ,x n ) + o p (n − 1 2 ).<br />

|J 5n | = ∣ 2 ∑ (<br />

n<br />

n t=1 w(x 1<br />

∑ ) n<br />

t)<br />

nh n g(x t ) k=1 K(x k−x t<br />

h n<br />

)ε k f(x t ,y t ) ∣ <strong>and</strong> from<br />

Mart<strong>in</strong>s-Filho <strong>and</strong> Yao (2003) Theorem 1 we have sup xt ∈G |w(x t)|<br />

= o p (h 2 n ) <strong>and</strong> sup 1<br />

∑<br />

∣<br />

n<br />

x t ∈G ∣<br />

nh n g(x t ) k=1 K(x k−x t ∣∣<br />

h n<br />

)ε k = op (h n ). Hence, |J 5n |<br />

≤ 2 ∑ n<br />

n t=1 |f(x t,y t )|o p (h 3 n ) = O p(1)o p (h 3 n ) = o p(h 3 n ) s<strong>in</strong>ce 2 ∑ n<br />

n t=1 |f(x t,y t )|<br />

= O p (1), which gives J 5n = o p (h 3 n ). F<strong>in</strong>ally, from Mart<strong>in</strong>s-Filho <strong>and</strong> Yao (2003)<br />

Theorem 1 we have<br />

h n<br />

n∑<br />

( )( ) xk − x t xk − x 2 t<br />

sup<br />

K<br />

m (2) (x kt )<br />

∣ng(x t )<br />

h n ∣ = O p(h 2 n ) (12)<br />

x t ∈G<br />

k=1<br />

h n


400 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />

<strong>and</strong> |J 6n |≤ 1 ∑ n<br />

n t=1 |f(x t,y t )|o p (h 2 n )O p(h 2 n ) = O p(1)o p (h 4 n ) = o p(h 4 n ). Comb<strong>in</strong><strong>in</strong>g<br />

<strong>the</strong> orders <strong>of</strong> all six terms gives <strong>the</strong> desired result.<br />

∑ n<br />

k=1 K (<br />

xk −x t<br />

1<br />

2nh n g(x t )<br />

∑ n<br />

k=1 K (<br />

xk −x t<br />

h n<br />

)<br />

m (2) (x kt )<br />

)<br />

(c) Let a tn = 1<br />

nh n g(x t )<br />

h n<br />

ε k , b tn =<br />

(x k − x t ) 2 , c tn = w(x t ). Then ( ˆm(x t ) − m(x t )) 3 = atn 3 + b3 tn + c3 tn + 3a tnbtn 2 +<br />

3a tn ctn 2 +3a2 tn b tn+3atn 2 c tn+3btn 2 c tn+3b tn ctn 2 +6a tnb tn c tn <strong>and</strong> n −1 ∑ n<br />

t=1 ( ˆm(x t)−<br />

m(x t )) 3 f(x t ,y t ) = ∑ 10<br />

i=1 L <strong>in</strong>, where L 1n = n −1 ∑ n<br />

t=1 a3 tn f(x t,y t ),... ,L 10n =<br />

n −1 ∑ n<br />

t=1 6a tnb tn c tn f(x t ,y t ). S<strong>in</strong>ce<br />

∣ sup 1<br />

n∑<br />

( ) ∣<br />

xk − x ∣∣∣∣<br />

t<br />

x t<br />

K ε k = o p (h n ),<br />

nh n g(x t ) h n<br />

k=1<br />

<strong>the</strong>n |L 1n |≤o p (h 3 n ) × 1 ∑ n<br />

n t=1 |f(x t,y t )|=o p (h 3 n ). By (12), |L 2n| ≤O p (h<br />

∑ 6 n )n−1<br />

n<br />

t=1 |f(x t,y t )|=O p (h 6 n ) s<strong>in</strong>ce E(f 2 (x t ,y t )) < ∞. |L 3n |≤o p (h 6 n ) follows<br />

directly from sup xt ∈G|w(x t )|=o p (h 2 n ).<br />

|L 4n |≤ 3 n∑<br />

1<br />

n∑<br />

( ) ∣<br />

xk − x ∣∣∣∣<br />

t<br />

K ε k<br />

n ∣nh n g(x t ) h n<br />

t=1<br />

(<br />

×<br />

∣<br />

1<br />

2nh n g(x t )<br />

k=1<br />

n∑<br />

K<br />

k=1<br />

×|f(x t ,y t )|≤o p (h n )O p (h 4 n ) 3 n<br />

( )<br />

xk − x t<br />

(x k − x t ) 2 m (2) (x kt )<br />

h n<br />

) 2<br />

∣ ∣∣∣∣∣<br />

n∑<br />

|f(x t ,y t )|=o p (h 5 n )<br />

from (12) <strong>and</strong> <strong>the</strong> analysis <strong>of</strong> J 5n <strong>in</strong> part (b). Similarly, we obta<strong>in</strong> |L 5n |, |L 10n |≤<br />

o p (h 5 n ), |L 6n|, |L 7n |≤o p (h 4 n ) <strong>and</strong> |L 8n|, |L 9n |≤o p (h 6 n ). Hence, we have L n =<br />

o p (h 3 n ).<br />

⊓⊔<br />

We now <strong>use</strong> <strong>the</strong> results <strong>in</strong> <strong>the</strong> Lemma to establish <strong>the</strong> asymptotic distributi<strong>on</strong> <strong>of</strong><br />

several <strong>statistics</strong> <strong>of</strong> <strong>in</strong>terest.<br />

t=1<br />

3.1 Estimat<strong>in</strong>g c<strong>on</strong>diti<strong>on</strong>al variance<br />

C<strong>on</strong>sider first a special case <strong>of</strong> <strong>the</strong> model (5), where ε t |x t ∼ N(0,σ 2 ). Then a<br />

natural estimator <strong>of</strong> σ 2 is<br />

ˆσ 2 = 1 n∑<br />

(y t −ˆm(x t )) 2 = 1 n∑ (<br />

ε<br />

2<br />

n<br />

n t +2(m(x t )−ˆm(x t ))ε t +(m(x t )−ˆm(x t )) 2) .<br />

t=1<br />

t=1<br />

The difficulty <strong>in</strong> deal<strong>in</strong>g with such expressi<strong>on</strong>s lies <strong>in</strong> <strong>the</strong> average terms <strong>in</strong>volv<strong>in</strong>g<br />

(m(x t ) −ˆm(x t )) <strong>and</strong> (m(x t ) −ˆm(x t )) 2 , s<strong>in</strong>ce <strong>the</strong> first term is an average <strong>of</strong> an<br />

i.i.d. sequence. By us<strong>in</strong>g Corollary 1 <strong>and</strong> Lemma 1 we have a c<strong>on</strong>venient way to<br />

establish <strong>the</strong>ir asymptotic properties. This is shown <strong>in</strong> <strong>the</strong> next <strong>the</strong>orem.<br />

Theorem 2 Assume that <strong>in</strong> model (5), ε t |x t ∼ N(0,σ 2 ) <strong>and</strong> E(yt 4 )


V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 401<br />

Pro<strong>of</strong> Note that ˆσ 2 = 1 ∑ n<br />

(<br />

n t=1 ε<br />

2<br />

t + 2(m(x t ) −ˆm(x t ))ε t + (m(x t ) −ˆm(x t )) 2) .<br />

Now, lett<strong>in</strong>g f(x t ,y t )=−2ε t <strong>in</strong> part (a) <strong>of</strong> Lemma 1 we have 1 ∑ n<br />

n t=1 ε ∫<br />

t f(xt ,y t )<br />

g(y t |x t )dy t = 0, s<strong>in</strong>ce E(ε t |x 1 ,... ,x n ) = 0 <strong>and</strong> 1 2 h2 n σ K 2 E ( m (2) (x t )f (x t ,y t ) ) = 0.<br />

Hence, 1 ∑ n<br />

n t=1 2(m(x t) −ˆm(x t ))ε t = o p (n − 1 2 ) + o p (h 2 n ). By part (b) <strong>of</strong> Lemma 1<br />

with f(x t ,y t ) = 1wehave<br />

1<br />

4 h4 n σ K 4 E((m(2) (x t )) 2 f(x t ,y t )) = O p (h 4 n )<br />

<strong>and</strong><br />

σK 2 n∑ h2 n<br />

ε t m (2) (x t )E(f (x t ,y)|x 1 ,... ,x n )<br />

n<br />

t=1<br />

= σ K 2 n∑ h2 n<br />

ε t m (2) (x t ) = O p (n − 1 2 h<br />

2<br />

n<br />

n ) = o p (n − 1 2 )<br />

t=1<br />

by <strong>the</strong> central limit <strong>the</strong>orem for i.i.d. sequences. Hence, 1 ∑ n<br />

n t=1 (m(x t)− ˆm(x t )) 2 =<br />

o p (n − 1 2 ) + o p (h 3 n ) provided E(y4 t )


402 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />

(b) If E(y 8 t )


V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 403<br />

S<strong>in</strong>ce E(yt 6)


404 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />

Hence,<br />

b 3n = n −1<br />

n∑<br />

t=1<br />

ε 4 t − µ 4 − 4µ 3<br />

n<br />

n∑<br />

ε t − 2σK 2 h2 n µ 3E ( m (2) (x t ) )<br />

+o p (n −1/2 ) + o p (h 2 n ). (23)<br />

µ<br />

Comb<strong>in</strong><strong>in</strong>g (20) <strong>and</strong> (23), we write b 3n − b 4<br />

4n (µ 2<br />

= n −1 ∑ n<br />

) 2 t=1 W t + A 2n where<br />

W t = εt 4 + µ 4 − 4ε t µ 3 − 2 µ 4<br />

µ 2<br />

εt 2 <strong>and</strong> A 2n =−2h 2 n σ K 2 µ 3E(m (2) (x t )) + o p (n −1/2 ) +<br />

o p (h 2 n ). Note that E(W t) = 0 <strong>and</strong> by <strong>the</strong> c r -<strong>in</strong>equality, V(W t ) = E(Wt 2)


V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 405<br />

√ n( ˆη 2 −η 2 −b 2n ) → d N(0,V(ξ t )) where ξ t = 1 (<br />

ε<br />

2<br />

V(y t ) t −(1−η 2 )(y t −E(y t )) 2)<br />

<strong>and</strong> b 2n = o p (h 2 n ).<br />

(<br />

)<br />

Pro<strong>of</strong> ˆη 2 − η 2 1<br />

E(y<br />

=− 1<br />

∑ n β<br />

n t=1 (y t −ȳ) 2 1 − β t −m(x t )) 2<br />

2 V(y t<br />

where β<br />

)<br />

1 = 1 ∑ n<br />

n t=1 (y t −<br />

ˆm(x t )) 2 − E(y t − m(x t )) 2 , β 2 = 1 ∑ n<br />

n t=1 (y t − ȳ) 2 − V(y t ). First, <str<strong>on</strong>g>note</str<strong>on</strong>g> that by <strong>the</strong><br />

law <strong>of</strong> large numbers 1 ∑ n<br />

n t=1 y2 p<br />

t → E(yt 2 ) <strong>and</strong> ȳ2 → p E(y t ) 2 , hence<br />

(<br />

) −1<br />

1<br />

n∑<br />

(y t − ȳ) 2 p<br />

→ V(yt ) −1 . (26)<br />

n<br />

t=1<br />

1<br />

For β 2 , ȳ 2 − (E(y t )) 2 = 1 ∑ n ∑ n<br />

2 n 2 t=1 k=1 (y ty k + y k y t ) − E(m(x t )) 2 = 1 2 v n −<br />

(E(m(x t ))) 2 . By Corollary 1, s<strong>in</strong>ce 1 n E(ψ2 n (y t,y k )) = 4 n E((y2 t ))2 = o(1) for all<br />

t,k, √ n(v n −û n ) = o p (1) where û n = 2 ∑ n<br />

n t=1 (ψ 1n(Z t ))−θ n = 4 ∑ n<br />

n t=1 y tE(m(x t ))−<br />

2(E(m(x t ))) 2 . Hence, ȳ 2 − (E(y t )) 2 = 2 ∑ n<br />

n t=1 y tE(m(x t )) − 2(E(m(x t ))) 2 +<br />

o p (n − 1 2 ), <strong>and</strong><br />

E(y t − m(x t )) 2<br />

β 2 = 1 n∑<br />

{ E(σ 2 (x t ))<br />

(yt 2 − E(yt 2 V(y t ) n V(y<br />

t=1<br />

t )<br />

))<br />

}<br />

−2 E(σ2 (x t ))E(m(x t ))<br />

(y t − E(m(x t ))) + o p (n − 1 2 ).<br />

V(y t )<br />

For β 1 , 1 ∑ n<br />

n t=1 (y t−ˆm(x t )) 2 = 1 ∑ n<br />

n t=1(<br />

ε<br />

2<br />

t + 2(m(x t )−ˆm(x t ))ε t +(m(x t )−ˆm(x t )) 2) .<br />

Similarly to <strong>the</strong> pro<strong>of</strong> <strong>of</strong> Theorem 2, we <strong>use</strong> part (a) <strong>of</strong> Lemma 1 with f(x t ,y t )<br />

=−2ε t to obta<strong>in</strong> 1 ∑ n<br />

n t=1 2(m(x t)− ˆm(x t ))ε t = o p (n − 1 2 )+o p (h 2 n ). By part (b) with<br />

f(x t ,y t ) = 1 we obta<strong>in</strong> 1 ∑ n<br />

n t=1 (m(x t)− ˆm(x t )) 2 = o p (n − 1 2 )+o p (h 3 n ). Therefore,<br />

1<br />

∑ n<br />

n t=1 (y t −ˆm(x t )) 2 = 1 ∑ n<br />

n t=1 ε2 t + o p (n − 1 2 ) + o p (h 2 n ). Hence,<br />

β 1 − β 2<br />

E(y t − m(x t )) 2<br />

V(y t )<br />

= 1 n<br />

n∑<br />

t=1<br />

(ε 2 t − E(σ 2 (x t )) − E(σ2 (x t ))<br />

V(y t )<br />

+ 2E(σ2 (x t ))E(m(x t ))<br />

(y t − E(m(x t )))<br />

V(y t )<br />

+o p (n − 1 2 ) + op (h 2 n )<br />

= 1 n∑<br />

ζ t + o p (n − 1 2 ) + op (h 2 n<br />

n<br />

).<br />

t=1<br />

(y 2 t − E(y 2 t ))<br />

S<strong>in</strong>ce E(ζ t ) = 0,V(ζ t ) = V(εt 2 − (1 − η 2 )(y t − E(y t )) 2 )


406 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />

4 C<strong>on</strong>clusi<strong>on</strong><br />

We have established <strong>the</strong> √ n asymptotic equivalence between V <strong>and</strong> U <strong>statistics</strong><br />

when <strong>the</strong>ir kernels depend <strong>on</strong> n, <strong>the</strong> sample size. We comb<strong>in</strong>e our result with a<br />

result <strong>of</strong> Lee (1988) to obta<strong>in</strong> <strong>the</strong> √ n asymptotic equivalence between V <strong>statistics</strong><br />

<strong>and</strong> U <strong>statistics</strong> projecti<strong>on</strong>s. We provide a number <strong>of</strong> examples <strong>and</strong> illustrati<strong>on</strong>s<br />

<strong>on</strong> how our results can be <strong>use</strong>d <strong>in</strong> n<strong>on</strong>parametric kernel estimati<strong>on</strong>. The list <strong>of</strong><br />

examples is obviously not exhaustive as our results can be <strong>use</strong>d <strong>in</strong> much broader<br />

c<strong>on</strong>texts.<br />

Acknowledgements<br />

We would like to thank Pr<strong>of</strong>essor Katuomi Hirano <strong>and</strong> two an<strong>on</strong>ymous referees<br />

for comments <strong>and</strong> suggesti<strong>on</strong>s.<br />

References<br />

Ahn, H., Powell, J. L. (1993). Semiparametric estimati<strong>on</strong> <strong>of</strong> censored selecti<strong>on</strong> <strong>models</strong> with a<br />

n<strong>on</strong>parametric selecti<strong>on</strong> mechanism. Journal <strong>of</strong> Ec<strong>on</strong>ometrics, 58, 3–29.<br />

Bönner, N., Kirschner, H.-P. (1977). Note <strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s for weak c<strong>on</strong>vergence <strong>of</strong> V<strong>on</strong> Mises’<br />

differentiable statistical functi<strong>on</strong>s. The Annals <strong>of</strong> Statistics, 5, 405–407.<br />

D’Amico, S. (2003). Density estimati<strong>on</strong> <strong>and</strong> comb<strong>in</strong>ati<strong>on</strong> under model ambiguity via a pilot n<strong>on</strong>parametric<br />

estimate: an applicati<strong>on</strong> to stock returns.Work<strong>in</strong>g paper, Department <strong>of</strong> Ec<strong>on</strong>omics,<br />

Columbia University (http://www.columbia.edu/∼sd445/).<br />

Doksum, K., Samarov, A. (1995). N<strong>on</strong>parametric estimati<strong>on</strong> <strong>of</strong> global functi<strong>on</strong>als <strong>and</strong> a measure<br />

<strong>of</strong> <strong>the</strong> explanatory power <strong>of</strong> covariates <strong>in</strong> regressi<strong>on</strong>. The Annals <strong>of</strong> Statistics, 23(5),<br />

1443–1473.<br />

Fan, J. (1992). Design adaptive n<strong>on</strong>parametric regressi<strong>on</strong>. Journal <strong>of</strong> <strong>the</strong> American Statistical<br />

Associati<strong>on</strong>, 87, 998–1004.<br />

Fan, J.,Yao, Q. (1998). Efficient estimati<strong>on</strong> <strong>of</strong> c<strong>on</strong>diti<strong>on</strong>al variance functi<strong>on</strong>s <strong>in</strong> stochastic regressi<strong>on</strong>.<br />

Biometrika, 85, 645–660.<br />

Fan, Y., Li, Q. (1996). C<strong>on</strong>sistent model specificati<strong>on</strong> test: omitted variables <strong>and</strong> semiparametric<br />

functi<strong>on</strong>al forms. Ec<strong>on</strong>ometrica, 64(4), 865–890.<br />

Hoeffd<strong>in</strong>g, W. (1948). A class <strong>of</strong> <strong>statistics</strong> with asymptotically normal distributi<strong>on</strong>. Annals <strong>of</strong><br />

Ma<strong>the</strong>matical Statistics, 19, 293–325.<br />

Hoeffd<strong>in</strong>g, W. (1961). The str<strong>on</strong>g law <strong>of</strong> large numbers for U <strong>statistics</strong>, Institute <strong>of</strong> Statistics,<br />

Mimeo Series 302, University <strong>of</strong> North Carol<strong>in</strong>a, Chapel Hill, NC.<br />

Kemp, G. (2000). Semiparametric estimati<strong>on</strong> <strong>of</strong> a logit model. Work<strong>in</strong>g paper, Department <strong>of</strong><br />

Ec<strong>on</strong>omics, University <strong>of</strong> Essex (http://www.essex.ac.uk /∼ kempgcr/ papers/logitmx.pdf).<br />

Lee, A. J. (1990). U-Statistics. New York: Marcel Dekker.<br />

Lee, B. (1988). N<strong>on</strong>parametric tests us<strong>in</strong>g a kernel estimati<strong>on</strong> method, Doctoral dissertati<strong>on</strong>,<br />

Department <strong>of</strong> Ec<strong>on</strong>omics, University <strong>of</strong> Wisc<strong>on</strong>s<strong>in</strong>, Madis<strong>on</strong>, WI.<br />

Mart<strong>in</strong>s-Filho, C., Yao, F. (2003). A n<strong>on</strong>parametric model <strong>of</strong> fr<strong>on</strong>tiers, work<strong>in</strong>g paper, Department<br />

<strong>of</strong> Ec<strong>on</strong>omics, Oreg<strong>on</strong> State University (http://oreg<strong>on</strong>state.edu/∼mart<strong>in</strong>sc/mart<strong>in</strong>s-filho-yao(03).pdf).<br />

Powell, J., Stock, J., Stoker, T. (1989). Semiparametric estimati<strong>on</strong> <strong>of</strong> <strong>in</strong>dex coefficients. Ec<strong>on</strong>ometrica,<br />

57, 1403–1430.<br />

Prakasa-Rao, B. L. S. (1983). N<strong>on</strong>parametric functi<strong>on</strong>al estimati<strong>on</strong>. Orl<strong>and</strong>o, FL: Academic<br />

Serfl<strong>in</strong>g, R. J. (1980). Approximati<strong>on</strong> <strong>the</strong>orems <strong>of</strong> ma<strong>the</strong>matical <strong>statistics</strong>. New York: Wiley.<br />

Yao, Q., T<strong>on</strong>g, H. (2000). N<strong>on</strong>parametric estimati<strong>on</strong> <strong>of</strong> ratios <strong>of</strong> noise to signal <strong>in</strong> stochastic<br />

regressi<strong>on</strong>. Statistica S<strong>in</strong>ica, 10, 751–770.<br />

Zheng, J. X. (1998). A c<strong>on</strong>sistent n<strong>on</strong>parametric test <strong>of</strong> parametric regressi<strong>on</strong> <strong>models</strong> under<br />

c<strong>on</strong>diti<strong>on</strong>al quantile restricti<strong>on</strong>s. Ec<strong>on</strong>ometric Theory, 14, 123–138.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!