A note on the use of V and U statistics in nonparametric models of ...
A note on the use of V and U statistics in nonparametric models of ...
A note on the use of V and U statistics in nonparametric models of ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
AISM (2006) 58: 389–406<br />
DOI 10.1007/s10463-006-0034-z<br />
Carlos Mart<strong>in</strong>s-Filho · Feng Yao<br />
A <str<strong>on</strong>g>note</str<strong>on</strong>g> <strong>on</strong> <strong>the</strong> <strong>use</strong> <strong>of</strong> V <strong>and</strong> U <strong>statistics</strong><br />
<strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> <strong>of</strong> regressi<strong>on</strong><br />
Received: 8 September 2004 / Revised: 12 April 2005 / Published <strong>on</strong>l<strong>in</strong>e: 1 June 2006<br />
© The Institute <strong>of</strong> Statistical Ma<strong>the</strong>matics, Tokyo 2006<br />
Abstract We establish <strong>the</strong> √ n asymptotic equivalence <strong>of</strong> V <strong>and</strong> U <strong>statistics</strong> when<br />
<strong>the</strong> statistic’s kernel depends <strong>on</strong> n. Comb<strong>in</strong>ed with a lemma <strong>of</strong> B. Lee this result<br />
provides c<strong>on</strong>diti<strong>on</strong>s under which U <strong>statistics</strong> projecti<strong>on</strong>s <strong>and</strong> V <strong>statistics</strong> are √ n<br />
asymptotically equivalent. The <strong>use</strong> <strong>of</strong> this equivalence <strong>in</strong> n<strong>on</strong>parametric regressi<strong>on</strong><br />
<strong>models</strong> is illustrated with several examples; <strong>the</strong> estimati<strong>on</strong> <strong>of</strong> c<strong>on</strong>diti<strong>on</strong>al variances,<br />
skewness, kurtosis <strong>and</strong> <strong>the</strong> c<strong>on</strong>structi<strong>on</strong> <strong>of</strong> a n<strong>on</strong>parametric R-squared measure.<br />
Keywords U <strong>statistics</strong> · V <strong>statistics</strong> · local l<strong>in</strong>ear estimati<strong>on</strong><br />
1 Introducti<strong>on</strong><br />
The <strong>use</strong> <strong>of</strong> U <strong>statistics</strong> (see Hoeffd<strong>in</strong>g 1948), as a means <strong>of</strong> obta<strong>in</strong><strong>in</strong>g asymptotic<br />
properties <strong>of</strong> kernel based estimators <strong>in</strong> semiparametric <strong>and</strong> n<strong>on</strong>parametric<br />
regressi<strong>on</strong> <strong>models</strong> is now comm<strong>on</strong> practice <strong>in</strong> ec<strong>on</strong>ometrics. For example, Powell,<br />
Stock <strong>and</strong> Stoker (1989) <strong>use</strong> U <strong>statistics</strong> to obta<strong>in</strong> <strong>the</strong> asymptotic properties <strong>of</strong> <strong>the</strong>ir<br />
semiparametric <strong>in</strong>dex model estimator. Ahn <strong>and</strong> Powell (1993) <strong>use</strong> U <strong>statistics</strong> to<br />
<strong>in</strong>vestigate <strong>the</strong> properties <strong>of</strong> a two stage semiparametric estimati<strong>on</strong> <strong>of</strong> censored<br />
selecti<strong>on</strong> <strong>models</strong> where <strong>the</strong> first stage estimator <strong>in</strong>volves a n<strong>on</strong>parametric regressi<strong>on</strong><br />
estimator for <strong>the</strong> selecti<strong>on</strong> variable. Fan <strong>and</strong> Li (1996) <strong>use</strong> U <strong>statistics</strong> to study<br />
C. Mart<strong>in</strong>s-Filho (B)<br />
Department <strong>of</strong> Ec<strong>on</strong>omics, Oreg<strong>on</strong> State University,<br />
Ballard Hall 303, Corvallis, OR 97331-3612 USA<br />
E-mail: carlos.mart<strong>in</strong>s@orst.edu<br />
Tel.:+1-541-737-1476<br />
F. Yao<br />
Department <strong>of</strong> Ec<strong>on</strong>omics, University <strong>of</strong> North Dakota,<br />
P.O. Box 8369, Gr<strong>and</strong> Forks, ND 58203 USA<br />
E-mail: feng.yao@mail.bus<strong>in</strong>ess.und.edu<br />
Tel.: +1-701-7773360
390 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />
a set <strong>of</strong> specificati<strong>on</strong> tests for n<strong>on</strong>parametric regressi<strong>on</strong> <strong>models</strong> <strong>and</strong> Zheng (1998)<br />
<strong>use</strong>s U-<strong>statistics</strong> to provide a n<strong>on</strong>parametric kernel based test for parametric quantile<br />
regressi<strong>on</strong> <strong>models</strong>. See also Kemp (2000) <strong>and</strong> D’Amico (2003) for more recent<br />
<strong>use</strong>s.<br />
The appeal for <strong>the</strong> <strong>use</strong> <strong>of</strong> U <strong>statistics</strong> derives from <strong>the</strong> fact that <strong>of</strong>ten estimators<br />
or <strong>statistics</strong> <strong>of</strong> <strong>in</strong>terest – T n – are expressed as l<strong>in</strong>ear comb<strong>in</strong>ati<strong>on</strong>s <strong>of</strong> n<strong>on</strong>parametric<br />
regressi<strong>on</strong> estimators. C<strong>on</strong>sider for example a n<strong>on</strong>parametric regressi<strong>on</strong> model<br />
E(Y|X = x) = m(x) with observati<strong>on</strong>s {(y i ,x i )} n i=1 , T n = ∑ n<br />
i=1 c i ˆm(x i ) where<br />
c i ∈Rare n<strong>on</strong>stochastic <strong>and</strong> ˆm(x) is an arbitrary n<strong>on</strong>parametric estimator for<br />
m(x). S<strong>in</strong>ce ˆm(x) can usually be written as ˆm(x) = ∑ n<br />
j=1 w jn(x)y j , we can write<br />
T n =<br />
n∑<br />
i=1<br />
( )<br />
n<br />
c i w <strong>in</strong> (x i )y i + u<br />
2 n = T 1n + T 2n ,<br />
( ) −1<br />
n ∑<br />
where u n =<br />
2 1≤i
V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 391<br />
where ∑ (n,k) de<str<strong>on</strong>g>note</str<strong>on</strong>g>s a sum over all subsets 1 ≤ i 1 0 <strong>and</strong> c<strong>on</strong>sequently to reach <strong>the</strong> desired c<strong>on</strong>clusi<strong>on</strong> it suffices to show<br />
that nE((u n − v n ) 2 ) = o(1).By<strong>the</strong>c r -<strong>in</strong>equality we can write<br />
(<br />
nE((u n − v n ) 2 ) ≤ kn 1 − P k<br />
n ) 2 k−1<br />
E(u 2<br />
n k n ) + kn ∑<br />
κ=1<br />
(<br />
n<br />
κ<br />
) 2<br />
n −2k E((u (κ)<br />
n )2 ).<br />
(4)
392 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />
We will deal with <strong>the</strong> two terms <strong>on</strong> <strong>the</strong> righth<strong>and</strong> side <strong>of</strong> <strong>in</strong>equality (4) separately.<br />
We first <str<strong>on</strong>g>note</str<strong>on</strong>g> that by <strong>the</strong> representati<strong>on</strong> <strong>the</strong>orem for U <strong>statistics</strong> (Serfl<strong>in</strong>g, 1980; p.<br />
180) u (κ)<br />
n = 1 n!<br />
∑p W n<br />
(κ)(Z<br />
i 1<br />
,... ,Z <strong>in</strong> ) where<br />
W (κ)<br />
n (Z i 1<br />
,... ,Z <strong>in</strong> ) = 1 γ κ<br />
(φ κ,n (Z i1 ,... ,Z iκ ) + φ κ,n (Z iκ+1 ,... ,Z i2κ ) +···<br />
+φ κ,n (Z iγκ κ−κ+1<br />
,...,Z iγκ κ<br />
)),<br />
γ κ = [n/κ] is <strong>the</strong> greatest <strong>in</strong>teger ≤ n/κ <strong>and</strong> ∑ p<br />
de<str<strong>on</strong>g>note</str<strong>on</strong>g>s <strong>the</strong> sum over all permutati<strong>on</strong>s<br />
(i 1 ,...,i n ) <strong>of</strong> {1, 2,...,n} <strong>and</strong> similarly u n = 1 ∑<br />
n! p W n(Z i1 ,... ,Z <strong>in</strong> )<br />
where W n (Z i1 ,... ,Z <strong>in</strong> ) = 1 γ (ψ n(Z i1 ,... ,Z ik ) + ψ n (Z ik+1 ,... ,Z i2k ) + ··· +<br />
ψ n (Z iγk−k+1 ,...,Z iγk )) <strong>and</strong> γ = [n/k] is <strong>the</strong> greatest <strong>in</strong>teger ≤ n/k. Now, for <strong>the</strong><br />
first term <strong>on</strong> <strong>the</strong> righth<strong>and</strong> side <strong>of</strong> <strong>in</strong>equality (4), s<strong>in</strong>ce n k −Pk<br />
n = O(nk−1 ) we have<br />
( )<br />
n 1 − P k<br />
n 2 ( )<br />
n = O(n −1 ) <strong>and</strong> n 1 − P n 2 ((<br />
k<br />
k<br />
n E(u<br />
2 k n )=O(n −1 1<br />
∑<br />
)E<br />
n! p W n(Z i1 ,...,<br />
Z <strong>in</strong> ) ) ) 2<br />
. By M<strong>in</strong>kowski’s <strong>in</strong>equality<br />
⎛(<br />
E ⎝<br />
1<br />
n!<br />
) ⎞ 2<br />
∑<br />
W n (Z i1 ,... ,Z <strong>in</strong> )<br />
p<br />
⎠ ≤ 1<br />
(n!) 2 ( ∑<br />
p<br />
= E(|W n (Z i1 ,... ,Z <strong>in</strong> )| 2 ),<br />
(<br />
E(|Wn (Z i1 ,... ,Z <strong>in</strong> )| 2 ) ) ) 2<br />
1/2<br />
where <strong>the</strong> last equality follows s<strong>in</strong>ce {Z i } i=1,2,... is an i.i.d. sequence. By def<strong>in</strong>iti<strong>on</strong><br />
<strong>of</strong> W n <strong>and</strong> ano<strong>the</strong>r <strong>use</strong> <strong>of</strong> M<strong>in</strong>kowski’s <strong>in</strong>equality we have<br />
⎛<br />
⎞<br />
γ∑<br />
E(|W n (Z i1 ,... ,Z <strong>in</strong> )| 2 ) ≤ γ −2 (<br />
⎝ E(ψ<br />
2<br />
n (Z ijk−k+1 ,... ,Z ijk )) ) 1/2<br />
⎠<br />
j=1<br />
= E(ψ n (Z i1 ,... ,Z ik )) = o(n),<br />
( )<br />
where <strong>the</strong> last equality follows by assumpti<strong>on</strong>.We <strong>the</strong>refore c<strong>on</strong>clude that n 1− P k<br />
n 2<br />
n k<br />
E(u 2 n ) = o(1).<br />
(<br />
n<br />
For <strong>the</strong> sec<strong>on</strong>d term <strong>on</strong> <strong>in</strong>equality (4), <str<strong>on</strong>g>note</str<strong>on</strong>g> first that n<br />
κ)<br />
−k = O(n κ−k ) <strong>and</strong><br />
(( ) 2<br />
n<br />
s<strong>in</strong>ce κ ∈{1,... ,k− 1}, max κ n κ−k = n −1 <strong>and</strong> <strong>the</strong>refore n<br />
κ)<br />
−k =O(n −2 ).<br />
Once aga<strong>in</strong>, by M<strong>in</strong>kowski’s <strong>in</strong>equality we have E((u (κ)<br />
n )2 ) ≤ E(|W n (κ)(Z<br />
i 1<br />
,<br />
... ,Z <strong>in</strong> )| 2 ). Now, given <strong>the</strong> def<strong>in</strong>iti<strong>on</strong> <strong>of</strong> W n<br />
(κ) <strong>and</strong> <strong>the</strong> fact that {φ κ,n (Z ijκ−κ+1<br />
,... ,Z ijκ )} j=1,2,... is an i.i.d. sequence <strong>of</strong> r<strong>and</strong>om variables we have by ano<strong>the</strong>r<br />
applicati<strong>on</strong> <strong>of</strong> M<strong>in</strong>kowski’s <strong>in</strong>equality that E(|W n<br />
(κ)(Z<br />
i 1<br />
,... ,Z <strong>in</strong> )| 2 ) ≤ E((φ κ,n (Z i1<br />
,... ,Z iκ )) 2 ). Let l κ de<str<strong>on</strong>g>note</str<strong>on</strong>g> <strong>the</strong> number <strong>of</strong> terms <strong>in</strong> ∑ {ν 1 ,... ,ν κ ,k} , <strong>the</strong>n by <strong>the</strong> c r<br />
<strong>in</strong>equality<br />
2
V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 393<br />
E((φ κ,n (Z i1 ,... ,Z iκ )) 2 )<br />
⎛⎛<br />
⎞2⎞<br />
= E ⎝⎝<br />
∑ k!<br />
∏ κ<br />
{ν 1 ,... ,ν κ ,k} j=1 ν j ! ψ ⎟ ⎟<br />
n ×(Z i1 ,... ,Z i1 ,... ,Z iκ ,... ,Z iκ ) ⎠ ⎠<br />
} {{ } } {{ }<br />
ν 1 − times<br />
ν κ − times<br />
⎛<br />
⎞<br />
∑ (k!) 2<br />
≤ l κ<br />
( ∏ κ<br />
{ν 1 ,... ,ν κ ,k} j=1 ν j !) E ⎜<br />
⎝ψ 2 2 n (Z ⎟<br />
i 1<br />
,... ,Z i1 ,... ,Z iκ ,... ,Z iκ ) ⎠ .<br />
} {{ } } {{ }<br />
ν 1 − times<br />
ν κ − times<br />
Now, s<strong>in</strong>ce by assumpti<strong>on</strong> E ( ψn 2(Z i 1<br />
,... ,Z ik ) ) = o(n) for all 1 ≤ i 1 ,...,i k<br />
≤ n, k ≤ n,, <strong>the</strong>n it is true that for all κ ∈{1,... ,k − 1} <strong>and</strong> all ν j ≥ 1 such<br />
that k = ∑ κ<br />
j=1 ν j we have E ( ψn 2(Z i 1<br />
,... ,Z i1 ,... ,Z iκ ,... ,Z iκ ) ) = o(n). Then<br />
E(|W (κ)<br />
n<br />
∑k−1<br />
(<br />
n<br />
n<br />
κ<br />
κ=1<br />
(Z i 1<br />
,... ,Z <strong>in</strong> )| 2 ) = o(n) <strong>and</strong> c<strong>on</strong>sequently E((u (κ))2<br />
) = o(n).As a result,<br />
) 2<br />
n −2k E((u (κ)<br />
n<br />
∑k−1<br />
)2 )≤nO(n −2 )o(n)<br />
κ=1<br />
l κ<br />
∑<br />
n<br />
{ν 1 ,... ,ν κ ,k}<br />
(k!) 2<br />
( ∏κ<br />
j=1 ν j !) 2<br />
=o(1),<br />
which completes <strong>the</strong> pro<strong>of</strong>.<br />
⊓⊔<br />
Theorem 1 can be proved us<strong>in</strong>g <strong>the</strong> follow<strong>in</strong>g two c<strong>on</strong>diti<strong>on</strong>s which are implied<br />
by E ( ψn 2(Z i 1<br />
,... ,Z ik ) ) = o(n) for all 1 ≤ i 1 ,...,i k ≤ n, k ≤ n: (a)<br />
E ( ψn 2(Z i 1<br />
,... ,Z ik ) ) = o(n) for all 1 ≤ i 1 < i 2 < ··· < i k ≤ n, k ≤ n;<br />
(b) for all κ ∈ {1,... ,k − 1} <strong>and</strong> all ν j ≥ 1 such that k = ∑ κ<br />
j=1 ν j that<br />
def<strong>in</strong>e Z i1 ,... ,Z i1 ,... ,Z iκ ,... ,Z iκ E ( ψ<br />
} {{ } } {{ }<br />
n 2(Z i 1<br />
,... ,Z i1 ,... ,Z iκ ,... ,Z iκ ) )<br />
ν 1 − times<br />
ν κ − times<br />
= o(n 2(k−κ)−1 ). The next corollary establishes <strong>the</strong> √ n asymptotic equivalence<br />
between V <strong>statistics</strong> <strong>and</strong> U <strong>statistics</strong> projecti<strong>on</strong>s.<br />
Corollary 1 Let {Z i } n i=1 be a sequence <strong>of</strong> i.i.d. r<strong>and</strong>om variables <strong>and</strong> u n <strong>and</strong><br />
v n be U <strong>and</strong> V <strong>statistics</strong> with kernel functi<strong>on</strong> ψ n (Z 1 ,... ,Z k ). In additi<strong>on</strong>, let<br />
û n = k n<br />
∑ n<br />
i=1 (ψ 1n(Z i ) − θ n ) + θ n , where ψ 1n (Z i ) = E (ψ n (Z 1 ,... ,Z k )|Z i ) <strong>and</strong><br />
θ n = E (ψ n (Z 1 ,... ,Z k )). IfE ( ψ 2 n (Z 1,... ,Z k ) ) = o(n) <strong>the</strong>n √ n(v n −û n ) =<br />
o p (1).<br />
Pro<strong>of</strong> From Theorem 1 we have √ n(v n − u n ) = o p (1), <strong>and</strong> from Lemma 2.1 <strong>in</strong><br />
Lee (1988) we have that √ n(u n −û n ) = o p (1). Hence, √ n(v n −û n ) = o p (1). ⊓⊔<br />
3 Some applicati<strong>on</strong>s <strong>in</strong> regressi<strong>on</strong> <strong>models</strong><br />
We provide applicati<strong>on</strong>s <strong>of</strong> Theorem 1 <strong>and</strong> its corollary around <strong>the</strong> follow<strong>in</strong>g regressi<strong>on</strong><br />
model,<br />
y t = m(x t ) + ε t , (5)
394 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />
where, {(y t ,x t )} n t=1 is a sequence <strong>of</strong> i.i.d. r<strong>and</strong>om variables, y t,x t ∈Rwith jo<strong>in</strong>t<br />
density q(y,x), E(ε t |x t ) = 0 <strong>and</strong> V(ε t |x t ) = σ 2 (x t ). This is similar to <strong>the</strong> regressi<strong>on</strong><br />
model c<strong>on</strong>sidered by Fan <strong>and</strong> Yao (1998), with <strong>the</strong> excepti<strong>on</strong> that here <strong>the</strong><br />
observati<strong>on</strong>s are i.i.d. ra<strong>the</strong>r than strictly stati<strong>on</strong>ary. Our applicati<strong>on</strong>s all <strong>in</strong>volve<br />
establish<strong>in</strong>g <strong>the</strong> asymptotic distributi<strong>on</strong> <strong>of</strong> <strong>statistics</strong> that are c<strong>on</strong>structed as l<strong>in</strong>ear<br />
comb<strong>in</strong>ati<strong>on</strong>s <strong>of</strong> a first stage n<strong>on</strong>parametric estimator <strong>of</strong> m(x). As <strong>the</strong> first stage<br />
estimator for m(x) we c<strong>on</strong>sider <strong>the</strong> local l<strong>in</strong>ear estimator <strong>of</strong> Fan (1992), i.e., for<br />
x ∈R, we obta<strong>in</strong> ˆm(x) =ˆα where<br />
n∑<br />
( ˆα, ˆβ) = argm<strong>in</strong> α,β (y t − α − β(x t − x)) 2 K<br />
t=1<br />
( )<br />
xt − x<br />
,<br />
h n<br />
where K(·) is a density functi<strong>on</strong> <strong>and</strong> 0
V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 395<br />
(a) Given E(y 2 t f 2 (x t ,y t )) < ∞ <strong>the</strong>n<br />
I n = 1 n<br />
n∑<br />
∫<br />
ε t f(x t , y)g(y|x t )dy + 1 2 h2 n σ K 2 E(m(2) (x t )f (x t ,y t ))<br />
t=1<br />
where g(y|x) de<str<strong>on</strong>g>note</str<strong>on</strong>g>s <strong>the</strong> c<strong>on</strong>diti<strong>on</strong>al density <strong>of</strong> y given x.<br />
(b) Given E(y 4 t f 2 (x t ,y t )) < ∞ <strong>the</strong>n<br />
+o p (n − 1 2 ) + op (h 2 n ), (6)<br />
J n = σ 2 K h2 n<br />
n<br />
n∑<br />
ε t m (2) (x t )E(f (x t ,y t )|x 1 ,... ,x n )<br />
t=1<br />
+ 1 4 h4 n σ 4 K E((m(2) (x t )) 2 f(x t ,y t ))<br />
+o p (n − 1 2 ) + op (h 3 n ), (7)<br />
(c) Given E(y 6 t f 2 (x t ,y t )) < ∞ <strong>the</strong>n<br />
L n = o p (h 3 n ) (8)<br />
(d) Given E(y 8 t f 2 (x t ,y t )) < ∞ <strong>the</strong>n<br />
M n = o p (h 4 n ). (9)<br />
Pro<strong>of</strong> Here we prove parts (a)–(c). The pro<strong>of</strong> <strong>of</strong> (d) is similar to that <strong>of</strong> (c) <strong>and</strong> is<br />
omitted. Let ∑ (p) h i 1 ...i p<br />
de<str<strong>on</strong>g>note</str<strong>on</strong>g> <strong>the</strong> sum <strong>of</strong> h i1 ...i p<br />
over <strong>the</strong> permutati<strong>on</strong>s <strong>of</strong> i 1 ...i p ,<br />
i.e., ∑ (p) h i 1 ...i p<br />
is <strong>the</strong> sum <strong>of</strong> p! terms given by h i1 i 2 ...i p<br />
+ h i2 i 1 ...i p<br />
+···+h ip i p−1 ...i 1<br />
.<br />
(a) Note that,<br />
ˆm(x t ) − m(x t ) =<br />
1<br />
nh n g(x t )<br />
n∑<br />
K<br />
k=1<br />
1<br />
+<br />
2nh n g(x t )<br />
( )<br />
xk − x t<br />
ε k<br />
h n<br />
n∑<br />
K<br />
k=1<br />
( )<br />
xk − x t<br />
m (2) (x kt )(x k − x t ) 2 + w(x t ),<br />
h n<br />
where w(x t ) = ˆm(x t )−m(x t )− 1 ∑ n<br />
nh n g(x t ) k=1 K(x k−x t<br />
h n<br />
)yk ∗ <strong>and</strong> y∗ k = ε k+ 1 2 m(2) (x kt )<br />
(x k −x t ) 2 with x kt = λx t +(1−λ)x k for some λ ∈ (0, 1). Hence, I n = I 1n +I 2n +I 3n ,<br />
where<br />
I 1n = 1<br />
n 2 h n<br />
n∑<br />
n∑<br />
K<br />
t=1 k=1<br />
( )<br />
xk − x t<br />
h n<br />
1<br />
ε k f(x t ,y t )<br />
g(x t )
396 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />
I 2n = h n<br />
2n 2<br />
I 3n = 1 n<br />
n∑<br />
t=1 k=1<br />
n∑<br />
( )( )<br />
xk − x t xk − x 2<br />
t<br />
K<br />
m (2) 1<br />
(x kt )f (x t ,y t )<br />
h n g(x t )<br />
h n<br />
n∑<br />
w(x t )f (x t ,y t ).<br />
t=1<br />
We treat each term separately. Observe that I 1n can be written as<br />
I 1n = 1 n∑ n∑<br />
( ( )<br />
1 xk − x t<br />
1<br />
K ε<br />
2n 2<br />
k f(x t ,y t )<br />
h<br />
t=1 k=1 n h n g(x t )<br />
+ 1 ( )<br />
)<br />
xt − x k<br />
1<br />
K ε t f(x k ,y k )<br />
h n h n g(x k )<br />
⎛ ⎞<br />
= 1 n∑ n∑<br />
⎝ ∑ h<br />
2n 2<br />
tk<br />
⎠ = 1 n∑ n∑<br />
ψ<br />
2n 2<br />
n (Z t ,Z k ) = 1 2 v n,<br />
t=1 k=1 (2)<br />
t=1 k=1<br />
where v n is a V statistic with Z t = (x t ,y t ) <strong>and</strong> ψ n (Z t ,Z k ) a symmetric kernel.<br />
Hence, by Corollary 1, if E ( ψn 2(Z t,Z k ) ) = o(n) for all t,k <strong>the</strong>n √ n(v n −û n ) =<br />
o p (1). First, <str<strong>on</strong>g>note</str<strong>on</strong>g> that for t ≠ k 1 n E ( ψn 2(Z t,Z k ) ) = 2 n E(h2 tk ) + 2 n E(h tkh kt ) <strong>and</strong><br />
s<strong>in</strong>ce E(εk 2|x 1,... ,x n ) = σ 2 (x k ), we have from <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa-Rao<br />
that<br />
1<br />
n E(h2 tk ) = 1 E<br />
nh 2 n<br />
( ( )<br />
K 2 xk −x t<br />
h n<br />
)<br />
σ 2 (x k )E(f 2 1<br />
(x t ,y t )|x 1 ,... ,x n ) → 0,<br />
g 2 (x t )<br />
provided that nh n → ∞. Similarly it can be shown that E( 1 n h tkh kt ) = o(1)<br />
<strong>and</strong> <strong>the</strong>refore 1 n E(ψ2 n (Z t,Z k )) = o(1). Fort = k it suffices to verify that 4K2 (0)<br />
(<br />
)<br />
nh 2 n<br />
E σ 2 (x t ) f 2 (x t ,y t )<br />
g 2 (x t<br />
= o(1), but this follows directly given our assumpti<strong>on</strong>s. Now,<br />
)<br />
s<strong>in</strong>ce E(ε t |x 1 ,... ,x n ) = 0, θ n = E (ψ n (Z t ,Z k )) = 0 <strong>and</strong> û n = 2 ∑ n<br />
n t=1 ψ 1n(Z t ).<br />
Therefore,<br />
ψ 1n (Z t ) = 1 ∫ ∫ ( )<br />
xk − x t<br />
ε t K f(x k ,y k )g(y k |x k )dx k dy k<br />
h n h<br />
∫<br />
n<br />
= ε t f(x t , y)g(y|x t )dy + o(1),<br />
where g(y|x) = q(y,x)<br />
g(x) . C<strong>on</strong>sequently,<br />
I 1n = 1 n∑<br />
∫<br />
ε t f(x t , y)g(y|x t )dy + o p (n − 1 2 ) (10)<br />
n<br />
t=1<br />
given that 1 n<br />
∑ n<br />
t=1 ε t = O p (n − 1 2 ). Now, <str<strong>on</strong>g>note</str<strong>on</strong>g> that<br />
I 2n = 1<br />
4n 2<br />
n∑<br />
n∑ ∑<br />
t=1 k=1 (2)<br />
h tk = 1<br />
4n 2<br />
n∑<br />
t=1 k=1<br />
n∑<br />
ψ n (Z t ,Z k ) = 1 4 v n,
V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 397<br />
where h tk = h n K( x k−x t<br />
h n<br />
)( x k−x t<br />
h n<br />
) 2 m (2) (x kt )f (x t ,y t ) 1<br />
g(x t ) . By Corollary 1 if E(ψ2 n (Z t,<br />
Z k )) = o(n) for all t,k, <strong>the</strong>n √ n(v n −û n ) = o p (1). Hence we focus <strong>on</strong> û n . Note<br />
that, û n = 2 ∑ n<br />
n t=1 ψ 1n(Z t ) − θ n , where ψ 1n (Z t ) = E(h tk |Z t ) + E(h kt |Z t ) <strong>and</strong><br />
E(û n ) = θ n = 2E(h tk ). Hence, by <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa-Rao,<br />
) (ûn<br />
E = 2 ∫ ∫ ∫ ( )( )<br />
xk − x t xk − x 2<br />
t<br />
K<br />
m (2) (x<br />
h 2 kt )<br />
n h n h n h n<br />
×f(x t ,y t ) g(x k)<br />
g(x t ) q(x t,y t )dx k dx t dy t<br />
−→ 2σ 2 K E(m(2) (x t )f (x t ,y t )).<br />
In additi<strong>on</strong>, <str<strong>on</strong>g>note</str<strong>on</strong>g> that<br />
( ) 1<br />
V û<br />
h 2 n = 4 nV (E(h<br />
n n 2 h 4 tk |Z t ) + E(h kt |Z t ))<br />
n<br />
= 4 (<br />
E(E(htk |Z<br />
nh 4 t )) 2 +E(E(h kt |Z t )) 2<br />
n<br />
+2E(E(h tk |Z t )E(h kt |Z k ))−θn) 2 .<br />
Under our assumpti<strong>on</strong>s, it is straightforward to show that<br />
1<br />
E(E(h tk |Z t )) 2<br />
nh 4 n<br />
= 1 E<br />
nh 2 n<br />
(<br />
as n →∞. Similarly,<br />
(<br />
as n →∞. Hence, V<br />
f(x t ,y t ) 2 1<br />
g 2 (x t ) E2 (<br />
K<br />
1<br />
nh 4 n<br />
1<br />
h 2 n<br />
(<br />
xk − x t<br />
h n<br />
)(<br />
xk − x t<br />
h n<br />
) 2<br />
m (2) (x kt )|Z t<br />
))<br />
→0<br />
E(E(h kt |Z t )) 2 → 0 <strong>and</strong> 1<br />
nh 4 n<br />
E(E(h tk |Z t )E(h kt |Z t )) → 0<br />
û n<br />
)<br />
→ 0. By Markov’s <strong>in</strong>equality we c<strong>on</strong>clude that<br />
û n = 2h 2 n σ K 2 E(m(2) (x t )f (x t ,y t )) + o p (h 2 n ) <strong>and</strong> by Corollary 1,<br />
I 2n = 1 2 h2 n σ K 2 E(m(2) (x t )f (x t ,y t )) + o p (h 2 n ). (11)<br />
F<strong>in</strong>ally, verificati<strong>on</strong> that E(ψn 2(Z t,Z k )) = 2E(h 2 tk ) + 2E(h tkh kt ) = o(n) for all<br />
t ≠ k follows directly from our assumpti<strong>on</strong>s <strong>and</strong> <strong>use</strong> <strong>of</strong> <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa-Rao,<br />
<strong>and</strong> for <strong>the</strong> case where t = k we have that E(ψn 2(Z t,Z k )) = 0. From<br />
Mart<strong>in</strong>s-Filho <strong>and</strong> Yao’s (2003) Lemma 2 <strong>and</strong> Theorem 1<br />
|w(x t )|=<br />
∣ ˆm(x t) − m(x t ) −<br />
1<br />
nh n g(x t )<br />
n∑<br />
K<br />
k=1<br />
( )<br />
xk − x t<br />
h n<br />
y ∗ k<br />
∣ = O p(R n,2 (x t ))<br />
<strong>and</strong> sup xt ∈G |R n,2(x t )|=o p (h 2 n ), hence sup x t ∈G |w(x t)| =o p (h 2 n ). As a result we<br />
have that |I 3n |≤o p (h 2 n ) 1 ∑ n<br />
n t=1 |f(x t,y t )|=o p (h 2 n ), s<strong>in</strong>ce E(f 2 (x t ,y t )) < ∞.<br />
Therefore, I 3n = 1 ∑ n<br />
n t=1 w(x t)f (x t ,y t ) = o p (h 2 n ). Comb<strong>in</strong><strong>in</strong>g this result with
398 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />
(10) <strong>and</strong> (11) proves part (a).<br />
For part (b) <str<strong>on</strong>g>note</str<strong>on</strong>g> that<br />
J n = 1 n∑ n∑ n∑<br />
( )<br />
1<br />
f(x<br />
n 3 h 2 t ,y t )<br />
n<br />
g 2 (x<br />
t=1 k=1 l=1<br />
t ) K xk − x t<br />
K<br />
h n<br />
n∑ n∑ n∑<br />
(<br />
+ h2 n<br />
1<br />
4n 3<br />
g 2 (x t ) f(x xk − x t<br />
t,y t )K<br />
t=1 k=1 l=1<br />
(<br />
xl − x t<br />
×K<br />
h n<br />
h n<br />
)(<br />
xl − x t<br />
h n<br />
) 2<br />
m (2) (x kt )m (2) (x lt )<br />
+ 1 n∑<br />
w 2 (x t )f (x t ,y t )<br />
n<br />
t=1<br />
+ 1 n∑ n∑ n∑<br />
( )<br />
1<br />
n 3 g 2 (x<br />
t=1 k=1 l=1 t ) f(x xk − x t<br />
t,y t )K<br />
h n<br />
( )( )<br />
xl − x t xl − x 2<br />
t<br />
×K<br />
m (2) (x lt )ε k<br />
+ 2<br />
n 2 h n<br />
n∑<br />
h n<br />
t=1 k=1<br />
h n<br />
n∑ 1<br />
w(x t )<br />
g(x t ) f(x t,y t )K<br />
+ h n∑ n∑<br />
n<br />
1<br />
w(x<br />
n 2<br />
t )<br />
g(x<br />
t=1 k=1<br />
t ) f(x t,y t )K<br />
= J 1n + J 2n + J 3n + J 4n + J 5n + J 6n .<br />
We exam<strong>in</strong>e each term separately.<br />
J 1n = 1 1<br />
n∑ n∑ n∑ ∑<br />
h<br />
6 n 3<br />
tkl = 1 1<br />
6 n 3<br />
t=1 k=1 l=1 (3)<br />
n∑<br />
( )<br />
xk − x t<br />
ε k<br />
h n<br />
(<br />
xk − x t<br />
n∑<br />
h n<br />
t=1 k=1 l=1<br />
1<br />
( )<br />
xl − x t<br />
ε k ε l<br />
h n<br />
)(<br />
xk − x t<br />
h n<br />
) 2<br />
)(<br />
xk − x t<br />
h n<br />
) 2<br />
m (2) (x kt )<br />
n∑<br />
ψ n (Z t ,Z k ,Z l ) = 1 6 v n,<br />
where h tkl = 1 K( x k−x t<br />
h 2 n h n<br />
)K( x l−x t<br />
h n<br />
)ε k ε l f(x t ,y t )<br />
g 2 (x t ) . By Corollary 1, if E(ψ2 n (Z t,<br />
Z k ,Z l )) = o(n) for all t,k,l <strong>the</strong>n √ n(v n −û n ) = o p (1). S<strong>in</strong>ce ψ 1n (Z t ) = 0<br />
<strong>and</strong> θ n = 0, we have that û n = 0 <strong>and</strong> <strong>the</strong>refore √ nI 1n = o p (1). To verify that<br />
1<br />
n E(ψ2 n (Z t,Z k ,Z l )) = o(1) for t ≠ k ≠ l we <str<strong>on</strong>g>note</str<strong>on</strong>g> that 1 n E(ψ2 n (Z t,Z k ,Z l )) =<br />
4<br />
n (3E(h2 tkl ) + 2E(h tklh ktl ) + 2E(h tkl h ltk ) + 2E(h ktl h ltk )). Under A1–A4, repeated<br />
applicati<strong>on</strong> <strong>of</strong> <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa-Rao to each <strong>of</strong> <strong>the</strong>se expectati<strong>on</strong>s shows<br />
that each approaches zero as n →∞. Also, if t = k but t ≠ l we have that<br />
E(ψn 2(Z t,Z t ,Z l )) = o(n) <strong>and</strong> when t = k = l, E(ψn 2(Z t,Z t ,Z t )) = o(n 2 )<br />
directly from our assumpti<strong>on</strong>s provided nh 3 n → 0.<br />
J 2n = 1<br />
24n 3<br />
n∑<br />
n∑<br />
n∑ ∑<br />
t=1 k=1 l=1 (3)<br />
)( x k−x t<br />
h n<br />
h tkl = 1<br />
24n 3<br />
n∑<br />
) 2 K( x l−x t<br />
h n<br />
)( x l−x t<br />
n∑<br />
n∑<br />
t=1 k=1 l=1<br />
ψ n (Z t ,Z k ,Z l ) = 1<br />
24 v n,<br />
where h tkl = h 2 n K(x k−x t<br />
h n<br />
By Corollary 1 √ n(v n −û n ) = o p (1) provided that E(ψ n (Z t ,Z k ,Z l )) = o(n)<br />
h n<br />
) 2 m (2) (x kt )m (2) (x lt )<br />
g 2 (x t ) f(x t,y t ).<br />
1
V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 399<br />
for all t,k,l. As above, verificati<strong>on</strong> <strong>of</strong> this c<strong>on</strong>diti<strong>on</strong> is straightforward given our<br />
assumpti<strong>on</strong>s. Note that û n = 3 ∑ n<br />
n t=1 ψ 1n(Z t ) − 2θ n with E(û n ) = θ n . Hence,<br />
(<br />
1<br />
E(û<br />
h 4 n ) = 6 ( )( )<br />
xk −x t xk − x 2 ( )( )<br />
t xl −x t xl −x 2<br />
t<br />
E K<br />
K<br />
m (2) (x<br />
n h 2 kt )<br />
n h n h n h n h n<br />
)<br />
× m (2) 1<br />
(x lt )<br />
g 2 (x t ) f(x t,y t )<br />
<strong>and</strong> by <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa-Rao we have, 1 E(û<br />
h 4 n )→6σK 4 E((m(2) (x t )) 2 f(x t ,<br />
n<br />
y t )) <strong>and</strong> V( 1 û<br />
h 4 n ) → 0. By Markov’s <strong>in</strong>equality we c<strong>on</strong>clude that J 2n = 1<br />
n 4 h4 n σ K<br />
4<br />
E((m (2) (x t )) 2 f(x t ,y t )) + o p (h 4 n ) + o p(n −1/2 ). J 3n = 1 ∑ n<br />
n t=1 w2 (x t )f (x t ,y t ) =<br />
o p (h 4 n ) follows directly from <strong>the</strong> analysis <strong>of</strong> <strong>the</strong> I 3n <strong>in</strong> part (a). Now,<br />
J 4n = 1<br />
6n 3<br />
n∑<br />
n∑<br />
where h tkl = K( x k−x t<br />
h n<br />
n∑ ∑<br />
t=1 k=1 l=1 (3)<br />
h tkl = 1<br />
6n 3<br />
n∑<br />
n∑<br />
t=1 k=1 l=1<br />
n∑<br />
ψ n (Z t ,Z k ,Z l ) = 1 6 v n,<br />
)K( x l−x t<br />
h n<br />
)( x l−x t<br />
h n<br />
) 2 m (2) 1<br />
(x lt )ε k g 2 (x t ) f(x t,y t ). Once aga<strong>in</strong> given<br />
our assumpti<strong>on</strong>s it can be verified that E(ψn 2(Z t,Z k ,Z l )) = o(n) for all t,k,l,<br />
<strong>the</strong>refore we have √ n(v n −û n ) = o p (1), where û n = 3 ∑ n<br />
n t=1 ψ 1n(Z t ) − 2θ n .<br />
Given that <strong>in</strong> this case θ n = 0, we have<br />
û n = 6 n<br />
(<br />
n∑<br />
ε t E K<br />
t=1<br />
( )<br />
xt −x k<br />
K<br />
h n<br />
( )( ) )<br />
xl −x k xl −x 2<br />
k<br />
m (2) 1<br />
(x lk )<br />
h n g 2 (x k ) f(x k,y k )|Z t ,<br />
h n<br />
Under A1–A4, <strong>the</strong> propositi<strong>on</strong> <strong>of</strong> Prakasa Rao gives<br />
( ( ) ( )( ) )<br />
1 xt − x k xl − x k xl − x 2<br />
k<br />
E K K<br />
m (2) 1<br />
(x<br />
h 2 lk )<br />
n h n h n h n g 2 (x k ) f(x k,y k )|Z t →<br />
σK 2 m(2) (x t )E(f (x t ,y)|x 1 ,... ,x n ).<br />
∑<br />
Hence, û n = 6h2 n n<br />
n t=1 ε tm (2) (x t )σK 2 E(f(x t,y)|x 1 ,... ,x n )+o p (n − 1 2 ), given that<br />
6<br />
∑ n<br />
n t=1 ε t = O p (n − 1 2 ). C<strong>on</strong>sequently, J 4n = σ K 2 ∑ h2 n n<br />
n t=1 ε tm (2) (x t )E(f (x t ,y)<br />
|x 1 ,... ,x n ) + o p (n − 1 2 ).<br />
|J 5n | = ∣ 2 ∑ (<br />
n<br />
n t=1 w(x 1<br />
∑ ) n<br />
t)<br />
nh n g(x t ) k=1 K(x k−x t<br />
h n<br />
)ε k f(x t ,y t ) ∣ <strong>and</strong> from<br />
Mart<strong>in</strong>s-Filho <strong>and</strong> Yao (2003) Theorem 1 we have sup xt ∈G |w(x t)|<br />
= o p (h 2 n ) <strong>and</strong> sup 1<br />
∑<br />
∣<br />
n<br />
x t ∈G ∣<br />
nh n g(x t ) k=1 K(x k−x t ∣∣<br />
h n<br />
)ε k = op (h n ). Hence, |J 5n |<br />
≤ 2 ∑ n<br />
n t=1 |f(x t,y t )|o p (h 3 n ) = O p(1)o p (h 3 n ) = o p(h 3 n ) s<strong>in</strong>ce 2 ∑ n<br />
n t=1 |f(x t,y t )|<br />
= O p (1), which gives J 5n = o p (h 3 n ). F<strong>in</strong>ally, from Mart<strong>in</strong>s-Filho <strong>and</strong> Yao (2003)<br />
Theorem 1 we have<br />
h n<br />
n∑<br />
( )( ) xk − x t xk − x 2 t<br />
sup<br />
K<br />
m (2) (x kt )<br />
∣ng(x t )<br />
h n ∣ = O p(h 2 n ) (12)<br />
x t ∈G<br />
k=1<br />
h n
400 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />
<strong>and</strong> |J 6n |≤ 1 ∑ n<br />
n t=1 |f(x t,y t )|o p (h 2 n )O p(h 2 n ) = O p(1)o p (h 4 n ) = o p(h 4 n ). Comb<strong>in</strong><strong>in</strong>g<br />
<strong>the</strong> orders <strong>of</strong> all six terms gives <strong>the</strong> desired result.<br />
∑ n<br />
k=1 K (<br />
xk −x t<br />
1<br />
2nh n g(x t )<br />
∑ n<br />
k=1 K (<br />
xk −x t<br />
h n<br />
)<br />
m (2) (x kt )<br />
)<br />
(c) Let a tn = 1<br />
nh n g(x t )<br />
h n<br />
ε k , b tn =<br />
(x k − x t ) 2 , c tn = w(x t ). Then ( ˆm(x t ) − m(x t )) 3 = atn 3 + b3 tn + c3 tn + 3a tnbtn 2 +<br />
3a tn ctn 2 +3a2 tn b tn+3atn 2 c tn+3btn 2 c tn+3b tn ctn 2 +6a tnb tn c tn <strong>and</strong> n −1 ∑ n<br />
t=1 ( ˆm(x t)−<br />
m(x t )) 3 f(x t ,y t ) = ∑ 10<br />
i=1 L <strong>in</strong>, where L 1n = n −1 ∑ n<br />
t=1 a3 tn f(x t,y t ),... ,L 10n =<br />
n −1 ∑ n<br />
t=1 6a tnb tn c tn f(x t ,y t ). S<strong>in</strong>ce<br />
∣ sup 1<br />
n∑<br />
( ) ∣<br />
xk − x ∣∣∣∣<br />
t<br />
x t<br />
K ε k = o p (h n ),<br />
nh n g(x t ) h n<br />
k=1<br />
<strong>the</strong>n |L 1n |≤o p (h 3 n ) × 1 ∑ n<br />
n t=1 |f(x t,y t )|=o p (h 3 n ). By (12), |L 2n| ≤O p (h<br />
∑ 6 n )n−1<br />
n<br />
t=1 |f(x t,y t )|=O p (h 6 n ) s<strong>in</strong>ce E(f 2 (x t ,y t )) < ∞. |L 3n |≤o p (h 6 n ) follows<br />
directly from sup xt ∈G|w(x t )|=o p (h 2 n ).<br />
|L 4n |≤ 3 n∑<br />
1<br />
n∑<br />
( ) ∣<br />
xk − x ∣∣∣∣<br />
t<br />
K ε k<br />
n ∣nh n g(x t ) h n<br />
t=1<br />
(<br />
×<br />
∣<br />
1<br />
2nh n g(x t )<br />
k=1<br />
n∑<br />
K<br />
k=1<br />
×|f(x t ,y t )|≤o p (h n )O p (h 4 n ) 3 n<br />
( )<br />
xk − x t<br />
(x k − x t ) 2 m (2) (x kt )<br />
h n<br />
) 2<br />
∣ ∣∣∣∣∣<br />
n∑<br />
|f(x t ,y t )|=o p (h 5 n )<br />
from (12) <strong>and</strong> <strong>the</strong> analysis <strong>of</strong> J 5n <strong>in</strong> part (b). Similarly, we obta<strong>in</strong> |L 5n |, |L 10n |≤<br />
o p (h 5 n ), |L 6n|, |L 7n |≤o p (h 4 n ) <strong>and</strong> |L 8n|, |L 9n |≤o p (h 6 n ). Hence, we have L n =<br />
o p (h 3 n ).<br />
⊓⊔<br />
We now <strong>use</strong> <strong>the</strong> results <strong>in</strong> <strong>the</strong> Lemma to establish <strong>the</strong> asymptotic distributi<strong>on</strong> <strong>of</strong><br />
several <strong>statistics</strong> <strong>of</strong> <strong>in</strong>terest.<br />
t=1<br />
3.1 Estimat<strong>in</strong>g c<strong>on</strong>diti<strong>on</strong>al variance<br />
C<strong>on</strong>sider first a special case <strong>of</strong> <strong>the</strong> model (5), where ε t |x t ∼ N(0,σ 2 ). Then a<br />
natural estimator <strong>of</strong> σ 2 is<br />
ˆσ 2 = 1 n∑<br />
(y t −ˆm(x t )) 2 = 1 n∑ (<br />
ε<br />
2<br />
n<br />
n t +2(m(x t )−ˆm(x t ))ε t +(m(x t )−ˆm(x t )) 2) .<br />
t=1<br />
t=1<br />
The difficulty <strong>in</strong> deal<strong>in</strong>g with such expressi<strong>on</strong>s lies <strong>in</strong> <strong>the</strong> average terms <strong>in</strong>volv<strong>in</strong>g<br />
(m(x t ) −ˆm(x t )) <strong>and</strong> (m(x t ) −ˆm(x t )) 2 , s<strong>in</strong>ce <strong>the</strong> first term is an average <strong>of</strong> an<br />
i.i.d. sequence. By us<strong>in</strong>g Corollary 1 <strong>and</strong> Lemma 1 we have a c<strong>on</strong>venient way to<br />
establish <strong>the</strong>ir asymptotic properties. This is shown <strong>in</strong> <strong>the</strong> next <strong>the</strong>orem.<br />
Theorem 2 Assume that <strong>in</strong> model (5), ε t |x t ∼ N(0,σ 2 ) <strong>and</strong> E(yt 4 )
V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 401<br />
Pro<strong>of</strong> Note that ˆσ 2 = 1 ∑ n<br />
(<br />
n t=1 ε<br />
2<br />
t + 2(m(x t ) −ˆm(x t ))ε t + (m(x t ) −ˆm(x t )) 2) .<br />
Now, lett<strong>in</strong>g f(x t ,y t )=−2ε t <strong>in</strong> part (a) <strong>of</strong> Lemma 1 we have 1 ∑ n<br />
n t=1 ε ∫<br />
t f(xt ,y t )<br />
g(y t |x t )dy t = 0, s<strong>in</strong>ce E(ε t |x 1 ,... ,x n ) = 0 <strong>and</strong> 1 2 h2 n σ K 2 E ( m (2) (x t )f (x t ,y t ) ) = 0.<br />
Hence, 1 ∑ n<br />
n t=1 2(m(x t) −ˆm(x t ))ε t = o p (n − 1 2 ) + o p (h 2 n ). By part (b) <strong>of</strong> Lemma 1<br />
with f(x t ,y t ) = 1wehave<br />
1<br />
4 h4 n σ K 4 E((m(2) (x t )) 2 f(x t ,y t )) = O p (h 4 n )<br />
<strong>and</strong><br />
σK 2 n∑ h2 n<br />
ε t m (2) (x t )E(f (x t ,y)|x 1 ,... ,x n )<br />
n<br />
t=1<br />
= σ K 2 n∑ h2 n<br />
ε t m (2) (x t ) = O p (n − 1 2 h<br />
2<br />
n<br />
n ) = o p (n − 1 2 )<br />
t=1<br />
by <strong>the</strong> central limit <strong>the</strong>orem for i.i.d. sequences. Hence, 1 ∑ n<br />
n t=1 (m(x t)− ˆm(x t )) 2 =<br />
o p (n − 1 2 ) + o p (h 3 n ) provided E(y4 t )
402 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />
(b) If E(y 8 t )
V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 403<br />
S<strong>in</strong>ce E(yt 6)
404 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />
Hence,<br />
b 3n = n −1<br />
n∑<br />
t=1<br />
ε 4 t − µ 4 − 4µ 3<br />
n<br />
n∑<br />
ε t − 2σK 2 h2 n µ 3E ( m (2) (x t ) )<br />
+o p (n −1/2 ) + o p (h 2 n ). (23)<br />
µ<br />
Comb<strong>in</strong><strong>in</strong>g (20) <strong>and</strong> (23), we write b 3n − b 4<br />
4n (µ 2<br />
= n −1 ∑ n<br />
) 2 t=1 W t + A 2n where<br />
W t = εt 4 + µ 4 − 4ε t µ 3 − 2 µ 4<br />
µ 2<br />
εt 2 <strong>and</strong> A 2n =−2h 2 n σ K 2 µ 3E(m (2) (x t )) + o p (n −1/2 ) +<br />
o p (h 2 n ). Note that E(W t) = 0 <strong>and</strong> by <strong>the</strong> c r -<strong>in</strong>equality, V(W t ) = E(Wt 2)
V <strong>and</strong> U <strong>statistics</strong> <strong>in</strong> n<strong>on</strong>parametric <strong>models</strong> 405<br />
√ n( ˆη 2 −η 2 −b 2n ) → d N(0,V(ξ t )) where ξ t = 1 (<br />
ε<br />
2<br />
V(y t ) t −(1−η 2 )(y t −E(y t )) 2)<br />
<strong>and</strong> b 2n = o p (h 2 n ).<br />
(<br />
)<br />
Pro<strong>of</strong> ˆη 2 − η 2 1<br />
E(y<br />
=− 1<br />
∑ n β<br />
n t=1 (y t −ȳ) 2 1 − β t −m(x t )) 2<br />
2 V(y t<br />
where β<br />
)<br />
1 = 1 ∑ n<br />
n t=1 (y t −<br />
ˆm(x t )) 2 − E(y t − m(x t )) 2 , β 2 = 1 ∑ n<br />
n t=1 (y t − ȳ) 2 − V(y t ). First, <str<strong>on</strong>g>note</str<strong>on</strong>g> that by <strong>the</strong><br />
law <strong>of</strong> large numbers 1 ∑ n<br />
n t=1 y2 p<br />
t → E(yt 2 ) <strong>and</strong> ȳ2 → p E(y t ) 2 , hence<br />
(<br />
) −1<br />
1<br />
n∑<br />
(y t − ȳ) 2 p<br />
→ V(yt ) −1 . (26)<br />
n<br />
t=1<br />
1<br />
For β 2 , ȳ 2 − (E(y t )) 2 = 1 ∑ n ∑ n<br />
2 n 2 t=1 k=1 (y ty k + y k y t ) − E(m(x t )) 2 = 1 2 v n −<br />
(E(m(x t ))) 2 . By Corollary 1, s<strong>in</strong>ce 1 n E(ψ2 n (y t,y k )) = 4 n E((y2 t ))2 = o(1) for all<br />
t,k, √ n(v n −û n ) = o p (1) where û n = 2 ∑ n<br />
n t=1 (ψ 1n(Z t ))−θ n = 4 ∑ n<br />
n t=1 y tE(m(x t ))−<br />
2(E(m(x t ))) 2 . Hence, ȳ 2 − (E(y t )) 2 = 2 ∑ n<br />
n t=1 y tE(m(x t )) − 2(E(m(x t ))) 2 +<br />
o p (n − 1 2 ), <strong>and</strong><br />
E(y t − m(x t )) 2<br />
β 2 = 1 n∑<br />
{ E(σ 2 (x t ))<br />
(yt 2 − E(yt 2 V(y t ) n V(y<br />
t=1<br />
t )<br />
))<br />
}<br />
−2 E(σ2 (x t ))E(m(x t ))<br />
(y t − E(m(x t ))) + o p (n − 1 2 ).<br />
V(y t )<br />
For β 1 , 1 ∑ n<br />
n t=1 (y t−ˆm(x t )) 2 = 1 ∑ n<br />
n t=1(<br />
ε<br />
2<br />
t + 2(m(x t )−ˆm(x t ))ε t +(m(x t )−ˆm(x t )) 2) .<br />
Similarly to <strong>the</strong> pro<strong>of</strong> <strong>of</strong> Theorem 2, we <strong>use</strong> part (a) <strong>of</strong> Lemma 1 with f(x t ,y t )<br />
=−2ε t to obta<strong>in</strong> 1 ∑ n<br />
n t=1 2(m(x t)− ˆm(x t ))ε t = o p (n − 1 2 )+o p (h 2 n ). By part (b) with<br />
f(x t ,y t ) = 1 we obta<strong>in</strong> 1 ∑ n<br />
n t=1 (m(x t)− ˆm(x t )) 2 = o p (n − 1 2 )+o p (h 3 n ). Therefore,<br />
1<br />
∑ n<br />
n t=1 (y t −ˆm(x t )) 2 = 1 ∑ n<br />
n t=1 ε2 t + o p (n − 1 2 ) + o p (h 2 n ). Hence,<br />
β 1 − β 2<br />
E(y t − m(x t )) 2<br />
V(y t )<br />
= 1 n<br />
n∑<br />
t=1<br />
(ε 2 t − E(σ 2 (x t )) − E(σ2 (x t ))<br />
V(y t )<br />
+ 2E(σ2 (x t ))E(m(x t ))<br />
(y t − E(m(x t )))<br />
V(y t )<br />
+o p (n − 1 2 ) + op (h 2 n )<br />
= 1 n∑<br />
ζ t + o p (n − 1 2 ) + op (h 2 n<br />
n<br />
).<br />
t=1<br />
(y 2 t − E(y 2 t ))<br />
S<strong>in</strong>ce E(ζ t ) = 0,V(ζ t ) = V(εt 2 − (1 − η 2 )(y t − E(y t )) 2 )
406 C. Mart<strong>in</strong>s-Filho <strong>and</strong> F. Yao<br />
4 C<strong>on</strong>clusi<strong>on</strong><br />
We have established <strong>the</strong> √ n asymptotic equivalence between V <strong>and</strong> U <strong>statistics</strong><br />
when <strong>the</strong>ir kernels depend <strong>on</strong> n, <strong>the</strong> sample size. We comb<strong>in</strong>e our result with a<br />
result <strong>of</strong> Lee (1988) to obta<strong>in</strong> <strong>the</strong> √ n asymptotic equivalence between V <strong>statistics</strong><br />
<strong>and</strong> U <strong>statistics</strong> projecti<strong>on</strong>s. We provide a number <strong>of</strong> examples <strong>and</strong> illustrati<strong>on</strong>s<br />
<strong>on</strong> how our results can be <strong>use</strong>d <strong>in</strong> n<strong>on</strong>parametric kernel estimati<strong>on</strong>. The list <strong>of</strong><br />
examples is obviously not exhaustive as our results can be <strong>use</strong>d <strong>in</strong> much broader<br />
c<strong>on</strong>texts.<br />
Acknowledgements<br />
We would like to thank Pr<strong>of</strong>essor Katuomi Hirano <strong>and</strong> two an<strong>on</strong>ymous referees<br />
for comments <strong>and</strong> suggesti<strong>on</strong>s.<br />
References<br />
Ahn, H., Powell, J. L. (1993). Semiparametric estimati<strong>on</strong> <strong>of</strong> censored selecti<strong>on</strong> <strong>models</strong> with a<br />
n<strong>on</strong>parametric selecti<strong>on</strong> mechanism. Journal <strong>of</strong> Ec<strong>on</strong>ometrics, 58, 3–29.<br />
Bönner, N., Kirschner, H.-P. (1977). Note <strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s for weak c<strong>on</strong>vergence <strong>of</strong> V<strong>on</strong> Mises’<br />
differentiable statistical functi<strong>on</strong>s. The Annals <strong>of</strong> Statistics, 5, 405–407.<br />
D’Amico, S. (2003). Density estimati<strong>on</strong> <strong>and</strong> comb<strong>in</strong>ati<strong>on</strong> under model ambiguity via a pilot n<strong>on</strong>parametric<br />
estimate: an applicati<strong>on</strong> to stock returns.Work<strong>in</strong>g paper, Department <strong>of</strong> Ec<strong>on</strong>omics,<br />
Columbia University (http://www.columbia.edu/∼sd445/).<br />
Doksum, K., Samarov, A. (1995). N<strong>on</strong>parametric estimati<strong>on</strong> <strong>of</strong> global functi<strong>on</strong>als <strong>and</strong> a measure<br />
<strong>of</strong> <strong>the</strong> explanatory power <strong>of</strong> covariates <strong>in</strong> regressi<strong>on</strong>. The Annals <strong>of</strong> Statistics, 23(5),<br />
1443–1473.<br />
Fan, J. (1992). Design adaptive n<strong>on</strong>parametric regressi<strong>on</strong>. Journal <strong>of</strong> <strong>the</strong> American Statistical<br />
Associati<strong>on</strong>, 87, 998–1004.<br />
Fan, J.,Yao, Q. (1998). Efficient estimati<strong>on</strong> <strong>of</strong> c<strong>on</strong>diti<strong>on</strong>al variance functi<strong>on</strong>s <strong>in</strong> stochastic regressi<strong>on</strong>.<br />
Biometrika, 85, 645–660.<br />
Fan, Y., Li, Q. (1996). C<strong>on</strong>sistent model specificati<strong>on</strong> test: omitted variables <strong>and</strong> semiparametric<br />
functi<strong>on</strong>al forms. Ec<strong>on</strong>ometrica, 64(4), 865–890.<br />
Hoeffd<strong>in</strong>g, W. (1948). A class <strong>of</strong> <strong>statistics</strong> with asymptotically normal distributi<strong>on</strong>. Annals <strong>of</strong><br />
Ma<strong>the</strong>matical Statistics, 19, 293–325.<br />
Hoeffd<strong>in</strong>g, W. (1961). The str<strong>on</strong>g law <strong>of</strong> large numbers for U <strong>statistics</strong>, Institute <strong>of</strong> Statistics,<br />
Mimeo Series 302, University <strong>of</strong> North Carol<strong>in</strong>a, Chapel Hill, NC.<br />
Kemp, G. (2000). Semiparametric estimati<strong>on</strong> <strong>of</strong> a logit model. Work<strong>in</strong>g paper, Department <strong>of</strong><br />
Ec<strong>on</strong>omics, University <strong>of</strong> Essex (http://www.essex.ac.uk /∼ kempgcr/ papers/logitmx.pdf).<br />
Lee, A. J. (1990). U-Statistics. New York: Marcel Dekker.<br />
Lee, B. (1988). N<strong>on</strong>parametric tests us<strong>in</strong>g a kernel estimati<strong>on</strong> method, Doctoral dissertati<strong>on</strong>,<br />
Department <strong>of</strong> Ec<strong>on</strong>omics, University <strong>of</strong> Wisc<strong>on</strong>s<strong>in</strong>, Madis<strong>on</strong>, WI.<br />
Mart<strong>in</strong>s-Filho, C., Yao, F. (2003). A n<strong>on</strong>parametric model <strong>of</strong> fr<strong>on</strong>tiers, work<strong>in</strong>g paper, Department<br />
<strong>of</strong> Ec<strong>on</strong>omics, Oreg<strong>on</strong> State University (http://oreg<strong>on</strong>state.edu/∼mart<strong>in</strong>sc/mart<strong>in</strong>s-filho-yao(03).pdf).<br />
Powell, J., Stock, J., Stoker, T. (1989). Semiparametric estimati<strong>on</strong> <strong>of</strong> <strong>in</strong>dex coefficients. Ec<strong>on</strong>ometrica,<br />
57, 1403–1430.<br />
Prakasa-Rao, B. L. S. (1983). N<strong>on</strong>parametric functi<strong>on</strong>al estimati<strong>on</strong>. Orl<strong>and</strong>o, FL: Academic<br />
Serfl<strong>in</strong>g, R. J. (1980). Approximati<strong>on</strong> <strong>the</strong>orems <strong>of</strong> ma<strong>the</strong>matical <strong>statistics</strong>. New York: Wiley.<br />
Yao, Q., T<strong>on</strong>g, H. (2000). N<strong>on</strong>parametric estimati<strong>on</strong> <strong>of</strong> ratios <strong>of</strong> noise to signal <strong>in</strong> stochastic<br />
regressi<strong>on</strong>. Statistica S<strong>in</strong>ica, 10, 751–770.<br />
Zheng, J. X. (1998). A c<strong>on</strong>sistent n<strong>on</strong>parametric test <strong>of</strong> parametric regressi<strong>on</strong> <strong>models</strong> under<br />
c<strong>on</strong>diti<strong>on</strong>al quantile restricti<strong>on</strong>s. Ec<strong>on</strong>ometric Theory, 14, 123–138.