Hypothesis testing in mixture regression models - Columbia University
Hypothesis testing in mixture regression models - Columbia University
Hypothesis testing in mixture regression models - Columbia University
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
8 H.-T. Zhu and H. Zhang<br />
MLR n when q 2 = 1. This co<strong>in</strong>cides with theorem 1 of Chen et al. (2001) when there are no<br />
covariates, i.e. q 1 = 0. Theorem 2, part (c), shows that the n −1=4 consistent rate for estimat<strong>in</strong>g<br />
the mix<strong>in</strong>g distribution G.µ/ is reachable by us<strong>in</strong>g ˆω P .<br />
Until now, we have systematically <strong>in</strong>vestigated the order of convergence rate for both K. ˆω P /<br />
and K. ˆω M /, and the asymptotic distributions of both LR n and MLR n . The next two issues are<br />
also very important. The first is how to compute the critical values for these potentially complicated<br />
distributions of test statistics. The second is to compare the power of LR n and MLR n .<br />
In what follows, we study the empirical and asymptotic behaviour of these two statistics under<br />
departures from hypothesis H 0 .<br />
2.1. A resampl<strong>in</strong>g method<br />
Although we have obta<strong>in</strong>ed the asymptotic distributions of the likelihood-based statistics, the<br />
limit<strong>in</strong>g distributions usually have complicated analytic forms. To alleviate this difficulty, we<br />
use a resampl<strong>in</strong>g technique to calculate the critical values of the <strong>test<strong>in</strong>g</strong> statistics. Although the<br />
bootstrapp<strong>in</strong>g method is an obvious approach, it requires repeated maximizations of the likelihood<br />
and modified likelihood functions. The maximizations are computationally <strong>in</strong>tensive for<br />
the f<strong>in</strong>ite <strong>mixture</strong> <strong>models</strong>. Thus, we prefer a computationally more efficient method, as proposed<br />
and used by Hansen (1996), Kosorok (2003) and others.<br />
On the basis of equations (6) and (7), we only need to focus on W n .µ 2 / and J n .µ 2 /. If hypothesis<br />
H 0 is true, . ˆβ 0 ,ˆµ 0 / = arg max ω∈Ω0 {L n .ω/} provides consistent estimators of βÅ and µÅ.By<br />
substitut<strong>in</strong>g .βÅ, µÅ/ with . ˆβ 0 ,ˆµ 0 / <strong>in</strong> the def<strong>in</strong>itions of W n .µ 2 / and J n .µ 2 /, we obta<strong>in</strong> ŵ i .µ 2 /,<br />
Ŵ n .µ 2 / and Ĵ n .µ 2 / accord<strong>in</strong>gly, i.e.<br />
ŵ i .µ 2 / = .F i,1 . ˆβ 0 ,ˆµ 0 / T , F i,2 . ˆβ 0 ,ˆµ 0 /, dvecs.F i,6 . ˆβ 0 , µ 2 // T / T ,<br />
Ŵ n .µ 2 / = √ 1 n∑<br />
ŵ i .µ 2 /,<br />
n<br />
Jˆ<br />
n .µ 2 / = 1 n∑<br />
ŵ i .µ 2 / ŵ i .µ 2 / T :<br />
n i=1<br />
To assess the significance level and the power of the two test statistics, we need to obta<strong>in</strong><br />
empirical distributions for these statistics <strong>in</strong> lieu of their theoretical distributions. What follow<br />
are the four key steps <strong>in</strong> generat<strong>in</strong>g the stochastic processes that have the same asymptotic<br />
distributions as the test statistics.<br />
Step 1: we generate <strong>in</strong>dependent and identically distributed random samples, {v i, m : i =<br />
1,...,n}, from the standard normal distribution N.0, 1/: Here, m represents a replication<br />
number.<br />
Step 2: we calculate<br />
and<br />
Q m n .λ, µ 2/ = .λ − ˆ<br />
i=1<br />
Ŵn m .µ ∑<br />
2/ = n ŵ i .µ 2 /v i, m = √ n<br />
J −1<br />
n<br />
i=1<br />
.µ 2/Wn m .µ 2// T Jˆ<br />
n .µ 2 /.λ − ˆ<br />
n .µ 2/Wn m .µ 2//:<br />
It is important to note that Ŵ m n .µ 2/ converges weakly to W.µ 2 / as n →∞. This claim can be<br />
proved by us<strong>in</strong>g the conditional central limit theorem; see theorem (10.2) of Pollard (1990)<br />
and theorem 2 of Hansen (1996).<br />
J −1