02.01.2015 Views

Hypothesis testing in mixture regression models - Columbia University

Hypothesis testing in mixture regression models - Columbia University

Hypothesis testing in mixture regression models - Columbia University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4 H.-T. Zhu and H. Zhang<br />

<strong>in</strong> model (1) are very important <strong>in</strong> genetic studies for assess<strong>in</strong>g potential gene–environment<br />

<strong>in</strong>teractions.<br />

Beyond this example, f<strong>in</strong>ite <strong>mixture</strong>s of Bernoulli distributions such as model (1) have received<br />

much attention <strong>in</strong> the last five decades. See Teicher (1963) for an early example. More recently,<br />

Wang and Puterman (1998) among others generalized b<strong>in</strong>omial f<strong>in</strong>ite <strong>mixture</strong>s to <strong>mixture</strong><br />

logistic <strong>regression</strong> <strong>models</strong>, and Zhang et al. (2003) applied <strong>mixture</strong> cumulative logistic <strong>models</strong><br />

to analyse correlated ord<strong>in</strong>al responses.<br />

1.2. Example 2: <strong>mixture</strong> of non-l<strong>in</strong>ear hierarchical <strong>models</strong><br />

Longitud<strong>in</strong>al and genetic studies commonly <strong>in</strong>volve a cont<strong>in</strong>uous response {Y ij }, also referred<br />

to as a quantitative trait. See, for example, Diggle et al. (2002), Haseman and Elston (1972) and<br />

Risch and Zhang (1995). Pauler and Laird (2000) used general f<strong>in</strong>ite <strong>mixture</strong> non-l<strong>in</strong>ear hierarchical<br />

<strong>models</strong> to analyse longitud<strong>in</strong>al data from heterogeneous subpopulations. Specifically,<br />

when there are only two subgroups, the model is of the form<br />

Y ij = g{x ij , β, U i z ij µ 1 + .1 − U i /z ij µ 2 } + " i,j ,<br />

where the " i,j s are <strong>in</strong>dependent and identically distributed accord<strong>in</strong>g to N.0, σ 2 / and g.·/ is<br />

a prespecified function. Here, the known covariates x ij may conta<strong>in</strong> observed time po<strong>in</strong>ts to<br />

reflect a time course <strong>in</strong> longitud<strong>in</strong>al data.<br />

1.3. Example 3: a f<strong>in</strong>ite <strong>mixture</strong> of Poisson <strong>regression</strong> <strong>models</strong><br />

Poisson distribution and Poisson <strong>regression</strong> have been widely used to analyse count data<br />

(McCullagh and Nelder, 1989), but observed count data often exhibit overdispersion relative<br />

to this. F<strong>in</strong>ite <strong>mixture</strong> Poisson <strong>regression</strong> <strong>models</strong> (Wang et al., 1996) provide a plausible explanation<br />

for overdispersion. Specifically, conditionally on all U i s, the Y ij s are <strong>in</strong>dependent<br />

and follow the Poisson <strong>regression</strong> model<br />

p.Y ij = y ij |x ij , U i / = 1<br />

y ij ! λy ij<br />

ij exp.−λ ij/, .2/<br />

where λ ij = exp{x ij β + U i z ij µ 1 + .1 − U i /z ij µ 2 }.<br />

To summarize the <strong>models</strong> presented above, we consider a random sample of n <strong>in</strong>dependent<br />

observations {y i , X i } n 1<br />

with the density function<br />

p i .y i , x i ; ω/ = {.1 − α/f i .y i , x i ; β, µ 1 / + α f i .y i , x i ; β, µ 2 /} g i .x i /, .3/<br />

where g i .x i / is the distribution function of X i . Further, ω = .α, β, µ 1 , µ 2 / is the unknown parameter<br />

vector, <strong>in</strong> which β (q 1 × 1) measures the strength of association that is contributed by<br />

the covariate terms and the two q 2 × 1 vectors, µ 1 and µ 2 , represent the different contributions<br />

from two different groups.<br />

Equivalently, if we consider P.U i = 0/ = 1 − P.U i = 1/ = α, and assume that the conditional<br />

density of y i given U i is p i .y i |U i / = f i {y i , x i ; β, µ 2 .1 − U i / + µ 1 U i }, then model (3) is the<br />

marg<strong>in</strong>al density of y i . In fact, McCullagh and Nelder (1989) considered a special case <strong>in</strong> which<br />

f i is from an exponential family distribution, i.e.<br />

∏<br />

f i {y i , x i ; β, µ 2 .1 − U i / + µ 1 U i } = n i<br />

exp[φ{y ij θ ij − a.θ ij /} + c.y ij , φ/], .4/<br />

j=1<br />

where θ ij = h{x ij , β, U i µ 1 + .1 − U i /µ 2 }, h.·/ is a l<strong>in</strong>k function and φ is a dispersion parameter.<br />

This family of <strong>mixture</strong> <strong>regression</strong> <strong>models</strong> is very useful <strong>in</strong> practice.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!