21.06.2014 Views

Subsampling estimates of the Lasso distribution.

Subsampling estimates of the Lasso distribution.

Subsampling estimates of the Lasso distribution.

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 4<br />

The adaptive <strong>Lasso</strong> in a high<br />

dimensional setting<br />

In a high dimensional setting, where <strong>the</strong> use <strong>of</strong> <strong>the</strong> <strong>Lasso</strong> is most justified, due to its<br />

sparsity inducing property, <strong>the</strong>re are to this date no asymptotic results similar to those <strong>of</strong><br />

Knight and Fu (2000). Hence we turn to a variant, <strong>the</strong> adaptive <strong>Lasso</strong> and presents <strong>the</strong><br />

results <strong>of</strong> Huang et al. (2008) who studied it under saprsity asssumptions and a fur<strong>the</strong>r a<br />

partial orthogonality assumption between relevant and noise covariates. Their approach<br />

<strong>of</strong>fers <strong>the</strong> advantage to provide an asymptotic normality result for <strong>estimates</strong> corresponding<br />

to nonzero-coefficients.<br />

One typically resorts to a triangular array to model high dimensionality in linear models,<br />

that is, one assumes that<br />

Y i = x ′ iβ n0 + ε i , i = 1, . . . , n (4.0.2.1)<br />

where <strong>the</strong> parameter β n0 has a dimension p n allowed to grow faster than n. For observations<br />

(Y i , x i ), i = 1, . . . , n drawn from 4.0.2.1, <strong>the</strong> adaptive <strong>Lasso</strong> is defined as<br />

arg min<br />

φ∈R p<br />

L n ( φ) = arg min<br />

φ∈R p<br />

n ∑<br />

i=1<br />

(Y i − x ′ iφ) 2 ∑p n<br />

+ λ n w nj |φ j | (4.0.2.2)<br />

where λ n is <strong>the</strong> usual penalty parameter and are {w nj } j are in general strictly positive<br />

weights. However, here we make <strong>the</strong> assumption that <strong>the</strong>y are computed using an initial<br />

estimator ˜β nj <strong>of</strong> β nj , that is :<br />

j=1<br />

w nj = | ˜β nj | −1 , (4.0.2.3)<br />

for j = 1, · · · , n. For notational convenience, in <strong>the</strong> remaining, we drop <strong>the</strong> subscript n<br />

for β n0 , yet dependence on n is implicitely assumed.<br />

Next, we introduce fur<strong>the</strong>r notation.<br />

First, assume without loss <strong>of</strong> generality that β 0 takes <strong>the</strong> form<br />

β 0 = (β ′ 1, β ′ 2) ′<br />

31

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!