Subsampling estimates of the Lasso distribution.
Subsampling estimates of the Lasso distribution.
Subsampling estimates of the Lasso distribution.
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 4<br />
The adaptive <strong>Lasso</strong> in a high<br />
dimensional setting<br />
In a high dimensional setting, where <strong>the</strong> use <strong>of</strong> <strong>the</strong> <strong>Lasso</strong> is most justified, due to its<br />
sparsity inducing property, <strong>the</strong>re are to this date no asymptotic results similar to those <strong>of</strong><br />
Knight and Fu (2000). Hence we turn to a variant, <strong>the</strong> adaptive <strong>Lasso</strong> and presents <strong>the</strong><br />
results <strong>of</strong> Huang et al. (2008) who studied it under saprsity asssumptions and a fur<strong>the</strong>r a<br />
partial orthogonality assumption between relevant and noise covariates. Their approach<br />
<strong>of</strong>fers <strong>the</strong> advantage to provide an asymptotic normality result for <strong>estimates</strong> corresponding<br />
to nonzero-coefficients.<br />
One typically resorts to a triangular array to model high dimensionality in linear models,<br />
that is, one assumes that<br />
Y i = x ′ iβ n0 + ε i , i = 1, . . . , n (4.0.2.1)<br />
where <strong>the</strong> parameter β n0 has a dimension p n allowed to grow faster than n. For observations<br />
(Y i , x i ), i = 1, . . . , n drawn from 4.0.2.1, <strong>the</strong> adaptive <strong>Lasso</strong> is defined as<br />
arg min<br />
φ∈R p<br />
L n ( φ) = arg min<br />
φ∈R p<br />
n ∑<br />
i=1<br />
(Y i − x ′ iφ) 2 ∑p n<br />
+ λ n w nj |φ j | (4.0.2.2)<br />
where λ n is <strong>the</strong> usual penalty parameter and are {w nj } j are in general strictly positive<br />
weights. However, here we make <strong>the</strong> assumption that <strong>the</strong>y are computed using an initial<br />
estimator ˜β nj <strong>of</strong> β nj , that is :<br />
j=1<br />
w nj = | ˜β nj | −1 , (4.0.2.3)<br />
for j = 1, · · · , n. For notational convenience, in <strong>the</strong> remaining, we drop <strong>the</strong> subscript n<br />
for β n0 , yet dependence on n is implicitely assumed.<br />
Next, we introduce fur<strong>the</strong>r notation.<br />
First, assume without loss <strong>of</strong> generality that β 0 takes <strong>the</strong> form<br />
β 0 = (β ′ 1, β ′ 2) ′<br />
31