Subsampling estimates of the Lasso distribution.
Subsampling estimates of the Lasso distribution.
Subsampling estimates of the Lasso distribution.
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
26 Application to <strong>the</strong> <strong>Lasso</strong> estimator<br />
Conversely, suppose that 3.2.1.2 holds, we show that <strong>the</strong> minimizer necessarily satisfies<br />
û 2 = 0. Indeed, if û 2 ≠ 0 <strong>the</strong>n <strong>the</strong> equality is attained in 3.2.1.2 at every line corresponding<br />
to nonzero components <strong>of</strong> û 2 , that is,<br />
⎧<br />
(<br />
(<br />
) ) ⎪⎨<br />
−W 2 + C 21 C −1<br />
11 (W 1 − λ 0 sgn(β)) + C 22 − C 21 C −1<br />
11 C 12 û 2 = − λ 0<br />
2 , if ûj 2 > 0<br />
j ⎪ ⎩<br />
λ 0<br />
2 , if ûj 2 < 0<br />
In particular, it follows from 3.2.1.2 that<br />
{<br />
((<br />
) )<br />
C 22 − C 21 C −1<br />
11 C 12 û 2 ∈ [−λ0 , 0], if û j 2 > 0<br />
j [0, λ 0 ] if û j 2 < 0 (3.2.1.3)<br />
Note that (C 22 − C 21 C −1<br />
11 C 12), <strong>the</strong> Schur complement <strong>of</strong> C, is SPD since C is. Let<br />
D be <strong>the</strong> matrix which results from (C 22 − C 21 C −1<br />
11 C 12) after removal <strong>of</strong> <strong>the</strong> lines and<br />
columns corresponding to <strong>the</strong> zero components <strong>of</strong> û 2 , D is also SPD. Let û≠0 2 be <strong>the</strong> vector<br />
which results from û 2 after removal <strong>of</strong> <strong>the</strong> zero components. Then 3.2.1.3 implies that<br />
û≠0 T<br />
2 Dû≠0 2 ≤ 0 which contradicts <strong>the</strong> fact that D is SPD, this complete <strong>the</strong> argument.<br />
Finally, we show that every component <strong>of</strong> <strong>the</strong> limit <strong>distribution</strong> has a discontinuity at <strong>the</strong><br />
point zero only, and is o<strong>the</strong>rwise Gaussian. Consider after an eventual permutation <strong>of</strong> <strong>the</strong><br />
last p − r covariates a fur<strong>the</strong>r partitioning <strong>of</strong> W 2 , u 2 and C <strong>of</strong> <strong>the</strong> form<br />
( )<br />
W2,1<br />
W 2 =<br />
W 2,0<br />
u 2 =<br />
(<br />
u2,1<br />
u 2,0<br />
)<br />
,<br />
and<br />
C =<br />
⎛<br />
⎜<br />
⎝<br />
⎞<br />
C 11 C 12 C 13<br />
⎟<br />
C 21 C 22 C 23 ⎠<br />
C 31 C 32 C 33<br />
respectively, where W 2,1 and u 2,1 are r ′ -vectors for some 0 < r ′ < p − r. For notational<br />
convenience, denote<br />
( )<br />
C11 C ˜C = 12<br />
.<br />
C 21 C 22<br />
Next, consider an arbitrary sign pattern s ′ ∈ {−1, +1} r′ . Then, given <strong>the</strong> event<br />
{ sgn(u2,1 ) = s ′ ; u 2,0 = 0 } ,<br />
(u 1 , u 2,1 ) ′ has <strong>the</strong> following <strong>distribution</strong> :<br />
( ) (<br />
( ))<br />
u1<br />
∼ N σ 2 λ 0 ˜C, −<br />
u 2,1 2 ˜C −1 sgn(β)<br />
s ′<br />
This completes <strong>the</strong> pro<strong>of</strong>.