Subsampling estimates of the Lasso distribution.
Subsampling estimates of the Lasso distribution.
Subsampling estimates of the Lasso distribution.
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
6 Minimizers <strong>of</strong> convex processes<br />
Let ε > 0, for arbitrary M > 0 we have<br />
P (∆ n (δ) > ε) = P (∆ n (δ) > ε; ‖α‖ ≤ M) + P (∆ n (δ) > ε; ‖α‖ > M)<br />
(<br />
)<br />
≤ P<br />
sup |f n (x) − f(x)| > ε<br />
‖x‖≤δ+M<br />
= o(1) + P (‖α‖ > M)<br />
+ P (‖α‖ > M)<br />
by Theorem 2.1.0.2. Letting M tend to infitnity completes <strong>the</strong> argument.<br />
Now, we can prove<br />
α n → P α.<br />
Let δ > 0, for fxed arbitrary M > 0, we have<br />
P (‖α n −α‖ > δ) ≤ P<br />
(∆ n (δ) ≥ 1 )<br />
2 h(δ) (<br />
= P ∆ n (δ) ≥ 1 ) (<br />
2 h(δ); 1<br />
h(δ) ≤ M + P ∆ n (δ) ≥ 1 )<br />
2 h(δ); 1<br />
h(δ) > M (<br />
≤ P ∆ n (δ) ≥ 1 ) ( ) 1<br />
+ P<br />
2M h(δ) > M<br />
For fixed M, <strong>the</strong> first term tends to zero as n tends to infinity by 2.1.0.3. Then, <strong>the</strong><br />
second term tends to zero as M tends to infinity. Indeed, h(δ) −1 is almost surely finite by<br />
uniqueness <strong>of</strong> α. This completes <strong>the</strong> pro<strong>of</strong>.<br />
2.2 Convergence in <strong>distribution</strong><br />
The goal <strong>of</strong> this section is to derive conditions under which a sequence <strong>of</strong> minimizers <strong>of</strong><br />
convex objective functions converge in <strong>distribution</strong> and to provide means to determine this<br />
limit. At first, <strong>the</strong> concept <strong>of</strong> weak convergence must be revisited to encompass not necessarily<br />
measurable maps defined on probability spaces, this is a feature typically exhibited<br />
by argmin functionals. This wider concept <strong>of</strong> weak convergence was originally introduced<br />
by H<strong>of</strong>mann-Jørgensen but first exposited in Dudley (1985) and Pollard (1990). However,<br />
in <strong>the</strong> present section we follow <strong>the</strong> more mature exposition <strong>of</strong> Van der Vaart and Wellner<br />
(1996). Indeed <strong>the</strong>y showed that even in this general setting, most important results<br />
from weak convergence <strong>the</strong>ory, going from <strong>the</strong> portmanteau <strong>the</strong>orem to <strong>the</strong> almost sure<br />
representation <strong>the</strong>orem, through Prohorov’s <strong>the</strong>orem, remain valid, provided one makes<br />
necessary, but essentially minor, modifications. This section is ended with <strong>the</strong> argmin continuous<br />
mapping <strong>the</strong>orem which will be applied to <strong>the</strong> <strong>Lasso</strong> estimator in <strong>the</strong> next chapter.<br />
Definition 2.2.0.4. Let (Ω, A, P ) be a probability space and T : Ω → R be an arbitrary<br />
map.<br />
(i) The outer expection and <strong>the</strong> inner expectation <strong>of</strong> T with respect to P are defined<br />
as<br />
E ∗ (T ) = inf {E(U)|U ≥ T, U : Ω → R measurable and E(U) exists} (2.2.0.4)