Subsampling estimates of the Lasso distribution.
Subsampling estimates of the Lasso distribution.
Subsampling estimates of the Lasso distribution.
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
56 Numerical results<br />
6.1.1 Confidence intervals<br />
We recall <strong>the</strong> definition <strong>of</strong> a confidence interval.<br />
Definition 6.1.1.1. Let α ∈ (0, 1). An interval valued functions I (j) (Z (n) ) is called confidence<br />
interval for <strong>the</strong> parameter β j if it satisfies<br />
(<br />
)<br />
P β j ∈ I (j) (Z (n) ) ≥ (1 − α).<br />
Estimation procedure<br />
For a sample (Y i , x i ) n i=1 <strong>of</strong> simulated observations we proceed as follows to construct confidence<br />
intervals for <strong>the</strong> coefficients:<br />
1. Compute <strong>the</strong> <strong>Lasso</strong> solution path for <strong>the</strong> scaled and centered observations, that is, for<br />
Ỹ i = Y i − n −1<br />
˜x ij =<br />
(<br />
n ∑<br />
l=1<br />
x ij − n −1<br />
Y l , i = 1, . . . , n<br />
∑ n ) (<br />
x lj n −1<br />
l=1<br />
n ∑<br />
l=1<br />
(x lj −<br />
)<br />
n∑ −1<br />
x kj ) 2 , j = 1, . . . , p, i = 1, . . . , n.<br />
2. Choose <strong>the</strong> penalization parameter by K-fold cross validation (we choose K = 10) on<br />
<strong>the</strong> whole data set (Ỹi, ˜x i ) n i=1 . Denote it by λ n,CV .<br />
k=1<br />
3. Set ˆβ n as <strong>the</strong> <strong>Lasso</strong> solution to <strong>the</strong> data set (Ỹi, ˜x i ) n i=1 corresponding to <strong>the</strong> parameter<br />
λ n,CV .<br />
4. Repeat <strong>the</strong> following steps for m = 1, . . . , B :<br />
(a) Generate a random subsample I m ⊂ {1, . . . , n} <strong>of</strong> size b by drawing without replacement.<br />
(b) Compute <strong>the</strong> <strong>Lasso</strong> solution path for <strong>the</strong> scaled and centered data set with indices<br />
i ∈ I m , that is for<br />
= Y i − b −1 ∑<br />
Ỹ (m)<br />
i<br />
˜x (m)<br />
ij =<br />
Y l , i ∈ I m<br />
l∈I m<br />
⎛<br />
⎞ ⎛<br />
⎝x ij − b −1 ∑<br />
x lj<br />
⎠ ⎝b −1 ∑<br />
l∈I m<br />
⎞<br />
(x lj − ∑<br />
x kj ) 2 ⎠<br />
l∈I m k∈I m<br />
−1<br />
, j = 1, . . . , p, i ∈ I m .<br />
(c) Set ˆβ (m)<br />
b as <strong>the</strong> <strong>Lasso</strong> solution to <strong>the</strong> data set<br />
√<br />
(Ỹ (m) , ˜X (m) ) corresponding to <strong>the</strong><br />
rescaled penalization parameter λ b,CV = λ n,CV b/n. Set Ln,b,m and L n,r,m to<br />
L n,b,m = √ ( )<br />
b ˆβ (m)<br />
b − ˆβ n<br />
and<br />
L n,r,m = √ ( )<br />
r ˆβ (m)<br />
b − ˆβ n<br />
respectively. Here, r = b/(1 − b/n) is <strong>the</strong> finite sample corrected subsample size,<br />
cf. (Politis et al., 1999, Section 10.3.1).