21.06.2014 Views

Subsampling estimates of the Lasso distribution.

Subsampling estimates of the Lasso distribution.

Subsampling estimates of the Lasso distribution.

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

56 Numerical results<br />

6.1.1 Confidence intervals<br />

We recall <strong>the</strong> definition <strong>of</strong> a confidence interval.<br />

Definition 6.1.1.1. Let α ∈ (0, 1). An interval valued functions I (j) (Z (n) ) is called confidence<br />

interval for <strong>the</strong> parameter β j if it satisfies<br />

(<br />

)<br />

P β j ∈ I (j) (Z (n) ) ≥ (1 − α).<br />

Estimation procedure<br />

For a sample (Y i , x i ) n i=1 <strong>of</strong> simulated observations we proceed as follows to construct confidence<br />

intervals for <strong>the</strong> coefficients:<br />

1. Compute <strong>the</strong> <strong>Lasso</strong> solution path for <strong>the</strong> scaled and centered observations, that is, for<br />

Ỹ i = Y i − n −1<br />

˜x ij =<br />

(<br />

n ∑<br />

l=1<br />

x ij − n −1<br />

Y l , i = 1, . . . , n<br />

∑ n ) (<br />

x lj n −1<br />

l=1<br />

n ∑<br />

l=1<br />

(x lj −<br />

)<br />

n∑ −1<br />

x kj ) 2 , j = 1, . . . , p, i = 1, . . . , n.<br />

2. Choose <strong>the</strong> penalization parameter by K-fold cross validation (we choose K = 10) on<br />

<strong>the</strong> whole data set (Ỹi, ˜x i ) n i=1 . Denote it by λ n,CV .<br />

k=1<br />

3. Set ˆβ n as <strong>the</strong> <strong>Lasso</strong> solution to <strong>the</strong> data set (Ỹi, ˜x i ) n i=1 corresponding to <strong>the</strong> parameter<br />

λ n,CV .<br />

4. Repeat <strong>the</strong> following steps for m = 1, . . . , B :<br />

(a) Generate a random subsample I m ⊂ {1, . . . , n} <strong>of</strong> size b by drawing without replacement.<br />

(b) Compute <strong>the</strong> <strong>Lasso</strong> solution path for <strong>the</strong> scaled and centered data set with indices<br />

i ∈ I m , that is for<br />

= Y i − b −1 ∑<br />

Ỹ (m)<br />

i<br />

˜x (m)<br />

ij =<br />

Y l , i ∈ I m<br />

l∈I m<br />

⎛<br />

⎞ ⎛<br />

⎝x ij − b −1 ∑<br />

x lj<br />

⎠ ⎝b −1 ∑<br />

l∈I m<br />

⎞<br />

(x lj − ∑<br />

x kj ) 2 ⎠<br />

l∈I m k∈I m<br />

−1<br />

, j = 1, . . . , p, i ∈ I m .<br />

(c) Set ˆβ (m)<br />

b as <strong>the</strong> <strong>Lasso</strong> solution to <strong>the</strong> data set<br />

√<br />

(Ỹ (m) , ˜X (m) ) corresponding to <strong>the</strong><br />

rescaled penalization parameter λ b,CV = λ n,CV b/n. Set Ln,b,m and L n,r,m to<br />

L n,b,m = √ ( )<br />

b ˆβ (m)<br />

b − ˆβ n<br />

and<br />

L n,r,m = √ ( )<br />

r ˆβ (m)<br />

b − ˆβ n<br />

respectively. Here, r = b/(1 − b/n) is <strong>the</strong> finite sample corrected subsample size,<br />

cf. (Politis et al., 1999, Section 10.3.1).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!