Subsampling estimates of the Lasso distribution.
Subsampling estimates of the Lasso distribution.
Subsampling estimates of the Lasso distribution.
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Abstract<br />
We investigate possibilities <strong>of</strong>fered by subsampling to etimate <strong>the</strong> <strong>distribution</strong> <strong>of</strong> <strong>the</strong> <strong>Lasso</strong><br />
estimator and construct confidence intervals/hypo<strong>the</strong>sis tests. Despite being inferior to<br />
<strong>the</strong> bootstrap in terms <strong>of</strong> higher-order accuracy in situations where <strong>the</strong> later is consistent,<br />
subsampling <strong>of</strong>fers <strong>the</strong> advantage to work under very weak assumptions. Thus, building<br />
upon Knight and Fu (2000), we first study <strong>the</strong> asymptotics <strong>of</strong> <strong>the</strong> <strong>Lasso</strong> estimator in a<br />
low dimensional setting and prove that under an orthogonal design assumption, <strong>the</strong> finite<br />
sample component <strong>distribution</strong>s converge to a limit in a mode allowing for consistency <strong>of</strong><br />
subsampling confidence intervals. We give hints that this result holds in greater generality.<br />
In a high dimensional setting, we study <strong>the</strong> adaptive <strong>Lasso</strong> under assumption <strong>of</strong> partial<br />
orthogonality introduced by Huang, Ma, and Zhang (2008) and use <strong>the</strong> partial oracle result<br />
in <strong>distribution</strong> to argue that subsampling should provide valid confidence intervals for<br />
nonzero parameters. Simulations studies confirm <strong>the</strong> validity <strong>of</strong> subsampling to construct<br />
confidence intervals, tests for null hypo<strong>the</strong>ses ansd control <strong>the</strong> FWER through subsampled<br />
p-values in a low dimensional setting. In <strong>the</strong> high dimensional setting, confidence intervals<br />
for nonzero coefficients are slightly anticonservative and false positive rates are shown to<br />
be conservative.<br />
v