- Text
- Regression,
- Matching,
- Algorithm,
- Analysis,
- Orthogonal,
- Sparsity,
- Pursuit,
- Sparse,
- Probability,
- Barron,
- Projects,
- Undertaken,
- Statistics,
- Www.yumpu.com

1 Projects undertaken - Welcome to the Department of Statistics ...

1 PROJECTS UNDERTAKEN 2The goal is **to** find β from knowledge **of** **the** received string Y , **the** dictionary X and **the** set B. Themain difference between this and **the** traditional high-dimensional regression model, analyzed extensively instatistical literature, is **the** added knowledge **of** **the** set B **of** choices **of** β.In particular, we arrange each β in B **to** be a sparse vec**to**r consisting **of** L non-zeroes with ‖β‖ 2 2 = P , whereP represents **the** signal strength. Here ‖.‖ 2 is **the** l 2 -norm. Details **of** our construction **of** **the** set B **of** choices**of** β are described in Figure 1(a).Section 1 Section 2 .... Section LM columns M columns M columnsX =. . .β = (.... √ P 1............ ............... √ P 2... .... ............... √ P L...)0.0 0.2 0.4 0.6 0.8 1.0●g(x)●●●●●●●●●●●●● ●B = 2 16 , L = BP σ 2 = 7R = 0.495CNo. **of** steps = 160.0 0.2 0.4 0.6 0.8 1.0x(a)(b)Figure 1: (a) Schematic rendering **of** **the** dictionary matrix X n×N and coefficient vec**to**r β. We assume N = LM and partition **the** X matrixin**to** L sections, each **of** size M. The set B **of** choices for β is **the**n assumed **to** consist **of** all vec**to**rs with one non-zero, **of** known value√ Pl , in section l, for l = 1, . . . , L. The quantities P l ’s are free **to** be chosen by **the** user, subject **to** **the** constraint P 1 +. . .+P L = P ,where P represents **the** signal strength.The vertical bars in **the** X matrix indicate **the** selected columns from a section.(b) Plot demonstrating progression **of** **the** algorithm.The dots indicate **the** proportion **of** correct detections after a particularnumber **of** steps. For example, **the** y-axis co-ordinate **of** **the** first dot (from **the** left) represents **the** proportion **of** correct detectionsafter **the** first step.The above can also be viewed as a problem **of** multiple hypo**the**sis testing. Indeed, given Y , one has **to** select**the** correct β (namely β ∗ ) among **the** elements in B. There are information-**the**oretic limits, which giveslower bounds on **the** sample size n, as a function **of** sparsity L, dimension N and signal-**to**-noise ratio P/σ 2 ,for which it possible **to** reliably distinguish between **the**se various hypo**the**ses. In particular, assume,n = L log(N/L)Rfor some R > 0, where R is also known as **the** communication Rate.Defining C = 1 2 log(1+P/σ2 ), **the** goal is **to** have a scheme for **the** recovery **of** β ∗ for all R < C. The demands**of** communication also require that **the** probability **of** failure be exponentially small in n.We remark that **the** possibility **of** recovery for R > C is ruled out by a converse **the**orem by Shannon (seeCover and Thomas [2008]). Analogous converses for signal recovery in **the** regression setup, for example byWainwright [2009a], Akçakaya and Tarokh [2010], are also relevant.(1)

1 PROJECTS UNDERTAKEN 3In Barron and Joseph [2010b], we analyze **the** performance **of** **the** optimum maximum likelihood (minimumdistance) estima**to**r, which searches over all β ∈ B, choosing **the** β which minimizes ‖Y − Xβ‖ 2 .demonstrate that for n as in (1), one can recover **the** support **of** β ∗ with high probability, for any R < C.When N is large, which is typically **the** case, **the** maximum likelihood decoder is computationally infeasible.Accordingly, in Barron and Joseph [2010a], we propose an iterative thresholding algorithm for **the** estimation**of** **the** support **of** **the** coefficient vec**to**r. The performance can be summarized as follows:Theorem 1. (Barron and Joseph [2010a]) For any R < C andn = L log(N/L) ,Rit is possible **to** recover **the** support **of** β ∗ , with error probability that is exponentially small in L and computationalcomplexity **of** order nN.WeTaking X **to** have i.i.d standard Gaussian entries, a key feature **of** our analysis is that we are able **to**characterize **the** distribution **of** **the** statistics involved in successive iterates. As a results **of** this, we showthat **the**re is a function g : [0, 1] → [0, 1] which characterizes, with high probability, **the** proportion **of** correctdetections after any step. This is shown in Figure 1(b).Performance **of** **the** Orthogonal Matching Pursuit (OMP) for variable selection with randomdesigns. The Orthogonal Matching Pursuit (Pati et al. [1993], Mallat and Zhang [1993]) is an iterative algorithmfor signal recovery. Unlike penalized procedures, **the** regularization in this algorithm appears through**the** s**to**pping criterion. In Joseph [2011], I analyze **the** orthogonal matching pursuit for **the** general problem**of** sparse recovery in high dimensions with random designs. For random designs, since **the** performance ismeasured after averaging over **the** distribution **of** **the** design matrix, one can ensure support recovery withfar less stringent constraints on sparsity compared **to** **the** case with deterministic X, as analyzed for **the**OMP by Cai and Wang [2010] and Zhang [2009a]. The s**to**pping criterion I used, which was motivatedby my work on **the** communication problem, was different from what is traditionally used in literature.For correlated Gaussian designs and exact sparse vec**to**rs, I show that **the** support recovery is similar **to**known results on **the** Lasso algorithm by Wainwright [2009b]. Moreover, variable selection under a morerelaxed assumption on sparsity, whereby one has only control on **the** l 1 norm **of** **the** smaller coefficients, isalso addressed. In particular, I was able **to** demonstrate recovery **of** coefficients with minimum magnitude≈ √ log p/n, where p represents **the** dimension and n **the** sample size, even under this more general notion**of** sparsity. As a consequence **of** **the**se results, if ˆβ is **the** estimate obtained after running **the** algorithm, itis shown that‖ ˆβ − β‖ 2 ≤ Cp∑j=1(min βj 2 , σ 2 2 log p ),nwith high probability. Such oracle inequalities have been shown for **the** Lasso by Zhang [2009b] and **the**Dantzig selec**to**r by Candes and Tao [2007].

- Page 4 and 5: 2 ONGOING AND FUTURE PROJECTS 4Impa
- Page 6 and 7: REFERENCES 6M. Bayati and A. Montan