l1-magic : Recovery of Sparse Signals via Convex Programming

More documents

Recommendations

Info

finds the vector x ∈ R N such that the error y − Ax has minimum l 1 norm (i.e. we are asking that the difference between Ax and y be sparse). This program arises in the context of channel coding [8]. Suppose we have a channel code that produces a codeword c = Ax for a message x. The message travels over the channel, and has an unknown number of its entries corrupted. The decoder observes y = c + e, where e is the corruption. If e is sparse enough, then the decoder can use (P A ) to recover x exactly. Again, (P A ) can be recast as an LP. • Min-l 1 with quadratic constraints. This program finds the vector with minimum l 1 norm that comes close to explaining the observations: (P 2 ) min ‖x‖ 1 subject to ‖Ax − b‖ 2 ≤ ɛ, where ɛ is a user specified parameter. As shown in [5], if a sufficiently sparse x 0 exists such that b = Ax 0 + e, for some small error term ‖e‖ 2 ≤ ɛ, then the solution x ⋆ 2 to (P 2 ) will be close to x 0 . That is, ‖x ⋆ 2 − x 0 ‖ 2 ≤ C · ɛ, where C is a small constant. (P 2 ) can be recast as a SOCP. • Min-l 1 with bounded residual correlation. Also referred to as the Dantzig Selector, the program (P D ) min ‖x‖ 1 subject to ‖A ∗ (Ax − b)‖ ∞ ≤ γ, where γ is a user specified parameter, relaxes the equality constraints of (P 1 ) in a different way. (P D ) requires that the residual Ax − b of a candidate vector x not be too correlated with any of the columns of A (the product A ∗ (Ax−b) measures each of these correlations). If b = Ax 0 +e, where e i ∼ N (0, σ 2 ), then the solution x ⋆ D to (P D) has near-optimal minimax risk: E‖x ⋆ D − x 0 ‖ 2 2 ≤ C(log N) · ∑ min(x 0 (i) 2 , σ 2 ), (see [7] for details). For real-valued x, A, b, (P D ) can again be recast as an LP; in the complex case, there is an equivalent SOCP. It is also true that when x, A, b are complex, the programs (P 1 ), (P A ), (P D ) can be written as SOCPs, but we will not pursue this here. If the underlying signal is a 2D image, an alternate recovery model is that the gradient is sparse [18]. Let x ij denote the pixel in the ith row and j column of an n × n image x, and define the operators and D h;ij x = { x i+1,j − x ij i < n 0 i = n D ij x = i D v;ij x = ( Dh;ij x D v;ij x { x i,j+1 − x ij j < n 0 j = n , ) . (1) The 2-vector D ij x can be interpreted as a kind of discrete gradient of the digital image x. The total variation of x is simply the sum of the magnitudes of this discrete gradient at every point: TV(x) := ∑ √ (D h;ij x) 2 + (D v;ij x) 2 = ∑ ‖D ij x‖ 2 . ij ij With these definitions, we have three programs for image recovery, each of which can be recast as a SOCP: 2
• Min-TV with equality constraints. (T V 1 ) min TV(x) subject to Ax = b If there exists a piecewise constant x 0 with sufficiently few edges (i.e. D ij x 0 is nonzero for only a small number of indices ij), then (T V 1 ) will recover x 0 exactly — see [4]. • Min-TV with quadratic constraints. (T V 2 ) min TV(x) subject to ‖Ax − b‖ 2 ≤ ɛ Examples of recovering images from noisy observations using (T V 2 ) were presented in [5]. Note that if A is the identity matrix, (T V 2 ) reduces to the standard Rudin-Osher-Fatemi image restoration problem [18]. See also [9, 11–13] for SOCP solvers specifically designed for the total-variational functional. • Dantzig TV. (T V D ) min TV(x) subject to ‖A ∗ (Ax − b)‖ ∞ ≤ γ This program was used in one of the numerical experiments in [7]. In the next section, we describe how to solve linear and second-order cone programs using modern interior point methods. 2 Interior point methods Advances in interior point methods for convex optimization over the past 15 years, led by the seminal work [14], have made large-scale solvers for the seven problems above feasible. Below we overview the generic LP and SOCP solvers used in the l 1 -magic package to solve these problems. 2.1 A primal-dual algorithm for linear programming In Chapter 11 of [2], Boyd and Vandenberghe outline a relatively simple primal-dual algorithm for linear programming which we have followed very closely for the implementation of (P 1 ),(P A ), and (P D ). For the sake of completeness, and to set up the notation, we briefly review their algorithm here. The standard-form linear program is min z 〈c 0 , z〉 subject to A 0 z = b, f i (z) ≤ 0, where the search vector z ∈ R N , b ∈ R K , A 0 is a K × N matrix, and each of the f i , i = 1, . . . , m is a linear functional: f i (z) = 〈c i , z〉 + d i , for some c i ∈ R N , d i ∈ R. At the optimal point z ⋆ , there will exist dual vectors ν ⋆ ∈ R K , λ ⋆ ∈ R m , λ ⋆ ≥ 0 such that the Karush-Kuhn-Tucker conditions are satisfied: (KKT ) c 0 + A T 0 ν ⋆ + ∑ i λ ⋆ i c i = 0, λ ⋆ i f i (z ⋆ ) = 0, i = 1, . . . , m, A 0 z ⋆ = b, f i (z ⋆ ) ≤ 0, i = 1, . . . , m. 3
Page 1: l 1 -magic : Recovery of Sparse Sig
Page 5 and 6: Since the f i are linear functional
Page 7 and 8: 1. Inputs: a feasible starting poin
Page 9 and 10: 0 50 100 150 200 250 300 350 400 45
Page 11 and 12: cgsolve l1dantzig pd l1decode pd l1
Page 13 and 14: B l 1 norm approximation The l 1 no
Page 15 and 16: where r = Ax − b, and As before,
Page 17 and 18: The Newton system is similar to tha
Page 19: [9] T. Chan, G. Golub, and P. Mulet

l1-magic : Recovery of Sparse Signals via Convex Programming

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?