08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

A sub gradient<br />

Figure 10.4: Illustration <strong>of</strong> a subgradient for |x| 1<br />

at x = 0<br />

gives −1 ≥ v 1 and choosing y 1 = −ε, gives −1 ≤ v 1 . So v 1 = −1. Similar reasoning gives<br />

the second condition. For the third condition, choose i in I 3 and set y i = ±ε and argue<br />

similarly.<br />

To characterize the value <strong>of</strong> x that minimizes ‖x‖ 1<br />

subject to Ax=b, note that at the<br />

minimum x 0 , there can be no downhill direction consistent with the constraint Ax=b.<br />

Thus, if the direction ∆x at x 0 is consistent with the constraint Ax=b, that is A∆x=0<br />

so that A (x 0 + ∆x) = b, any subgradient ∇ for ‖x‖ 1<br />

at x 0 must satisfy ∇ T ∆x = 0.<br />

A sufficient but not necessary condition for x 0 to be a minimum is that there exists<br />

some w such that the sub gradient at x 0 is given by ∇ = A T w. Then for any ∆x such<br />

that A∆x = 0, ∇ T ∆x = w T A∆x = w T · 0 = 0. That is, for any direction consistent with<br />

the constraint Ax = b, the subgradient is zero and hence x 0 is a minimum.<br />

10.3.2 The Exact Reconstruction Property<br />

Theorem 10.3 below gives a condition that guarantees that a solution x 0 to Ax = b is<br />

the unique minimum 1-norm solution to Ax = b. This is a sufficient condition, but not<br />

necessary condition.<br />

Theorem 10.3 Suppose x 0 satisfies Ax 0 = b. If there is a subgradient ∇ to the 1-norm<br />

function at x 0 for which there exists a w where ∇ = A T w and the columns <strong>of</strong> A corresponding<br />

to nonzero components <strong>of</strong> x 0 are linearly independent, then x 0 minimizes ‖x‖ 1<br />

subject to Ax=b. Furthermore, these conditions imply that x 0 is the unique minimum.<br />

Pro<strong>of</strong>: We first show that x 0 minimizes ‖x‖ 1<br />

. Suppose y is another solution to Ax = b.<br />

We need to show that ||y|| 1 ≥ ||x 0 || 1 . Let z = y − x 0 . Then Az = Ay − Ax 0 = 0. Hence,<br />

∇ T z = (A T w) T z = w T Az = 0. Now, since ∇ is a subgradient <strong>of</strong> the 1-norm function at<br />

x 0 ,<br />

||y|| 1 = ||x 0 + z|| 1 ≥ ||x 0 || 1 + ∇ T · z = ||x 0 || 1<br />

and so we have that ||x 0 || 1 minimizes ||x|| 1 over all solutions to Ax = b.<br />

336

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!