Boyd Convex Optimization book - SFU Wiki
Boyd Convex Optimization book - SFU Wiki
Boyd Convex Optimization book - SFU Wiki
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
304 6 Approximation and fitting<br />
point in the affine set with minimum distance to 0, i.e., it determines the projection<br />
of the point 0 on the affine set {x | Ax = b}.<br />
Least-squares solution of linear equations<br />
The most common least-norm problem involves the Euclidean or l 2 -norm.<br />
squaring the objective we obtain the equivalent problem<br />
By<br />
minimize ‖x‖ 2 2<br />
subject to Ax = b,<br />
the unique solution of which is called the least-squares solution of the equations<br />
Ax = b. Like the least-squares approximation problem, this problem can be solved<br />
analytically. Introducing the dual variable ν ∈ R m , the optimality conditions are<br />
2x ⋆ + A T ν ⋆ = 0, Ax ⋆ = b,<br />
which is a pair of linear equations, and readily solved. From the first equation<br />
we obtain x ⋆ = −(1/2)A T ν ⋆ ; substituting this into the second equation we obtain<br />
−(1/2)AA T ν ⋆ = b, and conclude<br />
ν ⋆ = −2(AA T ) −1 b, x ⋆ = A T (AA T ) −1 b.<br />
(Since rank A = m < n, the matrix AA T is invertible.)<br />
Least-penalty problems<br />
A useful variation on the least-norm problem (6.5) is the least-penalty problem<br />
minimize φ(x 1 ) + · · · + φ(x n )<br />
subject to Ax = b,<br />
(6.6)<br />
where φ : R → R is convex, nonnegative, and satisfies φ(0) = 0. The penalty<br />
function value φ(u) quantifies our dislike of a component of x having value u;<br />
the least-penalty problem then finds x that has least total penalty, subject to the<br />
constraint Ax = b.<br />
All of the discussion and interpretation of penalty functions in penalty function<br />
approximation can be transposed to the least-penalty problem, by substituting<br />
the amplitude distribution of x (in the least-penalty problem) for the amplitude<br />
distribution of the residual r (in the penalty approximation problem).<br />
Sparse solutions via least l 1 -norm<br />
Recall from the discussion on page 300 that l 1 -norm approximation gives relatively<br />
large weight to small residuals, and therefore results in many optimal residuals<br />
small, or even zero. A similar effect occurs in the least-norm context. The least<br />
l 1 -norm problem,<br />
minimize ‖x‖ 1<br />
subject to Ax = b,<br />
tends to produce a solution x with a large number of components equal to zero.<br />
In other words, the least l 1 -norm problem tends to produce sparse solutions of<br />
Ax = b, often with m nonzero components.