11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

498 Sparsity and convexity .^ .^ FIGURE 42.1Estimation picture for the lasso (left) and ridge regression (right). Shownare contours of the error and constraint functions. The solid areas are theconstraint regions |β 1 | + |β 2 |≤t and β1 2 + β22 ≤ t 2 ,respectively,whiletheellipses are the contours of the least squares error function. The sharp cornersof the constraint region for the lasso yield sparse solutions. In high dimensions,sparsity arises from corners and edges of the constraint region.42.2 Sparsity, convexity and l 1 penaltiesOne of the earliest proposals for using l 1 or absolute-value penalties, was thelasso method for penalized regression. Given a linear regression with predictorsx ij and response values y i for i ∈{1,...,N} and j ∈{1,...,p}, thelassosolves the l 1 -penalized regressionminimize β⎧⎪⎨⎪ ⎩12⎛N∑⎝y i −i=1p∑j=1⎞2x ij β j⎠ + λ⎫p∑⎪⎬|β j |⎪⎭ .This is equivalent to minimizing the sum of squares with constraint |β 1 |+···+|β p |≤s.Itissimilartoridgeregression,whichhasconstraintβ 2 1 +···+β 2 p ≤ s.Because of the form of the l 1 penalty, the lasso does variable selection andshrinkage; while ridge regression, in contrast, only shrinks. If we consider amore general penalty of the form (β q 1 + ···+ βq p) 1/q ,thenthelassousesq =1and ridge regression has q = 2. Subset selection emerges as q → 0, and thelasso corresponds to the smallest value of q (i.e., closest to subset selection)that yields a convex problem. Figure 42.1 gives a geometric view of the lassoand ridge regression.The lasso and l 1 penalization have been the focus of a great deal of workrecently. Table 42.1, adapted from Tibshirani (2011), gives a sample of thiswork.j=1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!