02.02.2015 Views

Subgradient slides

Subgradient slides

Subgradient slides

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Review: subgradient<br />

descent<br />

• Initialize x1<br />

• for t = 1 to “I’m tired”:<br />

‣ ft = estimate of f<br />

Problem:<br />

minx f(x)<br />

(optionally,<br />

s.t. x ∈ F)<br />

‣ from limited # of terms (or just use f)<br />

‣ gt = any element of ∂ft(xt)<br />

‣ xt+0.5 = xt – ηtgt<br />

‣ xt+1 = arg minx∈F ||x–xt+0.5||<br />

‣ which is just xt+1 = xt+0.5 if F = R n


<strong>Subgradient</strong> in action<br />

!<br />

"<br />

#<br />

$<br />

!#<br />

!"<br />

!!<br />

!! !" !# $ # " !


Convergence summary<br />

• For strictly convex f(x) (i.e., λ>0):<br />

‣ set ƞt = 1/λt<br />

~<br />

‣ f(xt) – f(x*) = O(1/t)<br />

• For non-strictly convex f(x) (i.e., λ=0):<br />

‣ set ƞt = 1/√t<br />

‣ f(xt) – f(x*) = O(1/√t)<br />

• To get accuracy ϵ:<br />

~<br />

‣ λ>0: T = O(1/ϵ)<br />

‣ λ=0: T = O(1/ϵ 2 )<br />

Interior point: T =<br />

O(ln(1/ϵ)), but each<br />

iteration much slower


Convergence intuition<br />

• Q(x) = ||x – x * || 2 /2<br />

• Proof works by guaranteeing that Q(x) decreases<br />

‣ subtlety: only if f(xt) ≫ f(x * )<br />

‣ has to be like this: e.g., multiple minimizers of f<br />

• We showed (for λ=0):<br />

‣ f(x * ) ≥ f(xt) + Q(xt+1)/ƞt – Q(xt)/ƞt + ƞt||gt|| 2 /2<br />

• Suppose f(xt) ≥ f(x * ) + ϵ:


Typical SVM bound<br />

• Given n training examples, for any δ


Example: SVM<br />

• Suppose no b<br />

‣ L = ||w|| 2 /2 + (C/m) ∑ h(yixi T w))<br />

‣ ∂L = w + (C/m) ∑ yi xi ∂h(yixi T w))<br />

• If ||xi|| ≤ X:


What if we want b<br />

• Problem: λ=0<br />

‣<br />

• Solutions:<br />

‣ ignore the problem:<br />

‣ penalize b too:<br />

‣ change the algorithm slightly:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!