10.07.2015 Views

The error rate of learning halfspaces using kernel-SVM

The error rate of learning halfspaces using kernel-SVM

The error rate of learning halfspaces using kernel-SVM

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Now, denote δ = ∫ g. It holds that{x:〈x,e〉=−γ}∫∫E µ 1 el(yg(x)) = θ l(g(x)) + (1 − θ)Thus,≥≥{x:〈x,e〉=γ}(∫θ · lg{x:〈x,e〉=γ})+ (1 − θ) · lθ · l(δ) + (1 − θ) · l(−δ) − Lɛ{x:〈x,e〉=−γ}(−∫l(−g(x)){x:〈x,e〉=−γ}Err D,l (g) ≥ (1 − λ 2 − λ 3 )(θ · l(δ) + (1 − θ) · l(−δ)) − Lɛ + λ 2 E µ 2 el(yg(x))However, by considering the constant solution δ, it follows that1Err D,l (g) ≤ (1 − λ 2 − λ 3 )(θl(δ) + (1 − θ) · l(−δ)) + λ 2 · l(δ) + λ 32 (l(δ) + l(−δ)) + √ γ≤ (1 − λ 2 − λ 3 )(θ · l(δ) + (1 − θ) · l(−δ)) + λ 2 · l(δ) + λ 3 · l(−|δ|) + √ γThus,Err µ 2 e ,l(g) ≤ Lɛλ 2+ l(δ) + λ 3λ 2l(−|δ|) += L · l(0)128γK3.5|∂ + l(0)|λ 2 λ 3+)g(15)√ γ(16)λ 210 · c · L · K3.5· E · m 3 · (r K + s d ) + l(δ) + λ √3γl(−|δ|) +λ 2 λ 2Now, relying on the assumption that γ · log 8 (m) = o(1), it is possible to choose λ 2 =Θ ( √ γK4 ) = Θ ( √ γ log 4 (m) ) , λ 3 = √ γ, K = Θ(log(m/γ)), and d = Θ(log(m/γ)) such thatthe bound in Equation (13), L·l(0)128γK3.5|∂ + l(0)|λ 2 λ 3+ 10·c·K3.5λ 2· E · m 3 · (r K + s d ), λ 2 , λ 3 and λ 3λ 2o(1).l(δ) ≤ l ( α2are allSince the ) bound in Equation (13) is o(1), it follows, as in the pro<strong>of</strong> <strong>of</strong> <strong>The</strong>orem 2.6, thatand consequently, 0 l(0)−l( α 2 )− o(1), l(g(x)) < l(0) ⇒l(0)g(x) > 0. Since the marginal distributions <strong>of</strong> µ 1 e and µ 2 e are the same, it follows that, if (x, y)(are chosen according to D, then w.p. >l(0)−l( α 2 )l(0))− o(1) · (1 − λ 2 − λ 3 ) · (1 − θ) = Ω(1),yg(x) < 0. Thus, Err D,0−1 (g) = Ω(1) while Err γ (D) ≤ λ 2 + λ 3 = O ( √ γ poly(log(m))).36✷λ 2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!