10.07.2015 Views

The error rate of learning halfspaces using kernel-SVM

The error rate of learning halfspaces using kernel-SVM

The error rate of learning halfspaces using kernel-SVM

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

equivalent formulationminErr D,l (f)s.t. f ∈ ¯W (18)<strong>The</strong> following lemma is very similar to lemma 5.12, but with better dependency on m (m 1.5instead <strong>of</strong> m 2 ).Lemma 5.25 Let l be a convex surrogate and let V ⊂ C(S d−1 ) an m-dimensional vectorspace. <strong>The</strong>re exists a continuous <strong>kernel</strong> k : S d−1 × S d−1 → R with sup x∈S d−1 k(x, x) ≤ 1 suchthat H k = V as a vector space and there exists a probability measure µ N such that∀f ∈ V, ||f|| Hk ≤ 2m1.5|∂ + l(0)| Err µ N ,l(f)Pro<strong>of</strong> Let ψ : S d−1 → V ∗ be the evaluation operator. It maps each x ∈ S d−1 to the linearfunctional f ∈ V ↦→ f(x). We claim that1. ψ is continuous,2. aff(ψ(S d−1 ) ∪ −ψ(S d−1 )) = V ∗ ,3. V = {v ∗∗ ◦ ψ : v ∗∗ ∈ V ∗∗ }.Pro<strong>of</strong> <strong>of</strong> 1: We need to show that ψ(x n ) → ψ(x) if x n → x. Since V ∗ is finite dimensional, itsuffices to show that ψ(x n )(f) → ψ(x)(f) for every f ∈ V , which follows from the continuity<strong>of</strong> f.Pro<strong>of</strong> <strong>of</strong> 2: Note that 0 ∈ U = aff(ψ(S d−1 ) ∪ −ψ(S d−1 )), so U is a linear space. Now, defineT : U ∗ → V via T (u ∗ ) = u ∗ ◦ ψ. We claim that T is onto, whence dim(U) = dim(U ∗ ) =dim(V ) = dim(V ∗ ), so that U = V ∗ . Indeed, for f ∈ V , let u ∗ f ∈ U ∗ be the functionalu ∗ f (u) = u(f). Now, T (u∗ f )(x) = u∗ f (ψ(x)) = ψ(x)(f) = f(x), thus T (u∗ f ) = f.Pro<strong>of</strong> <strong>of</strong> 3: From U = V ∗ it follows that U ∗ = V ∗∗ , so that the mapping T : V ∗∗ → V isonto, showing that V = {v ∗∗ ◦ ψ : v ∗∗ ∈ V ∗∗ }.Let us apply John’s Lemma to K = conv(ψ(S d−1 )∪−ψ(S d−1 )). It yields an inner producton V ∗ 1with K contained in the unit ball and containing the ball around 0 with radius √ m.Let k be the <strong>kernel</strong> k(x, y) = 〈ψ(x), ψ(y)〉. Since ψ is continuous, k is continuous as well.By <strong>The</strong>orem 5.1.1 and since T is onto, it follows that, as a vector space, V = H k . Since Kis contained in the unit ball, it follows that sup x∈S d−1 k(x, x) ≤ 1. It remains to prove theexistence <strong>of</strong> the measure µ N .Let e 1 , . . . , e m ∈ V ∗ be an orthonormal basis. For every i ∈ [m], choose(x 1 i , y i ), . . . , (x m+1i , y i ) ∈ S d−1 × {±1} and λ 1 i , . . . , λ m+1i ≥ 0 such that ∑ m+1j=1 λj i = 1 and√1me i = ∑ m+1j=1 λj i y iψ(x j i ). Define µ N(x j i , 1) = µ N(x j i , −1) = λj i. 2mLet f ∈ V . By <strong>The</strong>orem 5.1.1 there exists v ∈ V ∗ such that f = Λ v,0 ◦ ψ and ||f|| Hk =||v|| V ∗. It follows that, for a = ∂ + l(0),38

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!