08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Define a half space to be the set <strong>of</strong> all points on one side <strong>of</strong> a hyper plane, i.e., a set<br />

<strong>of</strong> the form {x|w T x ≥ w 0 }. The VC-dimension <strong>of</strong> half spaces in d-dimensions is d + 1.<br />

There exists a set <strong>of</strong> size d + 1 that can be shattered by half spaces. Select the d<br />

unit-coordinate vectors plus the origin to be the d + 1 points. Suppose A is any subset<br />

<strong>of</strong> these d + 1 points. Without loss <strong>of</strong> generality assume that the origin is in A. Take<br />

a 0-1 vector w which has 1’s precisely in the coordinates corresponding to vectors not<br />

in A. Clearly A lies in the half-space w T x ≤ 0 and the complement <strong>of</strong> A lies in the<br />

complementary half-space.<br />

We now show that no set <strong>of</strong> d + 2 points in d-dimensions can be shattered by linear<br />

separators. This is done by proving that any set <strong>of</strong> d+2 points can be partitioned into two<br />

disjoint subsets A and B <strong>of</strong> points whose convex hulls intersect. This establishes the claim<br />

since any linear separator with A on one side must have its entire convex hull on that<br />

side, 23 so it is not possible to have a linear separator with A on one side and B on the other.<br />

Let convex(S) denote the convex hull <strong>of</strong> point set S.<br />

Theorem 6.17 (Radon): Any set S ⊆ R d with |S| ≥ d + 2, can be partitioned into two<br />

disjoint subsets A and B such that convex(A) ∩ convex(B) ≠ φ.<br />

Pro<strong>of</strong>: Without loss <strong>of</strong> generality, assume |S| = d+2. Form a d×(d+2) matrix with one<br />

column for each point <strong>of</strong> S. Call the matrix A. Add an extra row <strong>of</strong> all 1’s to construct a<br />

(d+1)×(d+2) matrix B. Clearly the rank <strong>of</strong> this matrix is at most d+1 and the columns<br />

are linearly dependent. Say x = (x 1 , x 2 , . . . , x d+2 ) is a nonzero vector with Bx = 0.<br />

Reorder the columns so that x 1 , x 2 , . . . , x s ≥ 0 and x s+1 , x s+2 , . . . , x d+2 < 0. Normalize<br />

x so<br />

s ∑<br />

i=1<br />

|x i | = 1. Let b i (respectively a i ) be the i th column <strong>of</strong> B (respectively A). Then,<br />

s∑<br />

|x i |b i = d+2 ∑<br />

i=1<br />

d+2<br />

∑<br />

i=s+1<br />

i=s+1<br />

|x i |. Since s ∑<br />

i=1<br />

|x i |b i from which it follows that<br />

|x i | = 1 and d+2 ∑<br />

i=s+1<br />

s∑<br />

i=1<br />

|x i | = 1 each side <strong>of</strong><br />

|x i |a i = d+2 ∑<br />

i=s+1<br />

s∑<br />

|x i |a i = d+2 ∑<br />

i=1<br />

∑<br />

|x i |a i and s |x i | =<br />

i=s+1<br />

i=1<br />

|x i |a i is a convex<br />

combination <strong>of</strong> columns <strong>of</strong> A which proves the theorem. Thus, S can be partitioned into<br />

two sets, the first consisting <strong>of</strong> the first s points after the rearrangement and the second<br />

consisting <strong>of</strong> points s + 1 through d + 2 . Their convex hulls intersect as required.<br />

Radon’s theorem immediately implies that half-spaces in d-dimensions do not shatter<br />

any set <strong>of</strong> d + 2 points.<br />

Spheres in d-dimensions<br />

23 If any two points x 1 and x 2 lie on the same side <strong>of</strong> a separator, so must any convex combination: if<br />

w · x 1 ≥ b and w · x 2 ≥ b then w · (ax 1 + (1 − a)x 2 ) ≥ b.<br />

210

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!