08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

A sphere in d-dimensions is a set <strong>of</strong> points <strong>of</strong> the form {x| |x − x 0 | ≤ r}. The VCdimension<br />

<strong>of</strong> spheres is d + 1. It is the same as that <strong>of</strong> half spaces. First, we prove that<br />

no set <strong>of</strong> d + 2 points can be shattered by spheres. Suppose some set S with d + 2 points<br />

can be shattered. Then for any partition A 1 and A 2 <strong>of</strong> S, there are spheres B 1 and B 2<br />

such that B 1 ∩ S = A 1 and B 2 ∩ S = A 2 . Now B 1 and B 2 may intersect, but there is no<br />

point <strong>of</strong> S in their intersection. It is easy to see that there is a hyperplane perpendicular<br />

to the line joining the centers <strong>of</strong> the two spheres with all <strong>of</strong> A 1 on one side and all <strong>of</strong> A 2<br />

on the other and this implies that half spaces shatter S, a contradiction. Therefore no<br />

d + 2 points can be shattered by hyperspheres.<br />

It is also not difficult to see that the set <strong>of</strong> d+1 points consisting <strong>of</strong> the unit-coordinate<br />

vectors and the origin can be shattered by spheres. Suppose A is a subset <strong>of</strong> the d + 1<br />

points. Let a be the number <strong>of</strong> unit vectors in A. The center a 0 <strong>of</strong> our sphere will be<br />

the sum <strong>of</strong> the vectors in A. For every unit vector in A, its distance to this center will<br />

be √ a − 1 and for every unit vector outside A, its distance to this center will be √ a + 1.<br />

The distance <strong>of</strong> the origin to the center is √ a. Thus, we can choose the radius so that<br />

precisely the points in A are in the hypersphere.<br />

Finite sets<br />

The system <strong>of</strong> finite sets <strong>of</strong> real numbers can shatter any finite set <strong>of</strong> real numbers<br />

and thus the VC-dimension <strong>of</strong> finite sets is infinite.<br />

6.9.3 Pro<strong>of</strong> <strong>of</strong> Main Theorems<br />

We begin with a technical lemma. Consider drawing a set S <strong>of</strong> n examples from D and<br />

let A denote the event that there exists h ∈ H with zero training error on S but true<br />

error greater than or equal to ɛ. Now draw a second set S ′ <strong>of</strong> n examples from D and let<br />

B denote the event that there exists h ∈ H with zero error on S but error greater than<br />

or equal to ɛ/2 on S ′ .<br />

Lemma 6.18 Let H be a concept class over some domain X and let S and S ′ be sets <strong>of</strong><br />

n elements drawn from some distribution D on X , where n ≥ 8/ɛ. Let A be the event that<br />

there exists h ∈ H with zero error on S but true error greater than or equal to ɛ. Let B<br />

be the event that there exists h ∈ H with zero error on S but error greater than or equal<br />

to ɛ 2 on S′ . Then Prob(B) ≥ Prob(A)/2.<br />

Pro<strong>of</strong>: Clearly, Prob(B) ≥ Prob(A, B) = Prob(A)Prob(B|A). Consider drawing set S<br />

and suppose event A occurs. Let h be in H with err D (h) ≥ ɛ but err S (h) = 0. Now,<br />

draw set S ′ . E(error <strong>of</strong> h on S ′ ) = err D (h) ≥ ɛ. So, by Chern<strong>of</strong>f bounds, since n ≥ 8/ɛ,<br />

Prob(err S ′(h) ≥ ɛ/2) ≥ 1/2. Thus, Prob(B|A) ≥ 1/2 and Prob(B) ≥ Prob(A)/2 as<br />

desired.<br />

We now prove Theorem 6.13, restated here for convenience.<br />

211

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!