08.03.2014 Views

Machine Learning 3. Nearest Neighbor and Kernel Methods - ISMLL

Machine Learning 3. Nearest Neighbor and Kernel Methods - ISMLL

Machine Learning 3. Nearest Neighbor and Kernel Methods - ISMLL

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Machine</strong> <strong>Learning</strong> / <strong>3.</strong> Parzen Windows<br />

Space-averaged Estimates<br />

The probability that an instance x is within a given region<br />

R ⊆ X :<br />

∫<br />

p(x ∈ R) = p(x)dx<br />

R<br />

For a sample<br />

it is<br />

x 1 , x 2 , . . . , x n ∼ p<br />

(x i ∈ P ) ∼ binom(p(x ∈ R))<br />

Let k be the number of x i that are in region R:<br />

k := |{x i | x i ∈ R, i = 1, . . . , n}|<br />

then we can estimate<br />

ˆp(x ∈ R) := k n<br />

Lars Schmidt-Thieme, Information Systems <strong>and</strong> <strong>Machine</strong> <strong>Learning</strong> Lab (<strong>ISMLL</strong>), Institute BW/WI & Institute for Computer Science, University of Hildesheim<br />

Course on <strong>Machine</strong> <strong>Learning</strong>, winter term 2007 44/48<br />

<strong>Machine</strong> <strong>Learning</strong> / <strong>3.</strong> Parzen Windows<br />

Space-averaged Estimates<br />

If p is continous <strong>and</strong> R is very small, p(x) is almost<br />

constant in R:<br />

∫<br />

p(x ∈ R) = p(x)dx ≈ p(x) vol(R), for any x ∈ R<br />

R<br />

where vol(R) denotes the volume of region R.<br />

p(x) ≈<br />

k/n<br />

vol(R)<br />

Lars Schmidt-Thieme, Information Systems <strong>and</strong> <strong>Machine</strong> <strong>Learning</strong> Lab (<strong>ISMLL</strong>), Institute BW/WI & Institute for Computer Science, University of Hildesheim<br />

Course on <strong>Machine</strong> <strong>Learning</strong>, winter term 2007 45/48

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!