Machine Learning 3. Nearest Neighbor and Kernel Methods - ISMLL
Machine Learning 3. Nearest Neighbor and Kernel Methods - ISMLL
Machine Learning 3. Nearest Neighbor and Kernel Methods - ISMLL
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Machine</strong> <strong>Learning</strong> / <strong>3.</strong> Parzen Windows<br />
Space-averaged Estimates<br />
The probability that an instance x is within a given region<br />
R ⊆ X :<br />
∫<br />
p(x ∈ R) = p(x)dx<br />
R<br />
For a sample<br />
it is<br />
x 1 , x 2 , . . . , x n ∼ p<br />
(x i ∈ P ) ∼ binom(p(x ∈ R))<br />
Let k be the number of x i that are in region R:<br />
k := |{x i | x i ∈ R, i = 1, . . . , n}|<br />
then we can estimate<br />
ˆp(x ∈ R) := k n<br />
Lars Schmidt-Thieme, Information Systems <strong>and</strong> <strong>Machine</strong> <strong>Learning</strong> Lab (<strong>ISMLL</strong>), Institute BW/WI & Institute for Computer Science, University of Hildesheim<br />
Course on <strong>Machine</strong> <strong>Learning</strong>, winter term 2007 44/48<br />
<strong>Machine</strong> <strong>Learning</strong> / <strong>3.</strong> Parzen Windows<br />
Space-averaged Estimates<br />
If p is continous <strong>and</strong> R is very small, p(x) is almost<br />
constant in R:<br />
∫<br />
p(x ∈ R) = p(x)dx ≈ p(x) vol(R), for any x ∈ R<br />
R<br />
where vol(R) denotes the volume of region R.<br />
p(x) ≈<br />
k/n<br />
vol(R)<br />
Lars Schmidt-Thieme, Information Systems <strong>and</strong> <strong>Machine</strong> <strong>Learning</strong> Lab (<strong>ISMLL</strong>), Institute BW/WI & Institute for Computer Science, University of Hildesheim<br />
Course on <strong>Machine</strong> <strong>Learning</strong>, winter term 2007 45/48