08.03.2014 Views

Machine Learning 3. Nearest Neighbor and Kernel Methods - ISMLL

Machine Learning 3. Nearest Neighbor and Kernel Methods - ISMLL

Machine Learning 3. Nearest Neighbor and Kernel Methods - ISMLL

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Machine</strong> <strong>Learning</strong> / <strong>3.</strong> Parzen Windows<br />

k-<strong>Nearest</strong> <strong>Neighbor</strong> is locally constant<br />

In k-nearest neighbor the size of the window varies from<br />

point to point: it depends on the density of the data:<br />

in dense parts<br />

the effective window size is small,<br />

in sparse parts<br />

the effective window size is large.<br />

Alternatively, it is also possible to set the size of the<br />

windows to a constant λ, e.g.,<br />

{ 1, if |x − x0 | ≤ λ<br />

K λ (x, x 0 ) :=<br />

0, otherwise<br />

Lars Schmidt-Thieme, Information Systems <strong>and</strong> <strong>Machine</strong> <strong>Learning</strong> Lab (<strong>ISMLL</strong>), Institute BW/WI & Institute for Computer Science, University of Hildesheim<br />

Course on <strong>Machine</strong> <strong>Learning</strong>, winter term 2007 36/48<br />

<strong>Machine</strong> <strong>Learning</strong> / <strong>3.</strong> Parzen Windows<br />

<strong>Kernel</strong> Regression<br />

Instead of discrete windows, one typically uses<br />

continuous windows, i.e., continuous weights<br />

K(x, x 0 )<br />

that reflect the distance of a training point x to a<br />

prediction point x 0 , called kernel or Parzen window,<br />

e.g.,<br />

{<br />

1 −<br />

|x−x 0 |<br />

K(x, x 0 ) :=<br />

λ<br />

, if |x − x 0 | ≤ λ<br />

0, otherwise<br />

Instead of a binary neighbor/not-neighbor decision, a<br />

continuous kernel captures a “degree of neighborship”.<br />

<strong>Kernel</strong>s can be used for prediction via kernel<br />

regression, esp. Nadaraya-Watson kernel-weighted<br />

average:<br />

∑<br />

(x,y)∈X<br />

ŷ(x 0 ) :=<br />

K(x, x 0)y<br />

∑<br />

(x,y)∈X K(x, x 0)<br />

Lars Schmidt-Thieme, Information Systems <strong>and</strong> <strong>Machine</strong> <strong>Learning</strong> Lab (<strong>ISMLL</strong>), Institute BW/WI & Institute for Computer Science, University of Hildesheim<br />

Course on <strong>Machine</strong> <strong>Learning</strong>, winter term 2007 37/48

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!