18.07.2014 Views

PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

PHD Thesis - Institute for Computer Graphics and Vision - Graz ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.2. Scale invariant detectors 27<br />

where k is a constant multiplicative factor. This means that D(x, y, σ) is simply the subtraction<br />

of two neighboring discrete scale-space representations of the image I. The scale-space <strong>for</strong> DOG<br />

detection is defined in the following manner. It consists of a pre-defined number of partitions,<br />

called octaves. Each new octave starts with a σ with a double-as-high value of the previous<br />

octave. Each octave is partitioned into a number of s discrete scale-space representations, where<br />

s is an integer number. With this condition the parameter k is defined as k = √ 2. For each<br />

octave the image I is re-sampled down to half of the size of the previous image. Re-sampling<br />

is done by simply selecting every other pixel of the image. This is done <strong>for</strong> computational<br />

efficiency. Doing the re-sampling everytime when the σ is doubling is consistent with the scalespace<br />

theory. The difference of Gaussian function D(x, y, σ) is now produced by subtracting the<br />

neighboring scale-space slices within each octave. The next step after computation of D(x, y, σ)<br />

is the detection of local extrema therein. The extrema to be detected are the local minima<br />

<strong>and</strong> maxima of D(x, y, σ). Every pixel of the scale-space representation is checked if it is an<br />

extremum of D(x, y, σ). If a pixel is an extremum then it is selected as a DOG-keypoint. If the<br />

extremum is located on one of the re-sampled octaves the x <strong>and</strong> y coordinate in the original<br />

image scale have to be computed. The characteristic scale of the DOG-point is the value of the<br />

σ of the scale-space slice on which the extremum has been found. For extremum detection all<br />

26 neighbor pixels in scale-space are investigated. The pixel is a local maximum if its value is<br />

higher than the values of its neighbor <strong>and</strong> it is a local minimum if it is smaller than all of its<br />

neighbors. The 26 neighbors are defined by a 8-connecting neighborhood in scale-space. The 26<br />

neighbors consist of the 8 neighbors of the same slice, 9 neighbors on the upper scale-level <strong>and</strong> 9<br />

neighbors on the lower scale-level. Point detection in such a way only gives detection with pixel<br />

accuracy. In a subsequent step to detection a sub-pixel keypoint localization will be per<strong>for</strong>med.<br />

This step ensures, that keypoints are located exactly on corners or edges. To gain sub-pixel<br />

accuracy a 3D quadratic function will be fitted to the local scale-space region. The keypoint<br />

will finally be localized at the interpolated maximum or minimum of the quadratic function (<strong>for</strong><br />

more details see [13]).<br />

However not all detected extrema are suited to finally act as keypoints. Detected keypoints<br />

with low contrast are not well suited as keypoints. Scale-space extrema also tend to be located<br />

on edges. However, they are not well localized along the edge itself. A final filtering step will<br />

eliminate such ambiguous detections. Edge responses are eliminated by Eigenvalue analysis of<br />

the Hessian matrix H of the keypoint location. The process is very similar to corner detection<br />

using the Hessian matrix. The ratio of the two principal directions is computed <strong>and</strong> the<br />

keypoint is eliminated if one direction is significant stronger than the second one. The ratio is<br />

approximated by the ratio of the squared trace to the determinant. If<br />

trace(H) 2<br />

det(H)<br />

<<br />

(r + 1)2<br />

r<br />

(3.15)<br />

the location is accepted as DOG-keypoint, where r = 10 is a reasonable value <strong>for</strong> a lot of<br />

situations.<br />

It is possible to implement the necessary steps of the DOG-detector very efficiently. The<br />

DOG-detector is there<strong>for</strong>e a c<strong>and</strong>idate of choice if one wants to build a real-time system. Figure<br />

3.6(a) shows examples <strong>for</strong> DOG-keypoints. Each keypoint is represented by the center point<br />

(yellow cross) <strong>and</strong> the characteristic scale drawn as a circle around the center point.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!