15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

where the superscripts n and n + 1 indicate the iteration number, λ is a parameter allowing a tradeoff<br />

between smoothness and errors in the flow constraint equation, and rx( k, l)<br />

and ry( k, l)<br />

are local<br />

averages of rx and ry. The updated estimates are thus the average of the surrounding values, minus an<br />

adjustment (which in velocity space is in the direction of the intensity gradient).<br />

The previous discussion relied heavily on smoothness of the flow field. However, there are places in<br />

image sequences where discontinuities should occur. In particular, the boundaries of moving objects<br />

should exhibit discontinuities in optical flow. One approach taking advantage of smoothness but allowing<br />

discontinuities is to apply segmentation to the flow field. In this way, the boundaries between regions<br />

with smooth optical flow can be found, and the algorithm can be prevented from smoothing over these<br />

boundaries. Because of the “chicken-and-egg” nature of this method (a good segmentation depends on<br />

a good optical flow estimate, which depends on a good segmentation …), it is best applied iteratively.<br />

Spatiotemporal-Frequency-Based Methods<br />

It was shown in section 28.2 that motion can be considered in the frequency domain, as well as in the<br />

spatial domain. A number of motion estimation methods have been developed with this in mind. If the<br />

sequence to be analyzed is very simple (has only a single motion component, for example) or if motion<br />

detection alone is required, the Fourier transform can be used as the basis for motion analysis, as examined<br />

in [23–25]; however, due to the global nature of the Fourier transform, it cannot be used to determine<br />

the location of the object in motion. It is also poorly suited for cases in which multiple motions exist<br />

(i.e., when the scene of interest consists of more than one object moving independently), since the<br />

signatures of the different motions are difficult (impossible, in general) to separate in the Fourier domain.<br />

As a result, although Fourier analysis can be used to illustrate some interesting phenomena, it cannot be<br />

used as the basis of motion analysis methods for the majority of sequences of practical interest.<br />

To identify the locations and motions of objects, frequency analysis localized to the neighborhoods of<br />

the objects is required. Windowed Fourier analysis has been proposed for such cases [26], but the accuracy<br />

of a motion analysis method of this type is highly dependent on the resolution of the underlying<br />

transform, in both the spatiotemporal and spatiotemporal-frequency domains. It is known that the<br />

windowed Fourier transform does not perform particularly well in this regard. Filterbank-based<br />

approaches to this problem have also been proposed, as in [27]. The methods examined below each<br />

exploit the frequency domain characteristics of motion, and provide spatiotemporally localized motion<br />

estimates.<br />

Optical Flow via the 3-D Wigner Distribution<br />

Jacobson and Wechsler [28] proposed an approach to spatiotemporal-frequency, based derivation of<br />

optical flow using the 3-D Wigner distribution (WD). Extending the 2-D definition given earlier, the 3-D<br />

WD can be written as<br />

Wf ( x, y, t, ξx, ξy, ξt) f x α<br />

-- , y<br />

2<br />

β<br />

⎛ τ<br />

+ + --, t + --⎞<br />

f<br />

⎝ 2 2⎠<br />

∗ x α<br />

– -- , y<br />

2<br />

β<br />

∞ ∞ ∞<br />

τ<br />

=<br />

⋅ ⎛ – -- , t – --⎞<br />

∫ ∫ ∫<br />

⎝ 2 2⎠<br />

It can be shown that the WD of a linearly translating image with velocity r = (rx, ry) is<br />

© 2002 by CRC Press LLC<br />

– ∞ – ∞ – ∞<br />

× e −j2π(αξx βξ + y + τξt )<br />

dα dβ dτ<br />

Wf ( x, y, t, ξx, ξy, ξt) = δ( rxξ x + ryξ y + ξt) ⋅ Wf ( x– rxt, y – ryt, ξx,ξ y)<br />

(28.51)<br />

(28.52)<br />

which is nonzero only when rxξx + ryξy + ξt = 0.<br />

For a linearly translating image, then, the local spectra Wfx,y,t ( ξx,ξ y,ξ t)<br />

contain energy only in a plane<br />

(as in the Fourier case) the slope of which is determined by the velocity. Jacobson and Wechsler proposed<br />

to find this plane by integrating over the possible planar regions in these local spectra (via a so-called<br />

“velocity polling function”), using the plane of maximum energy to determine the velocity.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!