21.08.2013 Views

StereoVision_1 - Nanyang Technological University

StereoVision_1 - Nanyang Technological University

StereoVision_1 - Nanyang Technological University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

3D Stereo Vision<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Contents of 3D Stereo Vision<br />

Representation of 3D data<br />

Parallax<br />

2D triangulation<br />

Appearance-based matching<br />

Feature-based matching<br />

Epipolar geometry<br />

3D reconstruction<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Range Sensing<br />

Radars and Sonars<br />

Time of flight for EM / acoustic pulses<br />

Focusing / Defocusing<br />

Mapping from focal length to focused distance<br />

Projector-Camera Triangulation<br />

Projected light pattern or laser on surface,<br />

measured by camera<br />

Stereo Cameras<br />

Measures light intensities from scene passively<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Why Use Stereo Vision?<br />

Inexpensive<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Cameras are inexpensive and getting cheaper<br />

Low-powered, non-intrusive<br />

Does not project light or radio waves<br />

Able to handle moving objects<br />

Does not require that objects are static<br />

Directly measure object radiance<br />

Useful for graphics and visualization<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Representation of Raw Range Data<br />

Cloud of 3D Points<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Explicit (X, Y, Z) data per point<br />

Useful for 3D visualization<br />

Images © Point Grey Research<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Representation of Raw Range Data<br />

Depth Maps<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Each pixel has associated depth<br />

• measured along ray from projection center<br />

Easier to associate with original image intensity<br />

Images © Point Grey Research<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Fitting 3D Models to Range Data<br />

We can fit 3D models to raw range data for<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Reducing the amount of data storage<br />

Removing noise<br />

Models can be<br />

Simple – e.g. planar patches, B-spline meshes<br />

Complex – e.g. deformable object models<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


3D Model Example<br />

3D reconstruction of temple (Van Gool et al.)<br />

3D point cloud converted to triangular meshes and<br />

texture-mapped<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Parallax<br />

Parallax refers to the observed change in<br />

relative angular displacement of image<br />

points across different camera views<br />

Parallax results from the depth differences<br />

in corresponding 3D points<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Parallax from X-Y Translation<br />

Parallax results when camera is translated<br />

parallel to image plane<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

relative<br />

angle<br />

O1 reversed<br />

O2 SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Parallax from Z Translation<br />

Parallax results when camera is moved<br />

along optical axis<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

O 2<br />

O 1<br />

SC437<br />

Computer Vision and Image Processing<br />

relative<br />

angle<br />

increased<br />

TJ Cham<br />

2002-03 / S2


No Parallax from Zooming<br />

No parallax from increasing focal length<br />

(zooming)<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

O 1<br />

SC437<br />

Computer Vision and Image Processing<br />

no change in<br />

relative angle<br />

TJ Cham<br />

2002-03 / S2


No Parallax from Pure Rotation<br />

No parallax when rotating about projection<br />

center<br />

no change in<br />

relative angle<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

O 1<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Depth from Parallax<br />

3D depth of points in the scene can be inferred<br />

from parallax<br />

We can recover depth when projection centers are<br />

displaced:<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Translating the camera in any direction<br />

Have multiple cameras in different locations<br />

We cannot recover depth when projection centers<br />

are at the same location:<br />

Changing focal length (zooming)<br />

Rotating camera about projection center<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Simple 2D Triangulation<br />

Assume 1D camera<br />

translated by T<br />

along<br />

x-direction in<br />

camera frame<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

f<br />

O l<br />

X l<br />

SC437<br />

Computer Vision and Image Processing<br />

Z<br />

x l -xr<br />

T<br />

-X r<br />

O r<br />

TJ Cham<br />

2002-03 / S2


Simple 2D Triangulation<br />

Perspective camera equations<br />

Subtraction <br />

But based on geometry of diagram, we have<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

x<br />

l<br />

X l = f ,<br />

Z<br />

x<br />

l<br />

−<br />

x<br />

r<br />

=<br />

X<br />

x<br />

SC437<br />

Computer Vision and Image Processing<br />

f<br />

X T − =<br />

l<br />

l<br />

X<br />

r<br />

r<br />

−<br />

Z<br />

=<br />

X<br />

r<br />

f<br />

X<br />

Z<br />

r<br />

TJ Cham<br />

2002-03 / S2


Simple 2D Triangulation<br />

Depth from simple triangulation<br />

T is known as the baseline distance<br />

Since camera translation is only in x direction,<br />

image points in both cameras have same pixel row:<br />

Y<br />

yl =<br />

yr<br />

= f<br />

Z<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Z<br />

=<br />

f<br />

x<br />

l<br />

T<br />

−<br />

x<br />

SC437<br />

Computer Vision and Image Processing<br />

r<br />

TJ Cham<br />

2002-03 / S2


Image Disparity from 2 Cameras<br />

Depth Z is inversely proportional to the<br />

image disparity d=xl-xr For each pixel in left camera, we can<br />

measure the disparity after finding the<br />

corresponding image point in right camera<br />

If focal length and baseline distance is<br />

unknown, we can represent range data using<br />

a disparity map instead of a depth map<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Relation of Baseline to Accuracy<br />

Let disparity d = x l - x r, then<br />

If estimation of d has an error δd, the<br />

corresponding error in Z is<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

T<br />

Z = f ⇒ d =<br />

d<br />

2<br />

T<br />

T Z<br />

δZ<br />

=<br />

f δd<br />

= f δd<br />

= δd<br />

2<br />

2<br />

d<br />

fT<br />

T<br />

Z<br />

SC437<br />

Computer Vision and Image Processing<br />

f<br />

( fT Z )<br />

TJ Cham<br />

2002-03 / S2


Relation of Baseline to Accuracy<br />

As cameras become closer together,<br />

baseline distance T is reduced<br />

As<br />

T<br />

→<br />

Estimation for Z is unstable as cameras get<br />

infinitesimally close together<br />

Larger baseline is better for estimation<br />

accuracy<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

0,<br />

δZ<br />

=<br />

lim<br />

T →0<br />

2<br />

Z<br />

fT<br />

SC437<br />

Computer Vision and Image Processing<br />

δd<br />

→<br />

∞<br />

TJ Cham<br />

2002-03 / S2


Recap on Parallax and 1D Triangulation<br />

Parallax / no parallax<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Parallax when projection centers are displaced<br />

Depth from 2D triangulation<br />

Z = f T / (x l -x r )<br />

Disparity map depth map<br />

Depth estimation accuracy<br />

Relation of Z estimation to baseline distance T<br />

Larger T gives more accurate Z estimation<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


The Correspondence Problem<br />

The triangulation covered earlier assumes<br />

that image points corresponding to the same<br />

3D point are known<br />

The correspondence problem:<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

For each important image point in the first<br />

image, we need to find the corresponding image<br />

point in the second image<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Matching Image Points<br />

Appearance-based Matching<br />

Match points with similar appearances across<br />

two images<br />

Feature-based Matching<br />

Match similar features across two images<br />

Features = edges, corners, etc.<br />

Geometric Constraints<br />

Reduce ambiguity<br />

Limit amount of search<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Occlusion and Deocclusion<br />

Occlusion<br />

Some 3D points visible in first image are<br />

hidden in second image<br />

Implication: Cannot find correspondences for<br />

all points in first image<br />

De-Occlusion<br />

Some 3D points hidden in first image are<br />

visible in second image<br />

Implication: There will be unmatched points in<br />

the second image<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Appearance-based Matching<br />

Assumptions<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Corresponding points in 2 images have image<br />

patches that look identical<br />

Minimal geometric distortion<br />

Lambertian reflectance<br />

No occlusion<br />

Reasonable for stereo cameras with small<br />

baseline distance<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Minimize Sum-of-Squares Difference (SSD)<br />

For a small image patch in the first image,<br />

find corresponding image patch with<br />

smallest intensity SSD in second image<br />

For image I and image patch g of size N by<br />

N, find image location (x,y) where<br />

arg<br />

min<br />

( x,<br />

y)<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

− N 1 N −1<br />

∑∑<br />

u=<br />

0<br />

v=<br />

0<br />

( I(<br />

x + u,<br />

y + v)<br />

− g(<br />

u,<br />

v)<br />

)<br />

SC437<br />

Computer Vision and Image Processing<br />

2<br />

TJ Cham<br />

2002-03 / S2


Minimize Sum-of-Squares Difference (SSD)<br />

Expanding and eliminating constant terms<br />

⎧<br />

⎪2<br />

⎪<br />

arg max⎨<br />

( x,<br />

y)<br />

⎪−<br />

⎪⎩<br />

This expression in the form of 2 correlations<br />

can be computed efficiently via FFT<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

N −1<br />

N −1<br />

∑∑<br />

u=<br />

0<br />

N −1<br />

N −1<br />

∑∑<br />

u=<br />

0<br />

v=<br />

0<br />

v=<br />

0<br />

⎫<br />

I(<br />

x + u,<br />

y + v)<br />

g(<br />

u,<br />

v)<br />

⎪<br />

⎬<br />

2<br />

I(<br />

x + u,<br />

y + v)<br />

⋅1<br />

⎪<br />

⎪⎭<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Appearance-based Matching in Stereo<br />

Partition left image into small image<br />

patches<br />

For each image patch, find corresponding<br />

image patch in right image<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Appearance Matching Example<br />

Point Grey Research DigiClops<br />

Stereo Images<br />

Disparity Map<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Matching Localization<br />

Localization of SSD-based matching<br />

depends on `structure’ in image patch<br />

Smooth surface<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

• Localization is poor in all directions<br />

Image gradients in one direction (e.g. edges)<br />

• Localization is poor perpendicular to gradient<br />

direction<br />

Image gradients along multiple directions (e.g.<br />

corners)<br />

• Localization is good<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Matching Accuracy Examples<br />

SSD Images<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Quantifying Matching Localization<br />

For a particular image patch, we want to<br />

find out how accurately it can be localized<br />

1. Consider image patch at the solution point<br />

(i.e. perfect matching location where SSD=0)<br />

2. Estimate how much SSD increases when<br />

image patch is displaced by small amounts in<br />

different directions<br />

We can use partial derivatives to estimate<br />

the increase in SSD<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


S(<br />

x<br />

0<br />

Quantifying Matching Localization<br />

+ δx,<br />

y<br />

Express SSD as a function S(x,y)<br />

2 nd order Taylor Series expansion:<br />

+ δy)<br />

≈<br />

At solution point,<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

0<br />

S<br />

0<br />

0 th<br />

order<br />

⎡∂S<br />

+ ⎢<br />

⎣ ∂x<br />

S<br />

0<br />

=<br />

0,<br />

0<br />

∂S<br />

∂y<br />

⎤⎡δx⎤<br />

⎥⎢<br />

⎥ +<br />

⎦⎣δy⎦<br />

SC437<br />

Computer Vision and Image Processing<br />

0<br />

1 st<br />

order<br />

(gradient)<br />

∂S<br />

∂x<br />

0<br />

=<br />

∂S<br />

∂y<br />

0<br />

=<br />

2 ⎡∂<br />

S<br />

⎢<br />

∂x<br />

0<br />

2<br />

2<br />

∂ S ⎤ 0<br />

⎥<br />

∂x∂y<br />

⎡δx⎤<br />

[ δx<br />

δy]<br />

⎢<br />

⎥<br />

2 2 ⎢ ⎥<br />

⎢∂<br />

S0<br />

∂ S0<br />

⎥⎣δy⎦<br />

0<br />

⎢<br />

⎣∂y∂x<br />

∂y<br />

2 nd<br />

order<br />

(Hessian)<br />

2<br />

⎥<br />

⎦<br />

TJ Cham<br />

2002-03 / S2


Quantifying Matching Localization<br />

For a image patch g(u,v), the SSD Hessian matrix<br />

at the solution point can be expressed as<br />

H<br />

∂g/∂x and ∂g/∂y are simply the image gradients<br />

along the x and y directions, at each pixel (u,v) in<br />

the image patch<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

0<br />

⎡<br />

⎢2<br />

⎢<br />

= ⎢<br />

⎢<br />

⎢2<br />

⎢<br />

⎣<br />

N −1<br />

N −1<br />

∑∑<br />

u=<br />

0 v=<br />

0<br />

N −1<br />

N −1<br />

∑∑<br />

u=<br />

0 v=<br />

0<br />

⎛<br />

⎜<br />

⎝<br />

∂g<br />

∂x<br />

∂g<br />

∂x<br />

⎞<br />

⎟<br />

⎠<br />

∂g<br />

∂y<br />

N −1<br />

N −1<br />

∑∑<br />

u=<br />

0 v=<br />

0<br />

N −1<br />

N −1<br />

∑∑<br />

u=<br />

0 v=<br />

0<br />

SC437<br />

Computer Vision and Image Processing<br />

2<br />

2<br />

2<br />

∂g<br />

∂x<br />

⎛<br />

⎜<br />

⎝<br />

∂g<br />

∂y<br />

∂g<br />

∂y<br />

⎞<br />

⎟<br />

⎠<br />

2<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

TJ Cham<br />

2002-03 / S2


Cornerness<br />

Cornerness is defined as<br />

C<br />

=<br />

=<br />

min<br />

The corresponding eigenvector is the<br />

direction where the image patch can `slide’<br />

with minimum increase in SSD<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

minimum<br />

root<br />

λ<br />

eigenvalue<br />

of<br />

{ ( )( ) 0}<br />

2<br />

λ − h λ − h − h =<br />

11<br />

22<br />

SC437<br />

Computer Vision and Image Processing<br />

H<br />

0<br />

12<br />

TJ Cham<br />

2002-03 / S2


Cornerness Examples<br />

Image Patches:<br />

⎡δx⎤<br />

Plots of S ≈<br />

0 ⎢ ⎥<br />

⎣δy⎦<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

[ δx δy]<br />

H :<br />

C≈10 5 C≈10 3 C≈10 3<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Recap on Appearance-Based Matching<br />

The correspondence problem<br />

Occlusion / de-occlusion<br />

Appearance-based matching in stereo images<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Minimize sum-of-squares difference (SSD) of image<br />

patches<br />

Expressed as correlations for efficient computation<br />

Matching Localization<br />

Localization depends on structure in image patch<br />

SSD Hessian matrix H0 at solution point<br />

Cornerness C = min eigenvalue of H 0<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


3D Estimation Accuracy versus<br />

Matching Accuracy<br />

Correspondence by appearance matching is<br />

most accurate when<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Cameras are close together – small baseline<br />

3D estimation is most accurate when<br />

Cameras are far apart – large baseline<br />

In real 3D stereo applications, need to find<br />

reasonable tradeoff<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Feature-based Matching<br />

Want to match image points across large<br />

differences in viewpoints<br />

Extract image points with local properties that are<br />

approximately invariant to larger changes of<br />

viewpoint<br />

Need to select feature points = `special’ points<br />

Note: 3D reconstruction is more sparse<br />

Example features and properties:<br />

Corners – angle of corner<br />

Curves – maximum radius of curvature<br />

Properties depend on types of viewpoint changes<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Feature-based Matching<br />

Match feature points across images directly by<br />

finding similar properties<br />

Example 1:<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Match corner points in two images with the same angle<br />

Note: corner points can have very different orientation<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Feature-based Matching Example<br />

Example 2:<br />

Finding symmetrical<br />

contours<br />

Cannot use SSD directly<br />

Need to use feature points<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

• E.g. points of maximum<br />

curvature<br />

• Match using curvatures<br />

SC437<br />

Computer Vision and Image Processing<br />

Image ©TJ Cham<br />

TJ Cham<br />

2002-03 / S2


Feature Matching Heuristics<br />

Even with feature properties, there may be<br />

multiple candidates<br />

Heuristics may be used to try and select the<br />

correct solution<br />

Note: heuristics are empirical rules-ofthumb<br />

and may not always give the right<br />

solution<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Feature Matching Heuristics Examples<br />

Proximity – match feature point to candidate with<br />

nearest (x,y) position in second image<br />

Ordering – feature points on a contour in first<br />

image must match candidates on a contour in the<br />

second image in the same order<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Recap on Feature-based Matching<br />

Advantage<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

allows greater baseline / larger variation in viewpoints<br />

Method<br />

Extract feature points with properties which are<br />

invariant to viewpoint changes<br />

Match feature points with same properties across<br />

images<br />

Matching may require heuristics to help find<br />

correct solution<br />

E.g. proximity, ordering<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Geometric Search Constraint<br />

A very powerful search constraint is<br />

available if we know how cameras are<br />

related to each other<br />

Given an image point in the first image,<br />

consider the ray R from the projection<br />

center passing through that image point<br />

The ray R must intersect the 3D point<br />

The corresponding image point in the second<br />

image must lie on the projection of the ray R in<br />

the second view<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Geometric Search Constraint<br />

O l<br />

R<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

O r<br />

SC437<br />

Computer Vision and Image Processing<br />

Do not have to<br />

conduct 2D search<br />

over entire image<br />

Instead, conduct 1D<br />

search only along<br />

projection of R in<br />

second image<br />

TJ Cham<br />

2002-03 / S2


Stereo Rig for Epipolar Geometry<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

epipolar<br />

lines<br />

O l O r<br />

epipoles<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Epipoles and Epipolar Lines<br />

The epipole in view 2 is the image point of the<br />

projection center of camera 1<br />

An epipolar line in view 2 is the image line of a<br />

ray intersecting the projection center in camera 1<br />

Each pixel in view 1 has a corresponding epipolar<br />

line in view 2, and vice versa<br />

Epipolar lines always intersect the epipole<br />

There is only 1 epipole per image, but infinite<br />

family of epipolar lines<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Epipoles and Epipolar Lines Examples<br />

Epipolar lines in left image, based on points in the<br />

right image<br />

Epipolar lines intersect at the epipole which is outside<br />

the image in this case<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

Images © T. de Margerie<br />

TJ Cham<br />

2002-03 / S2


Epipolar Geometry Derivation<br />

O T<br />

l Or School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

P l<br />

Pl − T<br />

SC437<br />

Computer Vision and Image Processing<br />

w.r.t to O l reference frame<br />

Pr = R l<br />

−1<br />

( P − T)<br />

w.r.t to O r reference frame<br />

TJ Cham<br />

2002-03 / S2


Epipolar Geometry Derivation<br />

<br />

P l, T and (P l-T) lie in the same plane<br />

Express this as<br />

In right frame P r=R -1 (P l-T) <br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

( × T)<br />

⋅(<br />

P − T)<br />

= 0<br />

Pl l<br />

( ) T<br />

× T ( P − T)<br />

= 0<br />

Pl l<br />

( ) T<br />

P T RP = 0<br />

l<br />

× r<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Epipolar Geometry Derivation<br />

We can also express cross-product as matrix<br />

multiplication!<br />

Set<br />

⎡X<br />

⎢<br />

⎢<br />

Y<br />

⎢⎣<br />

Z<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

l<br />

l<br />

l<br />

⎤ ⎡T<br />

⎥<br />

×<br />

⎢<br />

⎥ ⎢<br />

T<br />

⎥⎦<br />

⎢⎣<br />

T<br />

X<br />

Y<br />

Z<br />

⎤<br />

⎥<br />

⎥<br />

⎥⎦<br />

=<br />

⎡ 0<br />

H<br />

=<br />

⎢<br />

⎢<br />

−T<br />

⎢⎣<br />

TY<br />

⎡ 0<br />

⎢<br />

⎢<br />

−T<br />

⎢⎣<br />

TY<br />

Z<br />

Z<br />

TZ<br />

0<br />

−T<br />

TZ<br />

0<br />

−T<br />

SC437<br />

Computer Vision and Image Processing<br />

X<br />

X<br />

−T<br />

TX<br />

0<br />

Y<br />

−T<br />

⎤<br />

⎥<br />

⎥<br />

⎥⎦<br />

T<br />

0<br />

X<br />

Y<br />

⎤ ⎡X<br />

⎥ ⎢<br />

⎥ ⎢<br />

Yl<br />

⎥⎦<br />

⎢⎣<br />

Z<br />

l<br />

l<br />

⎤<br />

⎥<br />

⎥<br />

⎥⎦<br />

TJ Cham<br />

2002-03 / S2


Epipolar Geometry Derivation<br />

<br />

Substitute cross-product<br />

2D image points are related to 3D points via<br />

p<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

l<br />

( ) T<br />

HP RP = 0<br />

P<br />

T<br />

l<br />

l<br />

H<br />

T<br />

fl<br />

= Pl<br />

,<br />

Z<br />

l<br />

RP<br />

SC437<br />

Computer Vision and Image Processing<br />

r<br />

r<br />

p<br />

=<br />

r<br />

0<br />

=<br />

f<br />

Z<br />

r<br />

r<br />

P<br />

r<br />

TJ Cham<br />

2002-03 / S2


Essential Matrix Equation<br />

<br />

Combining,<br />

where E is the 3x3 Essential Matrix<br />

E is scale factor independent, and is usually<br />

multiplied such that E(3,3)=1<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

p<br />

p<br />

T<br />

l<br />

T<br />

l<br />

⎛<br />

⎜<br />

⎝<br />

Z<br />

f<br />

l<br />

l<br />

H<br />

Ep<br />

T<br />

R<br />

SC437<br />

Computer Vision and Image Processing<br />

r<br />

=<br />

r<br />

0<br />

Z<br />

f<br />

r<br />

r<br />

⎞<br />

⎟<br />

⎟p<br />

⎠<br />

r<br />

=<br />

0<br />

TJ Cham<br />

2002-03 / S2


Fundamental Matrix Equation<br />

<br />

<br />

Affine transformations of image points<br />

where F is the 3x3 Fundamental Matrix<br />

F is also scale factor independent, and is usually<br />

multiplied such that F(3,3)=1<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

'<br />

l<br />

l<br />

l<br />

'<br />

r<br />

p = M p , p =<br />

p<br />

'T<br />

l<br />

M<br />

( ) −1 T −1<br />

'<br />

M EM p = 0<br />

p<br />

l<br />

T<br />

'<br />

l<br />

F<br />

p<br />

SC437<br />

Computer Vision and Image Processing<br />

r<br />

'<br />

r<br />

=<br />

r<br />

0<br />

r<br />

p<br />

r<br />

TJ Cham<br />

2002-03 / S2


Using Epipolar Geometry<br />

Writing the equation in full, we have<br />

We can express as<br />

If we have specific values for (x l ,y l ), this equation<br />

is a straight line in the right image!<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

1<br />

[ x y 1]<br />

f f f y = 0<br />

l<br />

l<br />

⎡ f<br />

⎢<br />

⎢<br />

⎣ f<br />

4<br />

7<br />

1<br />

SC437<br />

Computer Vision and Image Processing<br />

f<br />

f<br />

2<br />

5<br />

8<br />

f<br />

3<br />

6<br />

⎤ ⎡xr<br />

⎤<br />

⎥ ⎢ ⎥ r<br />

⎥ ⎢ 1 ⎥<br />

⎦ ⎣ ⎦<br />

[ ] r<br />

f<br />

x f y + f f x + f y + f = − f x + f y + 1)<br />

1<br />

l<br />

⎡x<br />

+ 4 l 7 2 l 5 l 8 ⎢ ⎥ ( 3 l 6 l<br />

yr<br />

⎣<br />

⎤<br />

⎦<br />

TJ Cham<br />

2002-03 / S2


Using Epipolar Geometry<br />

If we know the Fundamental matrix<br />

1. for any point (x l ,y l ) in the one image, we know the<br />

epipolar line in the other image<br />

2. We only need to search for corresponding point along<br />

epipolar line<br />

Example:<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Images ©Robotvis INRIA<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Estimating Fundamental Matrix<br />

How do we compute the Fundamental Matrix?<br />

Obtain directly from projection matrices<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

• if cameras are pre-calibrated<br />

• Difficult to derive and not covered in lectures<br />

or …<br />

Obtain from 8 known image correspondences<br />

• 1 correspondence means a specific (xl,yl),(xr,yr) pair<br />

• Can be used with uncalibrated cameras<br />

• Knowing 1 correspondence provides 1 constraint to the equation<br />

p T<br />

l Fpr=0<br />

• Knowing 8 correspondences<br />

able to solve for 8 variables in F<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Estimating Fundamental Matrix<br />

The Fundamental Matrix equation in full<br />

We can rewrite as<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

1<br />

[ x y 1]<br />

f f f y = 0<br />

l<br />

l<br />

⎡ f<br />

⎢<br />

⎢<br />

⎣ f<br />

4<br />

7<br />

f<br />

f<br />

2<br />

5<br />

8<br />

1<br />

SC437<br />

Computer Vision and Image Processing<br />

f<br />

3<br />

6<br />

⎤ ⎡xr<br />

⎤<br />

⎥ ⎢ ⎥ r<br />

⎥ ⎢ 1 ⎥<br />

⎦ ⎣ ⎦<br />

4<br />

[ x x y x y x y y y x y ] ⎢ ⎥ = −1<br />

xl r l r l l r l r l r r<br />

…<br />

known values from<br />

1 correspondence<br />

unknown values<br />

to be computed<br />

:<br />

⎡<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎢<br />

⎣<br />

f<br />

f<br />

f<br />

f<br />

f<br />

f<br />

f<br />

f<br />

1<br />

2<br />

3<br />

5<br />

6<br />

7<br />

8<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

TJ Cham<br />

2002-03 / S2


Estimating Fundamental Matrix<br />

⎡ xl1x<br />

⎢xl<br />

2xr<br />

2<br />

⎢<br />

⎢<br />

⎣ xl8x<br />

r<br />

Using 8 known correspondences, we get<br />

r1<br />

l8<br />

r8<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

8<br />

xl1y<br />

x y<br />

l 2<br />

x<br />

y<br />

r1<br />

r 2<br />

x<br />

x<br />

x<br />

l1<br />

l 2<br />

l8<br />

y<br />

y<br />

y<br />

l1<br />

l 2<br />

l8<br />

x<br />

x<br />

M<br />

x<br />

r1<br />

r 2<br />

r8<br />

B<br />

8 x 8<br />

y<br />

y<br />

y<br />

l1<br />

l 2<br />

l8<br />

y<br />

y<br />

r1<br />

r 2<br />

r8<br />

l1<br />

l 2<br />

l8<br />

SC437<br />

Computer Vision and Image Processing<br />

y<br />

y<br />

y<br />

y<br />

x<br />

x<br />

r1<br />

x<br />

r 2<br />

r8<br />

y<br />

y<br />

r1<br />

y<br />

r 2<br />

r8<br />

⎡ f<br />

⎢ f<br />

⎢<br />

⎤ f<br />

⎢<br />

⎥<br />

⎢ f<br />

⎥<br />

⎢ f<br />

⎥<br />

⎦ ⎢ f<br />

⎢ f<br />

⎢<br />

⎣ f<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

f<br />

8 x 1<br />

⎡−1⎤<br />

⎢−1⎥<br />

⎢−1⎥<br />

⎢ ⎥<br />

=<br />

−1<br />

⎢−1⎥<br />

⎢−1⎥<br />

⎢−1⎥<br />

⎢<br />

⎣−1⎥<br />

⎦<br />

TJ Cham<br />

2002-03 / S2


Estimating Fundamental Matrix<br />

We can get then solve for the elements of<br />

the Fundamental Matrix<br />

With n>8 point correspondences, B is nx8<br />

and we have to use pseudo-inverse<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

f<br />

= −1<br />

B<br />

⎡−<br />

⎢<br />

M<br />

⎢<br />

⎢⎣<br />

−<br />

1<br />

1<br />

⎤<br />

⎥<br />

⎥<br />

⎥⎦<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Recap on Epipolar Geometry<br />

Epipolar geometry<br />

Epipoles – image point of projection center<br />

Epipolar lines – projection of 3D rays from projection<br />

center<br />

Epipolar equations<br />

Essential matrix –image points in camera reference frame<br />

Fundamental matrix –<br />

T<br />

p<br />

Fp<br />

= 0<br />

covers additional affine transformation of image points<br />

Using epipolar geometry<br />

Search only along epipolar line to find corresponding point<br />

Estimating fundamental matrix<br />

Use at least n>8 known image correspondences<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

l<br />

SC437<br />

Computer Vision and Image Processing<br />

r<br />

TJ Cham<br />

2002-03 / S2


3D Reconstruction from Perspective Stereo<br />

3D Reconstruction by triangulation with calibrated<br />

cameras<br />

for each image point in both cameras, we know the<br />

associated 3D ray<br />

Find intersection of rays from corresponding image<br />

points<br />

O l<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

O r<br />

TJ Cham<br />

2002-03 / S2


3D Reconstruction from Perspective Stereo<br />

Do the triangulation in matrix form<br />

Assume we have projection matrices from camera<br />

calibration<br />

L camera:<br />

R camera:<br />

Except now the unknowns are only X, Y, Z!<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

⎡kx<br />

⎢ky<br />

⎢<br />

⎣ k<br />

l<br />

l<br />

⎡mx<br />

⎢my<br />

⎢<br />

⎣ m<br />

⎤ ⎡a1<br />

⎥ = ⎢a5<br />

⎥ ⎢<br />

⎦ ⎣a9<br />

r<br />

r<br />

⎤ ⎡b1<br />

⎥ = ⎢b5<br />

⎥ ⎢<br />

⎦ ⎣b9<br />

a<br />

a<br />

a<br />

10<br />

11<br />

SC437<br />

Computer Vision and Image Processing<br />

2<br />

6<br />

b<br />

b<br />

b<br />

2<br />

6<br />

10<br />

a<br />

a<br />

a<br />

3<br />

7<br />

b<br />

b<br />

b<br />

3<br />

7<br />

11<br />

a ⎤ ⎡X<br />

⎤<br />

4<br />

a ⎥ ⎢Y<br />

⎥<br />

8<br />

⎥ ⎢Z<br />

⎥<br />

1 ⎦ ⎢⎣<br />

1 ⎥⎦<br />

⎡ ⎤<br />

4 ⎤<br />

⎥ ⎢Y<br />

⎥<br />

8<br />

⎥ ⎢Z<br />

⎥<br />

1 ⎦ ⎢⎣<br />

1 ⎥⎦<br />

X b<br />

b<br />

TJ Cham<br />

2002-03 / S2


3D Reconstruction from Perspective Stereo<br />

x l<br />

<br />

Rewrite for left camera<br />

=<br />

a<br />

a<br />

Similarly for right camera<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

1<br />

9<br />

X + a2Y<br />

+ a3Z<br />

+ a4<br />

X + a Y + a Z + 1<br />

10<br />

11<br />

y l<br />

=<br />

a<br />

a<br />

SC437<br />

Computer Vision and Image Processing<br />

5<br />

9<br />

X<br />

X<br />

+ a6Y<br />

+ a7Z<br />

+ a8<br />

+ a Y + a Z + 1<br />

( a xl<br />

− a1)<br />

X + ( a10xl<br />

− a2)<br />

Y + ( a11xl<br />

− a3)<br />

Z = ( a4<br />

− x<br />

9 l<br />

( a yl<br />

−<br />

a5)<br />

X + ( a10<br />

yl<br />

− a6)<br />

Y + ( a11yl<br />

− a7)<br />

Z = ( a8<br />

−<br />

9 l<br />

10<br />

11<br />

y<br />

)<br />

)<br />

TJ Cham<br />

2002-03 / S2


3D Reconstruction from Perspective Stereo<br />

Hence with a pair of image correspondences (x l ,y l )<br />

and (x r ,y r ), we can get<br />

⎡a<br />

⎢a<br />

⎢b<br />

⎢<br />

⎣b<br />

x<br />

y<br />

x<br />

y<br />

− a<br />

− a<br />

− b<br />

− b<br />

Compute 3D world coordinates X, Y and Z via<br />

pseudo-inverse ⎡ ⎤<br />

Computationally expensive – compute new pseudoinverse<br />

for each pair of image correspondences<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

9<br />

9<br />

9<br />

9<br />

l<br />

l<br />

r<br />

r<br />

1<br />

5<br />

1<br />

5<br />

a<br />

a<br />

b<br />

b<br />

10<br />

10<br />

10<br />

10<br />

x<br />

y<br />

x<br />

y<br />

l<br />

l<br />

r<br />

r<br />

− a<br />

− a<br />

− b<br />

− b<br />

W<br />

2<br />

6<br />

2<br />

6<br />

a<br />

a<br />

b<br />

b<br />

11<br />

11<br />

11<br />

11<br />

x<br />

y<br />

x<br />

y<br />

− a<br />

− a<br />

− b<br />

− b<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

+<br />

⎢Y<br />

⎥ = W q = (W<br />

⎢⎣<br />

Z ⎥⎦<br />

X<br />

SC437<br />

Computer Vision and Image Processing<br />

l<br />

l<br />

r<br />

r<br />

3<br />

7<br />

3<br />

7<br />

⎡a<br />

⎡ ⎤ ⎢a<br />

⎢Y<br />

⎥ = ⎢<br />

⎢⎣<br />

Z ⎥ b<br />

⎦ ⎢<br />

⎣b<br />

X<br />

4<br />

8<br />

4<br />

8<br />

W)<br />

T −1<br />

W<br />

−<br />

−<br />

−<br />

−<br />

q<br />

T<br />

q<br />

x<br />

y<br />

x<br />

y<br />

l<br />

l<br />

r<br />

r<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

TJ Cham<br />

2002-03 / S2


3D Reconstruction from Weak-<br />

Perspective Stereo<br />

Assume weak-perspective projection matrices<br />

L camera:<br />

R camera:<br />

Note: last rows of matrix equations are redundant!<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

⎡xl<br />

⎤ ⎡a1<br />

⎢y<br />

⎥ = ⎢ l a5<br />

⎢ ⎥ ⎢<br />

⎣ 1 ⎦ ⎣ 0<br />

⎡xr<br />

⎤ ⎡b1<br />

⎢y<br />

⎥ = ⎢ r b5<br />

⎢ ⎥ ⎢<br />

⎣ 1 ⎦ ⎣ 0<br />

a<br />

a<br />

0<br />

a<br />

a<br />

0<br />

SC437<br />

Computer Vision and Image Processing<br />

2<br />

6<br />

b2<br />

b6<br />

0<br />

3<br />

7<br />

b3<br />

b7<br />

0<br />

a ⎤ ⎡X<br />

⎤<br />

4<br />

a ⎥ ⎢Y<br />

⎥<br />

8<br />

⎥ ⎢ Z ⎥<br />

1 ⎦ ⎢⎣<br />

1 ⎥⎦<br />

⎡ ⎤<br />

4 ⎤<br />

⎥ ⎢Y<br />

⎥<br />

8<br />

⎥ ⎢ Z ⎥<br />

1 ⎦ ⎢⎣<br />

1 ⎥⎦<br />

X b<br />

b<br />

TJ Cham<br />

2002-03 / S2


3D Reconstruction from Weak-<br />

Perspective Stereo<br />

Eliminate last rows and combine equations<br />

Again, solve by pseudo-inverse<br />

However, pseudo-inverse is dependent only on the<br />

projection matrices and not on individual image<br />

points<br />

need only be computed once for 3D reconstruction of<br />

all image correspondences<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

⎡a<br />

⎢a<br />

⎢b<br />

⎢<br />

⎣b<br />

1<br />

5<br />

1<br />

5<br />

a<br />

a<br />

b<br />

b<br />

2<br />

6<br />

2<br />

6<br />

a<br />

a<br />

b<br />

b<br />

3<br />

7<br />

3<br />

7<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

⎡ ⎤<br />

⎢Y<br />

⎥<br />

⎢⎣<br />

Z ⎥⎦<br />

X<br />

=<br />

⎡x<br />

⎢y<br />

⎢x<br />

⎢<br />

⎣y<br />

SC437<br />

Computer Vision and Image Processing<br />

l<br />

l<br />

r<br />

r<br />

− a<br />

− a<br />

− b<br />

− b<br />

4<br />

8<br />

4<br />

8<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

TJ Cham<br />

2002-03 / S2


Recap on 3D Reconstruction<br />

3D reconstruction involves triangulation of<br />

3D rays<br />

Full perspective 3D reconstruction<br />

Weak perspective 3D reconstruction<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

Computationally cheaper – compute pseudoinverse<br />

only once<br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Example Application of 3D Reconstruction<br />

Comparing Matrix movie’s super slow-motion<br />

technology and Kanade’s Virtualized Reality<br />

Matrix movie’s technology<br />

•Does not use stereo vision<br />

•Needs a lot of cameras<br />

•Trajectory of slow-mo sequence fixed<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

Virtualized Reality<br />

•Based on 3D stereo vision<br />

• Needs fewer cameras<br />

•No constraint on trajectory<br />

TJ Cham<br />

2002-03 / S2


Example Application of 3D Reconstruction<br />

Virtualized Reality (Kanade et al.)<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Practical Overview of Simplistic 3D<br />

Reconstruction Process<br />

1. Fix the pose of your cameras, e.g. by placing them<br />

on tripods<br />

2. Using a calibration chart,<br />

Calibrate your cameras using known correspondences of<br />

3D points to 2D image points<br />

Establish epipolar geometry by computing Fundamental<br />

Matrix using known correspondences between 2D image<br />

points<br />

3. For an arbitrary scene in front of your fixed<br />

cameras, your system should automatically:<br />

a. Find corresponding image points between camera images<br />

b. Compute 3D coordinates for each correspondence<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2


Summary of 3D Stereo Vision<br />

Representation of 3D data<br />

Parallax<br />

2D triangulation<br />

Appearance-based matching<br />

Feature-based matching<br />

Epipolar geometry<br />

3D reconstruction<br />

School of Computer Engineering<br />

<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />

SC437<br />

Computer Vision and Image Processing<br />

TJ Cham<br />

2002-03 / S2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!