StereoVision_1 - Nanyang Technological University

School of Computer Engineering

**Nanyang** **Technological** **University**

3D Stereo Vision

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Contents of 3D Stereo Vision

Representation of 3D data

Parallax

2D triangulation

Appearance-based matching

Feature-based matching

Epipolar geometry

3D reconstruction

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Range Sensing

Radars and Sonars

Time of flight for EM / acoustic pulses

Focusing / Defocusing

Mapping from focal length to focused distance

Projector-Camera Triangulation

Projected light pattern or laser on surface,

measured by camera

Stereo Cameras

Measures light intensities from scene passively

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Why Use Stereo Vision?

Inexpensive

School of Computer Engineering

**Nanyang** **Technological** **University**

Cameras are inexpensive and getting cheaper

Low-powered, non-intrusive

Does not project light or radio waves

Able to handle moving objects

Does not require that objects are static

Directly measure object radiance

Useful for graphics and visualization

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Representation of Raw Range Data

Cloud of 3D Points

School of Computer Engineering

**Nanyang** **Technological** **University**

Explicit (X, Y, Z) data per point

Useful for 3D visualization

Images © Point Grey Research

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Representation of Raw Range Data

Depth Maps

School of Computer Engineering

**Nanyang** **Technological** **University**

Each pixel has associated depth

• measured along ray from projection center

Easier to associate with original image intensity

Images © Point Grey Research

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Fitting 3D Models to Range Data

We can fit 3D models to raw range data for

School of Computer Engineering

**Nanyang** **Technological** **University**

Reducing the amount of data storage

Removing noise

Models can be

Simple – e.g. planar patches, B-spline meshes

Complex – e.g. deformable object models

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

3D Model Example

3D reconstruction of temple (Van Gool et al.)

3D point cloud converted to triangular meshes and

texture-mapped

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Parallax

Parallax refers to the observed change in

relative angular displacement of image

points across different camera views

Parallax results from the depth differences

in corresponding 3D points

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Parallax from X-Y Translation

Parallax results when camera is translated

parallel to image plane

School of Computer Engineering

**Nanyang** **Technological** **University**

relative

angle

O1 reversed

O2 SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Parallax from Z Translation

Parallax results when camera is moved

along optical axis

School of Computer Engineering

**Nanyang** **Technological** **University**

O 2

O 1

SC437

Computer Vision and Image Processing

relative

angle

increased

TJ Cham

2002-03 / S2

No Parallax from Zooming

No parallax from increasing focal length

(zooming)

School of Computer Engineering

**Nanyang** **Technological** **University**

O 1

SC437

Computer Vision and Image Processing

no change in

relative angle

TJ Cham

2002-03 / S2

No Parallax from Pure Rotation

No parallax when rotating about projection

center

no change in

relative angle

School of Computer Engineering

**Nanyang** **Technological** **University**

O 1

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Depth from Parallax

3D depth of points in the scene can be inferred

from parallax

We can recover depth when projection centers are

displaced:

School of Computer Engineering

**Nanyang** **Technological** **University**

Translating the camera in any direction

Have multiple cameras in different locations

We cannot recover depth when projection centers

are at the same location:

Changing focal length (zooming)

Rotating camera about projection center

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Simple 2D Triangulation

Assume 1D camera

translated by T

along

x-direction in

camera frame

School of Computer Engineering

**Nanyang** **Technological** **University**

f

O l

X l

SC437

Computer Vision and Image Processing

Z

x l -xr

T

-X r

O r

TJ Cham

2002-03 / S2

Simple 2D Triangulation

Perspective camera equations

Subtraction

But based on geometry of diagram, we have

School of Computer Engineering

**Nanyang** **Technological** **University**

x

l

X l = f ,

Z

x

l

−

x

r

=

X

x

SC437

Computer Vision and Image Processing

f

X T − =

l

l

X

r

r

−

Z

=

X

r

f

X

Z

r

TJ Cham

2002-03 / S2

Simple 2D Triangulation

Depth from simple triangulation

T is known as the baseline distance

Since camera translation is only in x direction,

image points in both cameras have same pixel row:

Y

yl =

yr

= f

Z

School of Computer Engineering

**Nanyang** **Technological** **University**

Z

=

f

x

l

T

−

x

SC437

Computer Vision and Image Processing

r

TJ Cham

2002-03 / S2

Image Disparity from 2 Cameras

Depth Z is inversely proportional to the

image disparity d=xl-xr For each pixel in left camera, we can

measure the disparity after finding the

corresponding image point in right camera

If focal length and baseline distance is

unknown, we can represent range data using

a disparity map instead of a depth map

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Relation of Baseline to Accuracy

Let disparity d = x l - x r, then

If estimation of d has an error δd, the

corresponding error in Z is

School of Computer Engineering

**Nanyang** **Technological** **University**

T

Z = f ⇒ d =

d

2

T

T Z

δZ

=

f δd

= f δd

= δd

2

2

d

fT

T

Z

SC437

Computer Vision and Image Processing

f

( fT Z )

TJ Cham

2002-03 / S2

Relation of Baseline to Accuracy

As cameras become closer together,

baseline distance T is reduced

As

T

→

Estimation for Z is unstable as cameras get

infinitesimally close together

Larger baseline is better for estimation

accuracy

School of Computer Engineering

**Nanyang** **Technological** **University**

0,

δZ

=

lim

T →0

2

Z

fT

SC437

Computer Vision and Image Processing

δd

→

∞

TJ Cham

2002-03 / S2

Recap on Parallax and 1D Triangulation

Parallax / no parallax

School of Computer Engineering

**Nanyang** **Technological** **University**

Parallax when projection centers are displaced

Depth from 2D triangulation

Z = f T / (x l -x r )

Disparity map depth map

Depth estimation accuracy

Relation of Z estimation to baseline distance T

Larger T gives more accurate Z estimation

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

The Correspondence Problem

The triangulation covered earlier assumes

that image points corresponding to the same

3D point are known

The correspondence problem:

School of Computer Engineering

**Nanyang** **Technological** **University**

For each important image point in the first

image, we need to find the corresponding image

point in the second image

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Matching Image Points

Appearance-based Matching

Match points with similar appearances across

two images

Feature-based Matching

Match similar features across two images

Features = edges, corners, etc.

Geometric Constraints

Reduce ambiguity

Limit amount of search

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Occlusion and Deocclusion

Occlusion

Some 3D points visible in first image are

hidden in second image

Implication: Cannot find correspondences for

all points in first image

De-Occlusion

Some 3D points hidden in first image are

visible in second image

Implication: There will be unmatched points in

the second image

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Appearance-based Matching

Assumptions

School of Computer Engineering

**Nanyang** **Technological** **University**

Corresponding points in 2 images have image

patches that look identical

Minimal geometric distortion

Lambertian reflectance

No occlusion

Reasonable for stereo cameras with small

baseline distance

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Minimize Sum-of-Squares Difference (SSD)

For a small image patch in the first image,

find corresponding image patch with

smallest intensity SSD in second image

For image I and image patch g of size N by

N, find image location (x,y) where

arg

min

( x,

y)

School of Computer Engineering

**Nanyang** **Technological** **University**

− N 1 N −1

∑∑

u=

0

v=

0

( I(

x + u,

y + v)

− g(

u,

v)

)

SC437

Computer Vision and Image Processing

2

TJ Cham

2002-03 / S2

Minimize Sum-of-Squares Difference (SSD)

Expanding and eliminating constant terms

⎧

⎪2

⎪

arg max⎨

( x,

y)

⎪−

⎪⎩

This expression in the form of 2 correlations

can be computed efficiently via FFT

School of Computer Engineering

**Nanyang** **Technological** **University**

N −1

N −1

∑∑

u=

0

N −1

N −1

∑∑

u=

0

v=

0

v=

0

⎫

I(

x + u,

y + v)

g(

u,

v)

⎪

⎬

2

I(

x + u,

y + v)

⋅1

⎪

⎪⎭

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Appearance-based Matching in Stereo

Partition left image into small image

patches

For each image patch, find corresponding

image patch in right image

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Appearance Matching Example

Point Grey Research DigiClops

Stereo Images

Disparity Map

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Matching Localization

Localization of SSD-based matching

depends on `structure’ in image patch

Smooth surface

School of Computer Engineering

**Nanyang** **Technological** **University**

• Localization is poor in all directions

Image gradients in one direction (e.g. edges)

• Localization is poor perpendicular to gradient

direction

Image gradients along multiple directions (e.g.

corners)

• Localization is good

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Matching Accuracy Examples

SSD Images

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Quantifying Matching Localization

For a particular image patch, we want to

find out how accurately it can be localized

1. Consider image patch at the solution point

(i.e. perfect matching location where SSD=0)

2. Estimate how much SSD increases when

image patch is displaced by small amounts in

different directions

We can use partial derivatives to estimate

the increase in SSD

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

S(

x

0

Quantifying Matching Localization

+ δx,

y

Express SSD as a function S(x,y)

2 nd order Taylor Series expansion:

+ δy)

≈

At solution point,

School of Computer Engineering

**Nanyang** **Technological** **University**

0

S

0

0 th

order

⎡∂S

+ ⎢

⎣ ∂x

S

0

=

0,

0

∂S

∂y

⎤⎡δx⎤

⎥⎢

⎥ +

⎦⎣δy⎦

SC437

Computer Vision and Image Processing

0

1 st

order

(gradient)

∂S

∂x

0

=

∂S

∂y

0

=

2 ⎡∂

S

⎢

∂x

0

2

2

∂ S ⎤ 0

⎥

∂x∂y

⎡δx⎤

[ δx

δy]

⎢

⎥

2 2 ⎢ ⎥

⎢∂

S0

∂ S0

⎥⎣δy⎦

0

⎢

⎣∂y∂x

∂y

2 nd

order

(Hessian)

2

⎥

⎦

TJ Cham

2002-03 / S2

Quantifying Matching Localization

For a image patch g(u,v), the SSD Hessian matrix

at the solution point can be expressed as

H

∂g/∂x and ∂g/∂y are simply the image gradients

along the x and y directions, at each pixel (u,v) in

the image patch

School of Computer Engineering

**Nanyang** **Technological** **University**

0

⎡

⎢2

⎢

= ⎢

⎢

⎢2

⎢

⎣

N −1

N −1

∑∑

u=

0 v=

0

N −1

N −1

∑∑

u=

0 v=

0

⎛

⎜

⎝

∂g

∂x

∂g

∂x

⎞

⎟

⎠

∂g

∂y

N −1

N −1

∑∑

u=

0 v=

0

N −1

N −1

∑∑

u=

0 v=

0

SC437

Computer Vision and Image Processing

2

2

2

∂g

∂x

⎛

⎜

⎝

∂g

∂y

∂g

∂y

⎞

⎟

⎠

2

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎦

TJ Cham

2002-03 / S2

Cornerness

Cornerness is defined as

C

=

=

min

The corresponding eigenvector is the

direction where the image patch can `slide’

with minimum increase in SSD

School of Computer Engineering

**Nanyang** **Technological** **University**

minimum

root

λ

eigenvalue

of

{ ( )( ) 0}

2

λ − h λ − h − h =

11

22

SC437

Computer Vision and Image Processing

H

0

12

TJ Cham

2002-03 / S2

Cornerness Examples

Image Patches:

⎡δx⎤

Plots of S ≈

0 ⎢ ⎥

⎣δy⎦

School of Computer Engineering

**Nanyang** **Technological** **University**

[ δx δy]

H :

C≈10 5 C≈10 3 C≈10 3

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Recap on Appearance-Based Matching

The correspondence problem

Occlusion / de-occlusion

Appearance-based matching in stereo images

School of Computer Engineering

**Nanyang** **Technological** **University**

Minimize sum-of-squares difference (SSD) of image

patches

Expressed as correlations for efficient computation

Matching Localization

Localization depends on structure in image patch

SSD Hessian matrix H0 at solution point

Cornerness C = min eigenvalue of H 0

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

3D Estimation Accuracy versus

Matching Accuracy

Correspondence by appearance matching is

most accurate when

School of Computer Engineering

**Nanyang** **Technological** **University**

Cameras are close together – small baseline

3D estimation is most accurate when

Cameras are far apart – large baseline

In real 3D stereo applications, need to find

reasonable tradeoff

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Feature-based Matching

Want to match image points across large

differences in viewpoints

Extract image points with local properties that are

approximately invariant to larger changes of

viewpoint

Need to select feature points = `special’ points

Note: 3D reconstruction is more sparse

Example features and properties:

Corners – angle of corner

Curves – maximum radius of curvature

Properties depend on types of viewpoint changes

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Feature-based Matching

Match feature points across images directly by

finding similar properties

Example 1:

School of Computer Engineering

**Nanyang** **Technological** **University**

Match corner points in two images with the same angle

Note: corner points can have very different orientation

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Feature-based Matching Example

Example 2:

Finding symmetrical

contours

Cannot use SSD directly

Need to use feature points

School of Computer Engineering

**Nanyang** **Technological** **University**

• E.g. points of maximum

curvature

• Match using curvatures

SC437

Computer Vision and Image Processing

Image ©TJ Cham

TJ Cham

2002-03 / S2

Feature Matching Heuristics

Even with feature properties, there may be

multiple candidates

Heuristics may be used to try and select the

correct solution

Note: heuristics are empirical rules-ofthumb

and may not always give the right

solution

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Feature Matching Heuristics Examples

Proximity – match feature point to candidate with

nearest (x,y) position in second image

Ordering – feature points on a contour in first

image must match candidates on a contour in the

second image in the same order

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Recap on Feature-based Matching

Advantage

School of Computer Engineering

**Nanyang** **Technological** **University**

allows greater baseline / larger variation in viewpoints

Method

Extract feature points with properties which are

invariant to viewpoint changes

Match feature points with same properties across

images

Matching may require heuristics to help find

correct solution

E.g. proximity, ordering

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Geometric Search Constraint

A very powerful search constraint is

available if we know how cameras are

related to each other

Given an image point in the first image,

consider the ray R from the projection

center passing through that image point

The ray R must intersect the 3D point

The corresponding image point in the second

image must lie on the projection of the ray R in

the second view

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Geometric Search Constraint

O l

R

School of Computer Engineering

**Nanyang** **Technological** **University**

O r

SC437

Computer Vision and Image Processing

Do not have to

conduct 2D search

over entire image

Instead, conduct 1D

search only along

projection of R in

second image

TJ Cham

2002-03 / S2

Stereo Rig for Epipolar Geometry

School of Computer Engineering

**Nanyang** **Technological** **University**

epipolar

lines

O l O r

epipoles

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Epipoles and Epipolar Lines

The epipole in view 2 is the image point of the

projection center of camera 1

An epipolar line in view 2 is the image line of a

ray intersecting the projection center in camera 1

Each pixel in view 1 has a corresponding epipolar

line in view 2, and vice versa

Epipolar lines always intersect the epipole

There is only 1 epipole per image, but infinite

family of epipolar lines

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Epipoles and Epipolar Lines Examples

Epipolar lines in left image, based on points in the

right image

Epipolar lines intersect at the epipole which is outside

the image in this case

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

Images © T. de Margerie

TJ Cham

2002-03 / S2

Epipolar Geometry Derivation

O T

l Or School of Computer Engineering

**Nanyang** **Technological** **University**

P l

Pl − T

SC437

Computer Vision and Image Processing

w.r.t to O l reference frame

Pr = R l

−1

( P − T)

w.r.t to O r reference frame

TJ Cham

2002-03 / S2

Epipolar Geometry Derivation

P l, T and (P l-T) lie in the same plane

Express this as

In right frame P r=R -1 (P l-T)

School of Computer Engineering

**Nanyang** **Technological** **University**

( × T)

⋅(

P − T)

= 0

Pl l

( ) T

× T ( P − T)

= 0

Pl l

( ) T

P T RP = 0

l

× r

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Epipolar Geometry Derivation

We can also express cross-product as matrix

multiplication!

Set

⎡X

⎢

⎢

Y

⎢⎣

Z

School of Computer Engineering

**Nanyang** **Technological** **University**

l

l

l

⎤ ⎡T

⎥

×

⎢

⎥ ⎢

T

⎥⎦

⎢⎣

T

X

Y

Z

⎤

⎥

⎥

⎥⎦

=

⎡ 0

H

=

⎢

⎢

−T

⎢⎣

TY

⎡ 0

⎢

⎢

−T

⎢⎣

TY

Z

Z

TZ

0

−T

TZ

0

−T

SC437

Computer Vision and Image Processing

X

X

−T

TX

0

Y

−T

⎤

⎥

⎥

⎥⎦

T

0

X

Y

⎤ ⎡X

⎥ ⎢

⎥ ⎢

Yl

⎥⎦

⎢⎣

Z

l

l

⎤

⎥

⎥

⎥⎦

TJ Cham

2002-03 / S2

Epipolar Geometry Derivation

Substitute cross-product

2D image points are related to 3D points via

p

School of Computer Engineering

**Nanyang** **Technological** **University**

l

( ) T

HP RP = 0

P

T

l

l

H

T

fl

= Pl

,

Z

l

RP

SC437

Computer Vision and Image Processing

r

r

p

=

r

0

=

f

Z

r

r

P

r

TJ Cham

2002-03 / S2

Essential Matrix Equation

Combining,

where E is the 3x3 Essential Matrix

E is scale factor independent, and is usually

multiplied such that E(3,3)=1

School of Computer Engineering

**Nanyang** **Technological** **University**

p

p

T

l

T

l

⎛

⎜

⎝

Z

f

l

l

H

Ep

T

R

SC437

Computer Vision and Image Processing

r

=

r

0

Z

f

r

r

⎞

⎟

⎟p

⎠

r

=

0

TJ Cham

2002-03 / S2

Fundamental Matrix Equation

Affine transformations of image points

where F is the 3x3 Fundamental Matrix

F is also scale factor independent, and is usually

multiplied such that F(3,3)=1

School of Computer Engineering

**Nanyang** **Technological** **University**

'

l

l

l

'

r

p = M p , p =

p

'T

l

M

( ) −1 T −1

'

M EM p = 0

p

l

T

'

l

F

p

SC437

Computer Vision and Image Processing

r

'

r

=

r

0

r

p

r

TJ Cham

2002-03 / S2

Using Epipolar Geometry

Writing the equation in full, we have

We can express as

If we have specific values for (x l ,y l ), this equation

is a straight line in the right image!

School of Computer Engineering

**Nanyang** **Technological** **University**

1

[ x y 1]

f f f y = 0

l

l

⎡ f

⎢

⎢

⎣ f

4

7

1

SC437

Computer Vision and Image Processing

f

f

2

5

8

f

3

6

⎤ ⎡xr

⎤

⎥ ⎢ ⎥ r

⎥ ⎢ 1 ⎥

⎦ ⎣ ⎦

[ ] r

f

x f y + f f x + f y + f = − f x + f y + 1)

1

l

⎡x

+ 4 l 7 2 l 5 l 8 ⎢ ⎥ ( 3 l 6 l

yr

⎣

⎤

⎦

TJ Cham

2002-03 / S2

Using Epipolar Geometry

If we know the Fundamental matrix

1. for any point (x l ,y l ) in the one image, we know the

epipolar line in the other image

2. We only need to search for corresponding point along

epipolar line

Example:

School of Computer Engineering

**Nanyang** **Technological** **University**

Images ©Robotvis INRIA

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Estimating Fundamental Matrix

How do we compute the Fundamental Matrix?

Obtain directly from projection matrices

School of Computer Engineering

**Nanyang** **Technological** **University**

• if cameras are pre-calibrated

• Difficult to derive and not covered in lectures

or …

Obtain from 8 known image correspondences

• 1 correspondence means a specific (xl,yl),(xr,yr) pair

• Can be used with uncalibrated cameras

• Knowing 1 correspondence provides 1 constraint to the equation

p T

l Fpr=0

• Knowing 8 correspondences

able to solve for 8 variables in F

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Estimating Fundamental Matrix

The Fundamental Matrix equation in full

We can rewrite as

School of Computer Engineering

**Nanyang** **Technological** **University**

1

[ x y 1]

f f f y = 0

l

l

⎡ f

⎢

⎢

⎣ f

4

7

f

f

2

5

8

1

SC437

Computer Vision and Image Processing

f

3

6

⎤ ⎡xr

⎤

⎥ ⎢ ⎥ r

⎥ ⎢ 1 ⎥

⎦ ⎣ ⎦

4

[ x x y x y x y y y x y ] ⎢ ⎥ = −1

xl r l r l l r l r l r r

…

known values from

1 correspondence

unknown values

to be computed

:

⎡

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎣

f

f

f

f

f

f

f

f

1

2

3

5

6

7

8

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎦

TJ Cham

2002-03 / S2

Estimating Fundamental Matrix

⎡ xl1x

⎢xl

2xr

2

⎢

⎢

⎣ xl8x

r

Using 8 known correspondences, we get

r1

l8

r8

School of Computer Engineering

**Nanyang** **Technological** **University**

8

xl1y

x y

l 2

x

y

r1

r 2

x

x

x

l1

l 2

l8

y

y

y

l1

l 2

l8

x

x

M

x

r1

r 2

r8

B

8 x 8

y

y

y

l1

l 2

l8

y

y

r1

r 2

r8

l1

l 2

l8

SC437

Computer Vision and Image Processing

y

y

y

y

x

x

r1

x

r 2

r8

y

y

r1

y

r 2

r8

⎡ f

⎢ f

⎢

⎤ f

⎢

⎥

⎢ f

⎥

⎢ f

⎥

⎦ ⎢ f

⎢ f

⎢

⎣ f

1

2

3

4

5

6

7

8

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎦

f

8 x 1

⎡−1⎤

⎢−1⎥

⎢−1⎥

⎢ ⎥

=

−1

⎢−1⎥

⎢−1⎥

⎢−1⎥

⎢

⎣−1⎥

⎦

TJ Cham

2002-03 / S2

Estimating Fundamental Matrix

We can get then solve for the elements of

the Fundamental Matrix

With n>8 point correspondences, B is nx8

and we have to use pseudo-inverse

School of Computer Engineering

**Nanyang** **Technological** **University**

f

= −1

B

⎡−

⎢

M

⎢

⎢⎣

−

1

1

⎤

⎥

⎥

⎥⎦

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Recap on Epipolar Geometry

Epipolar geometry

Epipoles – image point of projection center

Epipolar lines – projection of 3D rays from projection

center

Epipolar equations

Essential matrix –image points in camera reference frame

Fundamental matrix –

T

p

Fp

= 0

covers additional affine transformation of image points

Using epipolar geometry

Search only along epipolar line to find corresponding point

Estimating fundamental matrix

Use at least n>8 known image correspondences

School of Computer Engineering

**Nanyang** **Technological** **University**

l

SC437

Computer Vision and Image Processing

r

TJ Cham

2002-03 / S2

3D Reconstruction from Perspective Stereo

3D Reconstruction by triangulation with calibrated

cameras

for each image point in both cameras, we know the

associated 3D ray

Find intersection of rays from corresponding image

points

O l

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

O r

TJ Cham

2002-03 / S2

3D Reconstruction from Perspective Stereo

Do the triangulation in matrix form

Assume we have projection matrices from camera

calibration

L camera:

R camera:

Except now the unknowns are only X, Y, Z!

School of Computer Engineering

**Nanyang** **Technological** **University**

⎡kx

⎢ky

⎢

⎣ k

l

l

⎡mx

⎢my

⎢

⎣ m

⎤ ⎡a1

⎥ = ⎢a5

⎥ ⎢

⎦ ⎣a9

r

r

⎤ ⎡b1

⎥ = ⎢b5

⎥ ⎢

⎦ ⎣b9

a

a

a

10

11

SC437

Computer Vision and Image Processing

2

6

b

b

b

2

6

10

a

a

a

3

7

b

b

b

3

7

11

a ⎤ ⎡X

⎤

4

a ⎥ ⎢Y

⎥

8

⎥ ⎢Z

⎥

1 ⎦ ⎢⎣

1 ⎥⎦

⎡ ⎤

4 ⎤

⎥ ⎢Y

⎥

8

⎥ ⎢Z

⎥

1 ⎦ ⎢⎣

1 ⎥⎦

X b

b

TJ Cham

2002-03 / S2

3D Reconstruction from Perspective Stereo

x l

Rewrite for left camera

=

a

a

Similarly for right camera

School of Computer Engineering

**Nanyang** **Technological** **University**

1

9

X + a2Y

+ a3Z

+ a4

X + a Y + a Z + 1

10

11

y l

=

a

a

SC437

Computer Vision and Image Processing

5

9

X

X

+ a6Y

+ a7Z

+ a8

+ a Y + a Z + 1

( a xl

− a1)

X + ( a10xl

− a2)

Y + ( a11xl

− a3)

Z = ( a4

− x

9 l

( a yl

−

a5)

X + ( a10

yl

− a6)

Y + ( a11yl

− a7)

Z = ( a8

−

9 l

10

11

y

)

)

TJ Cham

2002-03 / S2

3D Reconstruction from Perspective Stereo

Hence with a pair of image correspondences (x l ,y l )

and (x r ,y r ), we can get

⎡a

⎢a

⎢b

⎢

⎣b

x

y

x

y

− a

− a

− b

− b

Compute 3D world coordinates X, Y and Z via

pseudo-inverse ⎡ ⎤

Computationally expensive – compute new pseudoinverse

for each pair of image correspondences

School of Computer Engineering

**Nanyang** **Technological** **University**

9

9

9

9

l

l

r

r

1

5

1

5

a

a

b

b

10

10

10

10

x

y

x

y

l

l

r

r

− a

− a

− b

− b

W

2

6

2

6

a

a

b

b

11

11

11

11

x

y

x

y

− a

− a

− b

− b

⎤

⎥

⎥

⎥

⎦

+

⎢Y

⎥ = W q = (W

⎢⎣

Z ⎥⎦

X

SC437

Computer Vision and Image Processing

l

l

r

r

3

7

3

7

⎡a

⎡ ⎤ ⎢a

⎢Y

⎥ = ⎢

⎢⎣

Z ⎥ b

⎦ ⎢

⎣b

X

4

8

4

8

W)

T −1

W

−

−

−

−

q

T

q

x

y

x

y

l

l

r

r

⎤

⎥

⎥

⎥

⎦

TJ Cham

2002-03 / S2

3D Reconstruction from Weak-

Perspective Stereo

Assume weak-perspective projection matrices

L camera:

R camera:

Note: last rows of matrix equations are redundant!

School of Computer Engineering

**Nanyang** **Technological** **University**

⎡xl

⎤ ⎡a1

⎢y

⎥ = ⎢ l a5

⎢ ⎥ ⎢

⎣ 1 ⎦ ⎣ 0

⎡xr

⎤ ⎡b1

⎢y

⎥ = ⎢ r b5

⎢ ⎥ ⎢

⎣ 1 ⎦ ⎣ 0

a

a

0

a

a

0

SC437

Computer Vision and Image Processing

2

6

b2

b6

0

3

7

b3

b7

0

a ⎤ ⎡X

⎤

4

a ⎥ ⎢Y

⎥

8

⎥ ⎢ Z ⎥

1 ⎦ ⎢⎣

1 ⎥⎦

⎡ ⎤

4 ⎤

⎥ ⎢Y

⎥

8

⎥ ⎢ Z ⎥

1 ⎦ ⎢⎣

1 ⎥⎦

X b

b

TJ Cham

2002-03 / S2

3D Reconstruction from Weak-

Perspective Stereo

Eliminate last rows and combine equations

Again, solve by pseudo-inverse

However, pseudo-inverse is dependent only on the

projection matrices and not on individual image

points

need only be computed once for 3D reconstruction of

all image correspondences

School of Computer Engineering

**Nanyang** **Technological** **University**

⎡a

⎢a

⎢b

⎢

⎣b

1

5

1

5

a

a

b

b

2

6

2

6

a

a

b

b

3

7

3

7

⎤

⎥

⎥

⎥

⎦

⎡ ⎤

⎢Y

⎥

⎢⎣

Z ⎥⎦

X

=

⎡x

⎢y

⎢x

⎢

⎣y

SC437

Computer Vision and Image Processing

l

l

r

r

− a

− a

− b

− b

4

8

4

8

⎤

⎥

⎥

⎥

⎦

TJ Cham

2002-03 / S2

Recap on 3D Reconstruction

3D reconstruction involves triangulation of

3D rays

Full perspective 3D reconstruction

Weak perspective 3D reconstruction

School of Computer Engineering

**Nanyang** **Technological** **University**

Computationally cheaper – compute pseudoinverse

only once

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Example Application of 3D Reconstruction

Comparing Matrix movie’s super slow-motion

technology and Kanade’s Virtualized Reality

Matrix movie’s technology

•Does not use stereo vision

•Needs a lot of cameras

•Trajectory of slow-mo sequence fixed

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

Virtualized Reality

•Based on 3D stereo vision

• Needs fewer cameras

•No constraint on trajectory

TJ Cham

2002-03 / S2

Example Application of 3D Reconstruction

Virtualized Reality (Kanade et al.)

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Practical Overview of Simplistic 3D

Reconstruction Process

1. Fix the pose of your cameras, e.g. by placing them

on tripods

2. Using a calibration chart,

Calibrate your cameras using known correspondences of

3D points to 2D image points

Establish epipolar geometry by computing Fundamental

Matrix using known correspondences between 2D image

points

3. For an arbitrary scene in front of your fixed

cameras, your system should automatically:

a. Find corresponding image points between camera images

b. Compute 3D coordinates for each correspondence

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2

Summary of 3D Stereo Vision

Representation of 3D data

Parallax

2D triangulation

Appearance-based matching

Feature-based matching

Epipolar geometry

3D reconstruction

School of Computer Engineering

**Nanyang** **Technological** **University**

SC437

Computer Vision and Image Processing

TJ Cham

2002-03 / S2