StereoVision_1 - Nanyang Technological University

School of Computer Engineering 

Nanyang Technological University 

3D Stereo Vision 

SC437 

Computer Vision and Image Processing 

TJ Cham 

2002-03 / S2

Contents of 3D Stereo Vision 

Representation of 3D data 

Parallax 

2D triangulation 

Appearance-based matching 

Feature-based matching 

Epipolar geometry 

3D reconstruction 



SC437 


TJ Cham 

2002-03 / S2

Range Sensing 

Radars and Sonars 

Time of flight for EM / acoustic pulses 

Focusing / Defocusing 

Mapping from focal length to focused distance 

Projector-Camera Triangulation 

Projected light pattern or laser on surface, 

measured by camera 

Stereo Cameras 

Measures light intensities from scene passively 



SC437 


TJ Cham 

2002-03 / S2

Why Use Stereo Vision? 

Inexpensive 



Cameras are inexpensive and getting cheaper 

Low-powered, non-intrusive 

Does not project light or radio waves 

Able to handle moving objects 

Does not require that objects are static 

Directly measure object radiance 

Useful for graphics and visualization 

SC437 


TJ Cham 

2002-03 / S2

Representation of Raw Range Data 

Cloud of 3D Points 



Explicit (X, Y, Z) data per point 

Useful for 3D visualization 

Images © Point Grey Research 

SC437 


TJ Cham 

2002-03 / S2

Representation of Raw Range Data 

Depth Maps 



Each pixel has associated depth 

• measured along ray from projection center 

Easier to associate with original image intensity 

Images © Point Grey Research 

SC437 


TJ Cham 

2002-03 / S2

Fitting 3D Models to Range Data 

We can fit 3D models to raw range data for 



Reducing the amount of data storage 

Removing noise 

Models can be 

Simple – e.g. planar patches, B-spline meshes 

Complex – e.g. deformable object models 

SC437 


TJ Cham 

2002-03 / S2

3D Model Example 

3D reconstruction of temple (Van Gool et al.) 

3D point cloud converted to triangular meshes and 

texture-mapped 



SC437 


TJ Cham 

2002-03 / S2

Parallax 

Parallax refers to the observed change in 

relative angular displacement of image 

points across different camera views 

Parallax results from the depth differences 

in corresponding 3D points 



SC437 


TJ Cham 

2002-03 / S2

Parallax from X-Y Translation 

Parallax results when camera is translated 

parallel to image plane 



relative 

angle 

O1 reversed 

O2 SC437 


TJ Cham 

2002-03 / S2

Parallax from Z Translation 

Parallax results when camera is moved 

along optical axis 



O 2 

O 1 

SC437 


relative 

angle 

increased 

TJ Cham 

2002-03 / S2

No Parallax from Zooming 

No parallax from increasing focal length 

(zooming) 



O 1 

SC437 


no change in 

relative angle 

TJ Cham 

2002-03 / S2

No Parallax from Pure Rotation 

No parallax when rotating about projection 

center 

no change in 

relative angle 



O 1 

SC437 


TJ Cham 

2002-03 / S2

Depth from Parallax 

3D depth of points in the scene can be inferred 

from parallax 

We can recover depth when projection centers are 

displaced: 



Translating the camera in any direction 

Have multiple cameras in different locations 

We cannot recover depth when projection centers 

are at the same location: 

Changing focal length (zooming) 

Rotating camera about projection center 

SC437 


TJ Cham 

2002-03 / S2

Simple 2D Triangulation 

Assume 1D camera 

translated by T 

along 

x-direction in 

camera frame 



f 

O l 

X l 

SC437 


Z 

x l -xr 

T 

-X r 

O r 

TJ Cham 

2002-03 / S2


Perspective camera equations 

Subtraction 

But based on geometry of diagram, we have 



x 

l 

X l = f , 

Z 

x 

l 

− 

x 

r 

= 

X 

x 

SC437 


f 

X T − = 

l 

l 

X 

r 

r 

− 

Z 

= 

X 

r 

f 

X 

Z 

r 

TJ Cham 

2002-03 / S2


Depth from simple triangulation 

T is known as the baseline distance 

Since camera translation is only in x direction, 

image points in both cameras have same pixel row: 

Y 

yl = 

yr 

= f 

Z 



Z 

= 

f 

x 

l 

T 

− 

x 

SC437 


r 

TJ Cham 

2002-03 / S2

Image Disparity from 2 Cameras 

Depth Z is inversely proportional to the 

image disparity d=xl-xr For each pixel in left camera, we can 

measure the disparity after finding the 

corresponding image point in right camera 

If focal length and baseline distance is 

unknown, we can represent range data using 

a disparity map instead of a depth map 



SC437 


TJ Cham 

2002-03 / S2

Relation of Baseline to Accuracy 

Let disparity d = x l - x r, then 

If estimation of d has an error δd, the 

corresponding error in Z is 



T 

Z = f ⇒ d = 

d 

2 

T 

T Z 

δZ 

= 

f δd 

= f δd 

= δd 

2 

2 

d 

fT 

T 

Z 

SC437 


f 

( fT Z ) 

TJ Cham 

2002-03 / S2

Relation of Baseline to Accuracy 

As cameras become closer together, 

baseline distance T is reduced 

As 

T 

→ 

Estimation for Z is unstable as cameras get 

infinitesimally close together 

Larger baseline is better for estimation 

accuracy 



0, 

δZ 

= 

lim 

T →0 

2 

Z 

fT 

SC437 


δd 

→ 

∞ 

TJ Cham 

2002-03 / S2

Recap on Parallax and 1D Triangulation 

Parallax / no parallax 



Parallax when projection centers are displaced 

Depth from 2D triangulation 

Z = f T / (x l -x r ) 

Disparity map depth map 

Depth estimation accuracy 

Relation of Z estimation to baseline distance T 

Larger T gives more accurate Z estimation 

SC437 


TJ Cham 

2002-03 / S2

The Correspondence Problem 

The triangulation covered earlier assumes 

that image points corresponding to the same 

3D point are known 

The correspondence problem: 



For each important image point in the first 

image, we need to find the corresponding image 

point in the second image 

SC437 


TJ Cham 

2002-03 / S2

Matching Image Points 

Appearance-based Matching 

Match points with similar appearances across 

two images 

Feature-based Matching 

Match similar features across two images 

Features = edges, corners, etc. 

Geometric Constraints 

Reduce ambiguity 

Limit amount of search 



SC437 


TJ Cham 

2002-03 / S2

Occlusion and Deocclusion 

Occlusion 

Some 3D points visible in first image are 

hidden in second image 

Implication: Cannot find correspondences for 

all points in first image 

De-Occlusion 

Some 3D points hidden in first image are 

visible in second image 

Implication: There will be unmatched points in 

the second image 



SC437 


TJ Cham 

2002-03 / S2

Appearance-based Matching 

Assumptions 



Corresponding points in 2 images have image 

patches that look identical 

Minimal geometric distortion 

Lambertian reflectance 

No occlusion 

Reasonable for stereo cameras with small 

baseline distance 

SC437 


TJ Cham 

2002-03 / S2

Minimize Sum-of-Squares Difference (SSD) 

For a small image patch in the first image, 

find corresponding image patch with 

smallest intensity SSD in second image 

For image I and image patch g of size N by 

N, find image location (x,y) where 

arg 

min 

( x, 

y) 



− N 1 N −1 

∑∑ 

u= 

0 

v= 

0 

( I( 

x + u, 

y + v) 

− g( 

u, 

v) 

) 

SC437 


2 

TJ Cham 

2002-03 / S2

Minimize Sum-of-Squares Difference (SSD) 

Expanding and eliminating constant terms 

⎧ 

⎪2 

⎪ 

arg max⎨ 

( x, 

y) 

⎪− 

⎪⎩ 

This expression in the form of 2 correlations 

can be computed efficiently via FFT 



N −1 

N −1 

∑∑ 

u= 

0 

N −1 

N −1 

∑∑ 

u= 

0 

v= 

0 

v= 

0 

⎫ 

I( 

x + u, 

y + v) 

g( 

u, 

v) 

⎪ 

⎬ 

2 

I( 

x + u, 

y + v) 

⋅1 

⎪ 

⎪⎭ 

SC437 


TJ Cham 

2002-03 / S2

Appearance-based Matching in Stereo 

Partition left image into small image 

patches 

For each image patch, find corresponding 

image patch in right image 



SC437 


TJ Cham 

2002-03 / S2

Appearance Matching Example 

Point Grey Research DigiClops 

Stereo Images 

Disparity Map 



SC437 


TJ Cham 

2002-03 / S2

Matching Localization 

Localization of SSD-based matching 

depends on `structure’ in image patch 

Smooth surface 



• Localization is poor in all directions 

Image gradients in one direction (e.g. edges) 

• Localization is poor perpendicular to gradient 

direction 

Image gradients along multiple directions (e.g. 

corners) 

• Localization is good 

SC437 


TJ Cham 

2002-03 / S2

Matching Accuracy Examples 

SSD Images 



SC437 


TJ Cham 

2002-03 / S2

Quantifying Matching Localization 

For a particular image patch, we want to 

find out how accurately it can be localized 

1. Consider image patch at the solution point 

(i.e. perfect matching location where SSD=0) 

2. Estimate how much SSD increases when 

image patch is displaced by small amounts in 

different directions 

We can use partial derivatives to estimate 

the increase in SSD 



SC437 


TJ Cham 

2002-03 / S2

S( 

x 

0 


+ δx, 

y 

Express SSD as a function S(x,y) 

2 nd order Taylor Series expansion: 

+ δy) 

≈ 

At solution point, 



0 

S 

0 

0 th 

order 

⎡∂S 

+ ⎢ 

⎣ ∂x 

S 

0 

= 

0, 

0 

∂S 

∂y 

⎤⎡δx⎤ 

⎥⎢ 

⎥ + 

⎦⎣δy⎦ 

SC437 


0 

1 st 

order 

(gradient) 

∂S 

∂x 

0 

= 

∂S 

∂y 

0 

= 

2 ⎡∂ 

S 

⎢ 

∂x 

0 

2 

2 

∂ S ⎤ 0 

⎥ 

∂x∂y 

⎡δx⎤ 

[ δx 

δy] 

⎢ 

⎥ 

2 2 ⎢ ⎥ 

⎢∂ 

S0 

∂ S0 

⎥⎣δy⎦ 

0 

⎢ 

⎣∂y∂x 

∂y 

2 nd 

order 

(Hessian) 

2 

⎥ 

⎦ 

TJ Cham 

2002-03 / S2


For a image patch g(u,v), the SSD Hessian matrix 

at the solution point can be expressed as 

H 

∂g/∂x and ∂g/∂y are simply the image gradients 

along the x and y directions, at each pixel (u,v) in 

the image patch 



0 

⎡ 

⎢2 

⎢ 

= ⎢ 

⎢ 

⎢2 

⎢ 

⎣ 

N −1 

N −1 

∑∑ 

u= 

0 v= 

0 

N −1 

N −1 

∑∑ 

u= 

0 v= 

0 

⎛ 

⎜ 

⎝ 

∂g 

∂x 

∂g 

∂x 

⎞ 

⎟ 

⎠ 

∂g 

∂y 

N −1 

N −1 

∑∑ 

u= 

0 v= 

0 

N −1 

N −1 

∑∑ 

u= 

0 v= 

0 

SC437 


2 

2 

2 

∂g 

∂x 

⎛ 

⎜ 

⎝ 

∂g 

∂y 

∂g 

∂y 

⎞ 

⎟ 

⎠ 

2 

⎤ 

⎥ 

⎥ 

⎥ 

⎥ 

⎥ 

⎥ 

⎦ 

TJ Cham 

2002-03 / S2

Cornerness 

Cornerness is defined as 

C 

= 

= 

min 

The corresponding eigenvector is the 

direction where the image patch can `slide’ 

with minimum increase in SSD 



minimum 

root 

λ 

eigenvalue 

of 

{ ( )( ) 0} 

2 

λ − h λ − h − h = 

11 

22 

SC437 


H 

0 

12 

TJ Cham 

2002-03 / S2

Cornerness Examples 

Image Patches: 

⎡δx⎤ 

Plots of S ≈ 

0 ⎢ ⎥ 

⎣δy⎦ 



[ δx δy] 

H : 

C≈10 5 C≈10 3 C≈10 3 

SC437 


TJ Cham 

2002-03 / S2

Recap on Appearance-Based Matching 

The correspondence problem 

Occlusion / de-occlusion 

Appearance-based matching in stereo images 



Minimize sum-of-squares difference (SSD) of image 

patches 

Expressed as correlations for efficient computation 

Matching Localization 

Localization depends on structure in image patch 

SSD Hessian matrix H0 at solution point 

Cornerness C = min eigenvalue of H 0 

SC437 


TJ Cham 

2002-03 / S2

3D Estimation Accuracy versus 

Matching Accuracy 

Correspondence by appearance matching is 

most accurate when 



Cameras are close together – small baseline 

3D estimation is most accurate when 

Cameras are far apart – large baseline 

In real 3D stereo applications, need to find 

reasonable tradeoff 

SC437 


TJ Cham 

2002-03 / S2


Want to match image points across large 

differences in viewpoints 

Extract image points with local properties that are 

approximately invariant to larger changes of 

viewpoint 

Need to select feature points = `special’ points 

Note: 3D reconstruction is more sparse 

Example features and properties: 

Corners – angle of corner 

Curves – maximum radius of curvature 

Properties depend on types of viewpoint changes 



SC437 


TJ Cham 

2002-03 / S2


Match feature points across images directly by 

finding similar properties 

Example 1: 



Match corner points in two images with the same angle 

Note: corner points can have very different orientation 

SC437 


TJ Cham 

2002-03 / S2

Feature-based Matching Example 

Example 2: 

Finding symmetrical 

contours 

Cannot use SSD directly 

Need to use feature points 



• E.g. points of maximum 

curvature 

• Match using curvatures 

SC437 


Image ©TJ Cham 

TJ Cham 

2002-03 / S2

Feature Matching Heuristics 

Even with feature properties, there may be 

multiple candidates 

Heuristics may be used to try and select the 

correct solution 

Note: heuristics are empirical rules-ofthumb 

and may not always give the right 

solution 



SC437 


TJ Cham 

2002-03 / S2

Feature Matching Heuristics Examples 

Proximity – match feature point to candidate with 

nearest (x,y) position in second image 

Ordering – feature points on a contour in first 

image must match candidates on a contour in the 

second image in the same order 



SC437 


TJ Cham 

2002-03 / S2

Recap on Feature-based Matching 

Advantage 



allows greater baseline / larger variation in viewpoints 

Method 

Extract feature points with properties which are 

invariant to viewpoint changes 

Match feature points with same properties across 

images 

Matching may require heuristics to help find 

correct solution 

E.g. proximity, ordering 

SC437 


TJ Cham 

2002-03 / S2

Geometric Search Constraint 

A very powerful search constraint is 

available if we know how cameras are 

related to each other 

Given an image point in the first image, 

consider the ray R from the projection 

center passing through that image point 

The ray R must intersect the 3D point 

The corresponding image point in the second 

image must lie on the projection of the ray R in 

the second view 



SC437 


TJ Cham 

2002-03 / S2

Geometric Search Constraint 

O l 

R 



O r 

SC437 


Do not have to 

conduct 2D search 

over entire image 

Instead, conduct 1D 

search only along 

projection of R in 

second image 

TJ Cham 

2002-03 / S2

Stereo Rig for Epipolar Geometry 



epipolar 

lines 

O l O r 

epipoles 

SC437 


TJ Cham 

2002-03 / S2

Epipoles and Epipolar Lines 

The epipole in view 2 is the image point of the 

projection center of camera 1 

An epipolar line in view 2 is the image line of a 

ray intersecting the projection center in camera 1 

Each pixel in view 1 has a corresponding epipolar 

line in view 2, and vice versa 

Epipolar lines always intersect the epipole 

There is only 1 epipole per image, but infinite 

family of epipolar lines 



SC437 


TJ Cham 

2002-03 / S2

Epipoles and Epipolar Lines Examples 

Epipolar lines in left image, based on points in the 

right image 

Epipolar lines intersect at the epipole which is outside 

the image in this case 



SC437 


Images © T. de Margerie 

TJ Cham 

2002-03 / S2

Epipolar Geometry Derivation 

O T 

l Or School of Computer Engineering 


P l 

Pl − T 

SC437 


w.r.t to O l reference frame 

Pr = R l 

−1 

( P − T) 

w.r.t to O r reference frame 

TJ Cham 

2002-03 / S2


 

P l, T and (P l-T) lie in the same plane 

Express this as 

In right frame P r=R -1 (P l-T) 



( × T) 

⋅( 

P − T) 

= 0 

Pl l 

( ) T 

× T ( P − T) 

= 0 

Pl l 

( ) T 

P T RP = 0 

l 

× r 

SC437 


TJ Cham 

2002-03 / S2


We can also express cross-product as matrix 

multiplication! 

Set 

⎡X 

⎢ 

⎢ 

Y 

⎢⎣ 

Z 



l 

l 

l 

⎤ ⎡T 

⎥ 

× 

⎢ 

⎥ ⎢ 

T 

⎥⎦ 

⎢⎣ 

T 

X 

Y 

Z 

⎤ 

⎥ 

⎥ 

⎥⎦ 

= 

⎡ 0 

H 

= 

⎢ 

⎢ 

−T 

⎢⎣ 

TY 

⎡ 0 

⎢ 

⎢ 

−T 

⎢⎣ 

TY 

Z 

Z 

TZ 

0 

−T 

TZ 

0 

−T 

SC437 


X 

X 

−T 

TX 

0 

Y 

−T 

⎤ 

⎥ 

⎥ 

⎥⎦ 

T 

0 

X 

Y 

⎤ ⎡X 

⎥ ⎢ 

⎥ ⎢ 

Yl 

⎥⎦ 

⎢⎣ 

Z 

l 

l 

⎤ 

⎥ 

⎥ 

⎥⎦ 

TJ Cham 

2002-03 / S2


 

Substitute cross-product 

2D image points are related to 3D points via 

p 



l 

( ) T 

HP RP = 0 

P 

T 

l 

l 

H 

T 

fl 

= Pl 

, 

Z 

l 

RP 

SC437 


r 

r 

p 

= 

r 

0 

= 

f 

Z 

r 

r 

P 

r 

TJ Cham 

2002-03 / S2

Essential Matrix Equation 

 

Combining, 

where E is the 3x3 Essential Matrix 

E is scale factor independent, and is usually 

multiplied such that E(3,3)=1 



p 

p 

T 

l 

T 

l 

⎛ 

⎜ 

⎝ 

Z 

f 

l 

l 

H 

Ep 

T 

R 

SC437 


r 

= 

r 

0 

Z 

f 

r 

r 

⎞ 

⎟ 

⎟p 

⎠ 

r 

= 

0 

TJ Cham 

2002-03 / S2

Fundamental Matrix Equation 

 

 

Affine transformations of image points 

where F is the 3x3 Fundamental Matrix 

F is also scale factor independent, and is usually 

multiplied such that F(3,3)=1 



' 

l 

l 

l 

' 

r 

p = M p , p = 

p 

'T 

l 

M 

( ) −1 T −1 

' 

M EM p = 0 

p 

l 

T 

' 

l 

F 

p 

SC437 


r 

' 

r 

= 

r 

0 

r 

p 

r 

TJ Cham 

2002-03 / S2

Using Epipolar Geometry 

Writing the equation in full, we have 

We can express as 

If we have specific values for (x l ,y l ), this equation 

is a straight line in the right image! 



1 

[ x y 1] 

f f f y = 0 

l 

l 

⎡ f 

⎢ 

⎢ 

⎣ f 

4 

7 

1 

SC437 


f 

f 

2 

5 

8 

f 

3 

6 

⎤ ⎡xr 

⎤ 

⎥ ⎢ ⎥ r 

⎥ ⎢ 1 ⎥ 

⎦ ⎣ ⎦ 

[ ] r 

f 

x f y + f f x + f y + f = − f x + f y + 1) 

1 

l 

⎡x 

+ 4 l 7 2 l 5 l 8 ⎢ ⎥ ( 3 l 6 l 

yr 

⎣ 

⎤ 

⎦ 

TJ Cham 

2002-03 / S2

Using Epipolar Geometry 

If we know the Fundamental matrix 

1. for any point (x l ,y l ) in the one image, we know the 

epipolar line in the other image 

2. We only need to search for corresponding point along 

epipolar line 

Example: 



Images ©Robotvis INRIA 

SC437 


TJ Cham 

2002-03 / S2

Estimating Fundamental Matrix 

How do we compute the Fundamental Matrix? 

Obtain directly from projection matrices 



• if cameras are pre-calibrated 

• Difficult to derive and not covered in lectures 

or … 

Obtain from 8 known image correspondences 

• 1 correspondence means a specific (xl,yl),(xr,yr) pair 

• Can be used with uncalibrated cameras 

• Knowing 1 correspondence provides 1 constraint to the equation 

p T 

l Fpr=0 

• Knowing 8 correspondences 

able to solve for 8 variables in F 

SC437 


TJ Cham 

2002-03 / S2


The Fundamental Matrix equation in full 

We can rewrite as 



1 

[ x y 1] 

f f f y = 0 

l 

l 

⎡ f 

⎢ 

⎢ 

⎣ f 

4 

7 

f 

f 

2 

5 

8 

1 

SC437 


f 

3 

6 

⎤ ⎡xr 

⎤ 

⎥ ⎢ ⎥ r 

⎥ ⎢ 1 ⎥ 

⎦ ⎣ ⎦ 

4 

[ x x y x y x y y y x y ] ⎢ ⎥ = −1 

xl r l r l l r l r l r r 

… 

known values from 

1 correspondence 

unknown values 

to be computed 

: 

⎡ 

⎢ 

⎢ 

⎢ 

⎢ 

⎢ 

⎢ 

⎢ 

⎣ 

f 

f 

f 

f 

f 

f 

f 

f 

1 

2 

3 

5 

6 

7 

8 

⎤ 

⎥ 

⎥ 

⎥ 

⎥ 

⎥ 

⎥ 

⎥ 

⎦ 

TJ Cham 

2002-03 / S2


⎡ xl1x 

⎢xl 

2xr 

2 

⎢ 

⎢ 

⎣ xl8x 

r 

Using 8 known correspondences, we get 

r1 

l8 

r8 



8 

xl1y 

x y 

l 2 

x 

y 

r1 

r 2 

x 

x 

x 

l1 

l 2 

l8 

y 

y 

y 

l1 

l 2 

l8 

x 

x 

M 

x 

r1 

r 2 

r8 

B 

8 x 8 

y 

y 

y 

l1 

l 2 

l8 

y 

y 

r1 

r 2 

r8 

l1 

l 2 

l8 

SC437 


y 

y 

y 

y 

x 

x 

r1 

x 

r 2 

r8 

y 

y 

r1 

y 

r 2 

r8 

⎡ f 

⎢ f 

⎢ 

⎤ f 

⎢ 

⎥ 

⎢ f 

⎥ 

⎢ f 

⎥ 

⎦ ⎢ f 

⎢ f 

⎢ 

⎣ f 

1 

2 

3 

4 

5 

6 

7 

8 

⎤ 

⎥ 

⎥ 

⎥ 

⎥ 

⎥ 

⎥ 

⎥ 

⎥ 

⎦ 

f 

8 x 1 

⎡−1⎤ 

⎢−1⎥ 

⎢−1⎥ 

⎢ ⎥ 

= 

−1 

⎢−1⎥ 

⎢−1⎥ 

⎢−1⎥ 

⎢ 

⎣−1⎥ 

⎦ 

TJ Cham 

2002-03 / S2


We can get then solve for the elements of 

the Fundamental Matrix 

With n>8 point correspondences, B is nx8 

and we have to use pseudo-inverse 



f 

= −1 

B 

⎡− 

⎢ 

M 

⎢ 

⎢⎣ 

− 

1 

1 

⎤ 

⎥ 

⎥ 

⎥⎦ 

SC437 


TJ Cham 

2002-03 / S2

Recap on Epipolar Geometry 


Epipoles – image point of projection center 

Epipolar lines – projection of 3D rays from projection 

center 

Epipolar equations 

Essential matrix –image points in camera reference frame 

Fundamental matrix – 

T 

p 

Fp 

= 0 

covers additional affine transformation of image points 

Using epipolar geometry 

Search only along epipolar line to find corresponding point 

Estimating fundamental matrix 

Use at least n>8 known image correspondences 



l 

SC437 


r 

TJ Cham 

2002-03 / S2

3D Reconstruction from Perspective Stereo 

3D Reconstruction by triangulation with calibrated 

cameras 

for each image point in both cameras, we know the 

associated 3D ray 

Find intersection of rays from corresponding image 

points 

O l 



SC437 


O r 

TJ Cham 

2002-03 / S2


Do the triangulation in matrix form 

Assume we have projection matrices from camera 

calibration 

L camera: 

R camera: 

Except now the unknowns are only X, Y, Z! 



⎡kx 

⎢ky 

⎢ 

⎣ k 

l 

l 

⎡mx 

⎢my 

⎢ 

⎣ m 

⎤ ⎡a1 

⎥ = ⎢a5 

⎥ ⎢ 

⎦ ⎣a9 

r 

r 

⎤ ⎡b1 

⎥ = ⎢b5 

⎥ ⎢ 

⎦ ⎣b9 

a 

a 

a 

10 

11 

SC437 


2 

6 

b 

b 

b 

2 

6 

10 

a 

a 

a 

3 

7 

b 

b 

b 

3 

7 

11 

a ⎤ ⎡X 

⎤ 

4 

a ⎥ ⎢Y 

⎥ 

8 

⎥ ⎢Z 

⎥ 

1 ⎦ ⎢⎣ 

1 ⎥⎦ 

⎡ ⎤ 

4 ⎤ 

⎥ ⎢Y 

⎥ 

8 

⎥ ⎢Z 

⎥ 

1 ⎦ ⎢⎣ 

1 ⎥⎦ 

X b 

b 

TJ Cham 

2002-03 / S2


x l 

 

Rewrite for left camera 

= 

a 

a 

Similarly for right camera 



1 

9 

X + a2Y 

+ a3Z 

+ a4 

X + a Y + a Z + 1 

10 

11 

y l 

= 

a 

a 

SC437 


5 

9 

X 

X 

+ a6Y 

+ a7Z 

+ a8 

+ a Y + a Z + 1 

( a xl 

− a1) 

X + ( a10xl 

− a2) 

Y + ( a11xl 

− a3) 

Z = ( a4 

− x 

9 l 

( a yl 

− 

a5) 

X + ( a10 

yl 

− a6) 

Y + ( a11yl 

− a7) 

Z = ( a8 

− 

9 l 

10 

11 

y 

) 

) 

TJ Cham 

2002-03 / S2


Hence with a pair of image correspondences (x l ,y l ) 

and (x r ,y r ), we can get 

⎡a 

⎢a 

⎢b 

⎢ 

⎣b 

x 

y 

x 

y 

− a 

− a 

− b 

− b 

Compute 3D world coordinates X, Y and Z via 

pseudo-inverse ⎡ ⎤ 

Computationally expensive – compute new pseudoinverse 

for each pair of image correspondences 



9 

9 

9 

9 

l 

l 

r 

r 

1 

5 

1 

5 

a 

a 

b 

b 

10 

10 

10 

10 

x 

y 

x 

y 

l 

l 

r 

r 

− a 

− a 

− b 

− b 

W 

2 

6 

2 

6 

a 

a 

b 

b 

11 

11 

11 

11 

x 

y 

x 

y 

− a 

− a 

− b 

− b 

⎤ 

⎥ 

⎥ 

⎥ 

⎦ 

+ 

⎢Y 

⎥ = W q = (W 

⎢⎣ 

Z ⎥⎦ 

X 

SC437 


l 

l 

r 

r 

3 

7 

3 

7 

⎡a 

⎡ ⎤ ⎢a 

⎢Y 

⎥ = ⎢ 

⎢⎣ 

Z ⎥ b 

⎦ ⎢ 

⎣b 

X 

4 

8 

4 

8 

W) 

T −1 

W 

− 

− 

− 

− 

q 

T 

q 

x 

y 

x 

y 

l 

l 

r 

r 

⎤ 

⎥ 

⎥ 

⎥ 

⎦ 

TJ Cham 

2002-03 / S2

3D Reconstruction from Weak- 

Perspective Stereo 

Assume weak-perspective projection matrices 

L camera: 

R camera: 

Note: last rows of matrix equations are redundant! 



⎡xl 

⎤ ⎡a1 

⎢y 

⎥ = ⎢ l a5 

⎢ ⎥ ⎢ 

⎣ 1 ⎦ ⎣ 0 

⎡xr 

⎤ ⎡b1 

⎢y 

⎥ = ⎢ r b5 

⎢ ⎥ ⎢ 

⎣ 1 ⎦ ⎣ 0 

a 

a 

0 

a 

a 

0 

SC437 


2 

6 

b2 

b6 

0 

3 

7 

b3 

b7 

0 

a ⎤ ⎡X 

⎤ 

4 

a ⎥ ⎢Y 

⎥ 

8 

⎥ ⎢ Z ⎥ 

1 ⎦ ⎢⎣ 

1 ⎥⎦ 

⎡ ⎤ 

4 ⎤ 

⎥ ⎢Y 

⎥ 

8 

⎥ ⎢ Z ⎥ 

1 ⎦ ⎢⎣ 

1 ⎥⎦ 

X b 

b 

TJ Cham 

2002-03 / S2

3D Reconstruction from Weak- 

Perspective Stereo 

Eliminate last rows and combine equations 

Again, solve by pseudo-inverse 

However, pseudo-inverse is dependent only on the 

projection matrices and not on individual image 

points 

need only be computed once for 3D reconstruction of 

all image correspondences 



⎡a 

⎢a 

⎢b 

⎢ 

⎣b 

1 

5 

1 

5 

a 

a 

b 

b 

2 

6 

2 

6 

a 

a 

b 

b 

3 

7 

3 

7 

⎤ 

⎥ 

⎥ 

⎥ 

⎦ 

⎡ ⎤ 

⎢Y 

⎥ 

⎢⎣ 

Z ⎥⎦ 

X 

= 

⎡x 

⎢y 

⎢x 

⎢ 

⎣y 

SC437 


l 

l 

r 

r 

− a 

− a 

− b 

− b 

4 

8 

4 

8 

⎤ 

⎥ 

⎥ 

⎥ 

⎦ 

TJ Cham 

2002-03 / S2

Recap on 3D Reconstruction 

3D reconstruction involves triangulation of 

3D rays 

Full perspective 3D reconstruction 

Weak perspective 3D reconstruction 



Computationally cheaper – compute pseudoinverse 

only once 

SC437 


TJ Cham 

2002-03 / S2

Example Application of 3D Reconstruction 

Comparing Matrix movie’s super slow-motion 

technology and Kanade’s Virtualized Reality 

Matrix movie’s technology 

•Does not use stereo vision 

•Needs a lot of cameras 

•Trajectory of slow-mo sequence fixed 



SC437 


Virtualized Reality 

•Based on 3D stereo vision 

• Needs fewer cameras 

•No constraint on trajectory 

TJ Cham 

2002-03 / S2

Example Application of 3D Reconstruction 

Virtualized Reality (Kanade et al.) 



SC437 


TJ Cham 

2002-03 / S2

Practical Overview of Simplistic 3D 

Reconstruction Process 

1. Fix the pose of your cameras, e.g. by placing them 

on tripods 

2. Using a calibration chart, 

Calibrate your cameras using known correspondences of 

3D points to 2D image points 

Establish epipolar geometry by computing Fundamental 

Matrix using known correspondences between 2D image 

points 

3. For an arbitrary scene in front of your fixed 

cameras, your system should automatically: 

a. Find corresponding image points between camera images 

b. Compute 3D coordinates for each correspondence 



SC437 


TJ Cham 

2002-03 / S2

Summary of 3D Stereo Vision 

Representation of 3D data 

Parallax 

2D triangulation 

Appearance-based matching 

Feature-based matching 


3D reconstruction 



SC437 


TJ Cham 

2002-03 / S2

StereoVision_1 - Nanyang Technological University

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?