StereoVision_1 - Nanyang Technological University
StereoVision_1 - Nanyang Technological University
StereoVision_1 - Nanyang Technological University
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
3D Stereo Vision<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Contents of 3D Stereo Vision<br />
Representation of 3D data<br />
Parallax<br />
2D triangulation<br />
Appearance-based matching<br />
Feature-based matching<br />
Epipolar geometry<br />
3D reconstruction<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Range Sensing<br />
Radars and Sonars<br />
Time of flight for EM / acoustic pulses<br />
Focusing / Defocusing<br />
Mapping from focal length to focused distance<br />
Projector-Camera Triangulation<br />
Projected light pattern or laser on surface,<br />
measured by camera<br />
Stereo Cameras<br />
Measures light intensities from scene passively<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Why Use Stereo Vision?<br />
Inexpensive<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Cameras are inexpensive and getting cheaper<br />
Low-powered, non-intrusive<br />
Does not project light or radio waves<br />
Able to handle moving objects<br />
Does not require that objects are static<br />
Directly measure object radiance<br />
Useful for graphics and visualization<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Representation of Raw Range Data<br />
Cloud of 3D Points<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Explicit (X, Y, Z) data per point<br />
Useful for 3D visualization<br />
Images © Point Grey Research<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Representation of Raw Range Data<br />
Depth Maps<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Each pixel has associated depth<br />
• measured along ray from projection center<br />
Easier to associate with original image intensity<br />
Images © Point Grey Research<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Fitting 3D Models to Range Data<br />
We can fit 3D models to raw range data for<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Reducing the amount of data storage<br />
Removing noise<br />
Models can be<br />
Simple – e.g. planar patches, B-spline meshes<br />
Complex – e.g. deformable object models<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
3D Model Example<br />
3D reconstruction of temple (Van Gool et al.)<br />
3D point cloud converted to triangular meshes and<br />
texture-mapped<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Parallax<br />
Parallax refers to the observed change in<br />
relative angular displacement of image<br />
points across different camera views<br />
Parallax results from the depth differences<br />
in corresponding 3D points<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Parallax from X-Y Translation<br />
Parallax results when camera is translated<br />
parallel to image plane<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
relative<br />
angle<br />
O1 reversed<br />
O2 SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Parallax from Z Translation<br />
Parallax results when camera is moved<br />
along optical axis<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
O 2<br />
O 1<br />
SC437<br />
Computer Vision and Image Processing<br />
relative<br />
angle<br />
increased<br />
TJ Cham<br />
2002-03 / S2
No Parallax from Zooming<br />
No parallax from increasing focal length<br />
(zooming)<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
O 1<br />
SC437<br />
Computer Vision and Image Processing<br />
no change in<br />
relative angle<br />
TJ Cham<br />
2002-03 / S2
No Parallax from Pure Rotation<br />
No parallax when rotating about projection<br />
center<br />
no change in<br />
relative angle<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
O 1<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Depth from Parallax<br />
3D depth of points in the scene can be inferred<br />
from parallax<br />
We can recover depth when projection centers are<br />
displaced:<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Translating the camera in any direction<br />
Have multiple cameras in different locations<br />
We cannot recover depth when projection centers<br />
are at the same location:<br />
Changing focal length (zooming)<br />
Rotating camera about projection center<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Simple 2D Triangulation<br />
Assume 1D camera<br />
translated by T<br />
along<br />
x-direction in<br />
camera frame<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
f<br />
O l<br />
X l<br />
SC437<br />
Computer Vision and Image Processing<br />
Z<br />
x l -xr<br />
T<br />
-X r<br />
O r<br />
TJ Cham<br />
2002-03 / S2
Simple 2D Triangulation<br />
Perspective camera equations<br />
Subtraction <br />
But based on geometry of diagram, we have<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
x<br />
l<br />
X l = f ,<br />
Z<br />
x<br />
l<br />
−<br />
x<br />
r<br />
=<br />
X<br />
x<br />
SC437<br />
Computer Vision and Image Processing<br />
f<br />
X T − =<br />
l<br />
l<br />
X<br />
r<br />
r<br />
−<br />
Z<br />
=<br />
X<br />
r<br />
f<br />
X<br />
Z<br />
r<br />
TJ Cham<br />
2002-03 / S2
Simple 2D Triangulation<br />
Depth from simple triangulation<br />
T is known as the baseline distance<br />
Since camera translation is only in x direction,<br />
image points in both cameras have same pixel row:<br />
Y<br />
yl =<br />
yr<br />
= f<br />
Z<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Z<br />
=<br />
f<br />
x<br />
l<br />
T<br />
−<br />
x<br />
SC437<br />
Computer Vision and Image Processing<br />
r<br />
TJ Cham<br />
2002-03 / S2
Image Disparity from 2 Cameras<br />
Depth Z is inversely proportional to the<br />
image disparity d=xl-xr For each pixel in left camera, we can<br />
measure the disparity after finding the<br />
corresponding image point in right camera<br />
If focal length and baseline distance is<br />
unknown, we can represent range data using<br />
a disparity map instead of a depth map<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Relation of Baseline to Accuracy<br />
Let disparity d = x l - x r, then<br />
If estimation of d has an error δd, the<br />
corresponding error in Z is<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
T<br />
Z = f ⇒ d =<br />
d<br />
2<br />
T<br />
T Z<br />
δZ<br />
=<br />
f δd<br />
= f δd<br />
= δd<br />
2<br />
2<br />
d<br />
fT<br />
T<br />
Z<br />
SC437<br />
Computer Vision and Image Processing<br />
f<br />
( fT Z )<br />
TJ Cham<br />
2002-03 / S2
Relation of Baseline to Accuracy<br />
As cameras become closer together,<br />
baseline distance T is reduced<br />
As<br />
T<br />
→<br />
Estimation for Z is unstable as cameras get<br />
infinitesimally close together<br />
Larger baseline is better for estimation<br />
accuracy<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
0,<br />
δZ<br />
=<br />
lim<br />
T →0<br />
2<br />
Z<br />
fT<br />
SC437<br />
Computer Vision and Image Processing<br />
δd<br />
→<br />
∞<br />
TJ Cham<br />
2002-03 / S2
Recap on Parallax and 1D Triangulation<br />
Parallax / no parallax<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Parallax when projection centers are displaced<br />
Depth from 2D triangulation<br />
Z = f T / (x l -x r )<br />
Disparity map depth map<br />
Depth estimation accuracy<br />
Relation of Z estimation to baseline distance T<br />
Larger T gives more accurate Z estimation<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
The Correspondence Problem<br />
The triangulation covered earlier assumes<br />
that image points corresponding to the same<br />
3D point are known<br />
The correspondence problem:<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
For each important image point in the first<br />
image, we need to find the corresponding image<br />
point in the second image<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Matching Image Points<br />
Appearance-based Matching<br />
Match points with similar appearances across<br />
two images<br />
Feature-based Matching<br />
Match similar features across two images<br />
Features = edges, corners, etc.<br />
Geometric Constraints<br />
Reduce ambiguity<br />
Limit amount of search<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Occlusion and Deocclusion<br />
Occlusion<br />
Some 3D points visible in first image are<br />
hidden in second image<br />
Implication: Cannot find correspondences for<br />
all points in first image<br />
De-Occlusion<br />
Some 3D points hidden in first image are<br />
visible in second image<br />
Implication: There will be unmatched points in<br />
the second image<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Appearance-based Matching<br />
Assumptions<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Corresponding points in 2 images have image<br />
patches that look identical<br />
Minimal geometric distortion<br />
Lambertian reflectance<br />
No occlusion<br />
Reasonable for stereo cameras with small<br />
baseline distance<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Minimize Sum-of-Squares Difference (SSD)<br />
For a small image patch in the first image,<br />
find corresponding image patch with<br />
smallest intensity SSD in second image<br />
For image I and image patch g of size N by<br />
N, find image location (x,y) where<br />
arg<br />
min<br />
( x,<br />
y)<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
− N 1 N −1<br />
∑∑<br />
u=<br />
0<br />
v=<br />
0<br />
( I(<br />
x + u,<br />
y + v)<br />
− g(<br />
u,<br />
v)<br />
)<br />
SC437<br />
Computer Vision and Image Processing<br />
2<br />
TJ Cham<br />
2002-03 / S2
Minimize Sum-of-Squares Difference (SSD)<br />
Expanding and eliminating constant terms<br />
⎧<br />
⎪2<br />
⎪<br />
arg max⎨<br />
( x,<br />
y)<br />
⎪−<br />
⎪⎩<br />
This expression in the form of 2 correlations<br />
can be computed efficiently via FFT<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
N −1<br />
N −1<br />
∑∑<br />
u=<br />
0<br />
N −1<br />
N −1<br />
∑∑<br />
u=<br />
0<br />
v=<br />
0<br />
v=<br />
0<br />
⎫<br />
I(<br />
x + u,<br />
y + v)<br />
g(<br />
u,<br />
v)<br />
⎪<br />
⎬<br />
2<br />
I(<br />
x + u,<br />
y + v)<br />
⋅1<br />
⎪<br />
⎪⎭<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Appearance-based Matching in Stereo<br />
Partition left image into small image<br />
patches<br />
For each image patch, find corresponding<br />
image patch in right image<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Appearance Matching Example<br />
Point Grey Research DigiClops<br />
Stereo Images<br />
Disparity Map<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Matching Localization<br />
Localization of SSD-based matching<br />
depends on `structure’ in image patch<br />
Smooth surface<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
• Localization is poor in all directions<br />
Image gradients in one direction (e.g. edges)<br />
• Localization is poor perpendicular to gradient<br />
direction<br />
Image gradients along multiple directions (e.g.<br />
corners)<br />
• Localization is good<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Matching Accuracy Examples<br />
SSD Images<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Quantifying Matching Localization<br />
For a particular image patch, we want to<br />
find out how accurately it can be localized<br />
1. Consider image patch at the solution point<br />
(i.e. perfect matching location where SSD=0)<br />
2. Estimate how much SSD increases when<br />
image patch is displaced by small amounts in<br />
different directions<br />
We can use partial derivatives to estimate<br />
the increase in SSD<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
S(<br />
x<br />
0<br />
Quantifying Matching Localization<br />
+ δx,<br />
y<br />
Express SSD as a function S(x,y)<br />
2 nd order Taylor Series expansion:<br />
+ δy)<br />
≈<br />
At solution point,<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
0<br />
S<br />
0<br />
0 th<br />
order<br />
⎡∂S<br />
+ ⎢<br />
⎣ ∂x<br />
S<br />
0<br />
=<br />
0,<br />
0<br />
∂S<br />
∂y<br />
⎤⎡δx⎤<br />
⎥⎢<br />
⎥ +<br />
⎦⎣δy⎦<br />
SC437<br />
Computer Vision and Image Processing<br />
0<br />
1 st<br />
order<br />
(gradient)<br />
∂S<br />
∂x<br />
0<br />
=<br />
∂S<br />
∂y<br />
0<br />
=<br />
2 ⎡∂<br />
S<br />
⎢<br />
∂x<br />
0<br />
2<br />
2<br />
∂ S ⎤ 0<br />
⎥<br />
∂x∂y<br />
⎡δx⎤<br />
[ δx<br />
δy]<br />
⎢<br />
⎥<br />
2 2 ⎢ ⎥<br />
⎢∂<br />
S0<br />
∂ S0<br />
⎥⎣δy⎦<br />
0<br />
⎢<br />
⎣∂y∂x<br />
∂y<br />
2 nd<br />
order<br />
(Hessian)<br />
2<br />
⎥<br />
⎦<br />
TJ Cham<br />
2002-03 / S2
Quantifying Matching Localization<br />
For a image patch g(u,v), the SSD Hessian matrix<br />
at the solution point can be expressed as<br />
H<br />
∂g/∂x and ∂g/∂y are simply the image gradients<br />
along the x and y directions, at each pixel (u,v) in<br />
the image patch<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
0<br />
⎡<br />
⎢2<br />
⎢<br />
= ⎢<br />
⎢<br />
⎢2<br />
⎢<br />
⎣<br />
N −1<br />
N −1<br />
∑∑<br />
u=<br />
0 v=<br />
0<br />
N −1<br />
N −1<br />
∑∑<br />
u=<br />
0 v=<br />
0<br />
⎛<br />
⎜<br />
⎝<br />
∂g<br />
∂x<br />
∂g<br />
∂x<br />
⎞<br />
⎟<br />
⎠<br />
∂g<br />
∂y<br />
N −1<br />
N −1<br />
∑∑<br />
u=<br />
0 v=<br />
0<br />
N −1<br />
N −1<br />
∑∑<br />
u=<br />
0 v=<br />
0<br />
SC437<br />
Computer Vision and Image Processing<br />
2<br />
2<br />
2<br />
∂g<br />
∂x<br />
⎛<br />
⎜<br />
⎝<br />
∂g<br />
∂y<br />
∂g<br />
∂y<br />
⎞<br />
⎟<br />
⎠<br />
2<br />
⎤<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎦<br />
TJ Cham<br />
2002-03 / S2
Cornerness<br />
Cornerness is defined as<br />
C<br />
=<br />
=<br />
min<br />
The corresponding eigenvector is the<br />
direction where the image patch can `slide’<br />
with minimum increase in SSD<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
minimum<br />
root<br />
λ<br />
eigenvalue<br />
of<br />
{ ( )( ) 0}<br />
2<br />
λ − h λ − h − h =<br />
11<br />
22<br />
SC437<br />
Computer Vision and Image Processing<br />
H<br />
0<br />
12<br />
TJ Cham<br />
2002-03 / S2
Cornerness Examples<br />
Image Patches:<br />
⎡δx⎤<br />
Plots of S ≈<br />
0 ⎢ ⎥<br />
⎣δy⎦<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
[ δx δy]<br />
H :<br />
C≈10 5 C≈10 3 C≈10 3<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Recap on Appearance-Based Matching<br />
The correspondence problem<br />
Occlusion / de-occlusion<br />
Appearance-based matching in stereo images<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Minimize sum-of-squares difference (SSD) of image<br />
patches<br />
Expressed as correlations for efficient computation<br />
Matching Localization<br />
Localization depends on structure in image patch<br />
SSD Hessian matrix H0 at solution point<br />
Cornerness C = min eigenvalue of H 0<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
3D Estimation Accuracy versus<br />
Matching Accuracy<br />
Correspondence by appearance matching is<br />
most accurate when<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Cameras are close together – small baseline<br />
3D estimation is most accurate when<br />
Cameras are far apart – large baseline<br />
In real 3D stereo applications, need to find<br />
reasonable tradeoff<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Feature-based Matching<br />
Want to match image points across large<br />
differences in viewpoints<br />
Extract image points with local properties that are<br />
approximately invariant to larger changes of<br />
viewpoint<br />
Need to select feature points = `special’ points<br />
Note: 3D reconstruction is more sparse<br />
Example features and properties:<br />
Corners – angle of corner<br />
Curves – maximum radius of curvature<br />
Properties depend on types of viewpoint changes<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Feature-based Matching<br />
Match feature points across images directly by<br />
finding similar properties<br />
Example 1:<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Match corner points in two images with the same angle<br />
Note: corner points can have very different orientation<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Feature-based Matching Example<br />
Example 2:<br />
Finding symmetrical<br />
contours<br />
Cannot use SSD directly<br />
Need to use feature points<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
• E.g. points of maximum<br />
curvature<br />
• Match using curvatures<br />
SC437<br />
Computer Vision and Image Processing<br />
Image ©TJ Cham<br />
TJ Cham<br />
2002-03 / S2
Feature Matching Heuristics<br />
Even with feature properties, there may be<br />
multiple candidates<br />
Heuristics may be used to try and select the<br />
correct solution<br />
Note: heuristics are empirical rules-ofthumb<br />
and may not always give the right<br />
solution<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Feature Matching Heuristics Examples<br />
Proximity – match feature point to candidate with<br />
nearest (x,y) position in second image<br />
Ordering – feature points on a contour in first<br />
image must match candidates on a contour in the<br />
second image in the same order<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Recap on Feature-based Matching<br />
Advantage<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
allows greater baseline / larger variation in viewpoints<br />
Method<br />
Extract feature points with properties which are<br />
invariant to viewpoint changes<br />
Match feature points with same properties across<br />
images<br />
Matching may require heuristics to help find<br />
correct solution<br />
E.g. proximity, ordering<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Geometric Search Constraint<br />
A very powerful search constraint is<br />
available if we know how cameras are<br />
related to each other<br />
Given an image point in the first image,<br />
consider the ray R from the projection<br />
center passing through that image point<br />
The ray R must intersect the 3D point<br />
The corresponding image point in the second<br />
image must lie on the projection of the ray R in<br />
the second view<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Geometric Search Constraint<br />
O l<br />
R<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
O r<br />
SC437<br />
Computer Vision and Image Processing<br />
Do not have to<br />
conduct 2D search<br />
over entire image<br />
Instead, conduct 1D<br />
search only along<br />
projection of R in<br />
second image<br />
TJ Cham<br />
2002-03 / S2
Stereo Rig for Epipolar Geometry<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
epipolar<br />
lines<br />
O l O r<br />
epipoles<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Epipoles and Epipolar Lines<br />
The epipole in view 2 is the image point of the<br />
projection center of camera 1<br />
An epipolar line in view 2 is the image line of a<br />
ray intersecting the projection center in camera 1<br />
Each pixel in view 1 has a corresponding epipolar<br />
line in view 2, and vice versa<br />
Epipolar lines always intersect the epipole<br />
There is only 1 epipole per image, but infinite<br />
family of epipolar lines<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Epipoles and Epipolar Lines Examples<br />
Epipolar lines in left image, based on points in the<br />
right image<br />
Epipolar lines intersect at the epipole which is outside<br />
the image in this case<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
Images © T. de Margerie<br />
TJ Cham<br />
2002-03 / S2
Epipolar Geometry Derivation<br />
O T<br />
l Or School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
P l<br />
Pl − T<br />
SC437<br />
Computer Vision and Image Processing<br />
w.r.t to O l reference frame<br />
Pr = R l<br />
−1<br />
( P − T)<br />
w.r.t to O r reference frame<br />
TJ Cham<br />
2002-03 / S2
Epipolar Geometry Derivation<br />
<br />
P l, T and (P l-T) lie in the same plane<br />
Express this as<br />
In right frame P r=R -1 (P l-T) <br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
( × T)<br />
⋅(<br />
P − T)<br />
= 0<br />
Pl l<br />
( ) T<br />
× T ( P − T)<br />
= 0<br />
Pl l<br />
( ) T<br />
P T RP = 0<br />
l<br />
× r<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Epipolar Geometry Derivation<br />
We can also express cross-product as matrix<br />
multiplication!<br />
Set<br />
⎡X<br />
⎢<br />
⎢<br />
Y<br />
⎢⎣<br />
Z<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
l<br />
l<br />
l<br />
⎤ ⎡T<br />
⎥<br />
×<br />
⎢<br />
⎥ ⎢<br />
T<br />
⎥⎦<br />
⎢⎣<br />
T<br />
X<br />
Y<br />
Z<br />
⎤<br />
⎥<br />
⎥<br />
⎥⎦<br />
=<br />
⎡ 0<br />
H<br />
=<br />
⎢<br />
⎢<br />
−T<br />
⎢⎣<br />
TY<br />
⎡ 0<br />
⎢<br />
⎢<br />
−T<br />
⎢⎣<br />
TY<br />
Z<br />
Z<br />
TZ<br />
0<br />
−T<br />
TZ<br />
0<br />
−T<br />
SC437<br />
Computer Vision and Image Processing<br />
X<br />
X<br />
−T<br />
TX<br />
0<br />
Y<br />
−T<br />
⎤<br />
⎥<br />
⎥<br />
⎥⎦<br />
T<br />
0<br />
X<br />
Y<br />
⎤ ⎡X<br />
⎥ ⎢<br />
⎥ ⎢<br />
Yl<br />
⎥⎦<br />
⎢⎣<br />
Z<br />
l<br />
l<br />
⎤<br />
⎥<br />
⎥<br />
⎥⎦<br />
TJ Cham<br />
2002-03 / S2
Epipolar Geometry Derivation<br />
<br />
Substitute cross-product<br />
2D image points are related to 3D points via<br />
p<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
l<br />
( ) T<br />
HP RP = 0<br />
P<br />
T<br />
l<br />
l<br />
H<br />
T<br />
fl<br />
= Pl<br />
,<br />
Z<br />
l<br />
RP<br />
SC437<br />
Computer Vision and Image Processing<br />
r<br />
r<br />
p<br />
=<br />
r<br />
0<br />
=<br />
f<br />
Z<br />
r<br />
r<br />
P<br />
r<br />
TJ Cham<br />
2002-03 / S2
Essential Matrix Equation<br />
<br />
Combining,<br />
where E is the 3x3 Essential Matrix<br />
E is scale factor independent, and is usually<br />
multiplied such that E(3,3)=1<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
p<br />
p<br />
T<br />
l<br />
T<br />
l<br />
⎛<br />
⎜<br />
⎝<br />
Z<br />
f<br />
l<br />
l<br />
H<br />
Ep<br />
T<br />
R<br />
SC437<br />
Computer Vision and Image Processing<br />
r<br />
=<br />
r<br />
0<br />
Z<br />
f<br />
r<br />
r<br />
⎞<br />
⎟<br />
⎟p<br />
⎠<br />
r<br />
=<br />
0<br />
TJ Cham<br />
2002-03 / S2
Fundamental Matrix Equation<br />
<br />
<br />
Affine transformations of image points<br />
where F is the 3x3 Fundamental Matrix<br />
F is also scale factor independent, and is usually<br />
multiplied such that F(3,3)=1<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
'<br />
l<br />
l<br />
l<br />
'<br />
r<br />
p = M p , p =<br />
p<br />
'T<br />
l<br />
M<br />
( ) −1 T −1<br />
'<br />
M EM p = 0<br />
p<br />
l<br />
T<br />
'<br />
l<br />
F<br />
p<br />
SC437<br />
Computer Vision and Image Processing<br />
r<br />
'<br />
r<br />
=<br />
r<br />
0<br />
r<br />
p<br />
r<br />
TJ Cham<br />
2002-03 / S2
Using Epipolar Geometry<br />
Writing the equation in full, we have<br />
We can express as<br />
If we have specific values for (x l ,y l ), this equation<br />
is a straight line in the right image!<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
1<br />
[ x y 1]<br />
f f f y = 0<br />
l<br />
l<br />
⎡ f<br />
⎢<br />
⎢<br />
⎣ f<br />
4<br />
7<br />
1<br />
SC437<br />
Computer Vision and Image Processing<br />
f<br />
f<br />
2<br />
5<br />
8<br />
f<br />
3<br />
6<br />
⎤ ⎡xr<br />
⎤<br />
⎥ ⎢ ⎥ r<br />
⎥ ⎢ 1 ⎥<br />
⎦ ⎣ ⎦<br />
[ ] r<br />
f<br />
x f y + f f x + f y + f = − f x + f y + 1)<br />
1<br />
l<br />
⎡x<br />
+ 4 l 7 2 l 5 l 8 ⎢ ⎥ ( 3 l 6 l<br />
yr<br />
⎣<br />
⎤<br />
⎦<br />
TJ Cham<br />
2002-03 / S2
Using Epipolar Geometry<br />
If we know the Fundamental matrix<br />
1. for any point (x l ,y l ) in the one image, we know the<br />
epipolar line in the other image<br />
2. We only need to search for corresponding point along<br />
epipolar line<br />
Example:<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Images ©Robotvis INRIA<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Estimating Fundamental Matrix<br />
How do we compute the Fundamental Matrix?<br />
Obtain directly from projection matrices<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
• if cameras are pre-calibrated<br />
• Difficult to derive and not covered in lectures<br />
or …<br />
Obtain from 8 known image correspondences<br />
• 1 correspondence means a specific (xl,yl),(xr,yr) pair<br />
• Can be used with uncalibrated cameras<br />
• Knowing 1 correspondence provides 1 constraint to the equation<br />
p T<br />
l Fpr=0<br />
• Knowing 8 correspondences<br />
able to solve for 8 variables in F<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Estimating Fundamental Matrix<br />
The Fundamental Matrix equation in full<br />
We can rewrite as<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
1<br />
[ x y 1]<br />
f f f y = 0<br />
l<br />
l<br />
⎡ f<br />
⎢<br />
⎢<br />
⎣ f<br />
4<br />
7<br />
f<br />
f<br />
2<br />
5<br />
8<br />
1<br />
SC437<br />
Computer Vision and Image Processing<br />
f<br />
3<br />
6<br />
⎤ ⎡xr<br />
⎤<br />
⎥ ⎢ ⎥ r<br />
⎥ ⎢ 1 ⎥<br />
⎦ ⎣ ⎦<br />
4<br />
[ x x y x y x y y y x y ] ⎢ ⎥ = −1<br />
xl r l r l l r l r l r r<br />
…<br />
known values from<br />
1 correspondence<br />
unknown values<br />
to be computed<br />
:<br />
⎡<br />
⎢<br />
⎢<br />
⎢<br />
⎢<br />
⎢<br />
⎢<br />
⎢<br />
⎣<br />
f<br />
f<br />
f<br />
f<br />
f<br />
f<br />
f<br />
f<br />
1<br />
2<br />
3<br />
5<br />
6<br />
7<br />
8<br />
⎤<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎦<br />
TJ Cham<br />
2002-03 / S2
Estimating Fundamental Matrix<br />
⎡ xl1x<br />
⎢xl<br />
2xr<br />
2<br />
⎢<br />
⎢<br />
⎣ xl8x<br />
r<br />
Using 8 known correspondences, we get<br />
r1<br />
l8<br />
r8<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
8<br />
xl1y<br />
x y<br />
l 2<br />
x<br />
y<br />
r1<br />
r 2<br />
x<br />
x<br />
x<br />
l1<br />
l 2<br />
l8<br />
y<br />
y<br />
y<br />
l1<br />
l 2<br />
l8<br />
x<br />
x<br />
M<br />
x<br />
r1<br />
r 2<br />
r8<br />
B<br />
8 x 8<br />
y<br />
y<br />
y<br />
l1<br />
l 2<br />
l8<br />
y<br />
y<br />
r1<br />
r 2<br />
r8<br />
l1<br />
l 2<br />
l8<br />
SC437<br />
Computer Vision and Image Processing<br />
y<br />
y<br />
y<br />
y<br />
x<br />
x<br />
r1<br />
x<br />
r 2<br />
r8<br />
y<br />
y<br />
r1<br />
y<br />
r 2<br />
r8<br />
⎡ f<br />
⎢ f<br />
⎢<br />
⎤ f<br />
⎢<br />
⎥<br />
⎢ f<br />
⎥<br />
⎢ f<br />
⎥<br />
⎦ ⎢ f<br />
⎢ f<br />
⎢<br />
⎣ f<br />
1<br />
2<br />
3<br />
4<br />
5<br />
6<br />
7<br />
8<br />
⎤<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎥<br />
⎦<br />
f<br />
8 x 1<br />
⎡−1⎤<br />
⎢−1⎥<br />
⎢−1⎥<br />
⎢ ⎥<br />
=<br />
−1<br />
⎢−1⎥<br />
⎢−1⎥<br />
⎢−1⎥<br />
⎢<br />
⎣−1⎥<br />
⎦<br />
TJ Cham<br />
2002-03 / S2
Estimating Fundamental Matrix<br />
We can get then solve for the elements of<br />
the Fundamental Matrix<br />
With n>8 point correspondences, B is nx8<br />
and we have to use pseudo-inverse<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
f<br />
= −1<br />
B<br />
⎡−<br />
⎢<br />
M<br />
⎢<br />
⎢⎣<br />
−<br />
1<br />
1<br />
⎤<br />
⎥<br />
⎥<br />
⎥⎦<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Recap on Epipolar Geometry<br />
Epipolar geometry<br />
Epipoles – image point of projection center<br />
Epipolar lines – projection of 3D rays from projection<br />
center<br />
Epipolar equations<br />
Essential matrix –image points in camera reference frame<br />
Fundamental matrix –<br />
T<br />
p<br />
Fp<br />
= 0<br />
covers additional affine transformation of image points<br />
Using epipolar geometry<br />
Search only along epipolar line to find corresponding point<br />
Estimating fundamental matrix<br />
Use at least n>8 known image correspondences<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
l<br />
SC437<br />
Computer Vision and Image Processing<br />
r<br />
TJ Cham<br />
2002-03 / S2
3D Reconstruction from Perspective Stereo<br />
3D Reconstruction by triangulation with calibrated<br />
cameras<br />
for each image point in both cameras, we know the<br />
associated 3D ray<br />
Find intersection of rays from corresponding image<br />
points<br />
O l<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
O r<br />
TJ Cham<br />
2002-03 / S2
3D Reconstruction from Perspective Stereo<br />
Do the triangulation in matrix form<br />
Assume we have projection matrices from camera<br />
calibration<br />
L camera:<br />
R camera:<br />
Except now the unknowns are only X, Y, Z!<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
⎡kx<br />
⎢ky<br />
⎢<br />
⎣ k<br />
l<br />
l<br />
⎡mx<br />
⎢my<br />
⎢<br />
⎣ m<br />
⎤ ⎡a1<br />
⎥ = ⎢a5<br />
⎥ ⎢<br />
⎦ ⎣a9<br />
r<br />
r<br />
⎤ ⎡b1<br />
⎥ = ⎢b5<br />
⎥ ⎢<br />
⎦ ⎣b9<br />
a<br />
a<br />
a<br />
10<br />
11<br />
SC437<br />
Computer Vision and Image Processing<br />
2<br />
6<br />
b<br />
b<br />
b<br />
2<br />
6<br />
10<br />
a<br />
a<br />
a<br />
3<br />
7<br />
b<br />
b<br />
b<br />
3<br />
7<br />
11<br />
a ⎤ ⎡X<br />
⎤<br />
4<br />
a ⎥ ⎢Y<br />
⎥<br />
8<br />
⎥ ⎢Z<br />
⎥<br />
1 ⎦ ⎢⎣<br />
1 ⎥⎦<br />
⎡ ⎤<br />
4 ⎤<br />
⎥ ⎢Y<br />
⎥<br />
8<br />
⎥ ⎢Z<br />
⎥<br />
1 ⎦ ⎢⎣<br />
1 ⎥⎦<br />
X b<br />
b<br />
TJ Cham<br />
2002-03 / S2
3D Reconstruction from Perspective Stereo<br />
x l<br />
<br />
Rewrite for left camera<br />
=<br />
a<br />
a<br />
Similarly for right camera<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
1<br />
9<br />
X + a2Y<br />
+ a3Z<br />
+ a4<br />
X + a Y + a Z + 1<br />
10<br />
11<br />
y l<br />
=<br />
a<br />
a<br />
SC437<br />
Computer Vision and Image Processing<br />
5<br />
9<br />
X<br />
X<br />
+ a6Y<br />
+ a7Z<br />
+ a8<br />
+ a Y + a Z + 1<br />
( a xl<br />
− a1)<br />
X + ( a10xl<br />
− a2)<br />
Y + ( a11xl<br />
− a3)<br />
Z = ( a4<br />
− x<br />
9 l<br />
( a yl<br />
−<br />
a5)<br />
X + ( a10<br />
yl<br />
− a6)<br />
Y + ( a11yl<br />
− a7)<br />
Z = ( a8<br />
−<br />
9 l<br />
10<br />
11<br />
y<br />
)<br />
)<br />
TJ Cham<br />
2002-03 / S2
3D Reconstruction from Perspective Stereo<br />
Hence with a pair of image correspondences (x l ,y l )<br />
and (x r ,y r ), we can get<br />
⎡a<br />
⎢a<br />
⎢b<br />
⎢<br />
⎣b<br />
x<br />
y<br />
x<br />
y<br />
− a<br />
− a<br />
− b<br />
− b<br />
Compute 3D world coordinates X, Y and Z via<br />
pseudo-inverse ⎡ ⎤<br />
Computationally expensive – compute new pseudoinverse<br />
for each pair of image correspondences<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
9<br />
9<br />
9<br />
9<br />
l<br />
l<br />
r<br />
r<br />
1<br />
5<br />
1<br />
5<br />
a<br />
a<br />
b<br />
b<br />
10<br />
10<br />
10<br />
10<br />
x<br />
y<br />
x<br />
y<br />
l<br />
l<br />
r<br />
r<br />
− a<br />
− a<br />
− b<br />
− b<br />
W<br />
2<br />
6<br />
2<br />
6<br />
a<br />
a<br />
b<br />
b<br />
11<br />
11<br />
11<br />
11<br />
x<br />
y<br />
x<br />
y<br />
− a<br />
− a<br />
− b<br />
− b<br />
⎤<br />
⎥<br />
⎥<br />
⎥<br />
⎦<br />
+<br />
⎢Y<br />
⎥ = W q = (W<br />
⎢⎣<br />
Z ⎥⎦<br />
X<br />
SC437<br />
Computer Vision and Image Processing<br />
l<br />
l<br />
r<br />
r<br />
3<br />
7<br />
3<br />
7<br />
⎡a<br />
⎡ ⎤ ⎢a<br />
⎢Y<br />
⎥ = ⎢<br />
⎢⎣<br />
Z ⎥ b<br />
⎦ ⎢<br />
⎣b<br />
X<br />
4<br />
8<br />
4<br />
8<br />
W)<br />
T −1<br />
W<br />
−<br />
−<br />
−<br />
−<br />
q<br />
T<br />
q<br />
x<br />
y<br />
x<br />
y<br />
l<br />
l<br />
r<br />
r<br />
⎤<br />
⎥<br />
⎥<br />
⎥<br />
⎦<br />
TJ Cham<br />
2002-03 / S2
3D Reconstruction from Weak-<br />
Perspective Stereo<br />
Assume weak-perspective projection matrices<br />
L camera:<br />
R camera:<br />
Note: last rows of matrix equations are redundant!<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
⎡xl<br />
⎤ ⎡a1<br />
⎢y<br />
⎥ = ⎢ l a5<br />
⎢ ⎥ ⎢<br />
⎣ 1 ⎦ ⎣ 0<br />
⎡xr<br />
⎤ ⎡b1<br />
⎢y<br />
⎥ = ⎢ r b5<br />
⎢ ⎥ ⎢<br />
⎣ 1 ⎦ ⎣ 0<br />
a<br />
a<br />
0<br />
a<br />
a<br />
0<br />
SC437<br />
Computer Vision and Image Processing<br />
2<br />
6<br />
b2<br />
b6<br />
0<br />
3<br />
7<br />
b3<br />
b7<br />
0<br />
a ⎤ ⎡X<br />
⎤<br />
4<br />
a ⎥ ⎢Y<br />
⎥<br />
8<br />
⎥ ⎢ Z ⎥<br />
1 ⎦ ⎢⎣<br />
1 ⎥⎦<br />
⎡ ⎤<br />
4 ⎤<br />
⎥ ⎢Y<br />
⎥<br />
8<br />
⎥ ⎢ Z ⎥<br />
1 ⎦ ⎢⎣<br />
1 ⎥⎦<br />
X b<br />
b<br />
TJ Cham<br />
2002-03 / S2
3D Reconstruction from Weak-<br />
Perspective Stereo<br />
Eliminate last rows and combine equations<br />
Again, solve by pseudo-inverse<br />
However, pseudo-inverse is dependent only on the<br />
projection matrices and not on individual image<br />
points<br />
need only be computed once for 3D reconstruction of<br />
all image correspondences<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
⎡a<br />
⎢a<br />
⎢b<br />
⎢<br />
⎣b<br />
1<br />
5<br />
1<br />
5<br />
a<br />
a<br />
b<br />
b<br />
2<br />
6<br />
2<br />
6<br />
a<br />
a<br />
b<br />
b<br />
3<br />
7<br />
3<br />
7<br />
⎤<br />
⎥<br />
⎥<br />
⎥<br />
⎦<br />
⎡ ⎤<br />
⎢Y<br />
⎥<br />
⎢⎣<br />
Z ⎥⎦<br />
X<br />
=<br />
⎡x<br />
⎢y<br />
⎢x<br />
⎢<br />
⎣y<br />
SC437<br />
Computer Vision and Image Processing<br />
l<br />
l<br />
r<br />
r<br />
− a<br />
− a<br />
− b<br />
− b<br />
4<br />
8<br />
4<br />
8<br />
⎤<br />
⎥<br />
⎥<br />
⎥<br />
⎦<br />
TJ Cham<br />
2002-03 / S2
Recap on 3D Reconstruction<br />
3D reconstruction involves triangulation of<br />
3D rays<br />
Full perspective 3D reconstruction<br />
Weak perspective 3D reconstruction<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
Computationally cheaper – compute pseudoinverse<br />
only once<br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Example Application of 3D Reconstruction<br />
Comparing Matrix movie’s super slow-motion<br />
technology and Kanade’s Virtualized Reality<br />
Matrix movie’s technology<br />
•Does not use stereo vision<br />
•Needs a lot of cameras<br />
•Trajectory of slow-mo sequence fixed<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
Virtualized Reality<br />
•Based on 3D stereo vision<br />
• Needs fewer cameras<br />
•No constraint on trajectory<br />
TJ Cham<br />
2002-03 / S2
Example Application of 3D Reconstruction<br />
Virtualized Reality (Kanade et al.)<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Practical Overview of Simplistic 3D<br />
Reconstruction Process<br />
1. Fix the pose of your cameras, e.g. by placing them<br />
on tripods<br />
2. Using a calibration chart,<br />
Calibrate your cameras using known correspondences of<br />
3D points to 2D image points<br />
Establish epipolar geometry by computing Fundamental<br />
Matrix using known correspondences between 2D image<br />
points<br />
3. For an arbitrary scene in front of your fixed<br />
cameras, your system should automatically:<br />
a. Find corresponding image points between camera images<br />
b. Compute 3D coordinates for each correspondence<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2
Summary of 3D Stereo Vision<br />
Representation of 3D data<br />
Parallax<br />
2D triangulation<br />
Appearance-based matching<br />
Feature-based matching<br />
Epipolar geometry<br />
3D reconstruction<br />
School of Computer Engineering<br />
<strong>Nanyang</strong> <strong>Technological</strong> <strong>University</strong><br />
SC437<br />
Computer Vision and Image Processing<br />
TJ Cham<br />
2002-03 / S2