A Probabilistic Approach to Geometric Hashing using Line Features
A Probabilistic Approach to Geometric Hashing using Line Features
A Probabilistic Approach to Geometric Hashing using Line Features
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
A <strong>Probabilistic</strong> <strong>Approach</strong> <strong>to</strong><br />
<strong>Geometric</strong> <strong>Hashing</strong> <strong>using</strong> <strong>Line</strong> <strong>Features</strong><br />
by<br />
Frank Chee-Da Tsai<br />
a dissertation submitted in partial fulfillment<br />
of the requirements for the degree of<br />
Doc<strong>to</strong>r of Philosophy<br />
Department of Computer Science<br />
New York University<br />
September 1993<br />
Approved:<br />
Professor Jacob T. Schwartz<br />
Research Advisor
cæ Frank Chee-Da Tsai<br />
All Rights Reserved, 1993.
Dedication<br />
This dissertation is dedicated <strong>to</strong> Professor Jacob T. Schwartz for his invaulable advisory,<br />
<strong>to</strong> my parents for their endless love and <strong>to</strong> Buddha, whose wisdom cheers me through the<br />
dark patches of my life.<br />
iii
Acknowledgements<br />
My most sincere gratitude goes <strong>to</strong> my research advisor Professor Jacob T. Schwartz<br />
for his generous guidance, without which this dissertation can not be possible. To whom,<br />
I also owe the understanding of identifying research directions of scientiæc interest.<br />
Iwould also like <strong>to</strong> express my gratitude <strong>to</strong> Professor Wen-Hsiang Tsai of National<br />
Chiao-Tung University, Hsin-Chu, Taiwan, Republic of China. I <strong>to</strong>ok his course ëImage<br />
Processing" when I was an undergraduate senior. This is my ærst taste of applying computer<br />
technology <strong>to</strong> the processing of images. After two-year R.O.T.C. military service<br />
upon graduation, I came here <strong>to</strong> the Courant Institute in 1987 <strong>to</strong> further my study. Professor<br />
Stçephane Mallat's ëComputer Vision" course again intrigued my interest in the æeld<br />
of image analysis.<br />
Thanks are also due <strong>to</strong> Professor Jaiwei Hong, Professor Robert Hummel, Professor<br />
Richard Wallace, Professor Haim Wolfson, Dr. Isidore Rigoutsos, Mr. Ronie Hecker and<br />
Mr. Jyh-Jong Liu for various kinds of helps and discussions.<br />
I also thank the staæs in the Courant Robotics Labora<strong>to</strong>ry, for the days we worked<br />
<strong>to</strong>gether, especially Dr. Xiaonan Tan for her encouragement.<br />
Finally, I thank my parents, truly and sincerely, for their patience, support and constant<br />
encouragement throughout the work and my whole life.<br />
iv
Abstract<br />
One of the most important goals of computer vision research is object recognition.<br />
Most current object recognition algorithms assume reliable image segmentation, which in<br />
practice is often not available. This research exploits the combination of the Hough method<br />
with the geometric hashing technique for model-based object recognition in seriously degraded<br />
intensity images.<br />
We describe the analysis, design and implementation of a recognition system that can<br />
recognize, in a seriously degraded intensity image, multiple objects modeled by a collection<br />
of lines.<br />
We ærst examine the fac<strong>to</strong>rs aæecting line extraction by the Hough transform and proposed<br />
various techniques <strong>to</strong> cope with them. <strong>Line</strong> features are then used as primitive<br />
features from which we compute the geometric invariants used by the geometric hashing<br />
technique. Various geometric transformations, including rigid, similarity, aæne and<br />
projective transformations, are examined.<br />
We then derive the ëspread" of computed invariant over the hash space caused by<br />
ëperturbation" of the lines giving rise <strong>to</strong> this invariant. This is the ærst of its kind for<br />
noise analysis on line features for geometric hashing. The result of the noise analysis is<br />
then used in a weighted voting scheme for the geometric hashing technique.<br />
We have implemented the system described and carried out a series of experiments on<br />
polygonal objects modeled by lines, assuming aæne approximations <strong>to</strong> perspective viewing<br />
transformations. Our experimental results show that the technique described is noise<br />
resistant and suitable in an environment containing many occlusions.<br />
v
Contents<br />
List of Figures<br />
ix<br />
1 Introduction 1<br />
1.1 Model-Based Object Recognition : : : : : : : : : : : : : : : : : : : : : : : : 2<br />
1.1.1 Problem Deænition : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2<br />
1.1.2 Object Recognition Method : : : : : : : : : : : : : : : : : : : : : : : 3<br />
1.2 Diæculties <strong>to</strong> Be Faced : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5<br />
1.3 The <strong>Hashing</strong> <strong>Approach</strong> : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5<br />
1.4 Overview of This Dissertation : : : : : : : : : : : : : : : : : : : : : : : : : : 6<br />
1.4.1 <strong>Line</strong> Extraction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7<br />
1.4.2 <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7<br />
1.4.3 Eæect of Noise on <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : 7<br />
1.4.4 Invariant Matching with Weighted Voting : : : : : : : : : : : : : : : 7<br />
2 Prior and Related Work 9<br />
2.1 Basic Paradigms : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9<br />
2.1.1 Template Matching : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9<br />
2.1.2 Hypothesis-Prediction-Veriæcation : : : : : : : : : : : : : : : : : : : 10<br />
2.1.3 Transformation Accumulation : : : : : : : : : : : : : : : : : : : : : : 11<br />
2.1.4 Consistency Checking and Constraint Propagation : : : : : : : : : : 13<br />
2.1.5 Sub-Graph Matching : : : : : : : : : : : : : : : : : : : : : : : : : : : 14<br />
2.1.6 Evidential Reasoning : : : : : : : : : : : : : : : : : : : : : : : : : : : 15<br />
2.1.7 Miscellaneous : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15<br />
2.2 An Overview of <strong>Geometric</strong> <strong>Hashing</strong> : : : : : : : : : : : : : : : : : : : : : : : 17<br />
vi
2.2.1 A Brief Description : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17<br />
2.2.2 Strengths and Weaknesses : : : : : : : : : : : : : : : : : : : : : : : : 19<br />
2.2.3 <strong>Geometric</strong> <strong>Hashing</strong> Systems : : : : : : : : : : : : : : : : : : : : : : : 20<br />
2.3 The Bayesian Model of Rigoutsos : : : : : : : : : : : : : : : : : : : : : : : : 20<br />
3 Noise in the Hough Transform 23<br />
3.1 The Hough Transform : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 23<br />
3.2 Various Hough Transform Improvements : : : : : : : : : : : : : : : : : : : : 25<br />
3.3 Implementation and Measured Performance : : : : : : : : : : : : : : : : : : 27<br />
3.4 Some Additional Observations Concerning the Accuracy of Hough Data : : 36<br />
4 Invariant Matching Using <strong>Line</strong> <strong>Features</strong> 39<br />
4.1 Change of Coordinates in Various Spaces : : : : : : : : : : : : : : : : : : : 39<br />
4.1.1 Change of Coordinates in Image Space : : : : : : : : : : : : : : : : : 40<br />
4.1.2 Change of Coordinates in <strong>Line</strong> Parameter Space : : : : : : : : : : : 40<br />
4.1.3 Change of Coordinates in èç; rè Space : : : : : : : : : : : : : : : : : 41<br />
4.2 Encoding <strong>Line</strong>s by a Basis of <strong>Line</strong>s : : : : : : : : : : : : : : : : : : : : : : : 42<br />
4.3 <strong>Line</strong> Invariants under Various Transformation Groups : : : : : : : : : : : : 43<br />
4.3.1 Rigid <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : : : : : : : : 43<br />
4.3.2 Similarity <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : : : : : 46<br />
4.3.3 Aæne <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : : : : : : : 51<br />
4.3.4 Projective <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : : : : : 55<br />
5 The Eæect of Noise on the Invariants 60<br />
5.1 A Noise Model for <strong>Line</strong> Parameters : : : : : : : : : : : : : : : : : : : : : : : 60<br />
5.2 The Spread Function of the Invariants : : : : : : : : : : : : : : : : : : : : : 61<br />
5.3 Spread Functions under Various Transformation Groups : : : : : : : : : : : 64<br />
5.3.1 Rigid-Invariant Spread Function : : : : : : : : : : : : : : : : : : : : 64<br />
5.3.2 Similarity-Invariant Spread Function : : : : : : : : : : : : : : : : : : 65<br />
5.3.3 Aæne-Invariant Spread Function : : : : : : : : : : : : : : : : : : : : 67<br />
5.3.4 Projective-Invariant Spread Function : : : : : : : : : : : : : : : : : : 69<br />
vii
6 Bayesian <strong>Line</strong> Invariant Matching 74<br />
6.1 A Measure of Matching between Scene and Model Invariants : : : : : : : : 74<br />
6.2 Evidence Synthesis by Bayesian Reasoning : : : : : : : : : : : : : : : : : : : 76<br />
6.3 The Algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 77<br />
6.4 An Analysis of the Number of Probings Needed : : : : : : : : : : : : : : : : 78<br />
7 The Experiments 80<br />
7.1 Implementation of Aæne Invariant Matching : : : : : : : : : : : : : : : : : 80<br />
7.2 Best Least-Squares Match : : : : : : : : : : : : : : : : : : : : : : : : : : : : 82<br />
7.3 Experimental Results and Observations : : : : : : : : : : : : : : : : : : : : 85<br />
8 Conclusions 95<br />
8.1 Discussion and Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : 95<br />
8.2 Future Directions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 96<br />
A 98<br />
B 100<br />
Bibliography 102<br />
viii
List of Figures<br />
3.1 A comparison of the results of the standard Hough technique and our improved<br />
Hough technique, when applied <strong>to</strong> an image without noise. : : : : : 28<br />
3.2 A comparison of the results of the standard Hough technique and our improved<br />
Hough technique, when applied <strong>to</strong> a noisy image. : : : : : : : : : : : 29<br />
3.3 The twenty models used in our experiments. From left <strong>to</strong> right, <strong>to</strong>p <strong>to</strong><br />
bot<strong>to</strong>m are model-0 <strong>to</strong> model-19. : : : : : : : : : : : : : : : : : : : : : : : : 30<br />
3.4 Reading from <strong>to</strong>p <strong>to</strong> bot<strong>to</strong>m, left <strong>to</strong> right, è0è <strong>to</strong> è8è are image-0 with different<br />
noise levels corresponding <strong>to</strong> cases 0 <strong>to</strong> 8. : : : : : : : : : : : : : : : 32<br />
3.5 Image-1 <strong>to</strong> Image-9 before noise is imposed. The same noise levels as for<br />
Image-0 are applied during our experiments. : : : : : : : : : : : : : : : : : 33<br />
3.6 The thicker line is with parameter èç; rè. Projections of the pixels of this<br />
line on<strong>to</strong> line èç +90 æ ;rè will disperse around the position r distant from<br />
the origin. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 38<br />
7.1 Experimental Example 1 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 87<br />
7.2 Experimental Example 2 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 88<br />
7.3 Experimental Example 3 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 89<br />
7.4 Experimental Example 4 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 90<br />
7.5 Experimental Example 5 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92<br />
7.6 Experimental Example 6 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 93<br />
ix
Chapter 1<br />
Introduction<br />
One of the most important goals of computer vision research is object recognition. This<br />
humanlike visual capability would enable machines <strong>to</strong> sense and analyze their environment<br />
and <strong>to</strong> take an appropriate action as desirable.<br />
We consider intensity-image techniques. Range data is usually harder <strong>to</strong> obtain. There<br />
is also reason <strong>to</strong> believe that human vision emphasizes intensity images and that most<br />
practical applications of computer vision can be tackled without range information ë42ë.<br />
Speciæcally, we consider the problem of 2-D èor, æat 3-Dè object recognition under various<br />
viewing transformations. We are interested in è1è large model bases èmore than ten<br />
objectsè; è2è cluttered scene èlow signal noise ratioè; è3è high occlusion levels; è4è segmentation<br />
diæculties. The current approaches <strong>to</strong> image segmentation generally produce<br />
incomplete boundaries and extraneous edge indications. Therefore, any approach <strong>to</strong> object<br />
recognition has <strong>to</strong> cope with segmentation defects. To work in environments containing<br />
many occlusions, we impose minimal segmentation requirements | only positional information<br />
of edgels is assumed <strong>to</strong> be available.<br />
We will describe the analysis, design and implementation of a recognition system that<br />
can recognize, in a seriously degraded intensity image, multiple objects which can be<br />
modeled as collections of lines.<br />
1
CHAPTER 1. INTRODUCTION 2<br />
1.1 Model-Based Object Recognition<br />
Recognition can be achieved by establishing correspondences between many kinds of predicted<br />
and measured object properties, including shape, color, texture, connectivity, context,<br />
motion, or shading. Here, we will focus upon the problem of achieving spatial correspondence.<br />
This is often prerequisite <strong>to</strong> examining correspondences along the other<br />
dimensions.<br />
Prior research ë7,13ë has indicated that a model-based approach <strong>to</strong> object recognition<br />
can be very eæective inovercoming occlusion, complication and inadequate or erroneous<br />
low level processing. Most commercial vision systems are model-based ones, in which<br />
recognition involves matching an input image <strong>to</strong> a set of predeæned models of objects. A<br />
key goal in this approach is <strong>to</strong> precompile a description of a known set of objects, then <strong>to</strong><br />
use these object models <strong>to</strong> recognize in an image each instance of an object and <strong>to</strong> specify<br />
its position and orientation relative <strong>to</strong> the viewer.<br />
1.1.1 Problem Deænition<br />
Object recognition can be conceptualized in various ways. A brief survey of the literature<br />
on this subject demonstrates this point ë7,8,13ë. Here, we adopt the deænition given by<br />
Besl and Jain ë7ë.<br />
Their deænition is motivated by the observation of human visual capabilities. Two<br />
main steps are involved. The ærst step is learning. When a new object is given, human<br />
visual system gathers information about that object from many diæerent viewpoints. This<br />
process is usually referred <strong>to</strong> as model formation. The second step is identiæcation.<br />
The following deænition of the single-arbitrary-view model-based object recognition problem<br />
is motivated by the above:<br />
1. Given any collection of labeled solid objects, èaè each object can be examined as long<br />
as the object is not deformed; èbè labeled models can be created <strong>using</strong> information<br />
from this examination.<br />
2. Given digitized sensor data corresponding <strong>to</strong> one particular, but arbitrary æeld of<br />
view of the real world, given any data s<strong>to</strong>red previously during the model formation<br />
process, and given the list of distinguishable objects, the following questions must be<br />
answered:
CHAPTER 1. INTRODUCTION 3<br />
æ Does the object appear in the digitized sensor data ?<br />
æ If so, how many times does it occur ?<br />
æ For each occurrence, è1è determine the location in the sensor data èimageè, è2è<br />
determine the location èor translation parametersè of that object with respect<br />
<strong>to</strong> a known coordinate system èif possible with given sensorè, and è3è determine<br />
the orientation èor rotation parametersè with respect <strong>to</strong> a known coordinate<br />
system.<br />
This problem statement allows the possibility of <strong>using</strong> computers <strong>to</strong> solve it; at any<br />
rate the problem is clearly solvable by human beings.<br />
1.1.2 Object Recognition Method<br />
In most object-recognition systems, one can distinguish æve sub-processes: image formation,<br />
feature extraction, model representation, scene analysis and hypotheses veriæcation.<br />
Image Formation<br />
This process creates images via sensors, which can be sensitive <strong>to</strong>avariety of signals like<br />
visible spectrum light, X-rays, etc. As said, we will concentrate on intensity images.<br />
Feature Extraction<br />
This process acts on the sensor data and extracts relevant features. The geometric features<br />
we are interested in might in principle be lines, segments, curvature extrema, curvature<br />
discontinuities, conics and so on. But we will use lines exclusively.<br />
Model Representation<br />
This process converts models <strong>to</strong> the quantities which will be used <strong>to</strong> recognize them. This<br />
process is closely related <strong>to</strong> the feature-extracting process, since the same features are<br />
supposed <strong>to</strong> be extracted from a scene for matching.<br />
Models based on geometric properties of an object's visible surfaces or silhouette are<br />
commonly used because they describe objects in terms of their constituent shape features.<br />
Although many other models of regions and images emphasizing gray-level properties èe.g.
CHAPTER 1. INTRODUCTION 4<br />
texture and colorè have been proposed, solving the problem of spatial correspondence is<br />
often a prerequisite <strong>to</strong> their use. Thus, recognition of objects in a scene is always apt <strong>to</strong><br />
involve construction of a shape description of objects from sensed data and then matching<br />
the description <strong>to</strong> s<strong>to</strong>red object models.<br />
A shape representation scheme usually involves at least two components:<br />
1. the primitive units, e.g. vertices è0-dimensionè, curves and edges è1-dimensionè, surfaces<br />
è2-dimension, e.g. plane surface, quadratic surfaceè and volumes è3-dimension,<br />
e.g. generalized cylindersè;<br />
2. the way those primitives are combined <strong>to</strong> form the object models.<br />
Such a representation scheme must satisfy the following two criteria <strong>to</strong> be considered of<br />
good quality:<br />
1. Near-Uniqueness: Ideally each object in the world should have a limited number of<br />
representations; otherwise, when image features are derived, there will be also many<br />
choices of conæguration for each representation and thus increase the computational<br />
complexity <strong>to</strong> match conægurations of image features <strong>to</strong> model representations.<br />
2. Continuity: Similar objects should have similar representations and very diæerent<br />
objects should have very diæerent representations.<br />
Scene Analysis<br />
This process involves an algorithm for performing matching between models and scene<br />
descriptions. This process is most crucial in an object recognition system and is the focus<br />
of this thesis. It recovers the transformations the model objects undergo during the imageformation<br />
process. Sometimes a quality measure can be associated with each model instance<br />
detected; this measure can be used as a measure of belief for further decision making in<br />
an au<strong>to</strong>nomous system. We will classify basic matching techniques and review them in<br />
chapter 2. The techniques we are interested in involve those allowing partial occlusion,<br />
viewing dis<strong>to</strong>rtion and data perturbation.<br />
In this dissertation, we will focus on <strong>using</strong> the geometric hashing technique for matching.<br />
The geometric hashing technique is very good as a ælter capable of eliminating many<br />
candidate hypotheses as the identity of the objects ë38ë.
CHAPTER 1. INTRODUCTION 5<br />
Hypotheses Veriæcation<br />
This process evaluates the quality of surviving hypotheses and either accepts or rejects<br />
them. Usually an object recognition system projects the hypothesized models on<strong>to</strong> the<br />
scene and the fraction of the model accounted for by the available scene signals is computed.<br />
If this fraction is below a pre-set èusually obtained empiricallyè threshold, the hypothesis<br />
fails veriæcation.<br />
1.2 Diæculties <strong>to</strong> Be Faced<br />
What makes the visual recognition task diæcult ? Basically, there are four fac<strong>to</strong>rs:<br />
1. noise, e.g. sensor noise and bad lighting conditions;<br />
2. clutter, e.g. industrial parts occluded by æakes generated in the milling process;<br />
3. occlusion, e.g. industrial parts overlapping each other;<br />
4. spurious data, e.g. presence of unfamiliar objects in the scene.<br />
The amount of eæort required <strong>to</strong> recognize and locate the objects increases when these<br />
four fac<strong>to</strong>rs become serious. In dealing with these four fac<strong>to</strong>rs, we have <strong>to</strong> consider three<br />
problems ë13ë:<br />
1. What kind of ëfeatures" should and can be reliably be extracted from an image in<br />
order <strong>to</strong> adequately describe the spatial relations in a scene ?<br />
2. What constitutes an eæective representation of these features and their relationships<br />
for an object model ?<br />
3. How should the correspondence between scene features and model features be matched<br />
in order <strong>to</strong> recover model instances in a complex scene ?<br />
1.3 The <strong>Hashing</strong> <strong>Approach</strong><br />
In order <strong>to</strong> recognize 3-D objects in 2-D images, we must either perform a comparison<br />
in 2-D or 3-D. Since it is easier <strong>to</strong> project 3-D models on<strong>to</strong> a 2-D image than it is <strong>to</strong>
CHAPTER 1. INTRODUCTION 6<br />
reconstruct 3-D shapes from 2-D scenes, it is logical, in model-based vision, <strong>to</strong> perform the<br />
comparisons in image space. If we simply project the 3-D models on<strong>to</strong> image space during<br />
the recognition process, we must guess at the projection parameters, and incur the computational<br />
expense of multiple projections at run-time. If we precompute many projections,<br />
then we potentially substitute many 2-D models for relatively fewer 3-D models.<br />
A more eæective method of performing object recognition was proposed by Schwartz<br />
and Sharir ë33ë. The technique exploits the idea of hashing. By appropriately selecting a<br />
hash function, dictionary operations 1 can be performed in Oè1è time on the average. By<br />
hashing transformation invariants, instead of raw data subject <strong>to</strong> a viewing transformation,<br />
the hash function ëhashes" <strong>to</strong> a æxed ëbucket" of the hash table before and after whatever<br />
viewing transformations are allowed. Thus object models can be represented by its<br />
constituent geometric features, some appropriate subset of which is hashed and pre-s<strong>to</strong>red<br />
in the hash table in a redundant way. During recognition, the same hash operations are<br />
applied <strong>to</strong> a subset of scene features and candidate matching model features are retrieved<br />
from the hash table <strong>to</strong> hypothesize their correspondence.<br />
This hashing technique trades space for time. We mayeven pre-compute many projections,<br />
substituting many 2-D models for relatively fewer 3-D models.<br />
1.4 Overview of This Dissertation<br />
In this dissertation, we consider highly degraded intensity images containing multiple objects.<br />
We emphasize use of four techniques:<br />
æ Use of an improved Hough transform <strong>to</strong> detect line features in a noisy image;<br />
æ Use of geometric invariants derived from line features under various viewing transformations;<br />
æ Use of the eæect of line feature statistics on the perturbation analysis of invariants<br />
computed;<br />
æ Use of a Bayesian reasoning as the basis for line feature matching.<br />
1 consisting of ëinsert", ëdelete" and ëmember" operations ë1ë.
CHAPTER 1. INTRODUCTION 7<br />
1.4.1 <strong>Line</strong> Extraction<br />
In noisy scenes, the locations of point features can be hard <strong>to</strong> detect. <strong>Line</strong> features are<br />
more robust and can be extracted by the Hough transform method with greater accuracy.<br />
Thus we choose lines as the primitive features <strong>to</strong> be used.<br />
In chapter 3, we ærst brieæy review the Hough transform technique for detecting line<br />
features. Then we point out several fac<strong>to</strong>rs that adversely aæect the performance of the<br />
method and propose several heuristics <strong>to</strong> cope with those fac<strong>to</strong>rs <strong>to</strong> improve the performance.<br />
A series of experiments relating <strong>to</strong> this point are presented.<br />
1.4.2 <strong>Line</strong> Invariants<br />
In chapter 4, we ærst examine the way in which coordinate changes act on various spaces<br />
of potential interest for recognition. This allows us <strong>to</strong> deæne a method of encoding a line<br />
<strong>using</strong> a combination of other lines in a way invariant under suitable geometric transformations.<br />
Potentially interesting transformations considered include rigid, similarity, aæne<br />
and projective transformations.<br />
1.4.3 Eæect of Noise on <strong>Line</strong> Invariants<br />
In chapter 5, we model the statistical behavior of line parameters detected by the Hough<br />
transform in a noisy image <strong>using</strong> a Gaussian random process with mild assumptions. We<br />
analyze the statistics of the computed invariants and show that these have a Gaussian<br />
distribution in a ærst order approximation.<br />
Analytical formulae for various transformations including rigid, similarity, aæne and<br />
projective transformations are given.<br />
1.4.4 Invariant Matching with Weighted Voting<br />
In chapter 6, we use the result of chapter 5 <strong>to</strong> formulate a Bayesian maximum likelihood<br />
pattern classiæcation as the basis of weighted voting scheme for matching line features by<br />
<strong>Geometric</strong> <strong>Hashing</strong>.<br />
We have implemented a system that makes use of these ideas <strong>to</strong> perform object recognition,<br />
assuming aæne approximations <strong>to</strong> more general perspective viewing transformations.
CHAPTER 1. INTRODUCTION 8<br />
Both synthesized and real images were used in experiments. Experimental results are given<br />
in chapter 7. It is seen that the technique is noise resistant and usable in environments<br />
containing many occlusions.
Chapter 2<br />
Prior and Related Work<br />
Numerous techniques have been proposed for object recognition. A brief survey and classiæcation<br />
of those techniques is given below. These techniques are not independent of each<br />
other; most vision systems combine several of them.<br />
We divide the analysis in<strong>to</strong> three sections. The ærst section examines object recognition<br />
<strong>using</strong> various classical schemes, and the second two sections discuss the background and<br />
existing work in the æeld of geometric hashing. In this thesis, we study the Hough transform<br />
methods, discussed in section 2.1.3 and geometric hashing described in section 2.2. Our<br />
work directly builds upon the Bayesian weighted voting scheme of Rigoutsos, which we<br />
describe in section 2.3.<br />
2.1 Basic Paradigms<br />
2.1.1 Template Matching<br />
Template matching involves matching an image <strong>to</strong> a s<strong>to</strong>red representation and evaluating<br />
some æt function.<br />
According <strong>to</strong> their æexibility, templates can be classiæed in<strong>to</strong> four categories ë14ë:<br />
æ Total templates require an exact match between a scene and a template. Any displacement<br />
or orientation error of pattern in the scene will result in rejection.<br />
æ Partial templates move a template across the scene, computing cross-correlation fac<strong>to</strong>rs.<br />
Points of maximum cross-correlation values are considered as locations where<br />
9
CHAPTER 2. PRIOR AND RELATED WORK 10<br />
the desired pattern appears; multiple matches against the scene is thus possible.<br />
æ Piece templates represent a pattern by its components. Usually the component templates<br />
are weighted by size and scored against a pro<strong>to</strong>type list of expected features.<br />
This method is less sensitive <strong>to</strong> dis<strong>to</strong>rtions of the pattern in the scene than the more<br />
limited techniques listed above. The subgraph matching technique described later<br />
can be viewed as an extension <strong>to</strong> this technique.<br />
æ Flexible templates are designed <strong>to</strong> handle the problems of scene deviations from pro<strong>to</strong>types.<br />
Starting with a good pro<strong>to</strong>type of a known object, templates can be parametrically<br />
modiæed <strong>to</strong> obtain a better æt until no more improvement is obtained.<br />
èThis technique has been successfully applied <strong>to</strong> the sorting of chromosome images.è<br />
A diæculty is that æexible approaches tend <strong>to</strong> be more time-expensive than rigid<br />
approaches.<br />
Most of these template matching techniques apply only in two-dimensional cases and have<br />
little place in three-dimensional analysis, where perspective dis<strong>to</strong>rtion comes in<strong>to</strong> play.<br />
One can s<strong>to</strong>re a dense set of tessellations of possible views, but this results in enormous<br />
computational time in matching. For basic techniques of template matching, see ë15ë.<br />
2.1.2 Hypothesis-Prediction-Veriæcation<br />
The hypothesis-prediction-veriæcation approach is among the most commonly-used techniques<br />
in vision systems. It combines both bot<strong>to</strong>m-up and <strong>to</strong>p-down paradigms.<br />
Hypotheses are generated bot<strong>to</strong>m-up among structures in a geometric hierarchy, from<br />
structures of edges <strong>to</strong> curves, from structures of curves <strong>to</strong> surfaces, and from structures<br />
of surfaces <strong>to</strong> objects. Predictions work <strong>to</strong>p-down from object models <strong>to</strong> images. Once<br />
ahypothesis has been made, predictions and measurements provide new information for<br />
identiæcation.<br />
Many such schemes use geometric features such as arcs, lines and corners <strong>to</strong> construct<br />
model representations. These features are usually portions of the object's boundary.<br />
In the HYPER system ë4ë, both model and scene are represented in the same way<br />
by approximating the boundary with polygons. The ten longest segments are chosen as<br />
so-called privileged segments which are the focus features used <strong>to</strong> detect a prospective
CHAPTER 2. PRIOR AND RELATED WORK 11<br />
correspondence èhypothesisè between object models and the scene. Using this correspondence,<br />
a transformation can be computed and nearby features are sought èpredictionè. The<br />
privileged segments, <strong>to</strong>gether with their nearby features, are combined <strong>to</strong> compute a new<br />
transformation. Then the veriæcation step follows.<br />
Veriæcation is often accomplished <strong>using</strong> an alignment method. Usually there is a tradeoæ<br />
between the hypothesis generation stage and the veriæcation stage. If hypotheses are<br />
generated in a quick-and-dirty manner, then the veriæcation stage requires more eæort. If<br />
we want the veriæcation stage <strong>to</strong> be less pains-taking, then more reliable features have <strong>to</strong><br />
be detected and more accurate hypotheses produced.<br />
In alignment by Huttenlocher and Ullman ë30,31ë, they consider aæne approximations<br />
<strong>to</strong> more general perspective transformations, <strong>using</strong> alignments of triplets of points. Models<br />
are processed sequentially during recognition. For each model, an exhaustive enumeration<br />
of all the possible pairings of three non-collinear points of the model and the scene is<br />
exercised. Thus the alignment method heavily relies on veriæcation. As a transformation<br />
is determined by the correspondence of model and scene features, the model is transformed<br />
and aligned <strong>to</strong> superimpose the image. Veriæcation of the entire edge con<strong>to</strong>ur, rather than<br />
just a few local feature points, is then performed <strong>to</strong> reduce the false alarm rate. To cope<br />
with the high computational cost of the veriæcation stage, a hierarchical scheme is used:<br />
Starting with a relatively simple and rapid check, they eliminate many false matches, and<br />
then conclude with a more accurate and slower check.<br />
To sum up, the hypothesis-prediction-veriæcation cycle relates scene images <strong>to</strong> object<br />
models and object models <strong>to</strong> scene images step by step with reænement in each step.<br />
Note also that any scheme <strong>using</strong> the ëhypothesis-prediction-veriæcation" paradigm can<br />
be tailored <strong>to</strong> a parallel implementation by <strong>using</strong> the overwhelming computing power <strong>to</strong><br />
generate a great number of hypotheses concurrently and do veriæcations concurrently.<br />
2.1.3 Transformation Accumulation<br />
Transformation accumulation is also called pose clustering ë50ë or the generalized Hough<br />
transform ë5ë, which ischaracterized by a ëparallel" accumulation of low level pose evidences,<br />
followed by a clustering step which selects pose hypotheses with strong support<br />
from the set of evidences.<br />
This method can be viewed as the inverse of template matching, which moves the model
CHAPTER 2. PRIOR AND RELATED WORK 12<br />
template around all the positions of the scene and directly computes a value èusually crosscorrelationè<br />
as a measure of matching. Transformation accumulation, instead of trying all<br />
possible positions of the model in the scene, computes which positions are consistent with<br />
model features and scene features. Moreover, its use of all of the evidences without regard<br />
<strong>to</strong> their order of arrival can be advantageous when there are occlusions.<br />
The features used <strong>to</strong> generate pose hypotheses can be of high level or low level. If the<br />
level of features used are high enough, much less computational eæort is required for the<br />
accumulation stage. However, the trade-oæ is that higher level features are usually more<br />
diæcult <strong>to</strong> extract.<br />
The major problem of this technique usually lies in ensuring that all visible hypotheses<br />
are considered within acceptable limits of s<strong>to</strong>rage and computational resources. However,<br />
as computer power continues <strong>to</strong> grow while its cost continues <strong>to</strong> drop, this problem has<br />
been growing steadily less signiæcant.<br />
In the generalized Hough transform ë5ë, the transformation between a model and the<br />
scene is usually described by a set of transformation parameters. For example, four parameters<br />
are needed for the case of two-dimensional similarity transformations: two parameters<br />
for translation; one parameter for rotation; one parameter for scaling. Each transformation<br />
parameter is quantized in the so-called Hough space. The scene features vote for<br />
these parameters which are consistent with the pairings of these scene features and hypothesized<br />
model features. One problem of the generalized Hough transform is the size<br />
of the Hough space. In terms of two-dimensional cases, we face three-dimensional Hough<br />
space for rigid transformations; four-dimensional Hough space for similarity transformations;<br />
six-dimensional Hough space for aæne transformations.<br />
Tucker et al. ë53ë use local boundary features <strong>to</strong> constrain an object's position and<br />
orientation, which is then used as the basis for hypothesis generation. Their system takes<br />
advantage of the highly parallel processing power of the Connection Machine ë26ë <strong>to</strong> generate<br />
numerous transformation hypotheses concurrently and verify them concurrently. The<br />
number of processors required for each model is equal <strong>to</strong> the number of features of the<br />
model. Each scene feature participates, in parallel, independently in each processor <strong>to</strong><br />
match model features for generating transformation hypotheses, which, after veriæed, are<br />
used for evidence accumulation for voting for the pose transformation of the object.
CHAPTER 2. PRIOR AND RELATED WORK 13<br />
Thompson and Mundy ë52ë describe a system for locating objects in a relatively unconstrained<br />
environment. The availability of a three-dimensional surface model of a polyhedral<br />
object is assumed. The primitive feature used is the so-called vertex-pair, which consists<br />
of two vertices: one is characterized by its position coordinate; the other, in addition <strong>to</strong><br />
position coordinate, includes two edges that deæne the vertex. This feature serves as the<br />
basis of computing the aæne viewing transformation from the model <strong>to</strong> the scene. Through<br />
the voting in the transformation space, candidate transformations are selected.<br />
A common critique about this paradigm lies in that as the scene is noisy, the accumulation<br />
of ëevidences" contributed by random noises can possibly result in false alarms.<br />
Grimson and Huttenlocher ë22ë give a formal analysis of the likelihood of false positive<br />
responses of the generalized Hough transform for object recognition. However, we can use<br />
this paradigm as an early stage of processing èi.e., as a ælterè, followed by a scrutinized<br />
veriæcation stage.<br />
2.1.4 Consistency Checking and Constraint Propagation<br />
Many model-based vision schemes are based on searching the set of possible interpretations,<br />
which is usually combina<strong>to</strong>rially large.<br />
As in other areas of artiæcial intelligence, making use of large amounts of world knowledge<br />
can often lead not only <strong>to</strong> increased robustness but also <strong>to</strong> a reduction in the search<br />
pace that must be explored during the process of interpretation. It is usually possible <strong>to</strong><br />
analyze a number of constraints or consistency conditions that must be satisæed <strong>to</strong> make<br />
a correct interpretation. Eæective application of consistency checking or propagation of<br />
constraints during searching can often prune the search space greatly.<br />
Lowe ë41ë emphasizes the importance of viewpoint consistency constraint, which requires<br />
that the locations of all object features in an image be consistent with the projection<br />
from a single viewpoint. The application of this constraint allows the spatial information<br />
in an image <strong>to</strong> be compared with prior knowledge of an object's shape <strong>to</strong> the full degree<br />
of available image resolution. Lowe also argues that viewpoint-consistency plays a central<br />
role in most instances of human visual recognition.<br />
Grimson ë20ë extended his previous work RAF ë23ë <strong>to</strong> handle some classes of parameterized<br />
objects. He approaches the recognition problem as a searching problem <strong>using</strong> the
CHAPTER 2. PRIOR AND RELATED WORK 14<br />
so-called interpretation tree. Since this search is inherently an exponential process, he analyzed<br />
a set of geometric constraints based on the local shape of parts of objects <strong>to</strong> prune<br />
large subtrees from consideration without having <strong>to</strong> explore them.<br />
The relaxation labeling technique èe.g. ë51,54ëè is yet another example of this type.<br />
Many visual recognition problems can be viewed as constraint-satisfaction problems. For<br />
example, when labeling a block-world picture, a relaxation algorithm iteratively assigns<br />
values <strong>to</strong> mutually constrained objects <strong>using</strong> local information alone in such away that<br />
ensures a consistent set of values for which no constraints are violated. The values assigned<br />
<strong>to</strong> the objects by relaxation algorithms can be discrete or probabilistic. In the former case,<br />
discrete levels are assigned <strong>to</strong> objects, while in the latter case, a level of certainty èor<br />
probabilityè is attached <strong>to</strong> each label. Relaxation algorithms can proceed in a paralleliterative<br />
manner, propagating matching constraints in parallel.<br />
2.1.5 Sub-Graph Matching<br />
Both object models and scene images can be expressed by graphs of nodes and arcs. Nodes<br />
represent features èusually geometric structuresè detected and arcs represent the geometric<br />
relations between these features, e.g. ë11ë. The task then reduces <strong>to</strong> ænd an embedding or<br />
ætting of a model in a description of the scene.<br />
Since a model contains both local and relational features in the form of a graph, matching<br />
depends not only on the presence of particular boundary features but also on their<br />
interrelations èe.g. distanceè. The requirement for successful recognition is that a suæcient<br />
set of key local features have <strong>to</strong> be visible and in correct relative positions, which may<br />
allow for a speciæed amount of dis<strong>to</strong>rtion. For example, Bolles and Cain ë10ë proposed the<br />
use of a hierarchy of local focus features for model representation. Once the local focus<br />
features have been detected, nearby features are predicted and searched for in the scene.<br />
If the best focus features are occluded, the second best focus features are used.<br />
An advantage of this technique is its robustness <strong>to</strong> relative occlusion and dis<strong>to</strong>rtion.<br />
Computational ineæciency can be a drawback, since graph-matching or subgraph isomorphism<br />
procedures have <strong>to</strong> be executed. Eæective strategies <strong>to</strong> reduce the time complexity<br />
are therefore required. In ë10ë, once an occurrence of the best focus feature is located, the<br />
system simply runs down the appropriate list of nearby features and a transformation is<br />
computed, followed by averiæcation step. Thus exploration of the whole relational graph
CHAPTER 2. PRIOR AND RELATED WORK 15<br />
is avoided. Barrow and Tenenbaum ë6ë proposes use of a hierarchical graph-searching<br />
technique which decomposes the model in<strong>to</strong> independent components.<br />
2.1.6 Evidential Reasoning<br />
Evidential reasoning is a method for combining information from diæerent sources of evidence<br />
<strong>to</strong> update probabilistic expectations. The attractive feature of <strong>using</strong> evidential<br />
reasoning for computer vision is that it allows us <strong>to</strong> combine information of varying reliability<br />
from many sources, even though no particular item of evidence is by itself strong<br />
enough for recognizing a particular object.<br />
For example, SCERPO by Lowe ë40ë is a search-based matching system. As <strong>to</strong> the<br />
search space of SCERPO, two major components are involved: è1è the space of possible<br />
viewpoints; è2è the space of selecting one model from the model database. Lowe tackles<br />
the ærst component by perceptual organization and suggests that the second component<br />
be treated by a technique of evidential reasoning. Each piece of evidence contributes<br />
<strong>to</strong> the presence of a certain object or objects, which quantitatively can be described by a<br />
probabilistic expectation. The combining of evidences thus can be quantitatively described<br />
by combining probabilistic expectations. The probability ranking is constantly updated <strong>to</strong><br />
reæect new evidence found. For example, as soon as one object is recognized, it provides<br />
contextual information which updates the rankings and aids in the search process. Ranking<br />
and updating may take time; however, as the list of possible objects increases, cost of the<br />
evidential reasoning may be amortized and its use becomes more important.<br />
To sum up, evidential reasoning may deserve serious consideration <strong>to</strong> be incorporated<br />
in<strong>to</strong> a system under development. It shows promise for carrying out the objective of<br />
combining many sources of information, including color, texture, shape and prior knowledge<br />
in a æexible way <strong>to</strong>achieve recognition.<br />
2.1.7 Miscellaneous<br />
Dual Models<br />
If partial visibility problem is <strong>to</strong> be attacked, a recognition system <strong>using</strong> global features<br />
alone can not be feasible. Only local features can be used in the matching procedure in<br />
order <strong>to</strong> cope with overlapping parts and occlusions. One is free <strong>to</strong> choose diæerent local
CHAPTER 2. PRIOR AND RELATED WORK 16<br />
features such aspoints, lines, curves, etc., or their combinations as the matching features,<br />
according <strong>to</strong> the nature of the objects in the model base. However, two fac<strong>to</strong>rs of representation,<br />
completeness and eæciency, have <strong>to</strong> be taken in<strong>to</strong> consideration. Completeness<br />
of representation suggests rich enough features in order <strong>to</strong> enable discrimination between<br />
similar objects; while eæciency suggests <strong>using</strong> some minimal characteristic information <strong>to</strong><br />
match models again the scene èso that the inherent complexity of the matching algorithm<br />
will be smallè.<br />
A main drawback of representation by local features can be its incompleteness, since<br />
objects usually can not be fully described by a set of local features alone. One way <strong>to</strong> cope<br />
with this problem is <strong>to</strong> use both global and local feature representations: Local features<br />
are used in the matching procedure; while the additional complete representation is used,<br />
after the matching process, for veriæcation and further reænement of the object localization<br />
ë4,10,19,31,36ë.<br />
Dual Scenes<br />
Object recognition algorithms usually fall in<strong>to</strong> two categories í those <strong>using</strong> intensity images<br />
and those <strong>using</strong> range data. However, we may use both of them as long as it helps.<br />
For example, Kishon ë34ë combines the use of both range and intensity data <strong>to</strong> extract<br />
3-D curves from a scene. These curves will be of a much higher quality than if they were<br />
extracted from the range data alone. The combined information is also used <strong>to</strong> classify the<br />
3-D curves as either shadow, occluding, fold or painted curves.<br />
Abstracted <strong>Features</strong><br />
An abstracted feature is a feature which does not physically exist, but is inferred. For<br />
example, in preparing a model representation for polygonal objects, we may extend nonneighboring<br />
edges <strong>to</strong> obtain their intersection and use the contained angle as a feature.<br />
Some more advanced applications of <strong>using</strong> abstracted features include the footprint of<br />
concavity ë37ë, the Fourier Descrip<strong>to</strong>r used for silhouettes description of aircraft in ë2ë and<br />
the representation of turning angle as a function of arc length used in ë3,19,49ë.
CHAPTER 2. PRIOR AND RELATED WORK 17<br />
Normalization<br />
There are various reasons for the use of normalization.<br />
In some cases, it is used <strong>to</strong> preserve certain properties. For example, in the probabilistic<br />
relaxation labeling scheme, the accumulated contributions from neighboring points have <strong>to</strong><br />
be normalized so that the updating rule results in a probability èi.e., a value in ë0::1ëè.<br />
In other cases, it can be viewed as a technique for removing some uncertain fac<strong>to</strong>rs. For<br />
example, suppose we use Fourier Descrip<strong>to</strong>r <strong>to</strong> describe the boundary of an object. In order<br />
for such representation insensitive <strong>to</strong> the variations as changes in size, rotation, translation<br />
and so on, we have <strong>to</strong> perform the normalization operations such that the con<strong>to</strong>ur has a<br />
ëstandard" size, orientation and starting point. By normalizing the Fourier component<br />
F è1è <strong>to</strong> have unity magnitude, we remove the fac<strong>to</strong>r of scaling variation. Similarly, the<br />
ès-çè graph èthe representation of turning angle as a function of arc lengthè has <strong>to</strong> be<br />
shifted vertically by adding an oæset <strong>to</strong> ç so that the reference point on the con<strong>to</strong>ur is<br />
somewhat a standard value, such as zero. Since the rotation of a con<strong>to</strong>ur in Cartesian<br />
space corresponds <strong>to</strong> a simple shift in the ordinate èçè of the s-ç graph, such normalization<br />
process removes the fac<strong>to</strong>r of rotation ë19ë.<br />
Other good reasons <strong>to</strong> use normalization involve making some operations easier <strong>to</strong><br />
perform. For example, in performing point set matching, Hong and Tan ë27ë normalize<br />
both point sets <strong>to</strong> their canonical forms ærst; then, their canonical forms are matched. After<br />
normalization, the matching between two canonical forms reduce <strong>to</strong> a simple rotation of<br />
one <strong>to</strong> match the other.<br />
2.2 An Overview of <strong>Geometric</strong> <strong>Hashing</strong><br />
2.2.1 A Brief Description<br />
The geometric hashing idea has its origins in work of Schwartz and Sharir ë49ë. Application<br />
of the geometric hashing idea for model-based visual recognition was described by Lamdan,<br />
Schwartz and Wolfson. This section outlines the method; a more complete description can<br />
be found in Lamdan's dissertation ë36ë; a parallel implementation can be found in ë46ë.<br />
The geometric hashing method proceeds in two stages: a preprocessing stage and a<br />
recognition stage. In the preprocessing stage, we construct a model representation by
CHAPTER 2. PRIOR AND RELATED WORK 18<br />
computing and s<strong>to</strong>ring redundant, transformation-invariant model information in a hash<br />
table. During the subsequent recognition stage the same invariants are computed from<br />
features in a scene and used as indexing keys <strong>to</strong> retrieve from the hash table the possible<br />
matches with the model features. If a model's features scores suæciently many hits, we<br />
hypothesize the existence of an instance of that model in the scene.<br />
The Pre-processing Stage<br />
Models are processed one by one. New models added <strong>to</strong> the model base can be processed<br />
and encoded in<strong>to</strong> the hash table independently. For each model M and for every feasible<br />
basis b consisting of k points èk depends on the transformations the model objects undergo<br />
during formation of the class of images <strong>to</strong> be analyzedè, we<br />
èiè compute the invariants of all the remaining points in terms of the basis b;<br />
èiiè use the computed invariants <strong>to</strong> index the hash table entries, in each of<br />
which we record a node èM; bè.<br />
Note that all feasible bases have <strong>to</strong> be used. In particular, all the permutations èup <strong>to</strong> k!è<br />
of the k inputs used <strong>to</strong> calculate the invariants have <strong>to</strong> be considered. The complexity of<br />
this stage is Oèm k+1 è per model, where m is the numberofpoints extracted from a model.<br />
However, since this stage is executed oæ-line, its complexity is of little signiæcance.<br />
The Recognition Stage<br />
Given a scene with n feature points extracted, we<br />
èiè choose a feasible set of k points as a basis b;<br />
èiiè compute the invariants of all the remaining points in terms of this basis b;<br />
èiiiè use each computed invariant <strong>to</strong> index the hash table and hit all èM i ;b j è's<br />
that are s<strong>to</strong>red in the entry retrieved;<br />
èivè his<strong>to</strong>gram all èM i ;b j è's with the number of hits received;
CHAPTER 2. PRIOR AND RELATED WORK 19<br />
èvè establish a hypothesis of the existence of an instance of model M i<br />
in the<br />
scene if èM i ;b j è, for some j, peaks in the his<strong>to</strong>gram with suæciently many<br />
hits;<br />
and repeat from step èiè, if all hypotheses established in step èvè fail veriæcation.<br />
The complexity of this stage is Oènè+Oètè per probe, where n is the number of points<br />
extracted from the scene and t is the complexity ofverifying an object instance.<br />
2.2.2 Strengths and Weaknesses<br />
At a glance, geometric hashing seems similar <strong>to</strong> the transformation accumulation techniques<br />
discussed in section 2.1.3. However, the similarity lies only in the use of ëaccumulating<br />
evidence" by voting. The techniques discussed in section 2.1.3 accumulate ëpose"<br />
evidence while geometric hashing accumulates ëfeature correspondence" evidence. The<br />
former analysis always votes for parameters of transformations while the latter votes for<br />
èmodel identiæer, basis setè pair, where transformation parameters can be computed when<br />
the correspondence of scene feature set and model basis set is hypothesized.<br />
Strengths<br />
Most of the methods discussed in section 2.1 are search-based: Model features are searched<br />
<strong>to</strong> match scene features and this search process goes through each model in the model base<br />
sequentially. For example, the interpretation tree technique by Grimson ë23ë has inherent<br />
exponential complexity by pairing each scene feature <strong>to</strong> each model feature combina<strong>to</strong>rially.<br />
<strong>Geometric</strong> hashing greatly accelerates search of the model base by <strong>using</strong> a hashing<br />
technique <strong>to</strong> select candidate model features <strong>to</strong> match scene features. This process is at<br />
worst sublinear in the size of the model base.<br />
Like the transformation accumulation techniques, geometric hashing's voting scheme<br />
copes well with occlusion and with the fragility of existing image segmentation techniques.<br />
The accumulation of evidence without regard <strong>to</strong> order makes parallel implementation easy.<br />
Weaknesses<br />
Errors in feature extraction will commonly lead <strong>to</strong> perturbation of invariants and degradation<br />
of recognition performance, since perturbed invariant values are used <strong>to</strong> index the
CHAPTER 2. PRIOR AND RELATED WORK 20<br />
hash table <strong>to</strong> retrieve candidate model features. Performance is similarly degraded by<br />
quantization noise introduced in constructing the hash table. Thus use of point features<br />
does not seem viable in a seriously degraded image, see ë21ë.<br />
The inherently non-uniform distribution of computed invariants in the hash table results<br />
in non-uniform hash bin occupancy for uniformly quantized hash table ë47ë. This degrades<br />
the index selectivity power of the hashing scheme used.<br />
2.2.3 <strong>Geometric</strong> <strong>Hashing</strong> Systems<br />
Since the introduction of the geometric hashing method by Wolfson and Lamdan, a number<br />
of subsequent applications and improvements have been developed at the Robotics<br />
Research Labora<strong>to</strong>ry of New York University and other research labora<strong>to</strong>ries.<br />
Gavrila and Groen ë18ë apply the hashing technique for recognition of 3-D objects from<br />
single 2-D images. They determine limits of discriminability by experiments <strong>to</strong> generate<br />
viewpoint-centered models.<br />
Gueziec and Ayache ë24ë address the problem of fast rigid matching of 3-D curves in<br />
medical images. They incorporate six diæerential invariants associated with the surface and<br />
introduce an original table, where they hash values for the six transformation parameters.<br />
Hummel and Wolfson ë29ë discuss both the object matching and the curve matching<br />
by geometric hashing.<br />
Flynn and Jain ë17ë use invariant feature and the hashing technique <strong>to</strong> generate hypotheses<br />
without resorting <strong>to</strong> a voting procedure. They conclude that this method is more<br />
eæcient than a constrained search technique, for example, interpretation tree technique<br />
ë23ë.<br />
Recently, the work of Rigoutsos develops geometric hashing by incorporating a Bayesian<br />
model for weighted voting, which we describe in the next section.<br />
2.3 The Bayesian Model of Rigoutsos<br />
The thesis work of Rigoutsos ë43ë from 1992 considerably extends the capability of the<br />
geometric hashing scheme.<br />
To cope with the problem caused by the positional uncertainty of point features, a<br />
Gaussian distribution is used <strong>to</strong> model the perturbation of the coordinate of the point. That
CHAPTER 2. PRIOR AND RELATED WORK 21<br />
is, the measured values are assumed <strong>to</strong> be distributed according <strong>to</strong> a Gaussian distribution<br />
centered at the true values and having standard deviation ç. More precisely, let èx i ;y i èbe<br />
the ëtrue" location of the i-th feature point in the scene. Let also èX i ;Y i è be the continuous<br />
random variables denoting the coordinates of the i-th feature. The joint probability density<br />
function of X i and Y i is then given by:<br />
fèX i ;Y i è= 1<br />
2çç 2 expè,èX i , x i è 2 +èY i , y i è 2<br />
2ç 2 è:<br />
Using this error model, he formulates a weighted voting scheme for the evidence accumulation<br />
in geometric hashing.<br />
He uses a weighted contribution <strong>to</strong> the modelèbasis<br />
hypothesis of an entry ç in the hash space based on a scene point's hash ç in the same<br />
space, <strong>using</strong> a formula of the form<br />
logë1 , c + A expè, 1 2 èç , çèæ,1 èç , çè t èë;<br />
where c and A are constants that depend on the scene density and æ is the covariance<br />
matrix in hash space coordinates of the expected distribution of ç based on the error<br />
model for èx; yè, the feature point in the scene.<br />
He shows that when weighted voting is used according <strong>to</strong> the above formula, the evidence<br />
accumulation can be interpreted as a Bayesian aposteriori classiæcation scheme. He<br />
shows that the accumulations are related <strong>to</strong><br />
logëProbèH k jE 1 ; æææ;E n èë<br />
where the hypothesis H k represents the proposition that a certain modelèbasis occurs in<br />
the scene and matches the chosen basis, and E i 's are pieces of evidence given by scene<br />
invariants.<br />
The above formulation is implemented on a 8K-processor Connection Machine èCM-<br />
2è and can recognize objects that have undergone a similarity transformation, from a<br />
library of 32 models. The models used for experiments are military aircraft and production<br />
au<strong>to</strong>mobiles. <strong>Features</strong> are extracted by <strong>using</strong> the Boie-Cox edge detec<strong>to</strong>r ë9ë and locating<br />
the points of high curvature. The features are the coordinate pairs of these points ë44ë.<br />
In this thesis, we also make use of a Bayesian aposteriori maximum likelihood recognition<br />
scheme based on geometric hashing. We use somewhat simpliæed versions of the<br />
formulae given by Rigoutsos. We make use of line features as opposed <strong>to</strong> point features.
CHAPTER 2. PRIOR AND RELATED WORK 22<br />
However, we also assume a Gaussian perturbation model of the feature values, which in our<br />
case implies an independent perturbation of the èç; rè variables. Since no reliable segmentation<br />
is assumed <strong>to</strong> be available, we extracted our line features by means of an improved<br />
Hough transform technique that operates on seriously degraded environments. Objects<br />
modeled by lines that have undergone aæne transformations are used for experiments. A<br />
viewpoint consistency constraint is also applied <strong>to</strong> further ælter out false alarms.
Chapter 3<br />
Noise in the Hough Transform<br />
This chapter overviews the eæect of noise on the Hough transform, which is used <strong>to</strong> detect<br />
line features from an edge map.<br />
We use a series of simulations <strong>to</strong> estimate the reliability and accuracy of the Hough<br />
technique. By reliability we mean its ability <strong>to</strong> detect lines, while coping with occlusions<br />
and additive noise; by accuracy we mean the deviation of detected line parameters from<br />
their true values. Although much has been published in this question and theoretical<br />
analyses have been given èsee ë22ëè, important questions on these two issues remains open<br />
and simulation remains a revealing technique.<br />
In section 3.1, we brieæy review the his<strong>to</strong>ry and technique of the Hough transform.<br />
Section 3.2 describes the fac<strong>to</strong>rs that aæect the performance of the Hough transform and<br />
the way it can be improved. A series of simulations on images, covering a range of line<br />
lengths, line orientations and line positions under increasing noise level èboth positive and<br />
negative noiseè have been performed. Section 3.3 shows our experimental results, which<br />
are summarized in section 3.4.<br />
3.1 The Hough Transform<br />
The well-known Hough technique was introduced by Paul Hough in a US patent æled in<br />
1962. Hough's initial application was analysis of bubble chamber pho<strong>to</strong>graphs of particle<br />
tracks; such images contain a large amount of noise. Hough proposed <strong>to</strong> use ampliæers,<br />
delays, signal genera<strong>to</strong>rs, and so on <strong>to</strong> perform what we now call the Hough transform in<br />
23
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 24<br />
an analog form.<br />
A common digital implementation of the Hough transform applies the normal parameterization<br />
suggested by Duda and Hart ë15ë in the form<br />
x cos ç + y sin ç = r;<br />
where r is the perpendicular distance of the line <strong>to</strong> the origin and ç is the angle between<br />
a normal <strong>to</strong> the line and the positive x axis.<br />
This normal parameterization has several advantages: r and ç vary uniformly as the line<br />
orientation and position change, and neither goes <strong>to</strong> inænity as the line becomes horizontal<br />
or vertical. It is easy <strong>to</strong> see that points on a particular line will all map <strong>to</strong> sinusoids that<br />
intersect in a common point in Hough space and the coordinate èç; rè of that intersection<br />
point gives the parameters of the line.<br />
In standard implementations of the Hough method, Hough space is quantized. Each<br />
èx i ;y i è is mapped <strong>to</strong> a sampled, quantized sinusoid. The detailed algorithm is as follows:<br />
1. Quantize the 2-D Hough space èi.e., parameter spaceè between 0 æ and 180 æ for ç and<br />
, p 1 N 2<br />
<strong>to</strong> + p 1 N 2<br />
for r èN æ N is the image sizeè.<br />
2. Form an accumula<strong>to</strong>r array Aëçëërë, whose buckets are initialized <strong>to</strong> 0.<br />
3. For each edgel èx; yè in the image, increment all buckets in the accumula<strong>to</strong>r array<br />
along the appropriate sinusoid, i.e.,<br />
Aëçëërë:=Aëçëërë+1<br />
for ç and r satisfying r = x cos ç + y sin ç within the limits of the quantization.<br />
4. Locate, in the accumula<strong>to</strong>r array, local maxima, which correspond <strong>to</strong> collinear edgels<br />
in the image èthe values of the accumula<strong>to</strong>r array provide a measure of the number<br />
of edgels on the lineè.<br />
We note that due <strong>to</strong> the quantization of Hough space and noise in the image, sinusoids<br />
generated by points of the same line do not in general intersect precisely at a common point<br />
in the Hough space. Much literature has addressed this problem by suggesting additional<br />
processing èsee ë32ë for a surveyè.
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 25<br />
3.2 Various Hough Transform Improvements<br />
Since the above straightforward implementation of the Hough transform needs improvement<br />
in highly degraded environments ë22ë, we have investigated the fac<strong>to</strong>rs aæecting the<br />
performance of the technique and implemented and experimented on a series of variations<br />
of the Hough transform.<br />
Fac<strong>to</strong>rs that adversely aæect the performance èreliability and accuracyè of the Hough<br />
transform include:<br />
æ Short lines inherently result in low peaks in Hough space Hèç; rè, which prevents the<br />
detection of short lines relative <strong>to</strong> long lines.<br />
æ When r = x cos ç + y sin ç is rounded <strong>to</strong> pick a particular bucket in Hèç; rè for a<br />
speciæc ç, fractional r information is lost and all subsequent computations are done<br />
on the rounded r values.<br />
æ ç is also sampled. If a line has an orientation ç 0 , not exactly equal <strong>to</strong> any sampled<br />
ç, the points on the line, when mapped <strong>to</strong> Hough space, will not have identical r<br />
values. Instead, their r values will spread out near the real r value.<br />
æ Using the coordinate èç; rè of the peak in Hough space as the line parameter limits<br />
the precision of the parameters detected, since Hough space is quantized.<br />
To strengthen the low peaks in Hough space which result from short lines present in<br />
a source image ècompared <strong>to</strong> the inherently high peaks which result from long linesè, we<br />
apply background subtraction by removing the contribution made <strong>to</strong> Hough space by edgels<br />
along a long line after this long line has been detected. This method is attributed <strong>to</strong> Risse<br />
ë48ë. To implement the idea, a global maximum in Hough space, say bucket èç; rè, is<br />
located; then its corresponding line in the image is identiæed and all accumula<strong>to</strong>r buckets<br />
which were incremented during processing edgels lying along this line are decremented.<br />
This algorithm diæers from the algorithm described in the previous section in that instead<br />
of detecting local maxima in the accumula<strong>to</strong>r array as candidate lines in the image, it<br />
detects lines one after one iteratively as follows:<br />
while still more lines <strong>to</strong> be detected do<br />
è^ç; ^rè := globalMaxèaccumula<strong>to</strong>r Aè;
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 26<br />
for each edgel èx; yè on line x cos ^ç + y sin ^ç =^r do<br />
Aëçëërë :=Aëçëërë , 1, for ç and r satisfying r = x cos ç + y sin ç<br />
end for<br />
end while<br />
Starting with this algorithm as the backbone, we improve it with weighted voting<br />
<strong>to</strong> compensate the rounding error of r and achieve sub-bucket precision computing, as<br />
described below.<br />
To compensate for the rounding error of r, we distribute the Hough vote-increment<br />
value ètypically 1è over a region in Hough space. For example, suppose r exact = x cos ç +<br />
y sin ç èi.e., before roundingè and r a and r b are consecutive values of sampled r in the Hough<br />
space such that r a ç r exact ér b . Then we increment Hèr a ;çèby an amount proportional<br />
<strong>to</strong> r b , r exact and increment Hèr b ;çèby an amount proportional <strong>to</strong> r exact , r a . That is, we<br />
distribute the increments linearly in two related buckets.<br />
To compensate for the spreading of votes caused by the quantization of ç and r, we<br />
ænd the bucket èç; rè receiving the maximum numberofvotes, <strong>using</strong> a small window è3<br />
by 3 in our implementationè centered around this bucket and compute the center of mass<br />
over this small window <strong>to</strong>achieve sub-bucket precision in line parameter estimation.<br />
Experiments on seriously degraded images show that this method is superior <strong>to</strong> other<br />
variations of the Hough transform in the sense that more true lines and fewer spurious<br />
lines are detected and estimated line parameters' accuracy is quite good.<br />
In order <strong>to</strong> further reduce the number of spurious lines detected, we notice that the<br />
Hough transform detects lines in a very global way. More precisely, it counts the number<br />
of edgels which happen <strong>to</strong> lie on the same line, sparsely or densely, as the evidence of the<br />
existence of a line. This diæers from human visual recognition, which exhibits grouping by<br />
proximity.<br />
To further enhance our algorithm, we therefore use grouping by considering proximity<br />
as follows.<br />
To detect n lines in an image, we<br />
1. apply the algorithm described above <strong>to</strong> detect a pre-speciæed number, say<br />
2n, of lines;<br />
2. for each line èç i ;r i è detected,
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 27<br />
èaè go back <strong>to</strong> the image èedge mapè and scan edgels èx; yè's on the line<br />
with a 1-D sliding window of appropriate size èdepending on the image<br />
resolutionè.<br />
èbè if the number of edgels in the window is less than, say 1=3, of the<br />
window size, take èx; yè <strong>to</strong> be a noise edgel which happens <strong>to</strong> lie on<br />
the line èç i ;r i è.<br />
ècè re-accumulate the evidence support for the line without counting the<br />
noise edgels.<br />
3. sort these lines by their new evidence support and select the <strong>to</strong>p n from<br />
among them.<br />
The reason we simply select the <strong>to</strong>p n from among 2n lines detected by our previous<br />
algorithm is that our previous algorithm already has good performance, and we just want<br />
<strong>to</strong> further remove spurious lines.<br />
Figure 3.1 shows an example, comparing the result of the standard Hough technique<br />
and our improved Hough technique. Figure 3.1èaè shows a synthesized image without noise.<br />
It contains roughly 40 segments. Figure 3.1èbè shows the result of the standard Hough<br />
analysis. 50 lines are detected. 17 true lines are missing. Also many of the true lines are<br />
detected multiple times. Figure 3.1ècè shows the result by applying our improved Hough<br />
technique without enhancement of <strong>using</strong> proximity grouping. Also, 50 lines are detected.<br />
There is no missing true lines. Yet, since more lines are detected than are actually present<br />
in the source image, a few true lines are detected multiple times. Figure 3.1èdè shows the<br />
result by applying our improved Hough technique with proximity grouping enhancement.<br />
Figure 3.2èaè shows the same image as in Figure 3.1èaè, yet with serious noise. Figure<br />
3.2èbè <strong>to</strong> Figure 3.2èdè are the correspondences of Figure 3.1èbè <strong>to</strong> Figure 3.1èdè . The<br />
performance improvement over the standard Hough technique is much obvious.<br />
3.3 Implementation and Measured Performance<br />
In our implementation, we use æç = 1 è1 degreeè and ær = 1 è1 pixelè as the quantization<br />
values for ç and r in Hough space. This appears quite suæcient for our subsequent applications.<br />
Intuitively, smaller values of æç might give better precision but can result in the<br />
spreading of peaks èrecall that image is digitized so that the edgels of a line will not lie
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 28<br />
èaè A test image without noise. It contains<br />
roughly 40 segments.<br />
èbè 50 lines are detected <strong>using</strong> the<br />
standard Hough transform.<br />
ècè 50 lines are detected <strong>using</strong> our<br />
improved Hough transform without<br />
proximity grouping.<br />
èdè 50 lines are detected <strong>using</strong> our improved<br />
Hough transform with proximity<br />
grouping.<br />
Figure 3.1: A comparison of the results of the standard Hough technique and our improved<br />
Hough technique, when applied <strong>to</strong> an image without noise.
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 29<br />
èaè A test image with serious noise. It<br />
contains roughly 40 segments.<br />
èbè 50 lines are detected <strong>using</strong> the<br />
standard Hough transform.<br />
ècè 50 lines are detected <strong>using</strong> our<br />
improved Hough transform without<br />
proximity grouping.<br />
èdè 50 lines are detected <strong>using</strong> our improved<br />
Hough transform with proximity<br />
grouping.<br />
Figure 3.2: A comparison of the results of the standard Hough technique and our improved<br />
Hough technique, when applied <strong>to</strong> a noisy image.
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 30<br />
precisely on a digitized line except when the line has orientation of 0 æ ,45 æ ,90 æ or 135 æ è.<br />
Similarly smaller values of ær give better precision but result in the spreading of peaks.<br />
We have performed a series of experiments on synthesized images containing polygonal<br />
objects, which are arbitrarily chosen from our model base èsee Figure 3.3è. èThe same<br />
images are used in our later experiments.è<br />
model-0 model-1 model-2 model-3 model-4<br />
model-5 model-6 model-7 model-8 model-9<br />
model-10 model-11 model-12 model-13 model-14<br />
model-15 model-16 model-17 model-18 model-19<br />
Figure 3.3: The twenty models used in our experiments. From left <strong>to</strong> right, <strong>to</strong>p <strong>to</strong> bot<strong>to</strong>m<br />
are model-0 <strong>to</strong> model-19.<br />
Two kinds of noise are imposed in our experiments: positive and negative.
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 31<br />
We generate negative noise by randomly erasing edgels from the object boundary. For<br />
example, if we want 10è of the boundary <strong>to</strong> be missing èoccludedè, we scan along the<br />
boundary of every object and erase the edgels whenever ran modulo 10 = 0, where ran<br />
is a random number; similarly, ifwewant 20è of the boundary <strong>to</strong> be missing, we dothe<br />
same whenever ran modulo 5 = 0.<br />
The positive noise we generate consists of random dot noise and segment noise. Random<br />
dot noise is an additive dot noise whose position in the image is randomly selected; segment<br />
noise is added by randomly selecting a position in the image and an orientation èfrom 0 æ<br />
<strong>to</strong> 360 æ è for the segment then putting a line segment of length varying randomly from 0<br />
<strong>to</strong> 7 pixels.<br />
Simulations consider diæerent combinations of noise levels. The image size used in our<br />
experiments is 256 æ 256. Figure 3.4 shows a sample scene èimage-0è after application of<br />
various noise levels. Other test images are shown in Figure 3.5. The same noise levels were<br />
applied <strong>to</strong> them.<br />
In the following, we show the result of experiments on ten images. The number of lines<br />
we select is 70. Since more lines are detected than are actually present in the source image,<br />
our experiments allow for spurious lines.<br />
The following abbreviated table shows lines by the ranks of their votes, <strong>using</strong> ë*" <strong>to</strong><br />
represent lines actually present in the source image and ë." <strong>to</strong> represent spurious lines.<br />
Standard deviations of the ç and r values and their covariances are also given; this data is<br />
of signiæcance for design of the later recognition stage.<br />
æ case 0: no noise imposed on the image<br />
image-0: *************************************************.****.*..***.........<br />
image-1: ******************************.*******************.**...*..*....**....<br />
image-2: ********************************************************.*..**.*..*...<br />
image-3: *************************************************......*.........*....<br />
image-4: ****************.**********.****************.********...*..**.........<br />
image-5: **********************************************.***.*...........*..*.*.<br />
image-6: ***********************************************************.*.*..*....<br />
image-7: ******************************************************.****...........<br />
image-8: ***************************************************.**.*..*.....*...*.<br />
image-9: *********************************************..************.*.....*..*<br />
ç ç = 1.398227 èdegreeè or 0.024404 èradianè; ç r = 1.672734 èpixelè;<br />
ç çr = 0.189464 èdegree.pixelè or 0.003307 èradian.pixelè.<br />
æ case 1: 10è negative noise, no positive noise
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 32<br />
è0è Image-0 with no noise è1è Image-0 with 10è<br />
negative noise and no positive<br />
noise<br />
è2è Image-0 with 10è<br />
negative noise and 200<br />
pieces of positive noise<br />
è3è Image-0 with 10è<br />
negative noise and 400<br />
pieces of positive noise<br />
è4è Image-0 with 10è<br />
negative noise and 800<br />
pieces of positive noise<br />
è5è Image-0 with 20è<br />
negative noise and no positive<br />
noise<br />
è6è Image-0 with 20è<br />
negative noise and 200<br />
pieces of positive noise<br />
è7è Image-0 with 20è<br />
negative noise and 400<br />
pieces of positive noise<br />
è8è Image-0 with 20è<br />
negative noise and 800<br />
pieces of positive noise<br />
Figure 3.4: Reading from <strong>to</strong>p <strong>to</strong> bot<strong>to</strong>m, left <strong>to</strong> right, è0è <strong>to</strong> è8è are image-0 with diæerent<br />
noise levels corresponding <strong>to</strong> cases 0 <strong>to</strong> 8.
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 33<br />
Image-1 Image-2 Image-3<br />
Image-4 Image-5 Image-6<br />
Image-7 Image-8 Image-9<br />
Figure 3.5: Image-1 <strong>to</strong> Image-9 before noise is imposed. The same noise levels as for<br />
Image-0 are applied during our experiments.
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 34<br />
image-0: **********************************************.*.***.*.*..............<br />
image-1: ***************************************.***.****.....*.**...*.....*...<br />
image-2: *********************************************************..*..*.......<br />
image-3: **************************************.*******...***.*...*.*.......*..<br />
image-4: ****************.*.***************..***.*..**.*.*.*............*...*..<br />
image-5: ***************************************************.....*.*.*.**.....*<br />
image-6: ************************************************************..........<br />
image-7: ****************************************************.****....*.*......<br />
image-8: ****************************************************.*...*.*.*..*.....<br />
image-9: ***********************************..***********************.*......*.<br />
ç ç = 1.496886 èdegreeè or 0.026126 èradianè; ç r = 1.803655 èpixelè;<br />
ç çr = 0.244846 èdegree.pixelè or 0.004273 èradian.pixelè.<br />
æ case 2: 10è negative noise, along with 200 pieces of random dot noise and 200 pieces<br />
of segment noise<br />
image-0: ******************************.******.*.*........*..*..........*......<br />
image-1: ***********************.*.******.********..**..*.*............*.*.....<br />
image-2: ***************************************.***.*.***.*...***.....*.......<br />
image-3: *************************************.*****.*.*..*......*.....*.......<br />
image-4: *****************.*********.****.*.*.*....**.*..**...*......*...*.....<br />
image-5: **************************************.********.**.*.....*...*....***.<br />
image-6: *********************************************.***.**.*.*..*..........*<br />
image-7: ************************************************...**...**...*..***.*.<br />
image-8: **************************************.**..***.*.*..*......*......*...<br />
image-9: ******************************.*.*..********.***.*.*..*..*.......*.**.<br />
ç ç = 1.656363 èdegreeè or 0.028909 èradianè; ç r = 2.059478 èpixelè;<br />
ç çr = 0.119053 èdegree.pixelè or 0.002078 èradian.pixelè.<br />
æ case 3: 10è negative noise, along with 400 pieces of random dot noise and 400 pieces<br />
of segment noise<br />
image-0: *****************.************..****.**......*.*.....*................<br />
image-1: ***********************.*****..*********..**.......*...........*.....*<br />
image-2: *********************************.*.***.***.*.***..........*........*.<br />
image-3: *******************************.**..*.*..**...**.*.****..........**...<br />
image-4: ****************.*********.*....****..*.*..*..*...............**....*.<br />
image-5: *******************************.****.*.**..*.....*...*......*...*.**.*<br />
image-6: *********************************.*.*****.*.*..**..........*..*.*....*<br />
image-7: **********************************.*******.*****.**.....**.*......**..<br />
image-8: **********************************...**.*..*.*.*......*...............<br />
image-9: ***************************...*..*.*.**.*****.*.....*.......*........*<br />
ç ç = 1.712783 èdegreeè or 0.029894 èradianè; ç r = 2.017484 èpixelè;<br />
ç çr = 0.217823 èdegree.pixelè or 0.003802 èradian.pixelè.<br />
æ case 4: 10è negative noise, along with 800 pieces of random dot noise and 800 pieces<br />
of segment noise<br />
image-0: *******************.***.***......****..*...*..........*..*.......*....<br />
image-1: ****.**.**.**..*****.*.*..*.***.**.*....*..*.*....*..*..*.............<br />
image-2: ***********.********.*..*...***..****.***.....*..............*.......*<br />
image-3: *******.************.**.*.**.***...*....*.*..*.*......................<br />
image-4: ************.*******....*.......*..******..*.............*....*....*..<br />
image-5: ***************.***********.***......**..*........*............*......<br />
image-6: ***************..*************..****..**.*..........*......*.........*<br />
image-7: **************.***.*****.*****..***********.*.*...**..*...**.*.*...*..
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 35<br />
image-8: ***************.********.****..***.*.****.*.*.....*..*................<br />
image-9: ***********.**********.****..*.**...*.*..****.**.*..*...*..*...**.....<br />
ç ç = 1.938883 èdegreeè or 0.033840 èradianè; ç r = 2.519878 èpixelè;<br />
ç çr = -0.357589 èdegree.pixelè or -0.006241 èradian.pixelè.<br />
æ case 5: 20è negative noise, along with no positive noise<br />
image-0: ***********************************************.***...................<br />
image-1: ********************************.***..******.**.*.*....**........*...*<br />
image-2: ****************************************.***********.*..*.*....*......<br />
image-3: ********************************.******.*.***..*.....**.*...*.*.*...*.<br />
image-4: *********************.****.**.*****.**.******.**.*...........**...*...<br />
image-5: ********************************************..*.....*.**.....*........<br />
image-6: **********************************************************.*..*.......<br />
image-7: ************************************.***************.*.*.*..*......*..<br />
image-8: **********************************************.*.......*........*.....<br />
image-9: ******************.*.******.********.*.*..*******.***.**.*.*.*..*.*...<br />
ç ç = 1.563745 èdegreeè or 0.027293 èradianè; ç r = 1.913004 èpixelè;<br />
ç çr = 0.219434 èdegree.pixelè or 0.003830 èradian.pixelè.<br />
æ case 6: 20è negative noise, along with 200 pieces of random dot noise and 200 pieces<br />
of segment noise<br />
image-0: *************.****************.*..**.*.*.*.*..*......................*<br />
image-1: ******************************.*..***...**..**.*...........*.....*..*.<br />
image-2: ****************************.*****.*.**.**.......**.....****.*........<br />
image-3: **************************************.**.***.***.....*.......*...*...<br />
image-4: *******************.****.***..**.***...***..........................*.<br />
image-5: ***********************************.*.**.***..*....................*..<br />
image-6: **********************.***************..***...**.*..*...**.....*.*.*.*<br />
image-7: ******************************.****..***...****.*..*...*..*..*.*......<br />
image-8: *********************************.**..*.**.***.*..........*...........<br />
image-9: ***********.*****.*.******.**.*****...*...***......*......*..*.*.**.*.<br />
ç ç = 1.754688 èdegreeè or 0.030625 èradianè; ç r = 2.144938 èpixelè;<br />
ç çr = 0.349338 èdegree.pixelè or 0.006097 èradian.pixelè.<br />
æ case 7: 20è negative noise, along with 400 pieces of random dot noise and 400 pieces<br />
of segment noise<br />
image-0: *********.*****************.*.***.*...*.......*...*............*....*.<br />
image-1: **********************.*.**..**.*****..*.*.*****......................<br />
image-2: *******************.**.*****..*...*.**.*....*..**.....................<br />
image-3: ***************************.**.*.**......*.........**...*........*....<br />
image-4: ******************..**..*..**..*****..**...*.*............*.*.........<br />
image-5: *****************.************.**.*........***.*............*........*<br />
image-6: **********************..*******.*********.......*.**....*.***..*..*...<br />
image-7: ****************************.*.*.***.*.***..........**.*.*....*.**....<br />
image-8: *********************..*******.*.*****.*..*..**...*..*.......*..*.....<br />
image-9: ***************..*.**.*.*.***.**.**..****....*.*.*........*...*.*.....<br />
ç ç = 1.880869 èdegreeè or 0.032827 èradianè; ç r = 2.397261 èpixelè;<br />
ç çr = -0.113044 èdegree.pixelè or -0.001973 èradian.pixelè.<br />
æ case 8: 20è negative noise, along with 800 pieces of random dot noise and 800 pieces
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 36<br />
of segment noise<br />
image-0: ***********.****.******...*.****..*.....*.*...........**......*..*....<br />
image-1: ***..**.***.**.**.***.**..*.*.*.*.**...*.*.*.....*....................<br />
image-2: ******.*.****.**.**........***....**.....*........**.**.....*.........<br />
image-3: **********.****.*..*****..*.*.*..**..*.*........*.*...............*..*<br />
image-4: *************.*.*.*.......*.*..*.....*.........*..***...*.*.........**<br />
image-5: ********.**********.*****...*...*....*.*........*..**...........**.*.*<br />
image-6: ****************...*.*********...*....*.**...*....*.......*.*.....***.<br />
image-7: *************.***.***.*..*.*.***.**.**.***.*...*.........*.**.......*.<br />
image-8: ******.******....*.*****..*.....*.*.*.*...*.*.....*....*.............*<br />
image-9: ********..*******..**..*....**.**.*******...*.....*......*.....**.....<br />
ç ç = 2.168774 èdegreeè or 0.037852 èradianè; ç r = 2.503542 èpixelè;<br />
ç çr = -0.396545 èdegree.pixelè or -0.006921 èradian.pixelè.<br />
3.4 Some Additional Observations Concerning the Accuracy<br />
of Hough Data<br />
For the case when there is no noise imposed, we can see that true lines dominate the front<br />
of the list èthose with high votesè and spurious lines are at the rear of the list. Note that<br />
some of the true lines are detected multiply. This can happen when a line is slanting,<br />
because its edgels become dispersed in ladder fashion,<br />
and two parallel lines are often detected.<br />
Negative noise is more devastating <strong>to</strong> the Hough transform than positive noise. Positive<br />
noise, unless very intense, does not make true lines disappear; Negative noise subtracts<br />
signals and can seriously aæect the accuracy of the line parameters detected. This is<br />
especially true of short lines with a small slope. For example, given negative noise the edge<br />
map<br />
can turn in<strong>to</strong> something like<br />
or
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 37<br />
Indeed, our experiments showed that most of the undetected lines are slanting lines with<br />
small slopes.<br />
While background subtraction works well <strong>to</strong> remove bias in favor of long line segments<br />
over short line segments, it in fact implicitly applies the principle of the-winner-takes-all. In<br />
particular, when a long line segment intersects a short line segment, their intersection pixel<br />
will be classiæed as belonging <strong>to</strong> the long line segment and subtracted from the image after<br />
the long line segment is detected. However, when these two line segments do not intersect<br />
each other even though their extended lines do, the perceptual grouping technique we use<br />
reduces this adverse eæect.<br />
In regard <strong>to</strong> accuracy, we observe that line length is the key role aæecting the accuracy<br />
of Hough transform results. The Hough transform works well <strong>to</strong> detect long line segments<br />
even if the image is quite noisy. As line length decreases, noise aæects the accuracy of the<br />
Hough transform even more seriously.<br />
When the center of a segment is far away from the projection of the origin on<strong>to</strong> the<br />
extended line of the segment, the average error of r increases. This phenomenon can be<br />
rationalized as follows èsee Figure 3.6è: Since our algorithm loops through ç, we in fact<br />
consider a line through the origin with slope angle ç èi.e., with line parameter èç +90 æ ; 0èè<br />
and project all the image edgels on<strong>to</strong> this line. If bucket èç; rè in Hough space is detected<br />
with n votes, roughly n edgels are projected on<strong>to</strong> the line at positions roughly r distant<br />
from the origin. With this viewpoint in mind, due <strong>to</strong> the quantization of images, if the<br />
center of a line segment, with line parameter èç; rè, is far away from the intersection point<br />
of line èç; rè and line èç +90 æ ; 0è, the projections èon<strong>to</strong> line èç +90 æ ; 0èè of the points of<br />
that line segment disperse èon line èç +90 æ ; 0èè around the position r distant from the<br />
origin èexcept when the orientation of the line segment is0 æ ,45 æ ,90 æ or 135 æ .è
CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 38<br />
r<br />
Projection of the Origin on the extended line<br />
O<br />
center of the segment<br />
Figure 3.6: The thicker line is with parameter èç; rè. Projections of the pixels of this line<br />
on<strong>to</strong> line èç +90 æ ;rè will disperse around the position r distant from the origin.
Chapter 4<br />
Invariant Matching Using <strong>Line</strong><br />
<strong>Features</strong><br />
The idea behind the geometric hashing method is <strong>to</strong> encode local geometric features in a<br />
manner which isinvariant under the geometric transformation that model objects undergo<br />
during formation of the class of images being analyzed. This encoding can then be used in<br />
a hash function that makes possible fast retrieval of model features from the hash table,<br />
which can accordingly be viewed as an encoded model base. The technique can be applied<br />
directly <strong>to</strong> line features, without resorting <strong>to</strong> point features indirectly derived from lines,<br />
providing that we use some method of encoding line features in a way invariant under<br />
transformations considered.<br />
Potentially relevant transformation groups include rigid transformations, similarity<br />
transformations, aæne transformations and projective transformations, depending on the<br />
manner in which an image is formed. Each of these classes of transformations is a subgroup<br />
of the full group of projective transformations. Since a projective transformation<br />
is a collineation èreferring <strong>to</strong> p. 89 of ë35ëè, all these transformations preserve lines. This<br />
allows us <strong>to</strong> use line features as inputs <strong>to</strong> the recognition procedures.<br />
4.1 Change of Coordinates in Various Spaces<br />
Since we will represent line features by their normal parameterization èç; rè, we show in<br />
this section how the change of coordinates in image space by a linear transformation acts<br />
39
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 40<br />
<strong>to</strong> change of coordinates in èç; rè-space. This is preliminary <strong>to</strong> the invariant line encodings<br />
exploited in its following sections.<br />
4.1.1 Change of Coordinates in Image Space<br />
It is often easiest <strong>to</strong> work in homogeneous coordinates, since this allows projective transformations<br />
<strong>to</strong> be represented by matrices.<br />
Achange of coordinates in homogeneous coordinates is given by<br />
w 0<br />
= Tw;<br />
where w = èw 1 ;w 2 ;w 3 è t is the homogeneous coordinate before transformation, w 0 =<br />
èw 0 1 ;w0 2 3è t is the homogeneous coordinate after transformation and T is a non-singular<br />
;w0<br />
3 æ 3 matrix. Note that T and çT, for any ç 6= 0, deæne the same transformation, since<br />
we are dealing with homogeneous coordinates.<br />
A point èx; yè t in image space is represented by a non-zero 3-D point w =èw 1 ;w 2 ;w 3 è t<br />
in homogeneous coordinate systems, such that<br />
x = w 1<br />
w 3<br />
and y = w 2<br />
w 3<br />
;<br />
where w 3 6=0. This representation is not unique, since çw, for any ç 6= 0 is also a<br />
homogeneous representation of èx; yè t .<br />
Every non-singular 3æ3 matrix deænes a 2-D projective transformation of homogeneous<br />
coordinates.<br />
Various subgroups of the projective transformation group are deæned by<br />
diæerent restrictions on the form of the matrix T.<br />
4.1.2 Change of Coordinates in <strong>Line</strong> Parameter Space<br />
A line in a 2-D plane can be represented by its parameter vec<strong>to</strong>r and is usually parameterized<br />
as<br />
a 1 w 1 + a 2 w 2 + a 3 w 3 =0<br />
or<br />
a t w =0;<br />
where a =èa 1 ;a 2 ;a 3 è t is the parameter vec<strong>to</strong>r of the line ènote that a 1 and a 2 are not both<br />
equal <strong>to</strong> 0è and w =èw 1 ;w 2 ;w 3 è t is a homogeneous coordinate of any point on the line.
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 41<br />
This representation is not unique, since ça, for any ç 6= 0, is also a representation of the<br />
same line.<br />
A transformation T in image space changes the coordinate of every point w on a line<br />
<strong>to</strong> w 0 by w 0 = Tw. Substituting w = T ,1 w 0 in a t w =0,we get<br />
a t T ,1 w 0 =0<br />
and hence<br />
èèT ,1 è t aè t w 0 =0;<br />
or<br />
a 0 tw 0 =0; where a 0 =èT ,1 è t a:<br />
This shows that the change of the coordinate of the point in3-D parameter space is<br />
given by<br />
a 0 = Pa:<br />
where P =èT ,1 è t .<br />
4.1.3 Change of Coordinates in èç; rè Space<br />
<strong>Line</strong> features of an image are usually extracted by the Hough transform. A common<br />
implementation of the Hough transform applies the normal parameterization suggested by<br />
Duda and Hartë15ë, in the form<br />
x cos ç + y sin ç = r;<br />
where r is the perpendicular distance of the line <strong>to</strong> the origin and ç is the angle between<br />
a normal <strong>to</strong> the line and the positive x-axis.<br />
This unique parameterization of lines relates <strong>to</strong> the preceding parameterization of lines.<br />
Let F : R 2 7! R 3 be a mapping such that<br />
Fèèç; rè t è = ècos ç; sin ç; ,rè t<br />
If we restrict the domain of ç <strong>to</strong> be in ë0;çè, then F ,1 exists. Deæne another mapping<br />
G : R 3 7! R 2 by F ,1 as<br />
Gèèa 1 ;a 2 ;a 3 è t è=<br />
8<br />
é<br />
é:<br />
F ,1 èèp 1<br />
; p a 2<br />
; p a 3<br />
è t è<br />
a 2 1 +a2 a<br />
2<br />
1 +a2 a<br />
2<br />
1 +a2 2<br />
if a 2 ç 0<br />
F ,1 èèp 1<br />
; p ,a 2<br />
; p ,a 3<br />
è t è<br />
a 2 1 +a2 a<br />
2<br />
1 +a2 a<br />
2<br />
1 +a2 2<br />
otherwise
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 42<br />
where a 1 and a 2 are not both equal <strong>to</strong> 0. Then<br />
Gèèa 1 ;a 2 ;a 3 è t è=<br />
8<br />
é<br />
é:<br />
ètan ,1 è a 2<br />
è; p,a3<br />
è t if a<br />
a1 a 2 2 é 0<br />
1 +a2 2<br />
è0; ,a 3<br />
a1 èt if a 2 =0<br />
ètan ,1 è ,a 2<br />
è; p<br />
a3<br />
è t if a<br />
,a1 2 é 0<br />
where the range of tan ,1 is in ë0::çè and tan ,1 è1è = ç 2 .<br />
Then G maps a point a =èa 1 ;a 2 ;a 3 è t<br />
a 2 1 +a2 2<br />
both equal <strong>to</strong> 0, <strong>to</strong> èç; rè t = Gèaè inèç; rè-space such that<br />
in parameter space, where a 1 and a 2 are not<br />
a 1 w 1 + a 2 w 2 + a 3 w 3 =0<br />
and<br />
cos çw 1 + sin çw 2 , rw 3 =0<br />
deæne the same line.<br />
Let èç; rè t be the èç; rè-parameter deæning a line èin fact, the line x cos ç + y sin ç =<br />
rè. Then Fèèç; rè t è=a, where a = ècos ç; sin ç; ,rè t ,isapoint in parameter space. A<br />
transformation P changes the coordinate of a <strong>to</strong> a 0 = Pa in parameter space. Substituting<br />
Fèèç; rè t è for a in a 0 = Pa, we get<br />
a 0 = PFèèç; rè t è<br />
and hence èç 0 ;r 0 è t = Gèa 0 è=GèPFèèç; rè t èè deænes the same line as a 0 èor ça 0 , ç 6= 0è.<br />
Thus a transformation of a point in parameter space results in the transformation of<br />
èç; rè t in èç; rè-space. The change of coordinate of èç; rè t in èç; rè-space is given by<br />
èç 0 ;r 0 è t = Hèèç; rè t è<br />
è4:1è<br />
where H = G æ P æ F. 1<br />
4.2 Encoding <strong>Line</strong>s by a Basis of <strong>Line</strong>s<br />
To apply the geometric hashing technique <strong>to</strong> line features, we need <strong>to</strong> encode the features<br />
in a manner invariant under the transformation group considered. One way of encoding is<br />
1 We abused the notation by <strong>using</strong> Pèvè <strong>to</strong> denote Pv èmatrix P multiplies vec<strong>to</strong>r vè. We will continue<br />
<strong>to</strong> use Pèvè orPv interchangeably when no ambiguity occurs.
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 43<br />
<strong>to</strong> ænd a canonical basis and a transformation èin the transformation group under considerationè<br />
that maps the basis <strong>to</strong> the canonical basis, then apply that transformation <strong>to</strong> the<br />
line being encoded. For example, let T be a projective transformation that maps a basis<br />
b <strong>to</strong> Tèbè. If P is the unique transformation such that Pèbè = b c and P 0 is the unique<br />
transformation such that P 0 èTèbèè = b c , where b c is the chosen canonical basis, then we<br />
have P = P 0 æ T and any other line l and its correspondence l 0<br />
= Tèlè will be mapped <strong>to</strong><br />
Pèlè by P and P 0 èl 0 èby P 0<br />
respectively. Since P 0 èl 0 è=P 0 èTèlèè = P 0 æ Tèlè =Pèlè, this<br />
is an invariant encodingë25ë.<br />
Various subgroups of the projective transformation group require diæerent bases. We<br />
discuss them in the following section and give formulae for the corresponding invariants.<br />
Throughout the following discussion, we represent a line by its normal parameterization<br />
èç; rè with ç in ë0; çè and r in R. If the computed invariant èç 0 ;r 0 è has ç 0<br />
not in ë0::çè, we<br />
adjust it by adding ç or ,ç and adjust the value of r 0<br />
by æipping its sign accordingly.<br />
4.3 <strong>Line</strong> Invariants under Various Transformation Groups<br />
4.3.1 Rigid <strong>Line</strong> Invariants<br />
Any correspondence of two ordered pairs of non-parallel lines determines a rigid transformation<br />
up <strong>to</strong> a 180 æ -rotation ambiguity, and this ambiguity can be broken if we know the<br />
position of a third line during veriæcation.<br />
One way <strong>to</strong> encode a third line in terms of a pair of non-parallel lines in a rigid-invariant<br />
way is <strong>to</strong> ænd a transformation P which maps the ærst basis line <strong>to</strong> the x-axis and maps<br />
the second basis line <strong>to</strong> such that its intersection with the ærst basis line is the origin, and<br />
then apply P <strong>to</strong> the third line.<br />
A rigid transformation T in image space has the form<br />
0<br />
T = @ R b<br />
0 1<br />
0<br />
1<br />
A = B<br />
@<br />
cos ç , sin ç b1<br />
sin ç cos ç b2<br />
0 0 1<br />
where R is the rotation matrix and b, the translation vec<strong>to</strong>r. This corresponds <strong>to</strong> the<br />
1<br />
C<br />
A ;
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 44<br />
parameter-space transformation P, deæned by èc.f. section 4.1.2è<br />
so that<br />
P =èT ,1 è t =<br />
0<br />
@ èR,1 è t 0<br />
,b t èR ,1 è t 1<br />
P ,1 = T t =<br />
1 0<br />
A = @ R 0<br />
0<br />
@ Rt 0<br />
b t 1<br />
1<br />
A :<br />
,b t R 1<br />
1<br />
A ;<br />
è4.2è<br />
Let a line be parameterized as x cos ç + y sin ç , r = 0 and let two additional basis lines<br />
be represented by their parameters a 1 and a 2 , such that<br />
which are <strong>to</strong> be mapped by P <strong>to</strong><br />
a 1 = ècos ç 1 ; sin ç 1 ; ,r 1 è t ;<br />
a 2 = ècos ç 2 ; sin ç 2 ; ,r 2 è t ;<br />
e 1 = ècos ç 2 ; sin ç 2 ; 0èt =è0; 1; 0è t ;<br />
e 2 = ècos '; sin '; 0è t =èC; S; 0è t :<br />
Note that ' is æxed, though unknown. In this way, P maps the ærst basis line <strong>to</strong> x-axis<br />
èor, y = 0è and the intersection of the two basis lines <strong>to</strong> the origin.<br />
Since ax + by + c = 0 and çax + çby + çc =0,ç 6= 0, represent the same line, we have<br />
ç 1 Pa 1 = e 1 ;<br />
ç 2 Pa 2 = e 2 ;<br />
where ç 1 and ç 2 are non-zero constants. Equivalently,<br />
Then<br />
Pèç 1 a 1 ;ç 2 a 2 è=èe 1 ; e 2 è:<br />
èç 1 a 1 ;ç 2 a 2 è = P ,1 èe 1 ; e 2 è<br />
=<br />
=<br />
0<br />
B<br />
@<br />
0<br />
B<br />
@<br />
cos ç sin ç 0<br />
, sin ç cos ç 0<br />
1 0<br />
C<br />
A<br />
B<br />
@<br />
0 C<br />
1 S<br />
1<br />
C<br />
A<br />
b 1 b 2 1 0 0<br />
1<br />
sin ç C cos ç + S sin ç<br />
cos ç ,C sin ç + S cos ç C<br />
A :<br />
b 2 Cb 1 + Sb 2
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 45<br />
Thus<br />
ç 1 = sin ç<br />
= cos ç<br />
= b 2<br />
; è4.3è<br />
cos ç 1 sin ç 1 ,r 1<br />
ç 2 =<br />
From Eq. è4.4è, we obtain<br />
C cos ç + S sin ç<br />
cos ç 2<br />
=<br />
,C sin ç + S cos ç<br />
sin ç 2<br />
= Cb 1 + Sb 2<br />
,r 2<br />
: è4.4è<br />
C = ç 2 cosèç 2 + çè; è4.5è<br />
S = ç 2 sinèç 2 + çè; è4.6è<br />
b 1 = ,ç 2r 2 , Sb 2<br />
C<br />
è4.7è<br />
and from Eq. è4.3è, we have sin ç sin ç 1 = cos ç cos ç 1 and thus ç = ç 2 , ç 1 or ç = , ç 2 , ç 1.<br />
When ç = ç 2 , ç 1,wehave ç 1 = 1 and b 2 = ,r 1 . Substituting them and Eq. è4.5è,<br />
Eq. è4.6è in Eq. è4.7è, we have b 1<br />
Eq. è4.2è we obtain<br />
P =<br />
= ,èr 2 , r 1 cosèç 1 , ç 2 èè= sinèç 1 , ç 2 è. Thus, from<br />
0<br />
1<br />
sin ç 1 , cos ç 1 0<br />
B<br />
cos ç<br />
@<br />
1 sin ç 1 0 C<br />
A<br />
cscèç 1 , ç 2 èèr 2 sin ç 1 , r 1 sin ç 2 è cscèç 1 , ç 2 èè,r 2 cos ç 1 + r 1 cos ç 2 è 1<br />
When ç = , ç 2 , ç 1,wehave ç 1 = ,1 and b 2 = r 1 . Substituting them and Eq. è4.5è,<br />
Eq. è4.6è in Eq. è4.7è, we have b 1 =èr 2 , r 1 cosèç 1 , ç 2 èè= sinèç 1 , ç 2 è. Thus, from Eq. è4.2è<br />
we obtain<br />
P =<br />
0<br />
1<br />
, sin ç 1 cos ç 1 0<br />
B<br />
@<br />
, cos ç 1 , sin ç 1 0 C<br />
A<br />
cscèç 1 , ç 2 èèr 2 sin ç 1 , r 1 sin ç 2 è cscèç 1 , ç 2 èè,r 2 cos ç 1 + r 1 cos ç 2 è 1<br />
From section 4.1.3, the invariant èç 0 ;r 0 è t of a line èç; rè t can be obtained from the basis<br />
lines as follows<br />
èç 0 ;r 0 è t = Hèèç; rè t è=GèPècos ç; sin ç;,rè t è=Gèa 0 è<br />
where<br />
a 0 =<br />
0<br />
B<br />
@<br />
, sinèç , ç 1 è<br />
cosèç , ç 1 è<br />
,r , r 2 cscèç 1 , ç 2 è sinèç , ç 1 è+r 1 cscèç 1 , ç 2 è sinèç , ç 2 è<br />
1<br />
C<br />
A ;
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 46<br />
or<br />
a 0 =<br />
0<br />
B<br />
@<br />
sinèç , ç1è<br />
, cosèç , ç1è<br />
,r , r2 cscèç1 , ç2è sinèç , ç1è+r1 cscèç1 , ç2è sinèç , ç2è<br />
1<br />
C<br />
A ;<br />
depending upon which P is used, and hence<br />
ç 0 = tan ,1 è cosèç , ç 1è<br />
, sinèç , ç1è è = tan,1 è sinèç , ç 1 + ç 2 è<br />
cosèç , ç1 + ç 2 èè=ç , ç 1 + ç 2 ;<br />
r 0 = r + r2 cscèç1 , ç2è sinèç , ç1è , r1 cscèç1 , ç2è sinèç , ç2è;<br />
or<br />
ç 0 = tan ,1 è , cosèç , ç 1è<br />
sinèç , ç1è è = tan,1 è sinèç , ç 1 , ç 2 è<br />
cosèç , ç1 , ç 2 èè=ç , ç 1 , ç 2 ;<br />
r 0 = r + r2 cscèç1 , ç2è sinèç , ç1è , r1 cscèç1 , ç2è sinèç , ç2è:<br />
We may s<strong>to</strong>re each encoded invariant èç 0 ;r 0 è redundantly in two entries of the hash<br />
table, èç 0 ;r 0 è and èç 0 ; ,r 0 è, during preprocessing. Then we may hit a match with either<br />
èç 0 ;r 0 èorèç 0 ; ,r 0 è as the computed scene invariant during recognition. èNote that with<br />
180 o -rotation, the computed invariants should be èç; rè and èç + ç; rè. However, we would<br />
like <strong>to</strong> restrict ç <strong>to</strong> be in ë0::çè, which makes the invariants <strong>to</strong> be èç; rè and èç;,rè, where<br />
ç is in ë0::çè.è<br />
4.3.2 Similarity <strong>Line</strong> Invariants<br />
To determine a similarity transformation uniquely, we need a correspondence of a pair of<br />
triplets of lines, not all of which are parallel <strong>to</strong> each other or intersect at a common point.<br />
Without loss of generality, wemay assume that the ærst basis line intersects the other<br />
two basis lines. One way <strong>to</strong> encode a fourth line in a similarity-invariant way is <strong>to</strong> ænd a<br />
transformation P which maps the ærst basis line <strong>to</strong> the x-axis, maps the second basis line<br />
<strong>to</strong> such that its intersection with the ærst basis line is the origin and maps the third basis<br />
line <strong>to</strong> such that its intersection with the ærst basis line has Euclidean coordinate è1; 0è,<br />
and then apply P <strong>to</strong> the fourth line.
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 47<br />
A similarity transformation T in image space has the form<br />
T =<br />
0<br />
@ s R<br />
b<br />
0 1<br />
0<br />
1<br />
A = B<br />
@<br />
s cos ç ,s sin ç b 1<br />
s sin ç s cos ç b 2<br />
0 0 1<br />
1<br />
C<br />
A ;<br />
where s is the scaling fac<strong>to</strong>r; R, the rotation matrix; b, the translation vec<strong>to</strong>r.<br />
corresponds <strong>to</strong> the parameter-space transformation P, deæned by èc.f. section 4.1.2è<br />
so that<br />
P =èT ,1 è t =<br />
0<br />
@ 1 èR,1 è t s<br />
0<br />
, 1 s bt èR ,1 è t 1<br />
P ,1 = T t =<br />
1 0<br />
A = @ 1 R 0<br />
s<br />
, 1 s bt R 1<br />
0<br />
@ sRt 0<br />
b t 1<br />
1<br />
A :<br />
1<br />
A ;<br />
This<br />
è4.8è<br />
Let a line be parameterized as x cos ç + y sin ç , r = 0 and let three additional basis lines,<br />
assuming the ærst line is not parallel <strong>to</strong> the others, be represented by their parameters a 1 ,<br />
a 2 and a 3 , such that<br />
a 1 = ècos ç 1 ; sin ç 1 ; ,r 1 è t ;<br />
a 2 = ècos ç 2 ; sin ç 2 ; ,r 2 è t ;<br />
a 3 = ècos ç 3 ; sin ç 3 ; ,r 3 è t ;<br />
which are <strong>to</strong> be mapped by P <strong>to</strong><br />
e 1 = ècos ç 2 ; sin ç 2 ; 0èt =è0; 1; 0è t ;<br />
e 2 = ècos ' 1 ; sin ' 1 ; 0è t =èC 1 ;S 1 ; 0è t ;<br />
e 3 = ècos ' 2 ; sin ' 2 ; , sin jç 1 , ç 3 jè t =èC 2 ;S 2 ; , sin jç 1 , ç 3 jè t :<br />
Note that ' 1 and ' 2 are æxed, though unknown. Also note that since we æxed the third<br />
component ofe 3 <strong>to</strong> be , sin jç 1 , ç 3 j é 0, ' 2 ranges within è, ç 2 ; ç 2 è èor, C 2 é 0è. In this<br />
way, P maps the ærst basis line <strong>to</strong> x-axis; the intersection of the ærst basis line and the<br />
second basis line <strong>to</strong> the origin; the intersection of the ærst basis line and the third basis<br />
line <strong>to</strong> è1; 0è t .
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 48<br />
Since ax + by + c = 0 and çax + çby + çc =0,ç 6= 0, represent the same line, we have<br />
ç 1 Pa 1 = e 1 ;<br />
ç 2 Pa 2 = e 2 ;<br />
ç 3 Pa 3 = e 3 ;<br />
where ç 1 , ç 2 and ç 3 are non-zero constants. Equivalently,<br />
Pèç 1 a 1 ;ç 2 a 2 ;ç 3 a 3 è=èe 1 ; e 2 ; e 3 è:<br />
Then<br />
èç 1 a 1 ;ç 2 a 2 ;ç 3 a 3 è=P ,1 èe 1 ; e 2 ; e 3 è<br />
=<br />
=<br />
0<br />
B<br />
@<br />
0<br />
B<br />
@<br />
s cos ç s sin ç 0<br />
,s sin ç s cos ç 0<br />
b 1 b 2 1<br />
1 0<br />
C<br />
A<br />
B<br />
@<br />
0 C 1 C 2<br />
1 S 1 S 2<br />
0 0 , sin jç 1 , ç 3 j<br />
s sin ç sC 1 cos ç + sS 1 sin ç sC 2 cos ç + sS 2 sin ç<br />
s cos ç ,sC 1 sin ç + sS 1 cos ç ,sC 2 sin ç + sS 2 cos ç<br />
b 2 C 1 b 1 + S 1 b 2 C 2 b 1 + S 2 b 2 , sin jç 1 , ç 3 j<br />
1<br />
C<br />
A<br />
1<br />
C<br />
A :<br />
Thus<br />
ç 1 = s sin ç<br />
cos ç 1<br />
= s cos ç<br />
sin ç 1<br />
= b 2<br />
,r 1<br />
; è4.9è<br />
ç 2 = s èC 1 cos ç + S 1 sin çè<br />
cos ç 2<br />
= s è,C 1 sin ç + S 1 cos çè<br />
sin ç 2<br />
= C 1 b 1 + S 1 b 2<br />
,r 2<br />
; è4.10è<br />
ç 3 = s èC 2 cos ç + S 2 sin çè<br />
cos ç 3<br />
= s è,C 2 sin ç + S 2 cos çè<br />
sin ç 3<br />
= C 2 b 1 + S 2 b 2 , sin jç 1 , ç 3 j<br />
,r 3<br />
: è4.11è
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 49<br />
From Eq. è4.9è, we have s sin ç sin ç 1 = s cos ç cos ç 1 and thus ç = ç 2 , ç 1 or ç = , ç 2 , ç 1.<br />
From Eq. è4.10è, we obtain<br />
From Eq. è4.11è, we obtain<br />
C 1 = ç 2<br />
s cosèç 2 + çè;<br />
S 1 = ç 2<br />
s sinèç 2 + çè;<br />
b 1 = ,ç 2r 2 , S 1 b 2<br />
C 1<br />
: è4.12è<br />
C 2 = ç 3<br />
S 2 = ç 3<br />
and since C 2 2 + S2 2 =1,wehave ç 3 = æs.<br />
s cosèç 3 + çè;<br />
s sinèç 3 + çè;<br />
b 1 = ,ç 3r 3 , S 2 b 2 + sin jç 1 , ç 3 j<br />
C 2<br />
; è4.13è<br />
When ç = ç 2 , ç 1,wehave ç 1 = s and b 2 = ,sr 1 ; when ç = , ç 2 , ç 1,wehave ç 1 = ,s<br />
and b 2 = sr 1 . In both cases, b 1 in Eq. è4.12è and Eq. è4.13è must be consistent, we can<br />
solve <strong>to</strong>get<br />
ç 3 =<br />
sinèç 1 , ç 2 è sin jç 1 , ç 3 j<br />
r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è :<br />
Since we require C 2 é 0, we have ç 3 = s when ç 1 éç 3 and ç = , ç 2 , ç 1 or ç 1 éç 3 and<br />
ç = ç 2 , ç 1;wehave ç 3 = ,s when ç 1 éç 3 and ç = ç 2 , ç 1 or ç 1 éç 3 and ç = , ç 2 , ç 1.<br />
To summarize, we have four cases as follows:<br />
æ case 1 ç 1 éç 3 èthus sin jç 1 , ç 3 j = , sinèç 1 , ç 3 èè<br />
í case 1.1<br />
ç = ç 2 , ç 1;<br />
s =<br />
sinèç 1 , ç 2 è sinèç 1 , ç 3 è<br />
r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è ;<br />
b 1 = s è,r 2 + r 1 cosèç 1 , ç 2 èè<br />
sinèç 1 , ç 2 è<br />
b 2 = ,sr 1 ;<br />
;
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 50<br />
í case 1.2<br />
ç = , ç 2 , ç 1;<br />
s = ,<br />
sinèç 1 , ç 2 è sinèç 1 , ç 3 è<br />
r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è ;<br />
b 1 = , s è,r 2 + r 1 cosèç 1 , ç 2 èè<br />
b 2 = sr 1 ;<br />
sinèç 1 , ç 2 è<br />
;<br />
æ case 2 ç 1 éç 3 èthus sin jç 1 , ç 3 j = sinèç 1 , ç 3 èè<br />
í case 2.1<br />
ç = ç 2 , ç 1;<br />
s =<br />
sinèç 1 , ç 2 è sinèç 1 , ç 3 è<br />
r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è ;<br />
b 1 = s è,r 2 + r 1 cosèç 1 , ç 2 èè<br />
sinèç 1 , ç 2 è<br />
b 2 = ,sr 1 ;<br />
;<br />
í case 2.2<br />
ç = , ç 2 , ç 1;<br />
s = ,<br />
sinèç 1 , ç 2 è sinèç 1 , ç 3 è<br />
r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è ;<br />
b 1 = , s è,r 2 + r 1 cosèç 1 , ç 2 èè<br />
;<br />
sinèç 1 , ç 2 è<br />
b 2 = sr 1 :<br />
Note that case 1.1 and case 2.1 are identical and case 1.2 and case 2.2 are identical.<br />
The transformation P derived from case 1.1 and case 1.2 are again identical. We conclude<br />
that the above four cases result in<br />
P = 1 D<br />
0<br />
B<br />
@<br />
A sin ç 1 ,A cos ç 1 0<br />
A cos ç 1 A sin ç 1 0<br />
èr 2 sin ç 1 , r 1 sin ç 2 è sinèç 1 , ç 3 è ,èr 2 cos ç 1 , r 1 cos ç 2 è sinèç 1 , ç 3 è D<br />
1<br />
C<br />
A
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 51<br />
where<br />
A = r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è;<br />
D = sinèç 1 , ç 2 è sinèç 1 , ç 3 è:<br />
From section 4.1.3, the invariant èç 0 ;r 0 è t of a line èç; rè t can be obtained from the basis<br />
lines as follows<br />
èç 0 ;r 0 è t = Hèèç; rè t è=GèPècos ç; sin ç;,rè t è=Gèa 0 è<br />
where<br />
a 0<br />
= 1 D<br />
0<br />
B<br />
@<br />
,A sinèç , ç 1 è<br />
A cosèç , ç 1 è<br />
è,r 2 sinèç , ç 1 è+r 1 sinèç , ç 2 è , r sinèç 1 , ç 2 èè sinèç 1 , ç 3 è<br />
1<br />
C<br />
A<br />
and hence<br />
ç 0 = tan ,1 è A cosèç , ç 1è<br />
,A sinèç , ç 1 è è = tan,1 è sinèç , ç 1 + ç 2 è<br />
cosèç , ç 1 + ç 2 è è=ç , ç 1 + ç 2 ;<br />
r 0 = èr 2 sinèç , ç 1 è , r 1 sinèç , ç 2 è+r sinèç 1 , ç 2 èè sinèç 1 , ç 3 è=jAj:<br />
4.3.3 Aæne <strong>Line</strong> Invariants<br />
To determine an aæne transformation uniquely, we need a correspondence of two ordered<br />
triplets of lines, which are not parallel <strong>to</strong> one another and do not intersect at a common<br />
point.<br />
One way <strong>to</strong> encode a fourth line in an aæne-invariant way is <strong>to</strong> ænd a transformation<br />
P which maps the three basis lines <strong>to</strong> a canonical basis, say x =0,y = 0 and x + y =1,<br />
and then apply P <strong>to</strong> the fourth line.<br />
An aæne transformation T in image space has the form<br />
T =<br />
0<br />
@ A<br />
b<br />
0 1<br />
0<br />
1<br />
A = B<br />
@<br />
a 11 a 12 b 1<br />
a 21 a 22 b 2<br />
0 0 1<br />
1<br />
C<br />
A ;<br />
where A is the skewing matrix; b, the translation vec<strong>to</strong>r. This corresponds <strong>to</strong> the<br />
parameter-space transformation P, deæned by èc.f. section 4.1.2è<br />
P =èT ,1 è t =<br />
0<br />
@ èA,1 è t 0<br />
,b t èA ,1 è t 1<br />
1<br />
A ;<br />
è4.14è
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 52<br />
so that<br />
P ,1 = T t =<br />
0<br />
@ At 0<br />
b t 1<br />
Let a line be parameterized as x cos ç + y sin ç , r = 0 and let three additional basis lines<br />
1<br />
A :<br />
be represented by their parameters a 1 , a 2 and a 3 , such that<br />
which are <strong>to</strong> be mapped by P <strong>to</strong><br />
a 1 = ècos ç1 ; sin ç 1 ; ,r 1è t ;<br />
a 2 = ècos ç2 ; sin ç 2 ; ,r 2è t ;<br />
a 3 = ècos ç3 ; sin ç 3 ; ,r 3è t ;<br />
e 1 = ècos 0; sin 0; 0è t =è1; 0; 0è t ;<br />
e 2 = ècos ç 2 ; sin ç 2 ; 0èt =è0; 1; 0è t ;<br />
e 3 = ècos ç 4 ; sin ç ,1<br />
p<br />
4 ; è t =è 1 1 ,1<br />
p ; p ; p è t :<br />
2 2 2 2<br />
In this way, P maps the ærst basis line <strong>to</strong> y-axis; the second basis line <strong>to</strong> x-axis; the third<br />
basis line <strong>to</strong> x + y =1.<br />
Since ax + by + c = 0 and çax + çby + çc =0,ç 6= 0, represent the same line, we have<br />
ç1Pa 1 = e 1 ;<br />
ç2Pa 2 = e 2 ;<br />
ç3Pa 3 = e 3 ;<br />
where ç1, ç2 and ç3 are non-zero constants. Equivalently,<br />
Then<br />
Pèç1a 1 ;ç2a 2 ;ç3a 3 è=èe 1 ; e 2 ; e 3 è:<br />
èç1a 1 ;ç2a 2 ;ç3a 3 è = P ,1 èe 1 ; e 2 ; e 3 è<br />
=<br />
=<br />
0<br />
B<br />
@<br />
0<br />
B<br />
@<br />
a11 a21 0<br />
a12 a22 0<br />
b1 b2 1<br />
a11<br />
a12<br />
b1<br />
a21<br />
a22<br />
b2<br />
1 0<br />
C<br />
A<br />
B<br />
@<br />
1<br />
1 0<br />
1p<br />
2<br />
0 1<br />
1p 2<br />
C<br />
A<br />
0 0 , p 1<br />
2<br />
p1<br />
èa11 + a21è<br />
2<br />
p1<br />
2<br />
èa12 + a22è<br />
p1<br />
èb1 + b2 2<br />
, 1è<br />
1<br />
C<br />
A :
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 53<br />
Thus<br />
ç1 =<br />
ç2 =<br />
ç3 =<br />
a11<br />
cos ç1<br />
From Eq. è4.15è, we obtain<br />
= a 12<br />
sin ç1<br />
= b 1<br />
; è4.15è<br />
,r1<br />
a21<br />
= a 22<br />
= b 2<br />
; è4.16è<br />
cos ç2 sin ç2 ,r2<br />
p1<br />
èa11 + a21è p1<br />
èa12 + a22è<br />
1p èb1 + b2 2 2 2<br />
, 1è<br />
=<br />
=<br />
: è4.17è<br />
cos ç3<br />
sin ç3 ,r3<br />
a11 = ç1 cos ç1 ; a 12 = ç1 sin ç1 and b1 = ,ç1 r 1 : è4.18è<br />
From Eq. è4.16è, we obtain<br />
a21 = ç2 cos ç2 ; a 22 = ç2 sin ç2 and b2 = ,ç2 r 2 : è4.19è<br />
From Eq. è4.17è, we obtain<br />
a11 + a21 = ç3<br />
p<br />
2 cos ç3 ; a 12 + a22 = ç3<br />
Substituting Eq. è4.18è and Eq. è4.19è in<strong>to</strong> Eq. è4.20è, we have<br />
and hence<br />
ç1 cos ç1 + ç2 cos ç2 , ç3<br />
ç1 sin ç1 + ç2 sin ç2 , ç3<br />
p<br />
p<br />
2 sin ç3 and b1 + b2 , 1=,ç3 2 r3 : è4.20è<br />
p<br />
p<br />
ç1 r 1 + ç2 r 2 , ç3<br />
2 cos ç3 = 0;<br />
2 sin ç3 = 0;<br />
p<br />
2 r3 = ,1;<br />
ç1 = , sinèç2 , ç3è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è+r3 sinèç1 , ç2èè;<br />
ç2 = , sinèç3 , ç1è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è+r3 sinèç1 , ç2èè:<br />
Substituting ç1 and ç2 back <strong>to</strong> Eq. è4.18è and Eq. è4.19è, we immediately obtain,<br />
a11 = , cos ç1 sinèç2 , ç3è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è +r3 sinèç1 , ç2èè;<br />
a12 = , sin ç1 sinèç2 , ç3è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è +r3 sinèç1 , ç2èè;<br />
a21 = , cos ç2 sinèç3 , ç1è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è +r3 sinèç1 , ç2èè;<br />
a22 = , sin ç2 sinèç3 , ç1è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è +r3 sinèç1 , ç2èè;<br />
b1 = r1 sinèç2 , ç3è=èr1 sinèç2 , ç3è +r2 sinèç3 , ç1è+r3 sinèç1 , ç2èè;<br />
b2 = r2 sinèç3 , ç1è=èr1 sinèç2 , ç3è +r2 sinèç3 , ç1è+r3 sinèç1 , ç2èè:
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 54<br />
Thus the transformation P in parameter space can be obtained from Eq. è4.14è as<br />
P =<br />
where<br />
0<br />
B@<br />
A cscèç 1 , ç 2 è cscèç 2 , ç 3 è sin ç 2 ,A cscèç 1 , ç 2 è cscèç 2 , ç 3 è cos ç 2 0<br />
A cscèç 1 , ç 2 è cscèç 1 , ç 3 è sin ç 1 ,A cscèç 1 , ç 2 è cscèç 1 , ç 3 è cos ç 1 0<br />
cscèç 1 , ç 2 èèr 2 sin ç 1 , r 1 sin ç 2 è , cscèç 1 , ç 2 èèr 2 cos ç 1 , r 1 cos ç 2 è 1<br />
A = r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è:<br />
From section 4.1.3, the invariant èç 0 ;r 0 è t of a line èç; rè t can be obtained from the basis<br />
lines as follows:<br />
1<br />
CA<br />
3æ3<br />
èç 0 ;r 0 è t = Hèèç; rè t è=GèPècos ç; sin ç;,rè t è=Gèa 0 è;<br />
where<br />
0<br />
, cscèç 1 , ç 2 è cscèç 2 , ç 3 è sinèç , ç 2 è<br />
1<br />
èr 3 sinèç 1 , ç 2 è+r 2 sinèç 3 , ç 1 è+r 1 sinèç 2 , ç 3 èè<br />
a 0 =<br />
, cscèç 1 , ç 2 è cscèç 1 , ç 3 è sinèç , ç 1 è<br />
B@<br />
èr 3 sinèç 1 , ç 2 è+r 2 sinèç 3 , ç 1 è+r 1 sinèç 2 , ç 3 èè<br />
CA<br />
r 1 cscèç 1 , ç 2 è sinèç , ç 2 è , r 2 cscèç 1 , ç 2 è sinèç , ç 1 è , r<br />
3æ1<br />
and hence<br />
ç 0 = tan ,1 èA cscèç 1 , ç 3 è sinèç 1 , çè cscèç 1 , ç 2 è=<br />
A cscèç 2 , ç 3 è sinèç 2 , çè cscèç 1 , ç 2 èè;<br />
è4.21è<br />
r 0 = 1 ç èr 1 cscèç 1 , ç 2 è sinèç 2 , çè , r 2 cscèç 1 , ç 2 è sinèç 1 , çè +rè; è4.22è<br />
where<br />
ç =<br />
q<br />
csc 2 èç 1 , ç 3 è sin 2 èç 1 , çè + csc 2 èç 2 , ç 3 è sin 2 èç 2 , çè jA cscèç 1 , ç 2 èj;<br />
A = r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è:
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 55<br />
4.3.4 Projective <strong>Line</strong> Invariants<br />
Although a projective transformation is fractional linear, one can well treat it as a linear<br />
transformation by <strong>using</strong> homogeneous coordinates.<br />
To determine a projective transformation uniquely, we need a correspondence of two<br />
ordered quadruplets of lines and for each quadruplet, no three lines are parallel <strong>to</strong> one<br />
another or intersect at a common point èthus no three points are collinear in parameter<br />
spaceè.<br />
One way <strong>to</strong> encode a æfth line in a projective-invariant way is <strong>to</strong> ænd a transformation<br />
P which maps the four basis lines <strong>to</strong> a canonical basis, say x =0,y =0,x = 1 and y =1,<br />
and then apply P <strong>to</strong> the æfth line.<br />
A projective transformation T in image space has the form<br />
0<br />
1<br />
a 11 a 12 a 13<br />
T = B<br />
a<br />
@ 21 a 22 a 23 C<br />
A :<br />
a 31 a 32 a 33<br />
This corresponds <strong>to</strong> the parameter-space transformation P, deæned by èc.f. section 4.1.2è<br />
P =èT ,1 è t = 1 D<br />
where<br />
0<br />
1<br />
a 22 a 33 , a 23 a 32 a 23 a 31 , a 21 a 33 a 21 a 32 , a 22 a 31<br />
B a<br />
@ 13 a 32 , a 12 a 33 a 11 a 33 , a 13 a 31 a 12 a 31 , a 11 a 32 C<br />
a 12 a 23 , a 13 a 22 a 13 a 21 , a 11 a 23 a 11 a 22 , a 12 a 21<br />
D = a 11 a 22 a 33 + a 12 a 23 a 31 + a 13 a 21 a 32 , a 11 a 23 a 32 , a 12 a 21 a 33 , a 13 a 22 a 31 ;<br />
so that<br />
P ,1 = T t =<br />
0<br />
B<br />
@<br />
1<br />
a 11 a 21 a 31<br />
a 12 a 22 a 32 C<br />
A :<br />
a 13 a 23 a 33<br />
A ;<br />
è4.23è<br />
Let a line be parameterized as x cos ç + y sin ç , r = 0 and let four additional basis lines<br />
be represented by their parameters a 1 , a 2 , a 3 and a 4 , such that<br />
a 1 = ècos ç 1 ; sin ç 1 ; ,r 1 è t ;<br />
a 2 = ècos ç 2 ; sin ç 2 ; ,r 2 è t ;<br />
a 3 = ècos ç 3 ; sin ç 3 ; ,r 3 è t ;<br />
a 4 = ècos ç 4 ; sin ç 4 ; ,r 4 è t ;
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 56<br />
which are <strong>to</strong> be mapped by P <strong>to</strong><br />
e 1 = ècos 0; sin 0; 0è t =è1; 0; 0è t ;<br />
e 2 = ècos ç 2 ; sin ç 2 ; 0èt =è0; 1; 0è t ;<br />
e 3 = ècos 0; sin 0; ,1è t =è1; 0; ,1è t ;<br />
e 4 = ècos ç 2 ; sin ç 2 ; ,1èt =è0; 1; ,1è t :<br />
In this way, P maps the ærst basis line <strong>to</strong> y-axis; the second basis line <strong>to</strong> x-axis; the third<br />
basis line <strong>to</strong> x = 1; the fourth basis line <strong>to</strong> y =1.<br />
Since ax + by + c = 0 and çax + çby + çc =0,ç 6= 0, represent the same line, we have<br />
ç1Pa 1 = e 1 ;<br />
ç2Pa 2 = e 2 ;<br />
ç3Pa 3 = e 3 ;<br />
ç4Pa 4 = e 4 ;<br />
where ç1, ç2, ç3 and ç4 are constants. Equivalently,<br />
Pèç1a 1 ;ç2a 2 ;ç3a 3 ;ç4a 4 è=èe 1 ; e 2 ; e 3 ; e 4 è:<br />
Then<br />
We have<br />
èç1a 1 ;ç2a 2 ;ç3a 3 ;ç4a 4 è = P ,1 èe 1 ; e 2 ; e 3 ; e 4 è<br />
=<br />
=<br />
0<br />
B<br />
@<br />
a11 a21 a31<br />
a12 a22 a32<br />
a13 a23 a33<br />
1 0<br />
C<br />
A<br />
B<br />
@<br />
1 0 1 0<br />
0 1 0 1<br />
0 0 ,1 ,1<br />
0<br />
1<br />
a11 a21 a11 , a31 a21 , a31<br />
B a12 a22 a12<br />
@<br />
, a32 a22 , a32 C<br />
a13 a23 a13 , a33 a23 , a33<br />
A :<br />
1<br />
C<br />
A<br />
ç1 =<br />
ç2 =<br />
a11<br />
cos ç1<br />
a21<br />
cos ç2<br />
ç3 = a 11 , a31<br />
cos ç3<br />
ç4 = a 21 , a31<br />
cos ç4<br />
= a 12<br />
sin ç1<br />
= a 22<br />
sin ç2<br />
= a 13<br />
; è4.24è<br />
,r1<br />
= a 23<br />
; è4.25è<br />
,r2<br />
= a 12 , a32<br />
sin ç3<br />
= a 22 , a32<br />
sin ç4<br />
= a 13 , a33<br />
; è4.26è<br />
,r3<br />
= a 23 , a33<br />
: è4.27è<br />
,r4
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 57<br />
From Eq. è4.24è, we obtain<br />
a 11 = ç 1 cos ç 1 ; a 12 = ç 1 sin ç 1 and a 13 = ,ç 1 r 1 : è4:28è<br />
From Eq. è4.25è, we obtain<br />
a 21 = ç 2 cos ç 2 ; a 22 = ç 2 sin ç 2 and a 23 = ,ç 2 r 2 : è4:29è<br />
From Eq. è4.26è, we obtain<br />
a 31 = a 11 , ç 3 cos ç 3 ; a 32 = a 12 , ç 3 sin ç 3 and a 33 = a 13 + ç 3 r 3 : è4:30è<br />
From Eq. è4.27è, we obtain<br />
a 31 = a 21 , ç 4 cos ç 4 ; a 32 = a 22 , ç 4 sin ç 4 and a 33 = a 23 + ç 4 r 4 : è4:31è<br />
Since Eq. è4.30è and Eq. è4.31è must be consistent, we have by substituting Eq. è4.28è in<br />
Eq. è4.30è and Eq. è4.29è in Eq. è4.31è,<br />
ç 1 cos ç 1 , ç 2 cos ç 2 , ç 3 cos ç 3 = ,ç 4 cos ç 4 ;<br />
ç 1 sin ç 1 , ç 2 sin ç 2 , ç 3 sin ç 3 = ,ç 4 sin ç 4 ;<br />
ç 1 r 1 , ç 2 r 2 , ç 3 r 3 = ,ç 4 r 4 ;<br />
and hence<br />
where<br />
ç 1 = C 1<br />
C 4<br />
ç 4 ; ç 2 = C 2<br />
C 4<br />
ç 4 and ç 3 = C 3<br />
C 4<br />
ç 4 ;<br />
C 1 = ,r 4 sinèç 2 , ç 3 è+r 3 sinèç 2 , ç 4 è , r 2 sinèç 3 , ç 4 è;<br />
C 2 = ,r 4 sinèç 1 , ç 3 è+r 3 sinèç 1 , ç 4 è , r 1 sinèç 3 , ç 4 è;<br />
C 3 = r 4 sinèç 1 , ç 2 è , r 2 sinèç 1 , ç 4 è+r 1 sinèç 2 , ç 4 è;<br />
C 4 = r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è:<br />
Thus<br />
a 11 = cos ç 1<br />
C 1<br />
C 4<br />
ç 4 ; a 12 = sin ç 1<br />
C 1<br />
C 4<br />
ç 4 and a 13 = ,r 1<br />
C 1<br />
C 4<br />
ç 4 ;<br />
a 21 = cos ç 2<br />
C 2<br />
C 4<br />
ç 4 ; a 22 = sin ç 2<br />
C 2<br />
C 4<br />
ç 4 ; and a 23 = ,r 2<br />
C 2<br />
C 4<br />
ç 4 ;
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 58<br />
a 31 = ècos ç 1<br />
C 1<br />
C 4<br />
,cos ç 3<br />
C 3<br />
C 4<br />
èç 4 ;a 32 = èsin ç 1<br />
C 1<br />
C 4<br />
,sin ç 3<br />
C 3<br />
C 4<br />
èç 4 and a 33 =è,r 1<br />
C 1<br />
C 4<br />
+r 3<br />
C 3<br />
C 4<br />
èç 4 :<br />
The transformation P in parameter space can be obtained from Eq. è4.23è as<br />
0<br />
1<br />
p 11 p 12 p 13<br />
P = ,1=C 1 C 2 C 3 ç 4 B<br />
p<br />
@ 21 p 22 p 23 C<br />
A<br />
p 31 p 32 p 33<br />
where<br />
p 11 = C 2 èC 1 r 2 sin ç 1 , C 1 r 1 sin ç 2 + C 3 r 3 sin ç 2 , C 3 r 2 sin ç 3 è;<br />
p 12 = ,C 2 èC 1 r 2 cos ç 1 , C 1 r 1 cos ç 2 + C 3 r 3 cos ç 2 , C 3 r 2 cos ç 3 è;<br />
p 13 = C 2 èC 1 sinèç 1 , ç 2 è+C 3 sinèç 2 , ç 3 èè;<br />
p 21 = C 1 C 3 èr 1 sin ç 3 , r 3 sin ç 1 è;<br />
p 22 = ,C 1 C 3 èr 1 cos ç 3 , r 3 cos ç 1 è;<br />
p 23 = ,C 1 C 3 sinèç 1 , ç 3 è;<br />
p 31 = C 1 C 2 èr 1 sin ç 2 , r 2 sin ç 1 è;<br />
p 32 = ,C 1 C 2 èr 1 cos ç 2 , r 2 cos ç 1 è;<br />
p 33 = ,C 1 C 2 sinèç 1 , ç 2 è:<br />
From section 4.1.3, the invariant èç 0 ;r 0 è t of a line èç; rè t can be obtained from the basis<br />
lines as follows<br />
èç 0 ;r 0 è t = Hèèç; rè t è=GèPècos ç; sin ç;,rè t è=Gèa 0 è<br />
where<br />
0<br />
a 0 = ,1=C 1 C 2 C 3 ç 4<br />
B<br />
B<br />
@<br />
ABC<br />
DEF<br />
GCF<br />
1<br />
C<br />
A<br />
3æ1<br />
where<br />
A = èr 3 sinèç 1 , ç 2 è , r 2 sinèç 1 , ç 3 è+r 1 sinèç 2 , ç 3 èè;<br />
B = èr 4 sinèç , ç 2 è , r 2 sinèç , ç 4 è+r sinèç 2 , ç 4 èè;<br />
C = èr 4 sinèç 1 , ç 3 è , r 3 sinèç 1 , ç 4 è+r 1 sinèç 3 , ç 4 èè;
CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 59<br />
D =<br />
è,r 3 sinèç , ç 1 è+r 1 sinèç , ç 3 è , r sinèç 1 , ç 3 èè;<br />
E = èr 4 sinèç 1 , ç 2 è , r 2 sinèç 1 , ç 4 è+r 1 sinèç 2 , ç 4 èè;<br />
F = èr 4 sinèç 2 , ç 3 è , r 3 sinèç 2 , ç 4 è+r 2 sinèç 3 , ç 4 èè;<br />
G = èr 2 sinèç , ç 1 è , r 1 sinèç , ç 2 è+r sinèç 1 , ç 2 èè;<br />
and hence<br />
ç 0 = tan ,1 èDEF=ABCè;<br />
,GCF<br />
r 0 = pèABCè 2 +èDEF è 2 :
Chapter 5<br />
The Eæect of Noise on the<br />
Invariants<br />
In an environment aæected by serious noise and occlusion, line features are usually detected<br />
<strong>using</strong> the Hough transform. However, detected line parameters èç; rè's always deviate<br />
slightly from their true values, since in the physical process of image acquisition the<br />
positions of the endpoints of a segment are usually randomly perturbed and occlusion also<br />
aæects the process of the Hough transform. As long as a segment is not <strong>to</strong>o short, this induces<br />
only slight perturbation of the line parameters of the segment. Thus it is reasonable<br />
for us <strong>to</strong> assume that detected line parameters diæer from the ëtrue" values by a small<br />
perturbation having a Gaussian distribution.<br />
In the following, we derive the ëspread" of the computed invariant over hash space<br />
from the ëperturbation" of the lines which give rise <strong>to</strong> this invariant.<br />
5.1 A Noise Model for <strong>Line</strong> Parameters<br />
We make the following mild assumptions: The measured line parameters èç; rè's of a set<br />
of lines<br />
1. are statistically independent;<br />
2. are distributed according <strong>to</strong> a Gaussian distribution centered at the true value of the<br />
parameter;<br />
60
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 61<br />
3. have a æxed covariance æ =<br />
0<br />
@ ç2 ç<br />
ç çr<br />
ç çr<br />
ç 2 r<br />
1<br />
A .<br />
More precisely, let èç; rè be the ëtrue" value of the parameters of a line and èææ; æRè<br />
be the s<strong>to</strong>chastic variable denoting the perturbation of èç; rè. The joint probability density<br />
function of ææ and æR is then given by<br />
pèææ; æRè=<br />
and is centered around èç; rè.<br />
1<br />
2ç p jæj expë,1 2 èææ; æRèæ,1 èææ; æRè t ë<br />
5.2 The Spread Function of the Invariants<br />
In the following, we derive, from the perturbations of the basis lines and the line being<br />
encoded, a closed-form formula of the perturbation of the invariants considered in the<br />
preceding chapter.<br />
We ærst introduce a lemma and a corollary derived from the lemma.<br />
Lemma: Let X 1 ;X 2 ; :::; X n have amultivariate Gaussian distribution with vec<strong>to</strong>r u of<br />
means and positive deænite covariance matrix æ. Then the moment-generating function<br />
of the multivariate Gaussian p:d:f: is given by<br />
C =<br />
æv<br />
expëv t u + vt ë; for all real vec<strong>to</strong>rs of v:<br />
2<br />
0 1<br />
@ c 11 ::: c 1n<br />
A, a real 2æn matrix. Then the random variable Y is NèCu; CæCt è.<br />
c 21 ::: c 2n<br />
Corollary: Let Y t =èY 1 ;Y 2 è such that Y = CX, where X t =èX 1 ;X 2 ; :::; X n è and<br />
Proof. The moment-generating function of the distribution Y is given by<br />
Applying the above theorem, we have<br />
Mèvè=Eèe vtY è=Eèe vt CX è=Eèe èCt vè tX è<br />
vè<br />
Mèvè = expëèC t vè t u + èCt t æèC t vè<br />
ë<br />
2<br />
èCæC<br />
= expëv t èCuè+ vt t èv<br />
ë<br />
2<br />
Thus the random variable Y is NèCu; CæC t è.<br />
Q.E.D.
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 62<br />
Now, we go back <strong>to</strong> consider the perturbation of the invariant from the perturbation<br />
of lines. In the preceding chapter, we have given formulae of the invariants èç 0 ;r 0 è's for<br />
various transformations. These invariants are functions of the line being encoded, èç; rè t<br />
and the basis lines, èç i ;r i è t :<br />
i=1;...;k<br />
èç 0 ;r 0 è t = fèèç; rè t ; èç 1 ;r 1 è t ; æææ; èç k ;r k è t è:<br />
Equivalently we may rewrite the above equation as follows:<br />
ç 0 = f 0 èèç; rè t ; èç 1 ;r 1 è t ; æææ; èç k ;r k è t è; è5.1è<br />
r 0 = f 00 èèç; rè t ; èç 1 ;r 1 è t ; æææ; èç k ;r k è t è: è5.2è<br />
The exact forms of Eq. è5.1è and Eq. è5.2è, <strong>to</strong>gether with the value of k èi.e., the number<br />
of basis lines neededè, depend on the viewing transformation under consideration and are<br />
given in section 4.3.1, 4.3.2, 4.3.3 and 4.3.4 respectively.<br />
Introducing perturbations of the line parameters èç; rè t and èç i ;r i è t i=1;...;k<br />
results in a<br />
perturbation of the computed invariant èç 0 ;r 0 è t ,having the following form:<br />
ç 0 + æç 0 = f 0 èèç + æç; r + ærè t ; èç 1 + æç 1 ;r 1 + ær 1 è t ; æææ; èç k + æç k ;r k + ær k è t è; è5.3è<br />
r 0 + ær 0 = f 00 èèç + æç; r + ærè t ; èç 1 + æç 1 ;r 1 + ær 1 è t ; æææ; èç k + æç k ;r k + ær k è t è: è5.4è<br />
Expanding Eq. è5.3è and Eq. è5.4è in Maclaurin series and ignoring second and higher<br />
order perturbation terms and then subtracting Eq. è5.1è and Eq. è5.2è from them, we have<br />
kX ç<br />
æç 0 = @ç0 @ç0<br />
@ç<br />
ç<br />
0<br />
æç +<br />
@ç @r ær + æç i + @ç0<br />
ær i<br />
@ç<br />
i=1<br />
i @r i<br />
= c 11 æç + c 12 ær + c 13 æç 1 + c 14 ær 1 +...+c 1;2k+1æç k + c 1;2k+2ær k ; è5.5è<br />
kX ç<br />
ær 0 = @r0 @r0<br />
@r<br />
ç<br />
0<br />
æç +<br />
@ç @r ær + æç i + @r0<br />
ær i<br />
@ç i @r i<br />
i=1<br />
= c 21 æç + c 22 ær + c 23 æç 1 + c 24 ær 1 +...+c 2;2k+1æç k + c 2;2k+2ær k ; è5.6è<br />
where c 11 = @ç 0 =@ç, c 12 = @ç 0 =@r, c 21 = @r 0 =@ç, c 22 = @r 0 =@r, c 1;i = @ç 0 =@ç i,2,<br />
c 2;i = @r 0 =@ç i,2, for i = 3; 5; ...; 2k + 1, and c 1;i = @ç 0 =@r i,2, c 2;i = @r 0 =@r i,2, for<br />
i =4; 6; ...; 2k +2.<br />
Let èææ; æRè, èææ i ; æR i è i=1;...;k and èææ 0 ; æR 0 è be s<strong>to</strong>chastic variables denoting the<br />
perturbations of èç; rè, èç i ;r i è i=1;...;k and èç 0 ;r 0 è respectively. Eq. è5.5è and Eq. è5.6è can
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 63<br />
be rewritten as<br />
ææ 0 = c 11 ææ + c 12 æR + c 13 ææ 1 + c 14 æR 1 +...+c 1;2k+1ææ k + c 1;2k+2æR k ;<br />
æR 0 = c 21 ææ + c 22 æR + c 23 ææ 1 + c 24 æR 1 +...+c 2;2k+1ææ k + c 2;2k+2æR k :<br />
Since pèææ; æRè and pèææ i ; æR i è i=1;...;k<br />
are Gaussian and independent, we have that<br />
p:d:f: pèææ; æR; ææ 1 ; æR 1 ; ...; ææ k ; æR k è is also Gaussian. More precisely,<br />
pèææ; æR; ææ 1 ; æR 1 ; ...; ææ k ; æR k è<br />
= pèææ; æRè pèææ 1 ; æR 1 è æææpèææ k ; æR k è<br />
1<br />
= è<br />
è 1 expë, ^æ<br />
è2çè<br />
qj ,1 Vë<br />
k+1 ^æj 2 Vt<br />
where<br />
V =<br />
^æ =<br />
0<br />
èææ; æR; ææ 1 ; æR 1 ; ...; ææ k ; æR k è t<br />
B@<br />
ç 2 ç<br />
ç çr ... ... 0 0<br />
ç çr ç 2 0 0<br />
r .<br />
. ..<br />
. .<br />
.<br />
..<br />
. 0 0 ç 2 ç<br />
ç çr<br />
1<br />
CA<br />
Since<br />
èææ 0 ; æR 0 è t =<br />
0 0 ... ... ç çr ç 2 r<br />
è2k+2èæè2k+2è:<br />
0<br />
@ c 11 c 12 ... c 1;2k+1 c 1;2k+2<br />
c 21 c 22 ... c 2;2k+1 c 2;2k+2<br />
1<br />
A V;<br />
from the above corollary, we can show that èææ 0 ; æR 0 è also has a Gaussian distribution<br />
having p:d:f:<br />
pèææ 0 ; æR 0 è=<br />
and covariance matrix<br />
2ç<br />
1<br />
p<br />
jæ 1 expë, 0 j 2 èææ0 ; æR 0 èæ 0,1 èææ 0 ; æR 0 è t ë<br />
æ 0 =<br />
0<br />
@ ç2 ç 0 ç ç 0 r 0<br />
ç 0 0 ç r ç 2 r 0<br />
1<br />
A ;<br />
è5.7è
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 64<br />
where<br />
ç 2 ç 0 = èc2 + 11 c2 + æææ+ 13 c2 1;2k+1 èç2 +èc2 + ç 12 c2 + æææ+ 14 c2 1;2k+2 èç2 + r 11 c 12 + c 13 c 14 + æææ+ c 1;2k+1 c 1;2k+2èç çr ;<br />
ç 2 r 0 = èc2 21<br />
+ c 2 23<br />
+ æææ+ c 2 2;2k+1èç 2 +èc 2 ç 22<br />
+ c 2 24<br />
+ æææ+ c 2 2;2k+2èç 2 +<br />
r 21 c 22 + c 23 c 24 + æææ+ c 2;2k+1 c 2;2k+2èç çr ;<br />
ç ç<br />
0 r<br />
0 = èc 11 c 21 + c 13 c 23 + æææ+ c 1;2k+1 c 2;2k+1èç 2 +<br />
ç 12 c 22 + c 14 c 24 + æææ+ c 1;2k+2 c 2;2k+2èç 2 +<br />
r 11 c 22 + c 13 c 24 + æææ+ c 1;2k+1 c 2;2k+2 + c 21 c 12 + c 23 c 14 + æææ+ c 2;2k+1 c 1;2k+2èç çr :<br />
Recall that in computing the invariant èç 0 ;r 0 è t ,wemay adjust the value of ç 0<br />
by adding<br />
ç or ,ç with the sign of r 0 æipped accordingly such that ç 0 is in ë0::çè. If this is the case,<br />
then the covariance matrix æ 0 must be adjusted by æipping the sign of ç 0 ç r<br />
0.<br />
5.3 Spread Functions under Various Transformation Groups<br />
In the following subsections, we specialize the general result derived in the preceding section<br />
<strong>to</strong> the various transformation groups considered earlier.<br />
5.3.1 Rigid-Invariant Spread Function<br />
From section 4.3.1, we have two sets of invariants èdue <strong>to</strong> 180 o -rotation ambiguityè for<br />
rigid transformations as follows:<br />
and<br />
ç 0 = ç , ç 1 + ç 2 ;<br />
r 0 = r + cscèç 1 , ç 2èèr 2 sinèç , ç 1è , r 1 sinèç , ç 2èè;<br />
ç 0 = ç , ç 1 , ç 2 ;<br />
r 0 = r + cscèç 1 , ç2èèr 2 sinèç , ç 1è , r 1 sinèç , ç 2èè;<br />
where èç; rè is the parameter of the line being encoded and èç i ;r i è i=1;2 are the parameters<br />
of the basis lines.
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 65<br />
Let èææ; æRè, èææ i ; æR i è i=1;2 and èææ 0 ; æR 0 è be s<strong>to</strong>chastic variables denoting respectively<br />
the perturbations of èç; rè, èç i<br />
;r i<br />
è i=1;2 and the computed invariant èç 0 ;r 0 è. From<br />
section 5.2, we have<br />
ææ 0 = c11ææ + c12æR + c13ææ1 + c14æR1 + c15ææ2 + c16æR2;<br />
æR 0 = c21ææ + c22æR + c23ææ1 + c24æR1 + c25ææ2 + c26æR2:<br />
Computing c i;j<br />
's, we obtain,<br />
c11 =1,<br />
c12 =0, c22 =1,<br />
c13 = ,1,<br />
c14 =0,<br />
c15 =0,<br />
c16 =0,<br />
c21 = cscèç1 , ç2èèr2 cosèç , ç1è , r1 cosèç , ç2èè,<br />
c23 = , sinèç , ç2è csc 2 èç1 , ç2èèr2 , r1 cosèç1 , ç2èè,<br />
c24 = , sinèç , ç2è cscèç1 , ç2è,<br />
c25 = , sinèç , ç1è csc 2 èç1 , ç2èèr1 , r2 cosèç1 , ç2èè,<br />
c26 = sinèç , ç1è cscèç1 , ç2è.<br />
Thus we conclude that èææ 0 ; æR 0 è has a Gaussian distribution with p:d:f<br />
and covariance matrix<br />
pèææ 0 ; æR 0 è=<br />
1<br />
2ç<br />
pjæj expë, 1 ; æR èæ ,1 0 èææ 0 ; æR 0 è t ë<br />
2 èææ0<br />
æ =<br />
0<br />
@ ç2 ç 0 ç ç 0 r 0<br />
ç 0 0 ç r<br />
ç 2 r 0<br />
1<br />
A ;<br />
where<br />
ç 2 ç 0 = 2ç2 ç ;<br />
ç 2 r 0 = èc2 21 + c2 23 + c2 25 èç2 ç +è1+c2 24 + c2 26 èç2 r +2èc 21 + c23c24 + c25c26èç çr<br />
;<br />
ç ç<br />
0 r<br />
0 = èc21 , c23èç 2 ç +èc 22 , c24èç çr<br />
:<br />
5.3.2 Similarity-Invariant Spread Function<br />
From section 4.3.2, we have the following formula for the invariant for similarity transformations:<br />
ç 0 = = ç , ç1 + ç 2 ;<br />
r 0 = sinèç1 , ç3è B=jAj
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 66<br />
and<br />
A = r 1 sinèç2 , ç3è+r2 sinèç3 , ç1è +r3 sinèç1 , ç2è;<br />
B = r2 sinèç , ç1è , r1 sinèç , ç2è +r sinèç1 , ç2è;<br />
where èç; rè is the parameter of the line being encoded and èç i ;r i è i=1;2;3 are the parameters<br />
of the basis lines.<br />
Let èææ; æRè, èææ i<br />
; æR i<br />
è i=1;2;3 and èææ 0 ; æR 0 è be s<strong>to</strong>chastic variables denoting respectively<br />
the perturbations of èç; rè, èç i<br />
;r i<br />
è i=1;2;3 and the computed invariant èç 0 ;r 0 è.<br />
From section 5.2, we have<br />
ææ 0 = c11ææ + c12æR + c13ææ1 + c14æR1 + c15ææ2 + c16æR2 + c17ææ3 + c18æR3;<br />
æR 0 = c21ææ + c22æR + c23ææ1 + c24æR1 + c25ææ2 + c26æR2 + c27ææ3 + c28æR3:<br />
Computing c i;j<br />
's, we obtain,<br />
c11 =1,<br />
c12 =0,<br />
c13 = ,1,<br />
c14 =0,<br />
c15 =0,<br />
c16 =0,<br />
c17 =0,<br />
c18 =0,<br />
c21 = r2 cosèç , ç1è , r1 cosèç , ç2è=jAj,<br />
c22 = sinèç1 , ç2è=jAj,<br />
c23 =ëèr2 cosèç1 , ç3è , r3 cosèç1 , ç2èèB=A,<br />
r2 cosèç , ç1è +r cosèç1 , ç2èë=jAj,<br />
c24 =ë, sinèç2 , ç3èB=A , sinèç , ç2èë=jAj,<br />
c25 =ëèr3 cosèç1 , ç2è , r1 cosèç2 , ç3èèB=A+<br />
r1 cosèç , ç2è , r cosèç1 , ç2èë=jAj,<br />
c26 = ësinèç1 , ç3èB=A + sinèç , ç1èë=jAj,<br />
c27 =èr1 cosèç2 , ç3è , r2 cosèç1 , ç3èèB=èAjAjè,<br />
c28 = , sinèç1 , ç2èB=èAjAjè.<br />
Thus we conclude that èææ 0 ; æR 0 è has a Gaussian distribution with p:d:f<br />
pèææ 0 ; æR 0 è=<br />
1<br />
2ç<br />
pjæj expë, 1 ; æR èæ ,1 0 èææ 0 ; æR 0 è t ë<br />
2 èææ0<br />
and covariance matrix<br />
æ =<br />
0<br />
@ ç2 ç 0 ç ç 0 r 0<br />
ç 0 0 ç r<br />
ç 2 r 0<br />
1<br />
A ;
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 67<br />
where<br />
ç 2 ç 0 = 2ç2;<br />
ç<br />
ç 2 r 0 = èc2 21 + c2 23 + c2 25 + c2 27 èç2 +èc2 ç 22 + c2 24 + c2 26 + c2 28 èç2 + r<br />
2èc21c22 + c23c24 + c25c26 + c27c28èç çr<br />
;<br />
ç ç<br />
0 r<br />
0 = èc21 , c23èç 2 ç +èc 22 , c24èç çr<br />
:<br />
5.3.3 Aæne-Invariant Spread Function<br />
From section 4.3.3, the invariant for aæne transformations has the following form:<br />
ç 0 = tan ,1 èA cscèç1 , ç3è sinèç1 , çè cscèç1 , ç2è=<br />
r 0 = C=D<br />
A cscèç2 , ç3è sinèç2 , çè cscèç1 , ç2èè;<br />
and<br />
A = r3 sinèç1 , ç2è +r2 sinèç3 , ç1è +r1 sinèç2 , ç3è;<br />
B = csc 2 èç1 , ç3è sin 2 èç1 , ç4è + csc 2 èç2 , ç3è sin 2 èç2 , ç4è;<br />
C = r1 cscèç1 , ç2è sinèç2 , çè , r2 cscèç1 , ç2è sinèç1 , çè +r;<br />
q<br />
D = A 2 B csc 2 èç1 , ç2è;<br />
where èç; rè is the parameter of the line being encoded and èç i<br />
;r i<br />
è i=1;2;3 are the parameters<br />
of the basis lines.<br />
Let èææ; æRè, èææ i<br />
; æR i<br />
è i=1;2;3 and èææ 0 ; æR 0 è be s<strong>to</strong>chastic variables denoting respectively<br />
the perturbations of èç; rè, èç i<br />
;r i<br />
è i=1;2;3 and the computed invariant èç 0 ;r 0 è.<br />
From section 5.2, we have<br />
ææ 0 = c11ææ + c12æR + c13ææ1 + c14æR1 + c15ææ2 + c16æR2 + c17ææ3 + c18æR3;<br />
æR 0 = c21ææ + c22æR + c23ææ1 + c24æR1 + c25ææ2 + c26æR2 + c27ææ3 + c28æR3:<br />
Computing c i;j<br />
's and writing E = cscèç1 , ç3è cscèç2 , ç3è=B, we obtain,<br />
c11 = sinèç1 , ç2èE;<br />
c12 = 0;
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 68<br />
c 13 = , cscèç 1 , ç 3 è sinèç , ç 2 è sinèç , ç 3 èE;<br />
c 14 = 0;<br />
c 15 = cscèç 2 , ç 3 è sinèç , ç 1 è sinèç , ç 3 èE;<br />
c 16 = 0;<br />
c 17 = , cscèç 1 , ç 3 è cscèç 2 , ç 3 è sinèç 1 , ç 2 è sinèç , ç 1 è sinèç , ç 2 èE;<br />
c 18 = 0;<br />
c 21 = ë,ècosèç , ç 1 è csc 2 èç 1 , ç 3 è sinèç , ç 1 è+<br />
cosèç , ç 2 è csc 2 èç 2 , ç 3 è sinèç , ç 2 èèC=B +<br />
cscèç 1 , ç 2 èèr 2 cosèç , ç 1 è , r 1 cosèç , ç 2 èèë=D;<br />
c 22 = 1=D;<br />
c 23 = ë, cscèç 1 , ç 2 èèr 2 cosèç , ç 1 è + cotèç 1 , ç 2 èèr 2 sinèç , ç 1 è , r 1 sinèç , ç 2 èèè ,<br />
Cèèr 3 cosèç 1 , ç 2 è , r 2 cosèç 1 , ç 3 èè=A ,<br />
csc 2 èç 1 , ç 3 è sinèç , ç 1 èècosèç , ç 1 è + cotèç 1 , ç 3 è sinèç , ç 1 èè=Bè ,<br />
cotèç 1 , ç 2 èë=D;<br />
c 24 = ë, sinèç 2 , ç 3 èAC , cscèç 1 , ç 2 è sinèç , ç 2 èë=D;<br />
c 25 = ëcscèç 1 , ç 2 èèr 1 cosèç , ç 2 è + cotèç 1 , ç 2 èèr 2 sinèç , ç 1 è , r 1 sinèç , ç 2 èèè ,<br />
Cèèr 1 cosèç 2 , ç 3 è , r 3 cosèç 1 , ç 2 èè=A ,<br />
csc 2 èç 2 , ç 3 è sinèç , ç 2 èècosèç , ç 2 è + cotèç 2 , ç 3 è sinèç , ç 2 èè=Bè +<br />
cotèç 1 , ç 2 èë=D;<br />
c 26 = ësinèç 1 , ç 3 èC=A + cscèç 1 , ç 2 è sinèç , ç 1 èë=D;<br />
c 27 = ë,Cèèr 2 cosèç 1 , ç 3 è , r 1 cosèç 2 , ç 3 èè=A +<br />
ècotèç 1 , ç 3 è csc 2 èç 1 , ç 3 è sin 2 èç , ç 1 è+<br />
cotèç 2 , ç 3 è csc 2 èç 2 , ç 3 è sin 2 èç , ç 2 èè=Bèë=D;<br />
c 28 = ë, sinèç 1 , ç 2 èC=Aë=D:<br />
Thus we conclude that èææ 0 ; æR 0 è has a Gaussian distribution with p:d:f<br />
pèææ 0 ; æR 0 è =<br />
1<br />
2ç p jæj expë, 1 2 èææ0 ; æR 0 èæ ,1 èææ 0 ; æR 0 è t ë<br />
è5.8è
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 69<br />
and covariance matrix<br />
where<br />
æ =<br />
0<br />
@ ç2 ç 0 ç ç 0 r 0<br />
ç 0 0 ç r<br />
ç 2 r 0<br />
1<br />
A ;<br />
ç 2 ç 0 = èc2 11 + c 2 13 + c 2 15 + c 2 17èç 2 ; ç<br />
ç 2 r 0 = èc2 21 + c2 23 + c2 25 + c2 27 èç2 +èc2 ç 22 + c2 24 + c2 26 + c2 28 èç2 + r<br />
2èc21c22 + c23c24 + c25c26 + c27c28èç çr<br />
;<br />
ç 0 0 ç r<br />
= èc11c21 + c13c23 + c15c25 + c17c27èç 2 ç<br />
+èc11c22 + c13c24 + c15c26 + c17c28èç çr<br />
:<br />
5.3.4 Projective-Invariant Spread Function<br />
From section 4.3.4, the invariant for projective transformations is as follows:<br />
ç 0 = tan ,1 èDEF=ABCè;<br />
,GCF<br />
r 0 = p<br />
èABCè 2 +èDEFè 2<br />
and<br />
A = èr3 sinèç1 , ç2è , r2 sinèç1 , ç3è+r1 sinèç2 , ç3èè;<br />
B = èr4 sinèç , ç2è , r2 sinèç , ç4è+r sinèç2 , ç4èè;<br />
C = èr4 sinèç1 , ç3è , r3 sinèç1 , ç4è+r1 sinèç3 , ç4èè;<br />
D = è,r3 sinèç , ç1è+r1 sinèç , ç3è , r sinèç1 , ç3èè;<br />
E = èr4 sinèç1 , ç2è , r2 sinèç1 , ç4è+r1 sinèç2 , ç4èè;<br />
F = èr4 sinèç2 , ç3è , r3 sinèç2 , ç4è+r2 sinèç3 , ç4èè;<br />
G = èr2 sinèç , ç1è , r1 sinèç , ç2è+r sinèç1 , ç2èè;<br />
where èç; rè is the parameter of the line being encoded and èç i<br />
;r i<br />
è i=1;2;3;4 are the parameters<br />
of the basis lines.<br />
Let èææ; æRè, èææ i<br />
; æR i<br />
è i=1;2;3;4 and èææ 0 ; æR 0 è be s<strong>to</strong>chastic variables denoting<br />
respectively the perturbations of èç; rè, èç i<br />
;r i<br />
è i=1;2;3;4 and the computed invariant èç 0 ;r 0 è.<br />
From section 5.2, we have<br />
ææ 0 = c11ææ + c12æR + c13ææ1 + c14æR1 + c15ææ2 +
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 70<br />
c 16 æR 2 + c 17 ææ 3 + c 18 æR 3 + c 19 ææ 4 + c 1;10 æR 4 ;<br />
æR 0 = c 21 ææ + c 22 æR + c 23 ææ 1 + c 24 æR 1 + c 25 ææ 2 +<br />
c 26 æR 2 + c 27 ææ 3 + c 28 æR 3 + c 29 ææ 4 + c 2;10 æR 4 :<br />
Computing c i;j 's, we obtain,<br />
c 11 = ëè,r 4 cosèç , ç 2 è+r 2 cosèç , ç 4 èèI=B +<br />
è,r 3 cosèç , ç 1 è+r 1 cosèç , ç 3 èèEFë=HK;<br />
c 12 = ësinèç 2 , ç 4 èI=B , sinèç 1 , ç 3 èEFë=HK;<br />
c 13 = ë,èr 4 cosèç 1 , ç 3 è , r 3 cosèç 1 , ç 4 èèI=C +<br />
èr 4 cosèç 1 , ç 2 è , r 2 cosèç 1 , ç 4 èèDF ,<br />
èr 3 cosèç 1 , ç 2 è , r 2 cosèç 1 , ç 3 èèI=A +<br />
èr 3 cosèç , ç 1 è , r cosèç 1 , ç 3 èèEFë=HK;<br />
c 14 = ë, sinèç 3 , ç 4 èI=C + sinèç 2 , ç 4 èDF ,<br />
sinèç 2 , ç 3 èI=A + sinèç , ç 3 èEFë=HK;<br />
c 15 = ë,è,r 4 cosèç 2 , ç 3 è+r 3 cosèç 2 , ç 4 èèDE +<br />
è,r 4 cosèç 1 , ç 2 è+r 1 cosèç 2 , ç 4 èèDF ,<br />
è,r 4 cosèç , ç 2 è+r cosèç 2 , ç 4 èèI=B,<br />
è,r 3 cosèç 1 , ç 2 è+r 1 cosèç 2 , ç 3 èèI=Aë=HK;<br />
c 16 = ësinèç 3 , ç 4 èDE , sinèç 1 , ç 4 èDF +<br />
sinèç , ç 4 èI=B + sinèç 1 , ç 3 èI=Aë=HK;<br />
c 17 = ë,èr 4 cosèç 2 , ç 3 è , r 2 cosèç 3 , ç 4 èèDE +<br />
èr 4 cosèç 1 , ç 3 è , r 1 cosèç 3 , ç 4 èèI=C,<br />
èr 2 cosèç 1 , ç 3 è , r 1 cosèç 2 , ç 3 èèI=A +<br />
è,r 1 cosèç , ç 3 è+r cosèç 1 , ç 3 èèEFë=HK;<br />
c 18 = ë, sinèç 2 , ç 4 èDE + sinèç 1 , ç 4 èI=C ,<br />
sinèç 1 , ç 2 èI=A , sinèç , ç 1 èEFë=HK;<br />
c 19 = ë,è,r 3 cosèç 2 , ç 4 è+r 2 cosèç 3 , ç 4 èèDE +
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 71<br />
è,r 3 cosèç 1 , ç 4 è+r 1 cosèç 3 , ç 4 èèI=C +<br />
èr 2 cosèç 1 , ç 4 è , r 1 cosèç 2 , ç 4 èèDF ,<br />
èr 2 cosèç , ç 4 è , r cosèç 2 , ç 4 èèI=Bë=HK;<br />
c 1;10 = ësinèç 2 , ç 3 èDE , sinèç 1 , ç 3 èI=C +<br />
sinèç 1 , ç 2 èDF , sinèç , ç 2 èI=Bë=HK;<br />
c 21 = ëèr 4 cosèç , ç 2 è , r 2 cosèç , ç 4 èèHAC +<br />
è,r 3 cosèç , ç 1 è+r 1 cosèç , ç 3 èèIEFëJ=L 3 +<br />
ë,r 2 cosèç , ç 1 è+r 1 cosèç , ç 2 èëCF=L;<br />
c 22 = ësinèç 2 , ç 4 èHAC , sinèç 1 , ç 3 èIEFëJ=L 3 ,<br />
sinèç 1 , ç 2 èCF=L;<br />
c 23 = ëèr 4 cosèç 1 , ç 3 è , r 3 cosèç 1 , ç 4 èèHAB +<br />
èr 3 cosèç 1 , ç 2 è , r 2 cosèç 1 , ç 3 èèHBC +<br />
èr 4 cosèç 1 , ç 2 è , r 2 cosèç 1 , ç 4 èèIDF +<br />
èr 3 cosèç , ç 1 è+r cosèç 1 , ç 3 èèIEFëJ=L 3 ,<br />
ër 4 cosèç 1 , ç 3 è , r 3 cosèç 1 , ç 4 èëGF=L +<br />
ër 2 cosèç , ç 1 è , r cosèç 1 , ç 2 èëCF=L;<br />
c 24 = ësinèç 3 , ç 4 èHAB + sinèç 2 , ç 3 èHBC +<br />
sinèç 2 , ç 4 èIDF + sinèç , ç 3 èIEFëJ=L 3 ,<br />
sinèç 3 , ç 4 èGF=L + sinèç , ç 2 èCF=L;<br />
c 25 = ëè,r 4 cosèç , ç 2 è+r cosèç 2 , ç 4 èèHAC +<br />
è,r 3 cosèç 1 , ç 2 è+r 1 cosèç 2 , ç 3 èèHBC +<br />
èr 4 cosèç 2 , ç 3 è , r 3 cosèç 2 , ç 4 èèIDE,<br />
èr 4 cosèç 1 , ç 2 è , r 1 cosèç 2 , ç 4 èèIDFëJ=L 3 ,<br />
ër 4 cosèç 2 , ç 3 è , r 3 cosèç 2 , ç 4 èëGC=L +<br />
ë,r 1 cosèç , ç 2 è+r cosèç 1 , ç 2 èëCF=L;<br />
c 26 = ë, sinèç , ç 4 èHAC , sinèç 1 , ç 3 èHBC +<br />
sinèç 3 , ç 4 èIDE, sinèç 1 , ç 4 èIDFëJ=L 3 ,<br />
sinèç 3 , ç 4 èGC=L , sinèç , ç 1 èCF=L;
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 72<br />
c 27 = ë,èr 4 cosèç 1 , ç 3 è , r 1 cosèç 3 , ç 4 èèHAB +<br />
èr 2 cosèç 1 , ç 3 è , r 1 cosèç 2 , ç 3 èèHBC ,<br />
èr 4 cosèç 2 , ç 3 è , r 2 cosèç 3 , ç 4 èèIDE +<br />
è,r 1 cosèç , ç 3 è+r cosèç 1 , ç 3 èèIEFëJ=L 3 ,<br />
ë,r 4 cosèç 2 , ç 3 è+r 2 cosèç 3 , ç 4 èëGC=L ,<br />
ë,r 4 cosèç 1 , ç 3 è+r 1 cosèç 3 , ç 4 èëGF=L;<br />
c 28 = ë, sinèç 1 , ç 4 èHAB + sinèç 1 , ç 2 èHBC ,<br />
sinèç 2 , ç 4 èIDE, sinèç , ç 1 èIEFëJ=L 3 +<br />
sinèç 2 , ç 4 èGC=L + sinèç 1 , ç 4 èGF=L;<br />
c 29 = ëèr 3 cosèç 1 , ç 4 è , r 1 cosèç 3 , ç 4 èèHAB +<br />
èr 2 cosèç , ç 4 è , r cosèç 2 , ç 4 èèHAC +<br />
èr 3 cosèç 2 , ç 4 è , r 2 cosèç 3 , ç 4 èèIDE +<br />
èr 2 cosèç 1 , ç 4 è , r 1 cosèç 2 , ç 4 èèIDFëJ=L 3 ,<br />
ër 3 cosèç 2 , ç 4 è , r 2 cosèç 3 , ç 4 èëGC=L ,<br />
ër 3 cosèç 1 , ç 4 è , r 1 cosèç 3 , ç 4 èëGF=L;<br />
c 2;10 = ësinèç 1 , ç 3 èHAB + sinèç , ç 2 èHAC +<br />
sinèç 2 , ç 3 èIDE + sinèç 1 , ç 2 èIDFëJ=L 3 ,<br />
sinèç 2 , ç 3 èGC=L , sinèç 1 , ç 3 èGF=L:<br />
and<br />
H = ABC; I = DEF; J = GCF; K =è1+I 2 =H 2 è; L =<br />
p<br />
H 2 + I 2 :<br />
Thus we conclude that èææ 0 ; æR 0 è has a Gaussian distribution with p:d:f<br />
pèææ 0 ; æR 0 è=<br />
1<br />
2ç p jæj expë, 1 2 èææ0 ; æR 0 èæ ,1 èææ 0 ; æR 0 è t ë<br />
and covariance matrix<br />
æ =<br />
0<br />
@ ç2 ç 0 ç ç 0 r 0<br />
ç 0 0 ç r<br />
ç 2 r 0<br />
1<br />
A ;
CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 73<br />
where<br />
ç 2 ç 0 = èc2 11 + c2 13 + æææ+ c2 19 èç2 ç +èc2 12 + c2 14 + æææ+ c2 1;10 èç2 r +<br />
2èc11 c 12 + c13 c 14 + æææ+ c19 c 1;10èç çr<br />
;<br />
ç 2 r 0 = èc2 21 + c2 23 + æææ+ c2 29 èç2 ç +èc2 22 + c2 24 + æææ+ c2 2;10 èç2 r +<br />
2èc21 c 22 + c23 c 24 + æææ+ c29 c 2;10èç çr<br />
;<br />
ç ç<br />
0 r<br />
0 = èc11 c 21 + c13 c 23 + æææ+ c19 c 29èç 2 ç +èc 12 c 22 + c14 c 24 + æææ+ c1;10 c 2;10èç 2 r +<br />
èc11 c 22 + c13 c 24 + æææ+ c19 c 2;10 + c21 c 12 + c23 c 14 + æææ+ c29 c 1;10èç çr<br />
:
Chapter 6<br />
Bayesian <strong>Line</strong> Invariant Matching<br />
Our recognition scheme builds upon the geometric hashing method, as reviewed in chapter<br />
2, which also notes some weaknesses of the geometric hashing technique. Even when<br />
we use an improved Hough transform <strong>to</strong> detect line features in a degraded image, we still<br />
face the problem dealing with image noise and with the resulting problem of noise in the<br />
invariants used for hashing.<br />
To minimize these noise eæects, we need <strong>to</strong> derive a theoretically justiæed scheme for<br />
weighted voting among hash bins. We show howaweighted voting scheme can be used <strong>to</strong><br />
cope with the problem. This is the aim of the probabilistic procedure which wenowgoon<br />
<strong>to</strong> describe.<br />
6.1 A Measure of Matching between Scene and Model Invariants<br />
In the geometric hashing method described in chapter 2, during recognition a computed<br />
scene invariant is used <strong>to</strong> index the geometric hash table and tally one vote for the entry<br />
retrieved. However, considering the presence of noise, a scene invariant can not exactly<br />
hit the hash table entry where its matching model invariant is supposed <strong>to</strong> reside. Gavrila<br />
and Groen ë18ë assume a bounded circular region around each hash bin and tally equal<br />
vote for all the bins within the region centered around the bin being hashed. However,<br />
our analysis of the noise eæect on computed invariants in the preceding chapter shows<br />
that the perturbation of an invariant has an approximately Gaussian distribution with a<br />
74
CHAPTER 6. BAYESIAN LINE INVARIANT MATCHING 75<br />
non-zero correlation coeæcient.<br />
This implies that the region of hash space that would<br />
need <strong>to</strong> be accessed is an ellipse, instead of a circle, centered around the ëtrue" value of<br />
the invariant. Moreover, this perturbation also depends on the basis and the line being<br />
encoded, which means that diæerent invariants should be associated with diæerent size,<br />
shape and orientation of hash space regions for error <strong>to</strong>lerance.<br />
The nodes in hash space that more ëclosely match" the value of the invariant should get<br />
a higher weighted score than those which are far away, in a manner accurately representing<br />
the behavior of the noise which aæects these invariants. Let inv s be a scene invariant and<br />
let inv Mi be a model invariant computed from a basis B Mi and a line l j of M i . Bayes'<br />
theorem ë16ë allows us <strong>to</strong> precisely determine the probability that inv s<br />
presence of a certain ëM i ;B Mi ;l j ë:<br />
where<br />
P èëM i ;B Mi ;l j ëjinv s è = P èëM i ;B Mi ;l j ëè pèinv sjëM i ;B Mi ;l j ëè<br />
pèinv s è<br />
æ P èëM i ;B Mi ;l j ëè is the a priori probability that M i<br />
B Mi<br />
results from the<br />
is embedded in the scene and<br />
is selected <strong>to</strong> encode l j . We assume that all the models are equally likely <strong>to</strong><br />
appear and have the same number of features, which is a simpliæcation that could<br />
be easily generalized. Thus P èëM i ;B Mi ;l j ëè is the same for all ëmodel, basis, lineë<br />
combinations. èNote that this may not be quite true if the basis selection procedure<br />
favors long bases instead of selecting bases randomly.è<br />
æ pèinv s jëM i ;B Mi ;l j ëè is the conditional p:d:f: of observing inv s given that M i appears<br />
in the scene with B Mi<br />
being selected and l j being encoded. It consists of two parts:<br />
the spike induced by ëM i ;B Mi ;l j ë and a background distribution.<br />
æ pèinv s è is the p:d:f: of observing inv s , regardless of which ëM i ;B Mi ;l j ë induces it.<br />
Assume that no twoinvariants of the same model and with a æxed basis locate near each<br />
other in hash space èi.e., model features are distinct and separate; If more than one model<br />
feature lands in approximately the same location, then the features should be coalesced<br />
in<strong>to</strong> a single featureè, inv s can be close <strong>to</strong> at most one model invariant. Thus, given inv s ,<br />
the supporting evidence that M i appears in the scene and the basis B Mi is selected is then<br />
P èëM i ;B Mi ëèjinv s è ç max P èëM i ;B Mi ;l j ëjinv s è<br />
j
CHAPTER 6. BAYESIAN LINE INVARIANT MATCHING 76<br />
è<br />
max j pèinv s jëM i ;B Mi ;l j ëè<br />
: è6.1è<br />
pèinv s è<br />
6.2 Evidence Synthesis by Bayesian Reasoning<br />
In the recognition stage, a certain scene basis B s is probed and the invariants of all the<br />
remaining features with respect <strong>to</strong> this basis are computed. Eachinvariant provides various<br />
degrees of support for the existence of ëM i ;B Mi ë combinations, assuming B s corresponds<br />
<strong>to</strong> B Mi . If for every ëM i ;B Mi ë combination we synthesize support from all the invariants<br />
and hypothesize the one with maximum support, we in fact exercise a maximum likelihood<br />
approach <strong>to</strong> object recognition.<br />
In order <strong>to</strong> synthesize support coming from diæerent invariants, we have the following<br />
formula, based on a probability derivation <strong>using</strong> Bayes' theorem ë12ë :<br />
P èëM i ;B Mi ëjinv s1 ; ...; inv sn è<br />
P èëM i ;B Mi ëè<br />
= pèinv s 1<br />
; ...; inv sn jëM i ;B Mi ëè<br />
pèinv s1 ; ...; inv sn è<br />
= pèinv s 1<br />
jëM i ;B Mi ëè ...pèinv sn jëM i ;B Mi ëè<br />
èassuming conditional independenceè<br />
pèinv s1 ; ...; inv sn è<br />
1 P èëM i ;B Mi ëjinv s1 è pèinv s1 è<br />
=<br />
... P èëM i;B Mi ëjinv sn è pèinv sn è<br />
pèinv s1 ; ...; inv sn è P èëM i ;B Mi ëè<br />
P èëM i ;B Mi ëè<br />
= pèinv s 1<br />
è ...pèinv sn è<br />
pèinv s1 ; ...; inv sn è<br />
è<br />
=<br />
Thus<br />
pèinv s 1<br />
è ...pèinv sn è<br />
pèinv s1 ; ...; inv sn è<br />
1<br />
pèinv s1 ; ...; inv sn è<br />
nY<br />
k=1<br />
nY<br />
nY<br />
k=1<br />
k=1<br />
P èëM i ;B Mi ëjinv sk è<br />
P èëM i ;B Mi ëè<br />
max j pèinv sk jëM i ;B Mi ;l j ë<br />
pèinv sk è P èëM i ;B Mi ëè<br />
max j pèinv sk jëM i ;B Mi ;l j ë<br />
:<br />
P èëM i ;B Mi ëè<br />
nY<br />
P èëM i ;B Mi ëjinv s1 ; ...; inv sn è è<br />
k=1<br />
max<br />
j<br />
èby Eq. è6.1èè<br />
pèinv sk jëM i ;B Mi ;l j ëè;<br />
è6.2è<br />
since pèinv s1 ; ...; inv sn è is independent of all the hypotheses and all P èëM i ;B Mi ëè's are<br />
identical.
CHAPTER 6. BAYESIAN LINE INVARIANT MATCHING 77<br />
In the above derivation, we have assumed conditional independence among the invariants,<br />
i.e.,<br />
pèinv s1 ; ...; inv sn jëM i ;B Mi ëè = pèinv sk jëM i ;B Mi ëè;<br />
k=1<br />
which means that the expected probability distribution of a scene invariant inv sk is independent<br />
of the other scene invariants. The intuition for this conditional independence <strong>to</strong><br />
hold is that if the given inv sk does not belong <strong>to</strong> the embedded model M i , it has nothing<br />
<strong>to</strong> do with ëM i ;B Mi ë and also other scene invariants; if the given inv sk belongs <strong>to</strong> the embedded<br />
model M i , since geometric hashing applies transformation-invariant information,<br />
inv sk is destined <strong>to</strong> appear, given the hypothesis that B Mi is selected. Thus inv sk is not<br />
aæected by other invariants. A more formal argument is given in ë28ë.<br />
In computing P èëM i ;B Mi ëjinv s1 ; ...; inv sn è, we do not need <strong>to</strong> compute absolute probabilities<br />
but only relative probabilities. Since the logarithm function is mono<strong>to</strong>nic, we can<br />
apply the logarithm <strong>to</strong> both sides of Eq. è6.2è:<br />
log P èëM i ;B Mi ëjinv s1 ; ...; inv sn è = log C + log max<br />
j<br />
k=1<br />
turning products in<strong>to</strong> sums.<br />
nX<br />
nY<br />
pèinv sk jëM i ;B Mi ;l j ëè;<br />
è6.3è<br />
6.3 The Algorithm<br />
The pre-processing stage parallels our description of the preprocessing stage of the geometric<br />
hashing method reviewed in section 2.2. For a speciæc transformation group considered,<br />
we use the formulae given in section 4.3 <strong>to</strong> compute the invariants. We also compute the<br />
spread functions of these invariants <strong>using</strong> Eq. è5.7è. Both pieces of information, <strong>to</strong>gether<br />
with the identity of ëmodel, basis, lineë, are s<strong>to</strong>red in a node in the hash entry indexed by<br />
each invariant. Understanding that the spread function is Gaussian, we s<strong>to</strong>re its covariance<br />
matrix as a representation of the function. We note that each node records information<br />
about a particular ëmodel, basis, lineë combination necessary for our recognition stage.<br />
The hash table is so arranged that the position of each entry in fact represents a quantized<br />
coordinate èç; rè of hash space. In this way, range query can be easily performed.<br />
In the recognition stage, lines are detected by the Hough transform and a subset of<br />
these lines are chosen as a basis for associating invariants with all remaining lines. Eq. è6.3è
CHAPTER 6. BAYESIAN LINE INVARIANT MATCHING 78<br />
is used <strong>to</strong> accumulate support from all these invariants èlog C term, which is common <strong>to</strong><br />
all the hypotheses, can be ignoredè. We note that a straightforward use of Eq. è6.3è runs<br />
through all the ëmodel, basis, lineë combinations and thus the computational cost can be<br />
immense. By exploiting the geometric hash table prepared in the pre-processing stage,<br />
for each invariant inv sk hashed in<strong>to</strong> the hash space, we access only its nearby nodes. For<br />
each node accessed, we credit log pèinv sk jëM i ;B Mi ;l j ëè <strong>to</strong> the node, whose initial score<br />
value is set <strong>to</strong> 0. In computing log pèinv sk jëM i ;B Mi ;l j ëè, the background distribution<br />
function, compared with the spike induced by ëM i ;B Mi ;l j ë is usually negligible. We will<br />
use logarithm of Eq. è5.7è <strong>to</strong> approximate log pèinv sk jëM i ;B Mi ;l j ëè. Typically this Gaussian<br />
p:d:f: falls oæ rapidly. Wethus approximately credit the same weighted score, log c,<br />
<strong>to</strong> all those not in the neighborhood of the hash. Equivalently, wemay instead credit<br />
log pèinv sk jëM i ;B Mi ;l j ëè , log c <strong>to</strong> those nearby nodes accessed and keep intact those not<br />
in the neighborhood. After all the invariants are processed, the support for all the ëM i ;B Mi ë<br />
combinations are his<strong>to</strong>gramed. The <strong>to</strong>p few ëM i ;b Mi ë's with suæciently high votes are veriæed<br />
by transforming and superimposing the model on<strong>to</strong> the scene. A quality measure,<br />
QM, deæned as the ratio of visible boundary, is computed. If the QM is above a preset<br />
threshold èe.g. 0.5è, the hypothesis passes the veriæcation. èThe veriæcation can proceed<br />
in a hierarchical way by ærst verifying the existence of the basis and reject immediately if<br />
the basis fails veriæcation.è<br />
In the above process, each probe of a scene basis is hypothesized as an instance of a<br />
particular basis of a certain model. It is possible that the scene basis probed does not<br />
correspond <strong>to</strong> any basis of any model. In the following section, we give a probabilistic<br />
analysis of the number of probes needed <strong>to</strong> detect all the instances in the scene.<br />
6.4 An Analysis of the Number of Probings Needed<br />
Let there be n lines in the scene, n = n 1 + n 2 , where n 2 lines are spurious. Let also there<br />
be M model instances in the scene. Assuming that the degree of occlusion of each instance<br />
is the same. Thus on average each model appearing in the scene has n 1<br />
lines.<br />
M<br />
The probability ofchoosing a scene basis, consisting of k features, which happens <strong>to</strong>
CHAPTER 6. BAYESIAN LINE INVARIANT MATCHING 79<br />
correspond <strong>to</strong> a basis of a model is<br />
k,1<br />
Y<br />
n 1<br />
p = è<br />
, i M<br />
i=0<br />
n , i è:<br />
If we are <strong>to</strong> detect all M model instances at once, in the best case we have <strong>to</strong> probe<br />
only M scene bases from the scene. This happens when each time a basis is probed, it<br />
happens <strong>to</strong> correspond <strong>to</strong> a basis of a certain model and next time another basis is probed,<br />
it corresponds <strong>to</strong> a basis of another model. However, the probability for this <strong>to</strong> happen is<br />
p M , which isobviously small. The probability of detecting exactly i diæerent models after<br />
t trials èi.e., among the t trials, t = i 0 + i 00 ;i 0 ç i probes cover bases of each of the i models<br />
and i 00 probes correspond <strong>to</strong> no bases of any modelè is<br />
p i =<br />
0<br />
@ t i<br />
1<br />
A i! p i ;i=0; :::; M:<br />
We may derive the lower bound of t by restricting p i ç æ, 0ç æ ç 1:<br />
tèt , 1è æææèt , i +1èç æ=p i :<br />
Thus,<br />
t i é æ=p i or té i p æ=p:<br />
The probability that at least j diæerent models are detected is then<br />
X<br />
j,1<br />
q j =1, p i :<br />
i=0<br />
In practice, we have <strong>to</strong> try more probes than theoretically predicted, because some<br />
probes, though corresponding <strong>to</strong> some model bases, are bad èe.g., <strong>to</strong>o small, <strong>to</strong>o big or <strong>to</strong>o<br />
much perturbedè bases and do not result in a stable transformation that transforms model<br />
<strong>to</strong> properly æt its scene instance <strong>to</strong> pass veriæcation.
Chapter 7<br />
The Experiments<br />
In this chapter, we test our approach <strong>using</strong> both synthesized and real images. We choose <strong>to</strong><br />
implement the case of aæne transformations. The group of aæne transformations is a supergroup<br />
of both rigid and similarity transformations. In addition, aæne transformations<br />
are often appropriate approximations <strong>to</strong> perspective transformations under suitable viewing<br />
situations èreferring <strong>to</strong> p. 79 of ë35ëè and thus can be used in recognition algorithms as<br />
substitutes for more general perspective transformations. It should be noted that a similar<br />
approach can be applied <strong>to</strong> the cases of other transformations we discussed in chapter 4.<br />
Implementation details, with experimental results, are presented in the following.<br />
7.1 Implementation of Aæne Invariant Matching<br />
To apply the procedure described in section 6.3 <strong>to</strong> the case of aæne group, a triplet of<br />
lines is necessary <strong>to</strong> form a basis. Eq. è4.21è and Eq. è4.22è in section 4.3.3 are used <strong>to</strong><br />
compute the invariant and the spread function of the invariant isgiven by Eq. è5.8è in<br />
section 5.3.3. Our formulae involve many trigonometric evaluations. To accelerate the<br />
process, a æne-grained look-up table of trigonometric functions is pre-computed.<br />
Considering numerical stability, we should avoid prevent <strong>using</strong> small bases or large<br />
bases for invariant computing. In our implementation, we use 256 æ 256 images. If any<br />
side of the basis is less than 20 pixels or all sides of the basis is greater than 500 pixels, we<br />
consider it infeasible.<br />
The hash table is implemented as a 2-D array <strong>to</strong> represent the 2-D hash space. Since<br />
80
CHAPTER 7. THE EXPERIMENTS 81<br />
during recognition, wehave <strong>to</strong> consider a neighborhood of a hash entry when it is retrieved,<br />
the boundary of the hash table has <strong>to</strong> be treated specially: The ç-axis of the 2-D hash<br />
table should be viewed as round-wrapped with æip, since the invariant èç; rè is mathematically<br />
equivalent <strong>to</strong>èç , ç;,rè èe.g., è179 æ ; 2è is equivalent <strong>to</strong>è,1 æ ; ,2è, thus neighboring,<br />
say,è0 æ ; ,2èè.<br />
During evidence synthesis, care should be taken <strong>to</strong> prevent multiple accumulation of<br />
weighted votes: For each probing of scene bases, each hash node, ëM i ;b Mi ;l j ë, should<br />
receive avote only once, since model feature l j can not match more than two scene features<br />
simultaneously. Wethus have <strong>to</strong>keep track of the score each hash node receives and take<br />
only the maximum weighted score the node has received.<br />
We also have <strong>to</strong> reject reæexion cases. Reæexion transformations form a subgroup of<br />
aæne transformations. However, a 2-D object, say a triangle with vertex-1; 2; 3 in clockwise<br />
sequence, preserves this sequence, even its shape being seriously skewed when viewed<br />
from a tilted camera. This problem is tackled by detecting the orientation èclockwise or<br />
counterclockwiseè of a basis by its cross product. Thus no such false match between scene<br />
basis and model basis will be hypothesized.<br />
Since we are dealing with highly occluded scenes, we have found that even if a hypothesis<br />
has passed veriæcation èby verifying its boundaryè, it can still be a false alarm. By<br />
the viewpoint consistency principle ë41ë, the locations of all object features in an image<br />
should be consistent with the projection from a single viewpoint. We may show that all<br />
2-D objects lying on the same plane have an identical enlarging èor shrinkingè ratio of<br />
area, if viewed from the same camera, assuming approximation of aæne transformations<br />
<strong>to</strong> perspective transformations. The quantity of this ratio is j detèAèj, where A is the<br />
skewing matrix of an aæne transformation T =èA; bè. We call it the skewing fac<strong>to</strong>r and<br />
request that all the hypothesized instances have the same skewing fac<strong>to</strong>r. èAppendix-A<br />
provides the formula for computing T from a correspondence of a scene basis and a model<br />
basis.è<br />
For each probe, after his<strong>to</strong>graming of the weighted votes, the <strong>to</strong>p few hypotheses are<br />
veriæed. If the veriæcation is successful, the skewing fac<strong>to</strong>r of the alleged instance is<br />
computed and used <strong>to</strong> vote for a common skewing fac<strong>to</strong>r.<br />
After suæcient number of bases are probed, which is determined by a statistical estimation<br />
procedure èsee section 6.4è, a global skewing fac<strong>to</strong>r, which istaken <strong>to</strong> be the
CHAPTER 7. THE EXPERIMENTS 82<br />
j detèAèj with highest number of votes, is obtained. Those hypothesized model instances<br />
voting for the global skewing fac<strong>to</strong>r are passed on<strong>to</strong> a disambiguation process which wego<br />
on <strong>to</strong> describe below.<br />
It is possible that a scene line is shared by many model instances, some of which are<br />
spurious. We could reæne the result by classifying the shared lines <strong>to</strong> belong <strong>to</strong> that model<br />
instance most strongly supported by the evidence. One possible technique for disambiguation<br />
is relaxation ë54ë. However, relaxation is a complicated process. Since each model<br />
instance is associated with its quality measure, QM, as a measure of the strength of evidence<br />
supporting its existence èthis deænition of QM favors long segments of a model<br />
instance and is similar <strong>to</strong> that used in ë4ë with the same justiæcationè. The following<br />
straightforward technique proves <strong>to</strong> work well.<br />
We maintain two data structures, a QM-buæer and a frame-buæer, of the same size<br />
as the image. After initializing the QM-buæer and the frame-buæer, we process each<br />
candidate model instance in an arbitrary order by superimposing the candidate model<br />
instance on<strong>to</strong> the QM-buæer, then tracing the boundary of the model instance. For each<br />
position traced, if the QM of the current model instance is greater than that value s<strong>to</strong>red<br />
in the QM-buæer, we write the id of this candidate model instance <strong>to</strong> the corresponding<br />
position of the frame-buæer and replace that value in the QM-buæer by the current QM.<br />
After all candidate model instances have been processed, we recompute the QM of<br />
each candidate on the frame-buæer by considering only positions with the same id as this<br />
candidate. Those with the new QM's less than a preset threshold èe.g. 0:5è are removed<br />
from the list of candidate model instances.<br />
7.2 Best Least-Squares Match<br />
The transformation between a model and its scene instance can be recovered by the correspondence<br />
of the model basis and the scene basis alone. However, scene lines detected by<br />
the Hough transform are usually somewhat dis<strong>to</strong>rted due <strong>to</strong> noise. This results in dis<strong>to</strong>rtion<br />
in computing the transformation. Usually this dis<strong>to</strong>rted transformation transforms<br />
a model <strong>to</strong> match its scene instance with basis lines matching each other perfectly while<br />
the other lines deviating from their correspondences more or less. Knowledge of additional<br />
line correspondences between a model and its scene instance can be used <strong>to</strong> improve the
CHAPTER 7. THE EXPERIMENTS 83<br />
accuracy of the computed transformation. In this following, we discuss several methods <strong>to</strong><br />
minimize the errors.<br />
Method 1<br />
Treat each line as a point with coordinate èç; rè inèç; rè-space and minimize the squared<br />
distance between èç; rè and its correspondence èç 0 ;r 0 è.<br />
Discussion<br />
The problem of this method is that ç and r are of diæerent metrics. To minimize the squared<br />
distance between a point èç; rè and its correspondence èç 0 ;r 0 è, we implicitly assume equal<br />
weight onboth ç and r.<br />
Method 2<br />
To circumvent the problem caused by Method 1, we note that a line èç; rè can be uniquely<br />
represented by the point èr cos ç; r sin çè, which is the projection of the origin on<strong>to</strong> the line.<br />
To match line èç; rè <strong>to</strong> its correspondence èç 0 ;r 0 è, we try <strong>to</strong> minimize the squared distance<br />
between èr cos ç; r sin çè and èr 0 cos ç 0 ;r 0 sin ç 0 è.<br />
Discussion<br />
The drawback of this method is its dependency on the origin. The nearer the line is <strong>to</strong> the<br />
origin, the more weight ison r èthink of the special case when both lines pass through the<br />
origin in the imageè.<br />
Method 3 1<br />
The models in a model base are usually ænite in the sense that though they are modeled<br />
by lines, they in fact consist of segments. We can minimize the squared distance of the<br />
endpoints of the transformed model line segments <strong>to</strong> their corresponding scene lines in<br />
image space.<br />
We derive in the following the closed-form formula for the case of aæne transformations.<br />
A similar technique can be applied for the cases of other transformation groups.<br />
1 suggested by Professor Jaiwei Hong.
CHAPTER 7. THE EXPERIMENTS 84<br />
Speciæcally, assuming that we are looking for an aæne match between n scene lines l j<br />
and endpoints of n segments, u j1 and u j2, j =1; ...;n,wewould like <strong>to</strong> ænd the aæne<br />
transformation T =èA; bè, such that the summations of the squared distances between<br />
the sequence Tèu j1è <strong>to</strong>l j<br />
E = min<br />
T<br />
nX<br />
and Tèu j2è <strong>to</strong>l j , j =1; ...;n, is minimized:<br />
èdistance of Tèu j1è and l j è 2 + èdistance of Tèu j2è and l j è 2 :<br />
j=1<br />
Let line l j be with parameter èç j ;r j è and endpoints u ji be èx ji ;y ji è;j =1; ...;n and<br />
i =1; 2. Also let T =èA; bè such that<br />
A = @ a 11 a12<br />
a21 a22<br />
Then<br />
and<br />
E = min<br />
T<br />
nX<br />
j=1<br />
0<br />
1<br />
A and b =<br />
0<br />
@ b 1<br />
b2<br />
1<br />
A :<br />
Tèu ji è=èa11x ji + a12y ji + b1;a21x ji + a22y ji + b2è t<br />
èècos ç j ; sin ç j èTèu j1è , r j è 2 + èècos ç j ; sin ç j èTèu j2è , r j è 2 :<br />
To minimize E, wehave <strong>to</strong> solve the following system of equations:<br />
@E<br />
@a11 =0; @E<br />
@a12 =0; @E<br />
@a21 =0; @E<br />
@a22 =0; @E<br />
@E<br />
=0 and<br />
@b1 @b2 =0:<br />
Since E is a quadratic function in each of its unknowns, the above is a system of linear<br />
equations with six unknowns. We can rewrite it in the matrix form as follows. èSee<br />
0<br />
Appendix-B for the expressions of m ij and n i , i; j =1; :::; 6.è<br />
B@<br />
m11 m12 m13 m14 m15 m16<br />
m21 m22 m23 m24 m25 m26<br />
m31 m32 m33 m34 m35 m36<br />
m41 m42 m43 m44 m45 m46<br />
m51 m52 m53 m54 m55 m56<br />
m61 m62 m63 m64 m65 m66<br />
10<br />
CA B@<br />
The six unknowns can be solved by Cramer's rule èsee p.89 of ë39ëè as follows:<br />
a11<br />
a12<br />
a21<br />
a22<br />
a11 = detèM1è= detèMè; a12 = detèM2è= detèMè;<br />
a21 = detèM3è= detèMè; a22 = detèM4è= detèMè;<br />
b1 = detèM5è= detèMè; b2 = detèM6è= detèMè;<br />
b1<br />
b2<br />
1<br />
CA<br />
=<br />
0<br />
B@<br />
n1<br />
n2<br />
n3<br />
n4<br />
n5<br />
n6<br />
1<br />
CA
CHAPTER 7. THE EXPERIMENTS 85<br />
where M =ëm ij ë i;j=1;...;6 and M k<br />
k =1; ...; 6.<br />
= M with k-th column substituted by ën i ë t i=1;...;6 , for<br />
Discussion<br />
We could <strong>to</strong> do a weighted sum of the squared distances. More speciæcally, the two squared<br />
distances provided by the endpoints of long segments get more weight than those of short<br />
segments.<br />
7.3 Experimental Results and Observations<br />
We have done a series of experiments <strong>using</strong> synthesized images containing polygonal objects,<br />
which are modeled by line features, from the model base consisting of twenty models<br />
shown in Figure 3.3. The way the synthesized images are generated has been explained in<br />
section 3.3.<br />
Figure 7.1, Figure 7.2 and Figure 7.3 show three examples of the experiments on synthesized<br />
images. The perturbation of the lines detected by the Hough transform is obtained<br />
from statistics of extensive simulations on images of diæerent levels of degradation. We<br />
also did experiments on real images containing objects from the same model base, with<br />
æakes scattered on the objects. Two examples èFigure 7.5 and Figure 7.6è are shown with<br />
best least-squares match <strong>using</strong> Method-3 discussed before.<br />
Figure 7.1èaè shows a composite overlapping scene of model-0, 3 and 4 ènote: model-0<br />
appears twiceè, which are signiæcantly skewed. Figure 7.1èbè is the result of Hough analysis<br />
applied <strong>to</strong> this scene. 50 lines were detected. The covariance matrix used for the deviation<br />
0:003384 ,0:00624<br />
of the detected lines from true lines was æ =<br />
,0:00624 2:5198<br />
shows the model instances hypothesized in the recognition stage. 5000 probes were tried.<br />
Figure 7.1èdè shows the ænal result after disambiguation.<br />
0<br />
@<br />
1<br />
A . Figure 7.1ècè<br />
Figure 7.2èaè shows a composite overlapping scene of model-0, 1, 2, 4 and 9. Figure<br />
7.2èbè is the result of Hough analysis applied <strong>to</strong> this scene. 60 lines were detected. The<br />
covariance matrix used for the deviation of the detected lines from true lines is the same<br />
as in Figure 7.1. Figure 7.2ècè shows the model instances hypothesized in the recognition<br />
stage. 5000 probes were tried. Figure 7.2èdè shows the ænal result after disambiguation.
CHAPTER 7. THE EXPERIMENTS 86<br />
Figure 7.3èaè shows a composite overlapping scene of model-1, 3, 4, 13 and 19. Figure<br />
7.3èbè is the result of Hough analysis applied <strong>to</strong> this scene. 60 lines were detected. The<br />
covariance matrix used for the deviation of the detected lines from true lines is the same<br />
as in Figure 7.1. Figure 7.3ècè shows the model instances hypothesized in the recognition<br />
stage. 5000 probes were tried. Figure 7.3èdè shows the ænal result after disambiguation.<br />
èNote that model-19 was not detected, since half of its boundary is invisible.è<br />
Figure 7.4èaè shows a composite overlapping scene of model-3, 12 and 15. Figure 7.4èbè<br />
is the result of Hough analysis applied <strong>to</strong> this scene. 60 lines were detected. The covariance<br />
matrix used for the deviation of the detected lines from true lines is the same as in Figure<br />
7.1. Figure 7.4ècè shows the model instances hypothesized in the recognition stage. 5000<br />
probes were tried. Figure 7.4èdè shows the ænal result after disambiguation.<br />
Figure 7.5èaè shows a real image of model-0, 2, 3 and 4 with æakes scattered. Figure<br />
7.5èbè is the result of edge detection <strong>using</strong> Boie-Cox edge detec<strong>to</strong>r ë9ë. Figure 7.5ècè shows<br />
the result of Hough analysis applied <strong>to</strong> this scene. 50 lines are detected. Figure 7.5èdè<br />
shows the recognition result before disambiguation. Figure 7.5èeè shows the recognition<br />
result after disambiguation. Figure 7.5èfè shows the result after best least-squares match.<br />
Figure 7.6èaè shows a real image of model-1, 2 and 3 ènote: model-3 appears twiceè<br />
with æakes scattered. Figure 7.6èbè is the result of edge detection <strong>using</strong> also Boie-Cox edge<br />
detec<strong>to</strong>r. Figure 7.6ècè shows the result of Hough analysis applied <strong>to</strong> this scene. 50 lines<br />
are detected. Figure 7.6èdè shows the recognition result before disambiguation. Figure<br />
7.6èeè shows the recognition result after disambiguation. Figure 7.6èfè shows the result<br />
after best least-squares match.<br />
When doing the experiments, in our ærst attempt we let the candidate model instances<br />
vote for the global skewing fac<strong>to</strong>r without verifying them ærst. A wrong global skewing<br />
fac<strong>to</strong>r was usually obtained. The reason involves: è1è The image is so noisy that there<br />
are <strong>to</strong>o many lines detected and irrelevant lines are often chosen as a basis; è2è An aæne<br />
transformation is so ëpowerful" that it is not diæcult <strong>to</strong> have an aæne transformation that<br />
maps a quadruplet of lines <strong>to</strong> another quadruplet of lines if some error <strong>to</strong>lerance is allowed.<br />
Our algorithm was modiæed <strong>to</strong> let only those veriæed candidate model instances vote for<br />
the global skewing fac<strong>to</strong>r. The trade-oæ is that veriæcation takes greater computational<br />
time.<br />
We also found from the experiments that when doing probing, if the probed triplet
CHAPTER 7. THE EXPERIMENTS 87<br />
èaè<br />
èbè<br />
ècè<br />
èdè<br />
Figure 7.1: Experimental Example 1
CHAPTER 7. THE EXPERIMENTS 88<br />
èaè<br />
èbè<br />
ècè<br />
èdè<br />
Figure 7.2: Experimental Example 2
CHAPTER 7. THE EXPERIMENTS 89<br />
èaè<br />
èbè<br />
ècè<br />
èdè<br />
Figure 7.3: Experimental Example 3
CHAPTER 7. THE EXPERIMENTS 90<br />
èaè<br />
èbè<br />
ècè<br />
èdè<br />
Figure 7.4: Experimental Example 4
CHAPTER 7. THE EXPERIMENTS 91<br />
èaè<br />
èbè<br />
ècè<br />
èdè
CHAPTER 7. THE EXPERIMENTS 92<br />
èeè<br />
èfè<br />
Figure 7.5: Experimental Example 5<br />
èaè<br />
èbè
CHAPTER 7. THE EXPERIMENTS 93<br />
ècè<br />
èdè<br />
èeè<br />
èfè<br />
Figure 7.6: Experimental Example 6
CHAPTER 7. THE EXPERIMENTS 94<br />
of lines happens <strong>to</strong> correspond <strong>to</strong> a basis of a certain model èi.e., a correct matchè, then<br />
this ëmodel, basisë pair usually accumulates the highest weighted score among all others.<br />
If it does not rank the highest, the cause is usually due <strong>to</strong> insuæcient lines detected for<br />
that particular model instance. However, its weighted vote is still among the highest few.<br />
The high-voted false alarms èi.e., matching this scene basis <strong>to</strong> a wrong ëmodel, basisëè are<br />
usually rejected by veriæcation. We conclude that this weighted voting scheme is quite<br />
eæective.<br />
The method described above does not assume any grouping of features, which is expected<br />
<strong>to</strong> greatly expedite the recognition stage èat least for scene basis selectionè. Lowe<br />
ë40ë ærst explicitly discussed the importance of grouping <strong>to</strong> recognition. However, if reliable<br />
segmentation is not available, intelligent grouping seems diæcult. We note that in probing,<br />
for a given selected basis that does not correspond <strong>to</strong> any of model bases, it may happen<br />
that false alarms pass even the veriæcation because of high noise in the scene. However, we<br />
point out here that statistically false alarms of such case disperse their votes for diæerent<br />
skewing fac<strong>to</strong>rs, while correct matches will accumulate their votes for the global skewing<br />
fac<strong>to</strong>r.
Chapter 8<br />
Conclusions<br />
8.1 Discussion and Summary<br />
<strong>Geometric</strong> hashing is often compared <strong>to</strong> the generalized Hough transform. The main<br />
diæerence between them lies in the kind of ëevidence" used for accumulation <strong>to</strong> generate<br />
hypotheses. While the generalized Hough transform accumulates ëpose" evidence in a continuous<br />
inænite parameter space, geometric hashing accumulates ëfeature correspondence"<br />
evidence within discrete ænite feature space of èmodel identiæer, basis setè's. However, the<br />
quantization problem exists not only in the generalized Hough transform technique. <strong>Geometric</strong><br />
hashing implicitly transfers this problem <strong>to</strong> the preprocessing stage when making<br />
decisions about the quantization of hash space of invariants.<br />
Grimson and Huttenlocher ë21ë analyze the performance of geometric hashing aæneinvariant<br />
matching in the presence of noise on point features and give pessimistic predictions.<br />
They assert that it is impossible <strong>to</strong> construct a sparsely populated hash table due <strong>to</strong><br />
the perturbation of the invariants computed from point features with uncertainty. Rigoutsos<br />
and Hummel ë45ë incorporate additive Gaussian noise in<strong>to</strong> the model and analytically<br />
determine its eæect on the computed invariants for the case where models are allowed <strong>to</strong><br />
undergo similarity transformations. They have shown promising results.<br />
However, there has not been here<strong>to</strong>fore an exploration of line features and their performance<br />
in seriously degraded intensity images. We have shown how the geometric hashing<br />
technique can be applied <strong>to</strong> line features by exploring line invariants under various geometric<br />
transformations, including rigid, similarity, aæne and projective cases.<br />
95
CHAPTER 8. CONCLUSIONS 96<br />
We have also done an analysis of noise sensitivity. Formulae that describe the spread<br />
of the computed invariants from the lines giving rise <strong>to</strong> these invariants are derived for<br />
the above geometric transformations. The basic underlying assumption here is that the<br />
perturbations of line features can be modeled by a Gaussian process. Since more image<br />
points are involved in line features, lines can be extracted by our improved Hough transform<br />
with good accuracy èsmall perturbationè even in seriously degraded images. This is the<br />
ærst of its kind for noise analysis of line features for geometric hashing.<br />
The knowledge about noise sensitivity of the invariants allows us <strong>to</strong> give aweighted<br />
voting scheme for geometric hashing <strong>using</strong> line features. We have applied our technique for<br />
the case of aæne transformations, whichcover both rigid and similarity transformations and<br />
are often suitable substitutes for more general perspective transformations. It shows that<br />
the technique is noise resistant and suitable in an environment containing many occlusions.<br />
8.2 Future Directions<br />
We end this dissertation with a brief discussion of possible directions for future research.<br />
It is obvious that the recognition stage can be easily parallelized. We may assign each<br />
basis <strong>to</strong> a processing element in a parallel computer, since each basis can be processed<br />
almost independently or assign each hash node a processing element. Rigoutsos ë43ë discuss<br />
a parallel implementation of geometric hashing with point features on the Connection<br />
Machine ë26ë, which is equipped with 16K , 64K processors èfor model CM-2è. When the<br />
number of processors simultaneously needed exceeded the maximum numberofphysical<br />
processors in the machine, the machine can operate in a virtual processor mode. In this<br />
mode, a single processor emulates the work of several virtual processors, by serializing<br />
operations in time and partitioning the local memory associated with that processor.<br />
With the advance of hardware techniques and the decrease of cost, parallel realization<br />
of image processing algorithms has become more and more popular.<br />
Another <strong>to</strong>pic <strong>to</strong> be explored is the intelligent fusion of multiple feature types. Since<br />
we assume no reliable segmentation is available, it seems that only line features can be<br />
extracted and used as out primitive feature. However, if the image can be well segmented<br />
and geometric features like points, segments, circles, ovals and so on can be reliably extracted,<br />
our technique can be helpful <strong>to</strong> design a suitable weighting function or <strong>to</strong> perform
CHAPTER 8. CONCLUSIONS 97<br />
hierarchical æltering by diæerent feature types. This may require evaluation criteria for<br />
measuring the relative merit of diæerent feature types and can be application-dependent.<br />
Recognition of non-æat 3-D objects from 2-D images can also be a direction for extension.<br />
A combined use of the transformation parameter space of the Hough transform and<br />
the invariant hash space of geometric hashing may possibly lead <strong>to</strong> a recognition system<br />
which can recognize parameterized objects.<br />
Finally, the technique of ëreasoning by parts" and the geometric hashing technique<br />
can beneæt from each other. A complicated object usually can not be represented by<br />
only one primitive feature type. It may be more suitable <strong>to</strong> represent diæerent parts of<br />
an object by diæerent primitive feature types and recognize each part by the geometric<br />
hashing technique <strong>using</strong> diæerent feature types.
Appendix A<br />
Given a correspondence between pairs of triplets of lines in general position, the aæne<br />
transformation T =èA,bè, where A isa2æ 2 non-singular skewing matrix and b is a<br />
2 æ 1 translation vec<strong>to</strong>r, can be uniquely determined such that T maps points of the ærst<br />
triplet of lines <strong>to</strong> their corresponding points of the second triplet of lines.<br />
Let the ærst triplet of lines be èç i ;r i è t i=1;2;3 and the second triplet of lines be èç 0<br />
We have the closed-form formula for T =èA,bè as follows:<br />
0 1<br />
A = 1 @ a 11 a 12<br />
det ,a 21 ,a 22<br />
b = 1<br />
det<br />
0 1<br />
@ ,b 1 A ;<br />
b 2<br />
A ;<br />
i ;r 0<br />
i èt i=1;2;3 .<br />
where<br />
a 11 =<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
cos ç 3 cscèç1 , ç2è sinèç 1 , ç 2 èèr1 sin ç2 , r2 sin ç1è+<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
cos ç 1 cscèç2 , ç3è sinèç 2 , ç 3 èèr2 sin ç3 , r3 sin ç2è+<br />
0<br />
0<br />
0<br />
0<br />
0<br />
cos ç 2 cscèç3 , ç1è sinèç 3 , ç 1 èèr3 sin ç1 , r1 sin ç<br />
a 21 =<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
cos ç 3 cscèç1 , ç2è sinèç 1 , ç 2 èèr1 cos ç2 , r2 cos ç1è+<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
cos ç 1 cscèç2 , ç3è sinèç 2 , ç 3 èèr2 cos ç3 , r3 cos ç2è+<br />
0<br />
0<br />
cos ç 2 cscèç3 , ç1 è 0<br />
sinèç 3 , ç 1 èèr3 cos 0 0<br />
ç 1 , r1 cos 0 ç 3<br />
è;<br />
a 12 =<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
sin ç 3 cscèç1 , ç2è sinèç 1 , ç 2 èèr1 sin ç2 , r2 sin ç1è+<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
sin ç 1 cscèç2 , ç3è sinèç 2 , ç 3 èèr2 sin ç3 , r3 sin ç2è+<br />
0<br />
0<br />
0<br />
0<br />
0<br />
sin ç 2 cscèç3 , ç1è sinèç 3 , ç 1 èèr3 sin ç1 , r1 sin ç<br />
98<br />
0<br />
3è;<br />
0<br />
3è;
APPENDIX A. 99<br />
0<br />
a22 = sin ç3 cscèç<br />
0<br />
sin ç1 cscèç<br />
0<br />
sin ç2 cscèç<br />
0<br />
b1 = r3 cscèç<br />
0<br />
r1 cscèç<br />
0<br />
r2 cscèç<br />
0<br />
b2 = r3 cscèç<br />
0<br />
r1 cscèç<br />
0<br />
r2 cscèç<br />
1 , ç<br />
2 , ç<br />
3<br />
, ç<br />
1 , ç<br />
2 , ç<br />
3 , ç<br />
1 , ç<br />
2 , ç<br />
3 , ç<br />
0<br />
0<br />
2è sinèç1 , ç2èèr1 cos ç<br />
0<br />
3è sinèç2 , ç3èèr2 cos ç<br />
0<br />
1è sinèç3 , ç1èèr3 cos ç<br />
2è sinèç1 , ç2èèr1 sin ç<br />
0<br />
3è sinèç2 , ç3èèr2 sin ç<br />
0<br />
1 è 0<br />
sinèç 3 , ç1èèr<br />
3 sin 0<br />
ç<br />
0<br />
2è sinèç1 , ç2èèr1 cos ç<br />
0<br />
3è sinèç2 , ç3èèr2 cos ç<br />
0<br />
1è sinèç3 , ç1èèr3 cos ç<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
2 , r<br />
0<br />
3 , r<br />
0<br />
2 , r<br />
0<br />
3 , r<br />
1<br />
, r<br />
0<br />
2 , r<br />
0<br />
3 , r<br />
0<br />
1 , r<br />
1 , r<br />
0<br />
0<br />
2 cos ç1è+<br />
0<br />
0<br />
3 cos ç2è+<br />
0<br />
1 cos ç<br />
0<br />
0<br />
2 sin ç1è+<br />
0<br />
0<br />
3 sin ç2è+<br />
0<br />
1 sin 0<br />
ç<br />
3 è;<br />
0<br />
3è;<br />
0<br />
0<br />
2 cos ç1è+<br />
0<br />
0<br />
3 cos ç2è+<br />
0<br />
1 cos ç<br />
det = r1 sinèç2 , ç3è+r2 sinèç3 , ç1è+r3 sinèç1 , ç2è:<br />
Given the correspondence of the basis lines of a model object and its scene instance, we<br />
can use the above formula <strong>to</strong> transform the boundary of the model object on<strong>to</strong> the scene<br />
for veriæcation.<br />
0<br />
3è;
Appendix B<br />
The components of the matrix for the system of linear equations for best least-squares<br />
match are as follows:<br />
P n<br />
m 11 = j=1 cos 2 ç j èx 2 P P j1 + x2 j2 è, m n<br />
31 =2 j=1 cos ç j sin ç j èx 2 j1 + x2 j2 è,<br />
n<br />
m12 = P j=1 cos 2 P n<br />
ç j èx j1 y j1 + x j2 y j2 è, m32 = j=1 cos ç j sin ç j èx j1 y j1 + x j2 y j2 è;<br />
n<br />
m13 = j=1 cos ç j sin ç j èx 2 P P j1 + x2 j2 è, m n<br />
33 = j=1 sin 2 ç j èx 2 j1 + x2 j2 è,<br />
n<br />
m14 = P P n<br />
j=1 cos ç j sin ç j èx j1 y j1 + x j2 y j2 è, m34 = j=1 sin 2 ç j èx j1 y j1 + x j2 y j2 è,<br />
n<br />
m15 = P j=1 cos 2 P n<br />
ç j èx j1 + x j2 è, m35 = P j=1 cos ç j sin ç j èx j1 + x j2 è,<br />
n n<br />
m16 = j=1 cos ç j sin ç j èx j1 + x j2 è, m36 = j=1 sin 2 ç j èx j1 + x j2 è,<br />
P n<br />
m21 = P j=1 cos 2 P n<br />
ç j èx j1 y j1 + x j2 y j2 è, m41 = j=1 cos ç j sin ç j èx j1 y j1 + x j2 y j2 è,<br />
n<br />
m22 = j=1 cos 2 ç j èyj1 2 P P + y2 j2 è, m n<br />
42 = j=1 cos ç j sin ç j èyj1 2 + y2 j2 è,<br />
n<br />
m23 = P P n<br />
j=1 cos ç j sin ç j èx j1 y j1 + x j2 y j2 è, m43 = j=1 sin 2 ç j èx j1 y j1 + x j2 y j2 è,<br />
n<br />
m24 = j=1 cos ç j sin ç j èyj1 2 P P + y2 j2 è, m n<br />
44 = j=1 sin 2 ç j èyj1 2 + y2 j2 è,<br />
n<br />
m25 = P j=1 cos 2 P n<br />
ç j èy j1 + y j2 è, m45 = P j=1 cos ç j sin ç j èy j1 + y j2 è,<br />
n n<br />
m26 = j=1 cos ç j sin ç j èy j1 + y j2 è, m46 = j=1 sin 2 ç j èy j1 + y j2 è,<br />
P n<br />
m51 = P j=1 cos 2 P n<br />
ç j èx j1 + x j2 è, m61 = j=1 cos ç j sin ç j èx j1 + x j2 è,<br />
n<br />
m52 = j=1 cos 2 P P n<br />
ç j èy j1 + y j2 è, m62 = j=1 cos ç j sin ç j èy j1 + y j2 è,<br />
n<br />
m53 = P P n<br />
j=1 cos ç j sin ç j èx j1 + x j2 è, m63 = j=1 sin 2 ç j èx j1 + x j2 è,<br />
n<br />
m54 = P P n<br />
j=1 cos ç j sin ç j èy j1 + y j2 è, m64 = j=1 sin 2 ç j èy j1 + y j2 è,<br />
n<br />
m55 =2 j=1 cos 2 P n<br />
ç j , m65 =2 j=1 cos ç j sin ç j ,<br />
P P n n<br />
m56 =2 j=1 cos ç j sin ç j ,<br />
m66 =2 j=1 sin 2 ç j<br />
100
APPENDIX B. 101<br />
P n<br />
n1 = P P j=1 cos çj r n<br />
jèxj1 + xj2è,<br />
n3 = P j=1 sin çj r jèxj1 + xj2è,<br />
n<br />
n2 = j=1 cos çj r n<br />
jèyj1 + yj2è,<br />
n4 = j=1 sin çj r jèyj1 + yj2è,<br />
n5 =2 P n<br />
j=1 cos çj r j, n6 =2 P n<br />
j=1 sin çj r j.
Bibliography<br />
ë1ë A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms.<br />
Addison-Wesley, 1984.<br />
ë2ë K. Arbter, W. E. Snyder, H. Burkhardt, and Hirzinger. Application of Aæne-Invariant<br />
Fourier Descrip<strong>to</strong>rs <strong>to</strong> Recognition of 3-D Objects. IEEE Trans. on PAMI, 12è7è:640í<br />
647, 1990.<br />
ë3ë L. P. Arkin, E. M .and Chew, D. P. Huttenlocher, K. Kedem, and J.S.B. Mitchell. An<br />
Eæciently Computable Metric for Comparing Polygonal Shapes. Technical Report<br />
TR 89-1007, Computer Science Dept., Cornell University, 1989.<br />
ë4ë N. Ayache and O. D. Faugeras. HYPER: A New <strong>Approach</strong> for the Recognition and<br />
Positioning of Two-Dimensional Objects. IEEE Trans. on PAMI, 8è1è:44í54, 1986.<br />
ë5ë D. H. Ballard. Generalizing the Hough Transform <strong>to</strong> Detect Arbitrary Shapes. Pattern<br />
Recognition, 13è2è:111í122, 1981.<br />
ë6ë H. G. Barrow and J. M. Tenenbaum. Computational Vision. Prentice-Hall, 1981.<br />
ë7ë P. J. Besl and R. C. Jain. Three-Dimensional Object Recognition. ACM Computing<br />
Surveys, 17è1è:75í154, 1985.<br />
ë8ë T. O. Binfold. Survey of Model-Based Image Analysis Systems. The Int. J. of Robotics<br />
Research, 1è1è:18í64, 1982.<br />
ë9ë R. Boie and I. Cox. Two Dimensional Optimum Edge Recognition <strong>using</strong> Matched<br />
and Weiner Filter for Machine Vision. In Proc. of the IEEE Int. Conf. on Computer<br />
Vision, 1987.<br />
102
BIBLIOGRAPHY 103<br />
ë10ë R. C. Bolles and R. A. Cain. Recognizing and Locating Partially Visible Objects:<br />
The Local-Feature-Focus Method. The Int. J. of Robotics Research, 1è3è:57í82, 1982.<br />
ë11ë R. A. Brooks. Model-Based Three-Dimensional Interpretations of Two-Dimensional<br />
Images. IEEE Trans. on PAMI, 5è2è:140í149, 1983.<br />
ë12ë E. Charniak and D. McDermott. Introduction <strong>to</strong> Artiæcial Intelligence. Addison-<br />
Wesley, 1985.<br />
ë13ë R. T. Chin and C. R. Dyer. Model-Based Recognition in Robot Vision. ACM Computing<br />
Surveys, 18è1è:67í108, 1986.<br />
ë14ë P. R. Cohen and E. A. Feigenbaum èEds.è. The Handbook of Artiæcial Intelligence,<br />
Vol. III. William Kaufmann, 1982.<br />
ë15ë R. O. Duda and P. E. Hart. Use of the Hough Transform <strong>to</strong> Detect <strong>Line</strong>s and Curves<br />
in Pictures. Communication of ACM, 15:11í15, 1972.<br />
ë16ë R. O. Duda and P. E. Hart. Pattern Classiæcation and Scene Analysis. Wiley, 1973.<br />
ë17ë P. J. Flynn and A. K. Jain. 3-D Object Recognition <strong>using</strong> Invariant Feature Indexing<br />
of Interpretation Table. J. of Computer Vision, Graphics and Image Processing:<br />
Image Understanding, 55è2è:119í129, 1992.<br />
ë18ë D. Gavrila and F. Groen. 3-D Object Recognition from 2-D Images Using <strong>Geometric</strong><br />
<strong>Hashing</strong>. Pattern Recognition Letters, 13è4è:263 í 278, 1992.<br />
ë19ë P. G. Gottschalk, J. L. Turney, and T. N. Mudge. Eæcient Recognition of Partially<br />
Visible Objects <strong>using</strong> a Logarithmic Complexity Matching Technique. The Int. J. of<br />
Robotics Research, 8è6è:110í140, 1989.<br />
ë20ë W. E. L. Grimson. On the Recognition of Parameterized 2-D Objects. Int. J. of<br />
Computer Vision, 2è4è:353í372, 1989.<br />
ë21ë W. E. L. Grimson and D. P. Huttenlocher. On the Sensitivity of <strong>Geometric</strong> <strong>Hashing</strong>.<br />
In Proc. of the IEEE Int. Conf. on Computer Vision, pages 334í338, 1990.<br />
ë22ë W. E. L. Grimson and D. P. Huttenlocher. On the Sensitivity of the Hough Transform<br />
for Object Recognition. IEEE Trans. on PAMI, 12è3è:255í274, 1990.
BIBLIOGRAPHY 104<br />
ë23ë W. E. L. Grimson and T. Lozano-Pçerez. Localizing Overlapping Parts by Searching<br />
the Interpretation Tree. IEEE Trans. on PAMI, 9è4è:469í482, 1987.<br />
ë24ë A. Gueziec and N. Ayache. New Developments on <strong>Geometric</strong> <strong>Hashing</strong> for Curve<br />
Matching. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition,<br />
pages 703í704, 1993.<br />
ë25ë Y. C. Hecker and R. M. Bolle. Invariant Feature Matching in Parameter Space with<br />
Application <strong>to</strong> <strong>Line</strong> <strong>Features</strong>. Technical Report RC 16447, IBM T.J.Watson Research<br />
Center, 1991.<br />
ë26ë D. Hillis. The Connection Machine. MIT Press, 1985.<br />
ë27ë J. Hong and X. Tan. Recognize the Similarity between Shapes under Aæne Transformation.<br />
In Proc. of the IEEE Int. Conf. on Computer Vision, pages 489í493,<br />
1988.<br />
ë28ë R. Hummel and Rigoutsos I. <strong>Geometric</strong> <strong>Hashing</strong> as a Bayesian Maximum Likelihood<br />
Object Recognition Method. Int. J. of Computer Vision. Currently under review.<br />
ë29ë R. Hummel and H. Wolfson. Aæne Invariant Matching. In Proc. of the DARPA IU<br />
Workshop, pages 351í364, 1988.<br />
ë30ë D. P. Huttenlocher and S. Ullman. Object Recognition <strong>using</strong> Alignment. In Proc. of<br />
the IEEE Int. Conf. on Computer Vision, pages 102í111, 1987.<br />
ë31ë D. P. Huttenlocher and S. Ullman. Recognizing Solid Objects by Alignment with an<br />
Image. Int. J. of Computer Vision, 5è2è:195í212, 1990.<br />
ë32ë J. Illingworth and J. Kittler. A Survey of the Hough Transform. J. of Computer<br />
Vision, Graphics and Image Processing, 44:87í116, 1988.<br />
ë33ë A. Kalvin, E. Schonberg, J. T. Schwartz, and M. Sharir. Two Dimensional Model<br />
Based Boundary Matching <strong>using</strong> Footprints. The Int. J. of Robotics Research, 5è4è:38í<br />
55, 1986.<br />
ë34ë E. Kishon. Use of Three Dimensional Curves in Computer Vision. PhD thesis,<br />
Computer Science Dept., Courant Institute of Mathematical Sciences, New York University,<br />
1989.
BIBLIOGRAPHY 105<br />
ë35ë F. Klein. Elementary Mathematics from an Advanced Standpoint ; Geometry. Macmillan,<br />
1925.<br />
ë36ë Y. Lamdan. Object Recognition by <strong>Geometric</strong> <strong>Hashing</strong>. PhD thesis, Computer Science<br />
Dept., Courant Institute of Mathematical Sciences, New York University, 1989.<br />
ë37ë Y. Lamdan, J. T. Schwartz, and H. J. Wolfson. Object Recognition by Aæne Invariant<br />
Matching. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition,<br />
pages 335í344, 1988.<br />
ë38ë Y. Lamdan and H. J. Wolfson. On the Error Analysis of <strong>Geometric</strong> <strong>Hashing</strong>. In Proc.<br />
of the IEEE Conf. on Computer Vision and Pattern Recognition, pages 22 í 27, 1991.<br />
ë39ë S. Lang. <strong>Line</strong>ar Algebra. Addison-Wesley, 1966.<br />
ë40ë D. G. Lowe. Perceptual Organization and Visual Recognition. Kluwer, 1985.<br />
ë41ë D. G. Lowe. The Viewpoint Consistency Constraint. Int. J. of Computer Vision,<br />
1è1è:57í72, 1987.<br />
ë42ë D. G. Lowe. Three-dimensional Object Recognition from Single Two-dimensional<br />
Images. Artiæcial Intelligence, 31:355í395, 1987.<br />
ë43ë I. Rigoutsos. Massively Parallel Bayesian Object Recognition. PhD thesis, Computer<br />
Science Dept., Courant Institute of Mathematical Sciences, New York University,<br />
1992.<br />
ë44ë I. Rigoutsos and R. Hummel. ABayesian <strong>Approach</strong> <strong>to</strong> Model Matching with <strong>Geometric</strong><br />
<strong>Hashing</strong>. J. of Computer Vision, Graphics and Image Processing: Image<br />
Understanding. Currently under review.<br />
ë45ë I. Rigoutsos and R. Hummel. Robust Similarity Invariant Matching in the Presence<br />
of Noise. In Proc. of the 8th Israeli Conf. on Artiæcial Intelligence and Computer<br />
Vision, 1991.<br />
ë46ë I. Rigoutsos and R. Hummel. Massively Parallel Model Matching: <strong>Geometric</strong> <strong>Hashing</strong><br />
on the Connection Machine. IEEE Computer: Special Issue on Parallel Processing<br />
for Computer Vision and Image Understanding, 1992.
BIBLIOGRAPHY 106<br />
ë47ë I. Rigoutsos and R. A. Hummel. Implementation of <strong>Geometric</strong> <strong>Hashing</strong> on the Connection<br />
Machine. In IEEE Workshop on Directions in Aut. CAD-Based Vision, 1991.<br />
ë48ë T. Risse. The Hough Transform for <strong>Line</strong> Recognition: Complexity of Evidence Accumulation<br />
and Cluster Detection. J. of Computer Vision, Graphics and Image Processing,<br />
46:327í345, 1989.<br />
ë49ë J. T. Schwartz and M. Sharir. Identiæcation of Partially Obscured Objects in Two<br />
Dimensions by Matching of Noisy `Characteristic Curves'. The Int. J. of Robotics<br />
Research, 6è2è:29í44, 1987.<br />
ë50ë G. S<strong>to</strong>ckman. Object Recognition and Localization via Pose Clustering. J. of Computer<br />
Vision, Graphics and Image Processing, 40:361í387, 1987.<br />
ë51ë J. M. Tenenbaum and H. G. Barrow. A Paradigm for Integrating Image Segmentation<br />
and Interpretation. In Proc. of the Int. Conf. on Pattern Recognition, pages 504í513,<br />
1976.<br />
ë52ë D. W. Thompson and J. L. Mundy. Three-Dimensional Model Matching from an Unconstrained<br />
Viewpoint. In Proc. of the IEEE Int. Conf. on Robotics and Au<strong>to</strong>mation,<br />
pages 208í220, 1987.<br />
ë53ë L. W. Tucker, C. R. Feynman, and D. M. Fritzsche. Object Recognition <strong>using</strong> the<br />
Connection Machine. In Proc. of the IEEE Conf. on Computer Vision and Pattern<br />
Recognition, pages 871í878, 1988.<br />
ë54ë S. W. Zucker, R. Hummel, and A. Rosenfeld. An Application of Relaxation Labeling<br />
<strong>to</strong> <strong>Line</strong> and Curve Enhancement. IEEE Trans. on Computers, 26è4è:394í403, 1977.