11.07.2014 Views

A Probabilistic Approach to Geometric Hashing using Line Features

A Probabilistic Approach to Geometric Hashing using Line Features

A Probabilistic Approach to Geometric Hashing using Line Features

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

A <strong>Probabilistic</strong> <strong>Approach</strong> <strong>to</strong><br />

<strong>Geometric</strong> <strong>Hashing</strong> <strong>using</strong> <strong>Line</strong> <strong>Features</strong><br />

by<br />

Frank Chee-Da Tsai<br />

a dissertation submitted in partial fulfillment<br />

of the requirements for the degree of<br />

Doc<strong>to</strong>r of Philosophy<br />

Department of Computer Science<br />

New York University<br />

September 1993<br />

Approved:<br />

Professor Jacob T. Schwartz<br />

Research Advisor


cæ Frank Chee-Da Tsai<br />

All Rights Reserved, 1993.


Dedication<br />

This dissertation is dedicated <strong>to</strong> Professor Jacob T. Schwartz for his invaulable advisory,<br />

<strong>to</strong> my parents for their endless love and <strong>to</strong> Buddha, whose wisdom cheers me through the<br />

dark patches of my life.<br />

iii


Acknowledgements<br />

My most sincere gratitude goes <strong>to</strong> my research advisor Professor Jacob T. Schwartz<br />

for his generous guidance, without which this dissertation can not be possible. To whom,<br />

I also owe the understanding of identifying research directions of scientiæc interest.<br />

Iwould also like <strong>to</strong> express my gratitude <strong>to</strong> Professor Wen-Hsiang Tsai of National<br />

Chiao-Tung University, Hsin-Chu, Taiwan, Republic of China. I <strong>to</strong>ok his course ëImage<br />

Processing" when I was an undergraduate senior. This is my ærst taste of applying computer<br />

technology <strong>to</strong> the processing of images. After two-year R.O.T.C. military service<br />

upon graduation, I came here <strong>to</strong> the Courant Institute in 1987 <strong>to</strong> further my study. Professor<br />

Stçephane Mallat's ëComputer Vision" course again intrigued my interest in the æeld<br />

of image analysis.<br />

Thanks are also due <strong>to</strong> Professor Jaiwei Hong, Professor Robert Hummel, Professor<br />

Richard Wallace, Professor Haim Wolfson, Dr. Isidore Rigoutsos, Mr. Ronie Hecker and<br />

Mr. Jyh-Jong Liu for various kinds of helps and discussions.<br />

I also thank the staæs in the Courant Robotics Labora<strong>to</strong>ry, for the days we worked<br />

<strong>to</strong>gether, especially Dr. Xiaonan Tan for her encouragement.<br />

Finally, I thank my parents, truly and sincerely, for their patience, support and constant<br />

encouragement throughout the work and my whole life.<br />

iv


Abstract<br />

One of the most important goals of computer vision research is object recognition.<br />

Most current object recognition algorithms assume reliable image segmentation, which in<br />

practice is often not available. This research exploits the combination of the Hough method<br />

with the geometric hashing technique for model-based object recognition in seriously degraded<br />

intensity images.<br />

We describe the analysis, design and implementation of a recognition system that can<br />

recognize, in a seriously degraded intensity image, multiple objects modeled by a collection<br />

of lines.<br />

We ærst examine the fac<strong>to</strong>rs aæecting line extraction by the Hough transform and proposed<br />

various techniques <strong>to</strong> cope with them. <strong>Line</strong> features are then used as primitive<br />

features from which we compute the geometric invariants used by the geometric hashing<br />

technique. Various geometric transformations, including rigid, similarity, aæne and<br />

projective transformations, are examined.<br />

We then derive the ëspread" of computed invariant over the hash space caused by<br />

ëperturbation" of the lines giving rise <strong>to</strong> this invariant. This is the ærst of its kind for<br />

noise analysis on line features for geometric hashing. The result of the noise analysis is<br />

then used in a weighted voting scheme for the geometric hashing technique.<br />

We have implemented the system described and carried out a series of experiments on<br />

polygonal objects modeled by lines, assuming aæne approximations <strong>to</strong> perspective viewing<br />

transformations. Our experimental results show that the technique described is noise<br />

resistant and suitable in an environment containing many occlusions.<br />

v


Contents<br />

List of Figures<br />

ix<br />

1 Introduction 1<br />

1.1 Model-Based Object Recognition : : : : : : : : : : : : : : : : : : : : : : : : 2<br />

1.1.1 Problem Deænition : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2<br />

1.1.2 Object Recognition Method : : : : : : : : : : : : : : : : : : : : : : : 3<br />

1.2 Diæculties <strong>to</strong> Be Faced : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5<br />

1.3 The <strong>Hashing</strong> <strong>Approach</strong> : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5<br />

1.4 Overview of This Dissertation : : : : : : : : : : : : : : : : : : : : : : : : : : 6<br />

1.4.1 <strong>Line</strong> Extraction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7<br />

1.4.2 <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7<br />

1.4.3 Eæect of Noise on <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : 7<br />

1.4.4 Invariant Matching with Weighted Voting : : : : : : : : : : : : : : : 7<br />

2 Prior and Related Work 9<br />

2.1 Basic Paradigms : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9<br />

2.1.1 Template Matching : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9<br />

2.1.2 Hypothesis-Prediction-Veriæcation : : : : : : : : : : : : : : : : : : : 10<br />

2.1.3 Transformation Accumulation : : : : : : : : : : : : : : : : : : : : : : 11<br />

2.1.4 Consistency Checking and Constraint Propagation : : : : : : : : : : 13<br />

2.1.5 Sub-Graph Matching : : : : : : : : : : : : : : : : : : : : : : : : : : : 14<br />

2.1.6 Evidential Reasoning : : : : : : : : : : : : : : : : : : : : : : : : : : : 15<br />

2.1.7 Miscellaneous : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15<br />

2.2 An Overview of <strong>Geometric</strong> <strong>Hashing</strong> : : : : : : : : : : : : : : : : : : : : : : : 17<br />

vi


2.2.1 A Brief Description : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17<br />

2.2.2 Strengths and Weaknesses : : : : : : : : : : : : : : : : : : : : : : : : 19<br />

2.2.3 <strong>Geometric</strong> <strong>Hashing</strong> Systems : : : : : : : : : : : : : : : : : : : : : : : 20<br />

2.3 The Bayesian Model of Rigoutsos : : : : : : : : : : : : : : : : : : : : : : : : 20<br />

3 Noise in the Hough Transform 23<br />

3.1 The Hough Transform : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 23<br />

3.2 Various Hough Transform Improvements : : : : : : : : : : : : : : : : : : : : 25<br />

3.3 Implementation and Measured Performance : : : : : : : : : : : : : : : : : : 27<br />

3.4 Some Additional Observations Concerning the Accuracy of Hough Data : : 36<br />

4 Invariant Matching Using <strong>Line</strong> <strong>Features</strong> 39<br />

4.1 Change of Coordinates in Various Spaces : : : : : : : : : : : : : : : : : : : 39<br />

4.1.1 Change of Coordinates in Image Space : : : : : : : : : : : : : : : : : 40<br />

4.1.2 Change of Coordinates in <strong>Line</strong> Parameter Space : : : : : : : : : : : 40<br />

4.1.3 Change of Coordinates in èç; rè Space : : : : : : : : : : : : : : : : : 41<br />

4.2 Encoding <strong>Line</strong>s by a Basis of <strong>Line</strong>s : : : : : : : : : : : : : : : : : : : : : : : 42<br />

4.3 <strong>Line</strong> Invariants under Various Transformation Groups : : : : : : : : : : : : 43<br />

4.3.1 Rigid <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : : : : : : : : 43<br />

4.3.2 Similarity <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : : : : : 46<br />

4.3.3 Aæne <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : : : : : : : 51<br />

4.3.4 Projective <strong>Line</strong> Invariants : : : : : : : : : : : : : : : : : : : : : : : : 55<br />

5 The Eæect of Noise on the Invariants 60<br />

5.1 A Noise Model for <strong>Line</strong> Parameters : : : : : : : : : : : : : : : : : : : : : : : 60<br />

5.2 The Spread Function of the Invariants : : : : : : : : : : : : : : : : : : : : : 61<br />

5.3 Spread Functions under Various Transformation Groups : : : : : : : : : : : 64<br />

5.3.1 Rigid-Invariant Spread Function : : : : : : : : : : : : : : : : : : : : 64<br />

5.3.2 Similarity-Invariant Spread Function : : : : : : : : : : : : : : : : : : 65<br />

5.3.3 Aæne-Invariant Spread Function : : : : : : : : : : : : : : : : : : : : 67<br />

5.3.4 Projective-Invariant Spread Function : : : : : : : : : : : : : : : : : : 69<br />

vii


6 Bayesian <strong>Line</strong> Invariant Matching 74<br />

6.1 A Measure of Matching between Scene and Model Invariants : : : : : : : : 74<br />

6.2 Evidence Synthesis by Bayesian Reasoning : : : : : : : : : : : : : : : : : : : 76<br />

6.3 The Algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 77<br />

6.4 An Analysis of the Number of Probings Needed : : : : : : : : : : : : : : : : 78<br />

7 The Experiments 80<br />

7.1 Implementation of Aæne Invariant Matching : : : : : : : : : : : : : : : : : 80<br />

7.2 Best Least-Squares Match : : : : : : : : : : : : : : : : : : : : : : : : : : : : 82<br />

7.3 Experimental Results and Observations : : : : : : : : : : : : : : : : : : : : 85<br />

8 Conclusions 95<br />

8.1 Discussion and Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : 95<br />

8.2 Future Directions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 96<br />

A 98<br />

B 100<br />

Bibliography 102<br />

viii


List of Figures<br />

3.1 A comparison of the results of the standard Hough technique and our improved<br />

Hough technique, when applied <strong>to</strong> an image without noise. : : : : : 28<br />

3.2 A comparison of the results of the standard Hough technique and our improved<br />

Hough technique, when applied <strong>to</strong> a noisy image. : : : : : : : : : : : 29<br />

3.3 The twenty models used in our experiments. From left <strong>to</strong> right, <strong>to</strong>p <strong>to</strong><br />

bot<strong>to</strong>m are model-0 <strong>to</strong> model-19. : : : : : : : : : : : : : : : : : : : : : : : : 30<br />

3.4 Reading from <strong>to</strong>p <strong>to</strong> bot<strong>to</strong>m, left <strong>to</strong> right, è0è <strong>to</strong> è8è are image-0 with different<br />

noise levels corresponding <strong>to</strong> cases 0 <strong>to</strong> 8. : : : : : : : : : : : : : : : 32<br />

3.5 Image-1 <strong>to</strong> Image-9 before noise is imposed. The same noise levels as for<br />

Image-0 are applied during our experiments. : : : : : : : : : : : : : : : : : 33<br />

3.6 The thicker line is with parameter èç; rè. Projections of the pixels of this<br />

line on<strong>to</strong> line èç +90 æ ;rè will disperse around the position r distant from<br />

the origin. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 38<br />

7.1 Experimental Example 1 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 87<br />

7.2 Experimental Example 2 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 88<br />

7.3 Experimental Example 3 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 89<br />

7.4 Experimental Example 4 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 90<br />

7.5 Experimental Example 5 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92<br />

7.6 Experimental Example 6 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 93<br />

ix


Chapter 1<br />

Introduction<br />

One of the most important goals of computer vision research is object recognition. This<br />

humanlike visual capability would enable machines <strong>to</strong> sense and analyze their environment<br />

and <strong>to</strong> take an appropriate action as desirable.<br />

We consider intensity-image techniques. Range data is usually harder <strong>to</strong> obtain. There<br />

is also reason <strong>to</strong> believe that human vision emphasizes intensity images and that most<br />

practical applications of computer vision can be tackled without range information ë42ë.<br />

Speciæcally, we consider the problem of 2-D èor, æat 3-Dè object recognition under various<br />

viewing transformations. We are interested in è1è large model bases èmore than ten<br />

objectsè; è2è cluttered scene èlow signal noise ratioè; è3è high occlusion levels; è4è segmentation<br />

diæculties. The current approaches <strong>to</strong> image segmentation generally produce<br />

incomplete boundaries and extraneous edge indications. Therefore, any approach <strong>to</strong> object<br />

recognition has <strong>to</strong> cope with segmentation defects. To work in environments containing<br />

many occlusions, we impose minimal segmentation requirements | only positional information<br />

of edgels is assumed <strong>to</strong> be available.<br />

We will describe the analysis, design and implementation of a recognition system that<br />

can recognize, in a seriously degraded intensity image, multiple objects which can be<br />

modeled as collections of lines.<br />

1


CHAPTER 1. INTRODUCTION 2<br />

1.1 Model-Based Object Recognition<br />

Recognition can be achieved by establishing correspondences between many kinds of predicted<br />

and measured object properties, including shape, color, texture, connectivity, context,<br />

motion, or shading. Here, we will focus upon the problem of achieving spatial correspondence.<br />

This is often prerequisite <strong>to</strong> examining correspondences along the other<br />

dimensions.<br />

Prior research ë7,13ë has indicated that a model-based approach <strong>to</strong> object recognition<br />

can be very eæective inovercoming occlusion, complication and inadequate or erroneous<br />

low level processing. Most commercial vision systems are model-based ones, in which<br />

recognition involves matching an input image <strong>to</strong> a set of predeæned models of objects. A<br />

key goal in this approach is <strong>to</strong> precompile a description of a known set of objects, then <strong>to</strong><br />

use these object models <strong>to</strong> recognize in an image each instance of an object and <strong>to</strong> specify<br />

its position and orientation relative <strong>to</strong> the viewer.<br />

1.1.1 Problem Deænition<br />

Object recognition can be conceptualized in various ways. A brief survey of the literature<br />

on this subject demonstrates this point ë7,8,13ë. Here, we adopt the deænition given by<br />

Besl and Jain ë7ë.<br />

Their deænition is motivated by the observation of human visual capabilities. Two<br />

main steps are involved. The ærst step is learning. When a new object is given, human<br />

visual system gathers information about that object from many diæerent viewpoints. This<br />

process is usually referred <strong>to</strong> as model formation. The second step is identiæcation.<br />

The following deænition of the single-arbitrary-view model-based object recognition problem<br />

is motivated by the above:<br />

1. Given any collection of labeled solid objects, èaè each object can be examined as long<br />

as the object is not deformed; èbè labeled models can be created <strong>using</strong> information<br />

from this examination.<br />

2. Given digitized sensor data corresponding <strong>to</strong> one particular, but arbitrary æeld of<br />

view of the real world, given any data s<strong>to</strong>red previously during the model formation<br />

process, and given the list of distinguishable objects, the following questions must be<br />

answered:


CHAPTER 1. INTRODUCTION 3<br />

æ Does the object appear in the digitized sensor data ?<br />

æ If so, how many times does it occur ?<br />

æ For each occurrence, è1è determine the location in the sensor data èimageè, è2è<br />

determine the location èor translation parametersè of that object with respect<br />

<strong>to</strong> a known coordinate system èif possible with given sensorè, and è3è determine<br />

the orientation èor rotation parametersè with respect <strong>to</strong> a known coordinate<br />

system.<br />

This problem statement allows the possibility of <strong>using</strong> computers <strong>to</strong> solve it; at any<br />

rate the problem is clearly solvable by human beings.<br />

1.1.2 Object Recognition Method<br />

In most object-recognition systems, one can distinguish æve sub-processes: image formation,<br />

feature extraction, model representation, scene analysis and hypotheses veriæcation.<br />

Image Formation<br />

This process creates images via sensors, which can be sensitive <strong>to</strong>avariety of signals like<br />

visible spectrum light, X-rays, etc. As said, we will concentrate on intensity images.<br />

Feature Extraction<br />

This process acts on the sensor data and extracts relevant features. The geometric features<br />

we are interested in might in principle be lines, segments, curvature extrema, curvature<br />

discontinuities, conics and so on. But we will use lines exclusively.<br />

Model Representation<br />

This process converts models <strong>to</strong> the quantities which will be used <strong>to</strong> recognize them. This<br />

process is closely related <strong>to</strong> the feature-extracting process, since the same features are<br />

supposed <strong>to</strong> be extracted from a scene for matching.<br />

Models based on geometric properties of an object's visible surfaces or silhouette are<br />

commonly used because they describe objects in terms of their constituent shape features.<br />

Although many other models of regions and images emphasizing gray-level properties èe.g.


CHAPTER 1. INTRODUCTION 4<br />

texture and colorè have been proposed, solving the problem of spatial correspondence is<br />

often a prerequisite <strong>to</strong> their use. Thus, recognition of objects in a scene is always apt <strong>to</strong><br />

involve construction of a shape description of objects from sensed data and then matching<br />

the description <strong>to</strong> s<strong>to</strong>red object models.<br />

A shape representation scheme usually involves at least two components:<br />

1. the primitive units, e.g. vertices è0-dimensionè, curves and edges è1-dimensionè, surfaces<br />

è2-dimension, e.g. plane surface, quadratic surfaceè and volumes è3-dimension,<br />

e.g. generalized cylindersè;<br />

2. the way those primitives are combined <strong>to</strong> form the object models.<br />

Such a representation scheme must satisfy the following two criteria <strong>to</strong> be considered of<br />

good quality:<br />

1. Near-Uniqueness: Ideally each object in the world should have a limited number of<br />

representations; otherwise, when image features are derived, there will be also many<br />

choices of conæguration for each representation and thus increase the computational<br />

complexity <strong>to</strong> match conægurations of image features <strong>to</strong> model representations.<br />

2. Continuity: Similar objects should have similar representations and very diæerent<br />

objects should have very diæerent representations.<br />

Scene Analysis<br />

This process involves an algorithm for performing matching between models and scene<br />

descriptions. This process is most crucial in an object recognition system and is the focus<br />

of this thesis. It recovers the transformations the model objects undergo during the imageformation<br />

process. Sometimes a quality measure can be associated with each model instance<br />

detected; this measure can be used as a measure of belief for further decision making in<br />

an au<strong>to</strong>nomous system. We will classify basic matching techniques and review them in<br />

chapter 2. The techniques we are interested in involve those allowing partial occlusion,<br />

viewing dis<strong>to</strong>rtion and data perturbation.<br />

In this dissertation, we will focus on <strong>using</strong> the geometric hashing technique for matching.<br />

The geometric hashing technique is very good as a ælter capable of eliminating many<br />

candidate hypotheses as the identity of the objects ë38ë.


CHAPTER 1. INTRODUCTION 5<br />

Hypotheses Veriæcation<br />

This process evaluates the quality of surviving hypotheses and either accepts or rejects<br />

them. Usually an object recognition system projects the hypothesized models on<strong>to</strong> the<br />

scene and the fraction of the model accounted for by the available scene signals is computed.<br />

If this fraction is below a pre-set èusually obtained empiricallyè threshold, the hypothesis<br />

fails veriæcation.<br />

1.2 Diæculties <strong>to</strong> Be Faced<br />

What makes the visual recognition task diæcult ? Basically, there are four fac<strong>to</strong>rs:<br />

1. noise, e.g. sensor noise and bad lighting conditions;<br />

2. clutter, e.g. industrial parts occluded by æakes generated in the milling process;<br />

3. occlusion, e.g. industrial parts overlapping each other;<br />

4. spurious data, e.g. presence of unfamiliar objects in the scene.<br />

The amount of eæort required <strong>to</strong> recognize and locate the objects increases when these<br />

four fac<strong>to</strong>rs become serious. In dealing with these four fac<strong>to</strong>rs, we have <strong>to</strong> consider three<br />

problems ë13ë:<br />

1. What kind of ëfeatures" should and can be reliably be extracted from an image in<br />

order <strong>to</strong> adequately describe the spatial relations in a scene ?<br />

2. What constitutes an eæective representation of these features and their relationships<br />

for an object model ?<br />

3. How should the correspondence between scene features and model features be matched<br />

in order <strong>to</strong> recover model instances in a complex scene ?<br />

1.3 The <strong>Hashing</strong> <strong>Approach</strong><br />

In order <strong>to</strong> recognize 3-D objects in 2-D images, we must either perform a comparison<br />

in 2-D or 3-D. Since it is easier <strong>to</strong> project 3-D models on<strong>to</strong> a 2-D image than it is <strong>to</strong>


CHAPTER 1. INTRODUCTION 6<br />

reconstruct 3-D shapes from 2-D scenes, it is logical, in model-based vision, <strong>to</strong> perform the<br />

comparisons in image space. If we simply project the 3-D models on<strong>to</strong> image space during<br />

the recognition process, we must guess at the projection parameters, and incur the computational<br />

expense of multiple projections at run-time. If we precompute many projections,<br />

then we potentially substitute many 2-D models for relatively fewer 3-D models.<br />

A more eæective method of performing object recognition was proposed by Schwartz<br />

and Sharir ë33ë. The technique exploits the idea of hashing. By appropriately selecting a<br />

hash function, dictionary operations 1 can be performed in Oè1è time on the average. By<br />

hashing transformation invariants, instead of raw data subject <strong>to</strong> a viewing transformation,<br />

the hash function ëhashes" <strong>to</strong> a æxed ëbucket" of the hash table before and after whatever<br />

viewing transformations are allowed. Thus object models can be represented by its<br />

constituent geometric features, some appropriate subset of which is hashed and pre-s<strong>to</strong>red<br />

in the hash table in a redundant way. During recognition, the same hash operations are<br />

applied <strong>to</strong> a subset of scene features and candidate matching model features are retrieved<br />

from the hash table <strong>to</strong> hypothesize their correspondence.<br />

This hashing technique trades space for time. We mayeven pre-compute many projections,<br />

substituting many 2-D models for relatively fewer 3-D models.<br />

1.4 Overview of This Dissertation<br />

In this dissertation, we consider highly degraded intensity images containing multiple objects.<br />

We emphasize use of four techniques:<br />

æ Use of an improved Hough transform <strong>to</strong> detect line features in a noisy image;<br />

æ Use of geometric invariants derived from line features under various viewing transformations;<br />

æ Use of the eæect of line feature statistics on the perturbation analysis of invariants<br />

computed;<br />

æ Use of a Bayesian reasoning as the basis for line feature matching.<br />

1 consisting of ëinsert", ëdelete" and ëmember" operations ë1ë.


CHAPTER 1. INTRODUCTION 7<br />

1.4.1 <strong>Line</strong> Extraction<br />

In noisy scenes, the locations of point features can be hard <strong>to</strong> detect. <strong>Line</strong> features are<br />

more robust and can be extracted by the Hough transform method with greater accuracy.<br />

Thus we choose lines as the primitive features <strong>to</strong> be used.<br />

In chapter 3, we ærst brieæy review the Hough transform technique for detecting line<br />

features. Then we point out several fac<strong>to</strong>rs that adversely aæect the performance of the<br />

method and propose several heuristics <strong>to</strong> cope with those fac<strong>to</strong>rs <strong>to</strong> improve the performance.<br />

A series of experiments relating <strong>to</strong> this point are presented.<br />

1.4.2 <strong>Line</strong> Invariants<br />

In chapter 4, we ærst examine the way in which coordinate changes act on various spaces<br />

of potential interest for recognition. This allows us <strong>to</strong> deæne a method of encoding a line<br />

<strong>using</strong> a combination of other lines in a way invariant under suitable geometric transformations.<br />

Potentially interesting transformations considered include rigid, similarity, aæne<br />

and projective transformations.<br />

1.4.3 Eæect of Noise on <strong>Line</strong> Invariants<br />

In chapter 5, we model the statistical behavior of line parameters detected by the Hough<br />

transform in a noisy image <strong>using</strong> a Gaussian random process with mild assumptions. We<br />

analyze the statistics of the computed invariants and show that these have a Gaussian<br />

distribution in a ærst order approximation.<br />

Analytical formulae for various transformations including rigid, similarity, aæne and<br />

projective transformations are given.<br />

1.4.4 Invariant Matching with Weighted Voting<br />

In chapter 6, we use the result of chapter 5 <strong>to</strong> formulate a Bayesian maximum likelihood<br />

pattern classiæcation as the basis of weighted voting scheme for matching line features by<br />

<strong>Geometric</strong> <strong>Hashing</strong>.<br />

We have implemented a system that makes use of these ideas <strong>to</strong> perform object recognition,<br />

assuming aæne approximations <strong>to</strong> more general perspective viewing transformations.


CHAPTER 1. INTRODUCTION 8<br />

Both synthesized and real images were used in experiments. Experimental results are given<br />

in chapter 7. It is seen that the technique is noise resistant and usable in environments<br />

containing many occlusions.


Chapter 2<br />

Prior and Related Work<br />

Numerous techniques have been proposed for object recognition. A brief survey and classiæcation<br />

of those techniques is given below. These techniques are not independent of each<br />

other; most vision systems combine several of them.<br />

We divide the analysis in<strong>to</strong> three sections. The ærst section examines object recognition<br />

<strong>using</strong> various classical schemes, and the second two sections discuss the background and<br />

existing work in the æeld of geometric hashing. In this thesis, we study the Hough transform<br />

methods, discussed in section 2.1.3 and geometric hashing described in section 2.2. Our<br />

work directly builds upon the Bayesian weighted voting scheme of Rigoutsos, which we<br />

describe in section 2.3.<br />

2.1 Basic Paradigms<br />

2.1.1 Template Matching<br />

Template matching involves matching an image <strong>to</strong> a s<strong>to</strong>red representation and evaluating<br />

some æt function.<br />

According <strong>to</strong> their æexibility, templates can be classiæed in<strong>to</strong> four categories ë14ë:<br />

æ Total templates require an exact match between a scene and a template. Any displacement<br />

or orientation error of pattern in the scene will result in rejection.<br />

æ Partial templates move a template across the scene, computing cross-correlation fac<strong>to</strong>rs.<br />

Points of maximum cross-correlation values are considered as locations where<br />

9


CHAPTER 2. PRIOR AND RELATED WORK 10<br />

the desired pattern appears; multiple matches against the scene is thus possible.<br />

æ Piece templates represent a pattern by its components. Usually the component templates<br />

are weighted by size and scored against a pro<strong>to</strong>type list of expected features.<br />

This method is less sensitive <strong>to</strong> dis<strong>to</strong>rtions of the pattern in the scene than the more<br />

limited techniques listed above. The subgraph matching technique described later<br />

can be viewed as an extension <strong>to</strong> this technique.<br />

æ Flexible templates are designed <strong>to</strong> handle the problems of scene deviations from pro<strong>to</strong>types.<br />

Starting with a good pro<strong>to</strong>type of a known object, templates can be parametrically<br />

modiæed <strong>to</strong> obtain a better æt until no more improvement is obtained.<br />

èThis technique has been successfully applied <strong>to</strong> the sorting of chromosome images.è<br />

A diæculty is that æexible approaches tend <strong>to</strong> be more time-expensive than rigid<br />

approaches.<br />

Most of these template matching techniques apply only in two-dimensional cases and have<br />

little place in three-dimensional analysis, where perspective dis<strong>to</strong>rtion comes in<strong>to</strong> play.<br />

One can s<strong>to</strong>re a dense set of tessellations of possible views, but this results in enormous<br />

computational time in matching. For basic techniques of template matching, see ë15ë.<br />

2.1.2 Hypothesis-Prediction-Veriæcation<br />

The hypothesis-prediction-veriæcation approach is among the most commonly-used techniques<br />

in vision systems. It combines both bot<strong>to</strong>m-up and <strong>to</strong>p-down paradigms.<br />

Hypotheses are generated bot<strong>to</strong>m-up among structures in a geometric hierarchy, from<br />

structures of edges <strong>to</strong> curves, from structures of curves <strong>to</strong> surfaces, and from structures<br />

of surfaces <strong>to</strong> objects. Predictions work <strong>to</strong>p-down from object models <strong>to</strong> images. Once<br />

ahypothesis has been made, predictions and measurements provide new information for<br />

identiæcation.<br />

Many such schemes use geometric features such as arcs, lines and corners <strong>to</strong> construct<br />

model representations. These features are usually portions of the object's boundary.<br />

In the HYPER system ë4ë, both model and scene are represented in the same way<br />

by approximating the boundary with polygons. The ten longest segments are chosen as<br />

so-called privileged segments which are the focus features used <strong>to</strong> detect a prospective


CHAPTER 2. PRIOR AND RELATED WORK 11<br />

correspondence èhypothesisè between object models and the scene. Using this correspondence,<br />

a transformation can be computed and nearby features are sought èpredictionè. The<br />

privileged segments, <strong>to</strong>gether with their nearby features, are combined <strong>to</strong> compute a new<br />

transformation. Then the veriæcation step follows.<br />

Veriæcation is often accomplished <strong>using</strong> an alignment method. Usually there is a tradeoæ<br />

between the hypothesis generation stage and the veriæcation stage. If hypotheses are<br />

generated in a quick-and-dirty manner, then the veriæcation stage requires more eæort. If<br />

we want the veriæcation stage <strong>to</strong> be less pains-taking, then more reliable features have <strong>to</strong><br />

be detected and more accurate hypotheses produced.<br />

In alignment by Huttenlocher and Ullman ë30,31ë, they consider aæne approximations<br />

<strong>to</strong> more general perspective transformations, <strong>using</strong> alignments of triplets of points. Models<br />

are processed sequentially during recognition. For each model, an exhaustive enumeration<br />

of all the possible pairings of three non-collinear points of the model and the scene is<br />

exercised. Thus the alignment method heavily relies on veriæcation. As a transformation<br />

is determined by the correspondence of model and scene features, the model is transformed<br />

and aligned <strong>to</strong> superimpose the image. Veriæcation of the entire edge con<strong>to</strong>ur, rather than<br />

just a few local feature points, is then performed <strong>to</strong> reduce the false alarm rate. To cope<br />

with the high computational cost of the veriæcation stage, a hierarchical scheme is used:<br />

Starting with a relatively simple and rapid check, they eliminate many false matches, and<br />

then conclude with a more accurate and slower check.<br />

To sum up, the hypothesis-prediction-veriæcation cycle relates scene images <strong>to</strong> object<br />

models and object models <strong>to</strong> scene images step by step with reænement in each step.<br />

Note also that any scheme <strong>using</strong> the ëhypothesis-prediction-veriæcation" paradigm can<br />

be tailored <strong>to</strong> a parallel implementation by <strong>using</strong> the overwhelming computing power <strong>to</strong><br />

generate a great number of hypotheses concurrently and do veriæcations concurrently.<br />

2.1.3 Transformation Accumulation<br />

Transformation accumulation is also called pose clustering ë50ë or the generalized Hough<br />

transform ë5ë, which ischaracterized by a ëparallel" accumulation of low level pose evidences,<br />

followed by a clustering step which selects pose hypotheses with strong support<br />

from the set of evidences.<br />

This method can be viewed as the inverse of template matching, which moves the model


CHAPTER 2. PRIOR AND RELATED WORK 12<br />

template around all the positions of the scene and directly computes a value èusually crosscorrelationè<br />

as a measure of matching. Transformation accumulation, instead of trying all<br />

possible positions of the model in the scene, computes which positions are consistent with<br />

model features and scene features. Moreover, its use of all of the evidences without regard<br />

<strong>to</strong> their order of arrival can be advantageous when there are occlusions.<br />

The features used <strong>to</strong> generate pose hypotheses can be of high level or low level. If the<br />

level of features used are high enough, much less computational eæort is required for the<br />

accumulation stage. However, the trade-oæ is that higher level features are usually more<br />

diæcult <strong>to</strong> extract.<br />

The major problem of this technique usually lies in ensuring that all visible hypotheses<br />

are considered within acceptable limits of s<strong>to</strong>rage and computational resources. However,<br />

as computer power continues <strong>to</strong> grow while its cost continues <strong>to</strong> drop, this problem has<br />

been growing steadily less signiæcant.<br />

In the generalized Hough transform ë5ë, the transformation between a model and the<br />

scene is usually described by a set of transformation parameters. For example, four parameters<br />

are needed for the case of two-dimensional similarity transformations: two parameters<br />

for translation; one parameter for rotation; one parameter for scaling. Each transformation<br />

parameter is quantized in the so-called Hough space. The scene features vote for<br />

these parameters which are consistent with the pairings of these scene features and hypothesized<br />

model features. One problem of the generalized Hough transform is the size<br />

of the Hough space. In terms of two-dimensional cases, we face three-dimensional Hough<br />

space for rigid transformations; four-dimensional Hough space for similarity transformations;<br />

six-dimensional Hough space for aæne transformations.<br />

Tucker et al. ë53ë use local boundary features <strong>to</strong> constrain an object's position and<br />

orientation, which is then used as the basis for hypothesis generation. Their system takes<br />

advantage of the highly parallel processing power of the Connection Machine ë26ë <strong>to</strong> generate<br />

numerous transformation hypotheses concurrently and verify them concurrently. The<br />

number of processors required for each model is equal <strong>to</strong> the number of features of the<br />

model. Each scene feature participates, in parallel, independently in each processor <strong>to</strong><br />

match model features for generating transformation hypotheses, which, after veriæed, are<br />

used for evidence accumulation for voting for the pose transformation of the object.


CHAPTER 2. PRIOR AND RELATED WORK 13<br />

Thompson and Mundy ë52ë describe a system for locating objects in a relatively unconstrained<br />

environment. The availability of a three-dimensional surface model of a polyhedral<br />

object is assumed. The primitive feature used is the so-called vertex-pair, which consists<br />

of two vertices: one is characterized by its position coordinate; the other, in addition <strong>to</strong><br />

position coordinate, includes two edges that deæne the vertex. This feature serves as the<br />

basis of computing the aæne viewing transformation from the model <strong>to</strong> the scene. Through<br />

the voting in the transformation space, candidate transformations are selected.<br />

A common critique about this paradigm lies in that as the scene is noisy, the accumulation<br />

of ëevidences" contributed by random noises can possibly result in false alarms.<br />

Grimson and Huttenlocher ë22ë give a formal analysis of the likelihood of false positive<br />

responses of the generalized Hough transform for object recognition. However, we can use<br />

this paradigm as an early stage of processing èi.e., as a ælterè, followed by a scrutinized<br />

veriæcation stage.<br />

2.1.4 Consistency Checking and Constraint Propagation<br />

Many model-based vision schemes are based on searching the set of possible interpretations,<br />

which is usually combina<strong>to</strong>rially large.<br />

As in other areas of artiæcial intelligence, making use of large amounts of world knowledge<br />

can often lead not only <strong>to</strong> increased robustness but also <strong>to</strong> a reduction in the search<br />

pace that must be explored during the process of interpretation. It is usually possible <strong>to</strong><br />

analyze a number of constraints or consistency conditions that must be satisæed <strong>to</strong> make<br />

a correct interpretation. Eæective application of consistency checking or propagation of<br />

constraints during searching can often prune the search space greatly.<br />

Lowe ë41ë emphasizes the importance of viewpoint consistency constraint, which requires<br />

that the locations of all object features in an image be consistent with the projection<br />

from a single viewpoint. The application of this constraint allows the spatial information<br />

in an image <strong>to</strong> be compared with prior knowledge of an object's shape <strong>to</strong> the full degree<br />

of available image resolution. Lowe also argues that viewpoint-consistency plays a central<br />

role in most instances of human visual recognition.<br />

Grimson ë20ë extended his previous work RAF ë23ë <strong>to</strong> handle some classes of parameterized<br />

objects. He approaches the recognition problem as a searching problem <strong>using</strong> the


CHAPTER 2. PRIOR AND RELATED WORK 14<br />

so-called interpretation tree. Since this search is inherently an exponential process, he analyzed<br />

a set of geometric constraints based on the local shape of parts of objects <strong>to</strong> prune<br />

large subtrees from consideration without having <strong>to</strong> explore them.<br />

The relaxation labeling technique èe.g. ë51,54ëè is yet another example of this type.<br />

Many visual recognition problems can be viewed as constraint-satisfaction problems. For<br />

example, when labeling a block-world picture, a relaxation algorithm iteratively assigns<br />

values <strong>to</strong> mutually constrained objects <strong>using</strong> local information alone in such away that<br />

ensures a consistent set of values for which no constraints are violated. The values assigned<br />

<strong>to</strong> the objects by relaxation algorithms can be discrete or probabilistic. In the former case,<br />

discrete levels are assigned <strong>to</strong> objects, while in the latter case, a level of certainty èor<br />

probabilityè is attached <strong>to</strong> each label. Relaxation algorithms can proceed in a paralleliterative<br />

manner, propagating matching constraints in parallel.<br />

2.1.5 Sub-Graph Matching<br />

Both object models and scene images can be expressed by graphs of nodes and arcs. Nodes<br />

represent features èusually geometric structuresè detected and arcs represent the geometric<br />

relations between these features, e.g. ë11ë. The task then reduces <strong>to</strong> ænd an embedding or<br />

ætting of a model in a description of the scene.<br />

Since a model contains both local and relational features in the form of a graph, matching<br />

depends not only on the presence of particular boundary features but also on their<br />

interrelations èe.g. distanceè. The requirement for successful recognition is that a suæcient<br />

set of key local features have <strong>to</strong> be visible and in correct relative positions, which may<br />

allow for a speciæed amount of dis<strong>to</strong>rtion. For example, Bolles and Cain ë10ë proposed the<br />

use of a hierarchy of local focus features for model representation. Once the local focus<br />

features have been detected, nearby features are predicted and searched for in the scene.<br />

If the best focus features are occluded, the second best focus features are used.<br />

An advantage of this technique is its robustness <strong>to</strong> relative occlusion and dis<strong>to</strong>rtion.<br />

Computational ineæciency can be a drawback, since graph-matching or subgraph isomorphism<br />

procedures have <strong>to</strong> be executed. Eæective strategies <strong>to</strong> reduce the time complexity<br />

are therefore required. In ë10ë, once an occurrence of the best focus feature is located, the<br />

system simply runs down the appropriate list of nearby features and a transformation is<br />

computed, followed by averiæcation step. Thus exploration of the whole relational graph


CHAPTER 2. PRIOR AND RELATED WORK 15<br />

is avoided. Barrow and Tenenbaum ë6ë proposes use of a hierarchical graph-searching<br />

technique which decomposes the model in<strong>to</strong> independent components.<br />

2.1.6 Evidential Reasoning<br />

Evidential reasoning is a method for combining information from diæerent sources of evidence<br />

<strong>to</strong> update probabilistic expectations. The attractive feature of <strong>using</strong> evidential<br />

reasoning for computer vision is that it allows us <strong>to</strong> combine information of varying reliability<br />

from many sources, even though no particular item of evidence is by itself strong<br />

enough for recognizing a particular object.<br />

For example, SCERPO by Lowe ë40ë is a search-based matching system. As <strong>to</strong> the<br />

search space of SCERPO, two major components are involved: è1è the space of possible<br />

viewpoints; è2è the space of selecting one model from the model database. Lowe tackles<br />

the ærst component by perceptual organization and suggests that the second component<br />

be treated by a technique of evidential reasoning. Each piece of evidence contributes<br />

<strong>to</strong> the presence of a certain object or objects, which quantitatively can be described by a<br />

probabilistic expectation. The combining of evidences thus can be quantitatively described<br />

by combining probabilistic expectations. The probability ranking is constantly updated <strong>to</strong><br />

reæect new evidence found. For example, as soon as one object is recognized, it provides<br />

contextual information which updates the rankings and aids in the search process. Ranking<br />

and updating may take time; however, as the list of possible objects increases, cost of the<br />

evidential reasoning may be amortized and its use becomes more important.<br />

To sum up, evidential reasoning may deserve serious consideration <strong>to</strong> be incorporated<br />

in<strong>to</strong> a system under development. It shows promise for carrying out the objective of<br />

combining many sources of information, including color, texture, shape and prior knowledge<br />

in a æexible way <strong>to</strong>achieve recognition.<br />

2.1.7 Miscellaneous<br />

Dual Models<br />

If partial visibility problem is <strong>to</strong> be attacked, a recognition system <strong>using</strong> global features<br />

alone can not be feasible. Only local features can be used in the matching procedure in<br />

order <strong>to</strong> cope with overlapping parts and occlusions. One is free <strong>to</strong> choose diæerent local


CHAPTER 2. PRIOR AND RELATED WORK 16<br />

features such aspoints, lines, curves, etc., or their combinations as the matching features,<br />

according <strong>to</strong> the nature of the objects in the model base. However, two fac<strong>to</strong>rs of representation,<br />

completeness and eæciency, have <strong>to</strong> be taken in<strong>to</strong> consideration. Completeness<br />

of representation suggests rich enough features in order <strong>to</strong> enable discrimination between<br />

similar objects; while eæciency suggests <strong>using</strong> some minimal characteristic information <strong>to</strong><br />

match models again the scene èso that the inherent complexity of the matching algorithm<br />

will be smallè.<br />

A main drawback of representation by local features can be its incompleteness, since<br />

objects usually can not be fully described by a set of local features alone. One way <strong>to</strong> cope<br />

with this problem is <strong>to</strong> use both global and local feature representations: Local features<br />

are used in the matching procedure; while the additional complete representation is used,<br />

after the matching process, for veriæcation and further reænement of the object localization<br />

ë4,10,19,31,36ë.<br />

Dual Scenes<br />

Object recognition algorithms usually fall in<strong>to</strong> two categories í those <strong>using</strong> intensity images<br />

and those <strong>using</strong> range data. However, we may use both of them as long as it helps.<br />

For example, Kishon ë34ë combines the use of both range and intensity data <strong>to</strong> extract<br />

3-D curves from a scene. These curves will be of a much higher quality than if they were<br />

extracted from the range data alone. The combined information is also used <strong>to</strong> classify the<br />

3-D curves as either shadow, occluding, fold or painted curves.<br />

Abstracted <strong>Features</strong><br />

An abstracted feature is a feature which does not physically exist, but is inferred. For<br />

example, in preparing a model representation for polygonal objects, we may extend nonneighboring<br />

edges <strong>to</strong> obtain their intersection and use the contained angle as a feature.<br />

Some more advanced applications of <strong>using</strong> abstracted features include the footprint of<br />

concavity ë37ë, the Fourier Descrip<strong>to</strong>r used for silhouettes description of aircraft in ë2ë and<br />

the representation of turning angle as a function of arc length used in ë3,19,49ë.


CHAPTER 2. PRIOR AND RELATED WORK 17<br />

Normalization<br />

There are various reasons for the use of normalization.<br />

In some cases, it is used <strong>to</strong> preserve certain properties. For example, in the probabilistic<br />

relaxation labeling scheme, the accumulated contributions from neighboring points have <strong>to</strong><br />

be normalized so that the updating rule results in a probability èi.e., a value in ë0::1ëè.<br />

In other cases, it can be viewed as a technique for removing some uncertain fac<strong>to</strong>rs. For<br />

example, suppose we use Fourier Descrip<strong>to</strong>r <strong>to</strong> describe the boundary of an object. In order<br />

for such representation insensitive <strong>to</strong> the variations as changes in size, rotation, translation<br />

and so on, we have <strong>to</strong> perform the normalization operations such that the con<strong>to</strong>ur has a<br />

ëstandard" size, orientation and starting point. By normalizing the Fourier component<br />

F è1è <strong>to</strong> have unity magnitude, we remove the fac<strong>to</strong>r of scaling variation. Similarly, the<br />

ès-çè graph èthe representation of turning angle as a function of arc lengthè has <strong>to</strong> be<br />

shifted vertically by adding an oæset <strong>to</strong> ç so that the reference point on the con<strong>to</strong>ur is<br />

somewhat a standard value, such as zero. Since the rotation of a con<strong>to</strong>ur in Cartesian<br />

space corresponds <strong>to</strong> a simple shift in the ordinate èçè of the s-ç graph, such normalization<br />

process removes the fac<strong>to</strong>r of rotation ë19ë.<br />

Other good reasons <strong>to</strong> use normalization involve making some operations easier <strong>to</strong><br />

perform. For example, in performing point set matching, Hong and Tan ë27ë normalize<br />

both point sets <strong>to</strong> their canonical forms ærst; then, their canonical forms are matched. After<br />

normalization, the matching between two canonical forms reduce <strong>to</strong> a simple rotation of<br />

one <strong>to</strong> match the other.<br />

2.2 An Overview of <strong>Geometric</strong> <strong>Hashing</strong><br />

2.2.1 A Brief Description<br />

The geometric hashing idea has its origins in work of Schwartz and Sharir ë49ë. Application<br />

of the geometric hashing idea for model-based visual recognition was described by Lamdan,<br />

Schwartz and Wolfson. This section outlines the method; a more complete description can<br />

be found in Lamdan's dissertation ë36ë; a parallel implementation can be found in ë46ë.<br />

The geometric hashing method proceeds in two stages: a preprocessing stage and a<br />

recognition stage. In the preprocessing stage, we construct a model representation by


CHAPTER 2. PRIOR AND RELATED WORK 18<br />

computing and s<strong>to</strong>ring redundant, transformation-invariant model information in a hash<br />

table. During the subsequent recognition stage the same invariants are computed from<br />

features in a scene and used as indexing keys <strong>to</strong> retrieve from the hash table the possible<br />

matches with the model features. If a model's features scores suæciently many hits, we<br />

hypothesize the existence of an instance of that model in the scene.<br />

The Pre-processing Stage<br />

Models are processed one by one. New models added <strong>to</strong> the model base can be processed<br />

and encoded in<strong>to</strong> the hash table independently. For each model M and for every feasible<br />

basis b consisting of k points èk depends on the transformations the model objects undergo<br />

during formation of the class of images <strong>to</strong> be analyzedè, we<br />

èiè compute the invariants of all the remaining points in terms of the basis b;<br />

èiiè use the computed invariants <strong>to</strong> index the hash table entries, in each of<br />

which we record a node èM; bè.<br />

Note that all feasible bases have <strong>to</strong> be used. In particular, all the permutations èup <strong>to</strong> k!è<br />

of the k inputs used <strong>to</strong> calculate the invariants have <strong>to</strong> be considered. The complexity of<br />

this stage is Oèm k+1 è per model, where m is the numberofpoints extracted from a model.<br />

However, since this stage is executed oæ-line, its complexity is of little signiæcance.<br />

The Recognition Stage<br />

Given a scene with n feature points extracted, we<br />

èiè choose a feasible set of k points as a basis b;<br />

èiiè compute the invariants of all the remaining points in terms of this basis b;<br />

èiiiè use each computed invariant <strong>to</strong> index the hash table and hit all èM i ;b j è's<br />

that are s<strong>to</strong>red in the entry retrieved;<br />

èivè his<strong>to</strong>gram all èM i ;b j è's with the number of hits received;


CHAPTER 2. PRIOR AND RELATED WORK 19<br />

èvè establish a hypothesis of the existence of an instance of model M i<br />

in the<br />

scene if èM i ;b j è, for some j, peaks in the his<strong>to</strong>gram with suæciently many<br />

hits;<br />

and repeat from step èiè, if all hypotheses established in step èvè fail veriæcation.<br />

The complexity of this stage is Oènè+Oètè per probe, where n is the number of points<br />

extracted from the scene and t is the complexity ofverifying an object instance.<br />

2.2.2 Strengths and Weaknesses<br />

At a glance, geometric hashing seems similar <strong>to</strong> the transformation accumulation techniques<br />

discussed in section 2.1.3. However, the similarity lies only in the use of ëaccumulating<br />

evidence" by voting. The techniques discussed in section 2.1.3 accumulate ëpose"<br />

evidence while geometric hashing accumulates ëfeature correspondence" evidence. The<br />

former analysis always votes for parameters of transformations while the latter votes for<br />

èmodel identiæer, basis setè pair, where transformation parameters can be computed when<br />

the correspondence of scene feature set and model basis set is hypothesized.<br />

Strengths<br />

Most of the methods discussed in section 2.1 are search-based: Model features are searched<br />

<strong>to</strong> match scene features and this search process goes through each model in the model base<br />

sequentially. For example, the interpretation tree technique by Grimson ë23ë has inherent<br />

exponential complexity by pairing each scene feature <strong>to</strong> each model feature combina<strong>to</strong>rially.<br />

<strong>Geometric</strong> hashing greatly accelerates search of the model base by <strong>using</strong> a hashing<br />

technique <strong>to</strong> select candidate model features <strong>to</strong> match scene features. This process is at<br />

worst sublinear in the size of the model base.<br />

Like the transformation accumulation techniques, geometric hashing's voting scheme<br />

copes well with occlusion and with the fragility of existing image segmentation techniques.<br />

The accumulation of evidence without regard <strong>to</strong> order makes parallel implementation easy.<br />

Weaknesses<br />

Errors in feature extraction will commonly lead <strong>to</strong> perturbation of invariants and degradation<br />

of recognition performance, since perturbed invariant values are used <strong>to</strong> index the


CHAPTER 2. PRIOR AND RELATED WORK 20<br />

hash table <strong>to</strong> retrieve candidate model features. Performance is similarly degraded by<br />

quantization noise introduced in constructing the hash table. Thus use of point features<br />

does not seem viable in a seriously degraded image, see ë21ë.<br />

The inherently non-uniform distribution of computed invariants in the hash table results<br />

in non-uniform hash bin occupancy for uniformly quantized hash table ë47ë. This degrades<br />

the index selectivity power of the hashing scheme used.<br />

2.2.3 <strong>Geometric</strong> <strong>Hashing</strong> Systems<br />

Since the introduction of the geometric hashing method by Wolfson and Lamdan, a number<br />

of subsequent applications and improvements have been developed at the Robotics<br />

Research Labora<strong>to</strong>ry of New York University and other research labora<strong>to</strong>ries.<br />

Gavrila and Groen ë18ë apply the hashing technique for recognition of 3-D objects from<br />

single 2-D images. They determine limits of discriminability by experiments <strong>to</strong> generate<br />

viewpoint-centered models.<br />

Gueziec and Ayache ë24ë address the problem of fast rigid matching of 3-D curves in<br />

medical images. They incorporate six diæerential invariants associated with the surface and<br />

introduce an original table, where they hash values for the six transformation parameters.<br />

Hummel and Wolfson ë29ë discuss both the object matching and the curve matching<br />

by geometric hashing.<br />

Flynn and Jain ë17ë use invariant feature and the hashing technique <strong>to</strong> generate hypotheses<br />

without resorting <strong>to</strong> a voting procedure. They conclude that this method is more<br />

eæcient than a constrained search technique, for example, interpretation tree technique<br />

ë23ë.<br />

Recently, the work of Rigoutsos develops geometric hashing by incorporating a Bayesian<br />

model for weighted voting, which we describe in the next section.<br />

2.3 The Bayesian Model of Rigoutsos<br />

The thesis work of Rigoutsos ë43ë from 1992 considerably extends the capability of the<br />

geometric hashing scheme.<br />

To cope with the problem caused by the positional uncertainty of point features, a<br />

Gaussian distribution is used <strong>to</strong> model the perturbation of the coordinate of the point. That


CHAPTER 2. PRIOR AND RELATED WORK 21<br />

is, the measured values are assumed <strong>to</strong> be distributed according <strong>to</strong> a Gaussian distribution<br />

centered at the true values and having standard deviation ç. More precisely, let èx i ;y i èbe<br />

the ëtrue" location of the i-th feature point in the scene. Let also èX i ;Y i è be the continuous<br />

random variables denoting the coordinates of the i-th feature. The joint probability density<br />

function of X i and Y i is then given by:<br />

fèX i ;Y i è= 1<br />

2çç 2 expè,èX i , x i è 2 +èY i , y i è 2<br />

2ç 2 è:<br />

Using this error model, he formulates a weighted voting scheme for the evidence accumulation<br />

in geometric hashing.<br />

He uses a weighted contribution <strong>to</strong> the modelèbasis<br />

hypothesis of an entry ç in the hash space based on a scene point's hash ç in the same<br />

space, <strong>using</strong> a formula of the form<br />

logë1 , c + A expè, 1 2 èç , çèæ,1 èç , çè t èë;<br />

where c and A are constants that depend on the scene density and æ is the covariance<br />

matrix in hash space coordinates of the expected distribution of ç based on the error<br />

model for èx; yè, the feature point in the scene.<br />

He shows that when weighted voting is used according <strong>to</strong> the above formula, the evidence<br />

accumulation can be interpreted as a Bayesian aposteriori classiæcation scheme. He<br />

shows that the accumulations are related <strong>to</strong><br />

logëProbèH k jE 1 ; æææ;E n èë<br />

where the hypothesis H k represents the proposition that a certain modelèbasis occurs in<br />

the scene and matches the chosen basis, and E i 's are pieces of evidence given by scene<br />

invariants.<br />

The above formulation is implemented on a 8K-processor Connection Machine èCM-<br />

2è and can recognize objects that have undergone a similarity transformation, from a<br />

library of 32 models. The models used for experiments are military aircraft and production<br />

au<strong>to</strong>mobiles. <strong>Features</strong> are extracted by <strong>using</strong> the Boie-Cox edge detec<strong>to</strong>r ë9ë and locating<br />

the points of high curvature. The features are the coordinate pairs of these points ë44ë.<br />

In this thesis, we also make use of a Bayesian aposteriori maximum likelihood recognition<br />

scheme based on geometric hashing. We use somewhat simpliæed versions of the<br />

formulae given by Rigoutsos. We make use of line features as opposed <strong>to</strong> point features.


CHAPTER 2. PRIOR AND RELATED WORK 22<br />

However, we also assume a Gaussian perturbation model of the feature values, which in our<br />

case implies an independent perturbation of the èç; rè variables. Since no reliable segmentation<br />

is assumed <strong>to</strong> be available, we extracted our line features by means of an improved<br />

Hough transform technique that operates on seriously degraded environments. Objects<br />

modeled by lines that have undergone aæne transformations are used for experiments. A<br />

viewpoint consistency constraint is also applied <strong>to</strong> further ælter out false alarms.


Chapter 3<br />

Noise in the Hough Transform<br />

This chapter overviews the eæect of noise on the Hough transform, which is used <strong>to</strong> detect<br />

line features from an edge map.<br />

We use a series of simulations <strong>to</strong> estimate the reliability and accuracy of the Hough<br />

technique. By reliability we mean its ability <strong>to</strong> detect lines, while coping with occlusions<br />

and additive noise; by accuracy we mean the deviation of detected line parameters from<br />

their true values. Although much has been published in this question and theoretical<br />

analyses have been given èsee ë22ëè, important questions on these two issues remains open<br />

and simulation remains a revealing technique.<br />

In section 3.1, we brieæy review the his<strong>to</strong>ry and technique of the Hough transform.<br />

Section 3.2 describes the fac<strong>to</strong>rs that aæect the performance of the Hough transform and<br />

the way it can be improved. A series of simulations on images, covering a range of line<br />

lengths, line orientations and line positions under increasing noise level èboth positive and<br />

negative noiseè have been performed. Section 3.3 shows our experimental results, which<br />

are summarized in section 3.4.<br />

3.1 The Hough Transform<br />

The well-known Hough technique was introduced by Paul Hough in a US patent æled in<br />

1962. Hough's initial application was analysis of bubble chamber pho<strong>to</strong>graphs of particle<br />

tracks; such images contain a large amount of noise. Hough proposed <strong>to</strong> use ampliæers,<br />

delays, signal genera<strong>to</strong>rs, and so on <strong>to</strong> perform what we now call the Hough transform in<br />

23


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 24<br />

an analog form.<br />

A common digital implementation of the Hough transform applies the normal parameterization<br />

suggested by Duda and Hart ë15ë in the form<br />

x cos ç + y sin ç = r;<br />

where r is the perpendicular distance of the line <strong>to</strong> the origin and ç is the angle between<br />

a normal <strong>to</strong> the line and the positive x axis.<br />

This normal parameterization has several advantages: r and ç vary uniformly as the line<br />

orientation and position change, and neither goes <strong>to</strong> inænity as the line becomes horizontal<br />

or vertical. It is easy <strong>to</strong> see that points on a particular line will all map <strong>to</strong> sinusoids that<br />

intersect in a common point in Hough space and the coordinate èç; rè of that intersection<br />

point gives the parameters of the line.<br />

In standard implementations of the Hough method, Hough space is quantized. Each<br />

èx i ;y i è is mapped <strong>to</strong> a sampled, quantized sinusoid. The detailed algorithm is as follows:<br />

1. Quantize the 2-D Hough space èi.e., parameter spaceè between 0 æ and 180 æ for ç and<br />

, p 1 N 2<br />

<strong>to</strong> + p 1 N 2<br />

for r èN æ N is the image sizeè.<br />

2. Form an accumula<strong>to</strong>r array Aëçëërë, whose buckets are initialized <strong>to</strong> 0.<br />

3. For each edgel èx; yè in the image, increment all buckets in the accumula<strong>to</strong>r array<br />

along the appropriate sinusoid, i.e.,<br />

Aëçëërë:=Aëçëërë+1<br />

for ç and r satisfying r = x cos ç + y sin ç within the limits of the quantization.<br />

4. Locate, in the accumula<strong>to</strong>r array, local maxima, which correspond <strong>to</strong> collinear edgels<br />

in the image èthe values of the accumula<strong>to</strong>r array provide a measure of the number<br />

of edgels on the lineè.<br />

We note that due <strong>to</strong> the quantization of Hough space and noise in the image, sinusoids<br />

generated by points of the same line do not in general intersect precisely at a common point<br />

in the Hough space. Much literature has addressed this problem by suggesting additional<br />

processing èsee ë32ë for a surveyè.


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 25<br />

3.2 Various Hough Transform Improvements<br />

Since the above straightforward implementation of the Hough transform needs improvement<br />

in highly degraded environments ë22ë, we have investigated the fac<strong>to</strong>rs aæecting the<br />

performance of the technique and implemented and experimented on a series of variations<br />

of the Hough transform.<br />

Fac<strong>to</strong>rs that adversely aæect the performance èreliability and accuracyè of the Hough<br />

transform include:<br />

æ Short lines inherently result in low peaks in Hough space Hèç; rè, which prevents the<br />

detection of short lines relative <strong>to</strong> long lines.<br />

æ When r = x cos ç + y sin ç is rounded <strong>to</strong> pick a particular bucket in Hèç; rè for a<br />

speciæc ç, fractional r information is lost and all subsequent computations are done<br />

on the rounded r values.<br />

æ ç is also sampled. If a line has an orientation ç 0 , not exactly equal <strong>to</strong> any sampled<br />

ç, the points on the line, when mapped <strong>to</strong> Hough space, will not have identical r<br />

values. Instead, their r values will spread out near the real r value.<br />

æ Using the coordinate èç; rè of the peak in Hough space as the line parameter limits<br />

the precision of the parameters detected, since Hough space is quantized.<br />

To strengthen the low peaks in Hough space which result from short lines present in<br />

a source image ècompared <strong>to</strong> the inherently high peaks which result from long linesè, we<br />

apply background subtraction by removing the contribution made <strong>to</strong> Hough space by edgels<br />

along a long line after this long line has been detected. This method is attributed <strong>to</strong> Risse<br />

ë48ë. To implement the idea, a global maximum in Hough space, say bucket èç; rè, is<br />

located; then its corresponding line in the image is identiæed and all accumula<strong>to</strong>r buckets<br />

which were incremented during processing edgels lying along this line are decremented.<br />

This algorithm diæers from the algorithm described in the previous section in that instead<br />

of detecting local maxima in the accumula<strong>to</strong>r array as candidate lines in the image, it<br />

detects lines one after one iteratively as follows:<br />

while still more lines <strong>to</strong> be detected do<br />

è^ç; ^rè := globalMaxèaccumula<strong>to</strong>r Aè;


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 26<br />

for each edgel èx; yè on line x cos ^ç + y sin ^ç =^r do<br />

Aëçëërë :=Aëçëërë , 1, for ç and r satisfying r = x cos ç + y sin ç<br />

end for<br />

end while<br />

Starting with this algorithm as the backbone, we improve it with weighted voting<br />

<strong>to</strong> compensate the rounding error of r and achieve sub-bucket precision computing, as<br />

described below.<br />

To compensate for the rounding error of r, we distribute the Hough vote-increment<br />

value ètypically 1è over a region in Hough space. For example, suppose r exact = x cos ç +<br />

y sin ç èi.e., before roundingè and r a and r b are consecutive values of sampled r in the Hough<br />

space such that r a ç r exact ér b . Then we increment Hèr a ;çèby an amount proportional<br />

<strong>to</strong> r b , r exact and increment Hèr b ;çèby an amount proportional <strong>to</strong> r exact , r a . That is, we<br />

distribute the increments linearly in two related buckets.<br />

To compensate for the spreading of votes caused by the quantization of ç and r, we<br />

ænd the bucket èç; rè receiving the maximum numberofvotes, <strong>using</strong> a small window è3<br />

by 3 in our implementationè centered around this bucket and compute the center of mass<br />

over this small window <strong>to</strong>achieve sub-bucket precision in line parameter estimation.<br />

Experiments on seriously degraded images show that this method is superior <strong>to</strong> other<br />

variations of the Hough transform in the sense that more true lines and fewer spurious<br />

lines are detected and estimated line parameters' accuracy is quite good.<br />

In order <strong>to</strong> further reduce the number of spurious lines detected, we notice that the<br />

Hough transform detects lines in a very global way. More precisely, it counts the number<br />

of edgels which happen <strong>to</strong> lie on the same line, sparsely or densely, as the evidence of the<br />

existence of a line. This diæers from human visual recognition, which exhibits grouping by<br />

proximity.<br />

To further enhance our algorithm, we therefore use grouping by considering proximity<br />

as follows.<br />

To detect n lines in an image, we<br />

1. apply the algorithm described above <strong>to</strong> detect a pre-speciæed number, say<br />

2n, of lines;<br />

2. for each line èç i ;r i è detected,


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 27<br />

èaè go back <strong>to</strong> the image èedge mapè and scan edgels èx; yè's on the line<br />

with a 1-D sliding window of appropriate size èdepending on the image<br />

resolutionè.<br />

èbè if the number of edgels in the window is less than, say 1=3, of the<br />

window size, take èx; yè <strong>to</strong> be a noise edgel which happens <strong>to</strong> lie on<br />

the line èç i ;r i è.<br />

ècè re-accumulate the evidence support for the line without counting the<br />

noise edgels.<br />

3. sort these lines by their new evidence support and select the <strong>to</strong>p n from<br />

among them.<br />

The reason we simply select the <strong>to</strong>p n from among 2n lines detected by our previous<br />

algorithm is that our previous algorithm already has good performance, and we just want<br />

<strong>to</strong> further remove spurious lines.<br />

Figure 3.1 shows an example, comparing the result of the standard Hough technique<br />

and our improved Hough technique. Figure 3.1èaè shows a synthesized image without noise.<br />

It contains roughly 40 segments. Figure 3.1èbè shows the result of the standard Hough<br />

analysis. 50 lines are detected. 17 true lines are missing. Also many of the true lines are<br />

detected multiple times. Figure 3.1ècè shows the result by applying our improved Hough<br />

technique without enhancement of <strong>using</strong> proximity grouping. Also, 50 lines are detected.<br />

There is no missing true lines. Yet, since more lines are detected than are actually present<br />

in the source image, a few true lines are detected multiple times. Figure 3.1èdè shows the<br />

result by applying our improved Hough technique with proximity grouping enhancement.<br />

Figure 3.2èaè shows the same image as in Figure 3.1èaè, yet with serious noise. Figure<br />

3.2èbè <strong>to</strong> Figure 3.2èdè are the correspondences of Figure 3.1èbè <strong>to</strong> Figure 3.1èdè . The<br />

performance improvement over the standard Hough technique is much obvious.<br />

3.3 Implementation and Measured Performance<br />

In our implementation, we use æç = 1 è1 degreeè and ær = 1 è1 pixelè as the quantization<br />

values for ç and r in Hough space. This appears quite suæcient for our subsequent applications.<br />

Intuitively, smaller values of æç might give better precision but can result in the<br />

spreading of peaks èrecall that image is digitized so that the edgels of a line will not lie


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 28<br />

èaè A test image without noise. It contains<br />

roughly 40 segments.<br />

èbè 50 lines are detected <strong>using</strong> the<br />

standard Hough transform.<br />

ècè 50 lines are detected <strong>using</strong> our<br />

improved Hough transform without<br />

proximity grouping.<br />

èdè 50 lines are detected <strong>using</strong> our improved<br />

Hough transform with proximity<br />

grouping.<br />

Figure 3.1: A comparison of the results of the standard Hough technique and our improved<br />

Hough technique, when applied <strong>to</strong> an image without noise.


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 29<br />

èaè A test image with serious noise. It<br />

contains roughly 40 segments.<br />

èbè 50 lines are detected <strong>using</strong> the<br />

standard Hough transform.<br />

ècè 50 lines are detected <strong>using</strong> our<br />

improved Hough transform without<br />

proximity grouping.<br />

èdè 50 lines are detected <strong>using</strong> our improved<br />

Hough transform with proximity<br />

grouping.<br />

Figure 3.2: A comparison of the results of the standard Hough technique and our improved<br />

Hough technique, when applied <strong>to</strong> a noisy image.


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 30<br />

precisely on a digitized line except when the line has orientation of 0 æ ,45 æ ,90 æ or 135 æ è.<br />

Similarly smaller values of ær give better precision but result in the spreading of peaks.<br />

We have performed a series of experiments on synthesized images containing polygonal<br />

objects, which are arbitrarily chosen from our model base èsee Figure 3.3è. èThe same<br />

images are used in our later experiments.è<br />

model-0 model-1 model-2 model-3 model-4<br />

model-5 model-6 model-7 model-8 model-9<br />

model-10 model-11 model-12 model-13 model-14<br />

model-15 model-16 model-17 model-18 model-19<br />

Figure 3.3: The twenty models used in our experiments. From left <strong>to</strong> right, <strong>to</strong>p <strong>to</strong> bot<strong>to</strong>m<br />

are model-0 <strong>to</strong> model-19.<br />

Two kinds of noise are imposed in our experiments: positive and negative.


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 31<br />

We generate negative noise by randomly erasing edgels from the object boundary. For<br />

example, if we want 10è of the boundary <strong>to</strong> be missing èoccludedè, we scan along the<br />

boundary of every object and erase the edgels whenever ran modulo 10 = 0, where ran<br />

is a random number; similarly, ifwewant 20è of the boundary <strong>to</strong> be missing, we dothe<br />

same whenever ran modulo 5 = 0.<br />

The positive noise we generate consists of random dot noise and segment noise. Random<br />

dot noise is an additive dot noise whose position in the image is randomly selected; segment<br />

noise is added by randomly selecting a position in the image and an orientation èfrom 0 æ<br />

<strong>to</strong> 360 æ è for the segment then putting a line segment of length varying randomly from 0<br />

<strong>to</strong> 7 pixels.<br />

Simulations consider diæerent combinations of noise levels. The image size used in our<br />

experiments is 256 æ 256. Figure 3.4 shows a sample scene èimage-0è after application of<br />

various noise levels. Other test images are shown in Figure 3.5. The same noise levels were<br />

applied <strong>to</strong> them.<br />

In the following, we show the result of experiments on ten images. The number of lines<br />

we select is 70. Since more lines are detected than are actually present in the source image,<br />

our experiments allow for spurious lines.<br />

The following abbreviated table shows lines by the ranks of their votes, <strong>using</strong> ë*" <strong>to</strong><br />

represent lines actually present in the source image and ë." <strong>to</strong> represent spurious lines.<br />

Standard deviations of the ç and r values and their covariances are also given; this data is<br />

of signiæcance for design of the later recognition stage.<br />

æ case 0: no noise imposed on the image<br />

image-0: *************************************************.****.*..***.........<br />

image-1: ******************************.*******************.**...*..*....**....<br />

image-2: ********************************************************.*..**.*..*...<br />

image-3: *************************************************......*.........*....<br />

image-4: ****************.**********.****************.********...*..**.........<br />

image-5: **********************************************.***.*...........*..*.*.<br />

image-6: ***********************************************************.*.*..*....<br />

image-7: ******************************************************.****...........<br />

image-8: ***************************************************.**.*..*.....*...*.<br />

image-9: *********************************************..************.*.....*..*<br />

ç ç = 1.398227 èdegreeè or 0.024404 èradianè; ç r = 1.672734 èpixelè;<br />

ç çr = 0.189464 èdegree.pixelè or 0.003307 èradian.pixelè.<br />

æ case 1: 10è negative noise, no positive noise


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 32<br />

è0è Image-0 with no noise è1è Image-0 with 10è<br />

negative noise and no positive<br />

noise<br />

è2è Image-0 with 10è<br />

negative noise and 200<br />

pieces of positive noise<br />

è3è Image-0 with 10è<br />

negative noise and 400<br />

pieces of positive noise<br />

è4è Image-0 with 10è<br />

negative noise and 800<br />

pieces of positive noise<br />

è5è Image-0 with 20è<br />

negative noise and no positive<br />

noise<br />

è6è Image-0 with 20è<br />

negative noise and 200<br />

pieces of positive noise<br />

è7è Image-0 with 20è<br />

negative noise and 400<br />

pieces of positive noise<br />

è8è Image-0 with 20è<br />

negative noise and 800<br />

pieces of positive noise<br />

Figure 3.4: Reading from <strong>to</strong>p <strong>to</strong> bot<strong>to</strong>m, left <strong>to</strong> right, è0è <strong>to</strong> è8è are image-0 with diæerent<br />

noise levels corresponding <strong>to</strong> cases 0 <strong>to</strong> 8.


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 33<br />

Image-1 Image-2 Image-3<br />

Image-4 Image-5 Image-6<br />

Image-7 Image-8 Image-9<br />

Figure 3.5: Image-1 <strong>to</strong> Image-9 before noise is imposed. The same noise levels as for<br />

Image-0 are applied during our experiments.


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 34<br />

image-0: **********************************************.*.***.*.*..............<br />

image-1: ***************************************.***.****.....*.**...*.....*...<br />

image-2: *********************************************************..*..*.......<br />

image-3: **************************************.*******...***.*...*.*.......*..<br />

image-4: ****************.*.***************..***.*..**.*.*.*............*...*..<br />

image-5: ***************************************************.....*.*.*.**.....*<br />

image-6: ************************************************************..........<br />

image-7: ****************************************************.****....*.*......<br />

image-8: ****************************************************.*...*.*.*..*.....<br />

image-9: ***********************************..***********************.*......*.<br />

ç ç = 1.496886 èdegreeè or 0.026126 èradianè; ç r = 1.803655 èpixelè;<br />

ç çr = 0.244846 èdegree.pixelè or 0.004273 èradian.pixelè.<br />

æ case 2: 10è negative noise, along with 200 pieces of random dot noise and 200 pieces<br />

of segment noise<br />

image-0: ******************************.******.*.*........*..*..........*......<br />

image-1: ***********************.*.******.********..**..*.*............*.*.....<br />

image-2: ***************************************.***.*.***.*...***.....*.......<br />

image-3: *************************************.*****.*.*..*......*.....*.......<br />

image-4: *****************.*********.****.*.*.*....**.*..**...*......*...*.....<br />

image-5: **************************************.********.**.*.....*...*....***.<br />

image-6: *********************************************.***.**.*.*..*..........*<br />

image-7: ************************************************...**...**...*..***.*.<br />

image-8: **************************************.**..***.*.*..*......*......*...<br />

image-9: ******************************.*.*..********.***.*.*..*..*.......*.**.<br />

ç ç = 1.656363 èdegreeè or 0.028909 èradianè; ç r = 2.059478 èpixelè;<br />

ç çr = 0.119053 èdegree.pixelè or 0.002078 èradian.pixelè.<br />

æ case 3: 10è negative noise, along with 400 pieces of random dot noise and 400 pieces<br />

of segment noise<br />

image-0: *****************.************..****.**......*.*.....*................<br />

image-1: ***********************.*****..*********..**.......*...........*.....*<br />

image-2: *********************************.*.***.***.*.***..........*........*.<br />

image-3: *******************************.**..*.*..**...**.*.****..........**...<br />

image-4: ****************.*********.*....****..*.*..*..*...............**....*.<br />

image-5: *******************************.****.*.**..*.....*...*......*...*.**.*<br />

image-6: *********************************.*.*****.*.*..**..........*..*.*....*<br />

image-7: **********************************.*******.*****.**.....**.*......**..<br />

image-8: **********************************...**.*..*.*.*......*...............<br />

image-9: ***************************...*..*.*.**.*****.*.....*.......*........*<br />

ç ç = 1.712783 èdegreeè or 0.029894 èradianè; ç r = 2.017484 èpixelè;<br />

ç çr = 0.217823 èdegree.pixelè or 0.003802 èradian.pixelè.<br />

æ case 4: 10è negative noise, along with 800 pieces of random dot noise and 800 pieces<br />

of segment noise<br />

image-0: *******************.***.***......****..*...*..........*..*.......*....<br />

image-1: ****.**.**.**..*****.*.*..*.***.**.*....*..*.*....*..*..*.............<br />

image-2: ***********.********.*..*...***..****.***.....*..............*.......*<br />

image-3: *******.************.**.*.**.***...*....*.*..*.*......................<br />

image-4: ************.*******....*.......*..******..*.............*....*....*..<br />

image-5: ***************.***********.***......**..*........*............*......<br />

image-6: ***************..*************..****..**.*..........*......*.........*<br />

image-7: **************.***.*****.*****..***********.*.*...**..*...**.*.*...*..


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 35<br />

image-8: ***************.********.****..***.*.****.*.*.....*..*................<br />

image-9: ***********.**********.****..*.**...*.*..****.**.*..*...*..*...**.....<br />

ç ç = 1.938883 èdegreeè or 0.033840 èradianè; ç r = 2.519878 èpixelè;<br />

ç çr = -0.357589 èdegree.pixelè or -0.006241 èradian.pixelè.<br />

æ case 5: 20è negative noise, along with no positive noise<br />

image-0: ***********************************************.***...................<br />

image-1: ********************************.***..******.**.*.*....**........*...*<br />

image-2: ****************************************.***********.*..*.*....*......<br />

image-3: ********************************.******.*.***..*.....**.*...*.*.*...*.<br />

image-4: *********************.****.**.*****.**.******.**.*...........**...*...<br />

image-5: ********************************************..*.....*.**.....*........<br />

image-6: **********************************************************.*..*.......<br />

image-7: ************************************.***************.*.*.*..*......*..<br />

image-8: **********************************************.*.......*........*.....<br />

image-9: ******************.*.******.********.*.*..*******.***.**.*.*.*..*.*...<br />

ç ç = 1.563745 èdegreeè or 0.027293 èradianè; ç r = 1.913004 èpixelè;<br />

ç çr = 0.219434 èdegree.pixelè or 0.003830 èradian.pixelè.<br />

æ case 6: 20è negative noise, along with 200 pieces of random dot noise and 200 pieces<br />

of segment noise<br />

image-0: *************.****************.*..**.*.*.*.*..*......................*<br />

image-1: ******************************.*..***...**..**.*...........*.....*..*.<br />

image-2: ****************************.*****.*.**.**.......**.....****.*........<br />

image-3: **************************************.**.***.***.....*.......*...*...<br />

image-4: *******************.****.***..**.***...***..........................*.<br />

image-5: ***********************************.*.**.***..*....................*..<br />

image-6: **********************.***************..***...**.*..*...**.....*.*.*.*<br />

image-7: ******************************.****..***...****.*..*...*..*..*.*......<br />

image-8: *********************************.**..*.**.***.*..........*...........<br />

image-9: ***********.*****.*.******.**.*****...*...***......*......*..*.*.**.*.<br />

ç ç = 1.754688 èdegreeè or 0.030625 èradianè; ç r = 2.144938 èpixelè;<br />

ç çr = 0.349338 èdegree.pixelè or 0.006097 èradian.pixelè.<br />

æ case 7: 20è negative noise, along with 400 pieces of random dot noise and 400 pieces<br />

of segment noise<br />

image-0: *********.*****************.*.***.*...*.......*...*............*....*.<br />

image-1: **********************.*.**..**.*****..*.*.*****......................<br />

image-2: *******************.**.*****..*...*.**.*....*..**.....................<br />

image-3: ***************************.**.*.**......*.........**...*........*....<br />

image-4: ******************..**..*..**..*****..**...*.*............*.*.........<br />

image-5: *****************.************.**.*........***.*............*........*<br />

image-6: **********************..*******.*********.......*.**....*.***..*..*...<br />

image-7: ****************************.*.*.***.*.***..........**.*.*....*.**....<br />

image-8: *********************..*******.*.*****.*..*..**...*..*.......*..*.....<br />

image-9: ***************..*.**.*.*.***.**.**..****....*.*.*........*...*.*.....<br />

ç ç = 1.880869 èdegreeè or 0.032827 èradianè; ç r = 2.397261 èpixelè;<br />

ç çr = -0.113044 èdegree.pixelè or -0.001973 èradian.pixelè.<br />

æ case 8: 20è negative noise, along with 800 pieces of random dot noise and 800 pieces


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 36<br />

of segment noise<br />

image-0: ***********.****.******...*.****..*.....*.*...........**......*..*....<br />

image-1: ***..**.***.**.**.***.**..*.*.*.*.**...*.*.*.....*....................<br />

image-2: ******.*.****.**.**........***....**.....*........**.**.....*.........<br />

image-3: **********.****.*..*****..*.*.*..**..*.*........*.*...............*..*<br />

image-4: *************.*.*.*.......*.*..*.....*.........*..***...*.*.........**<br />

image-5: ********.**********.*****...*...*....*.*........*..**...........**.*.*<br />

image-6: ****************...*.*********...*....*.**...*....*.......*.*.....***.<br />

image-7: *************.***.***.*..*.*.***.**.**.***.*...*.........*.**.......*.<br />

image-8: ******.******....*.*****..*.....*.*.*.*...*.*.....*....*.............*<br />

image-9: ********..*******..**..*....**.**.*******...*.....*......*.....**.....<br />

ç ç = 2.168774 èdegreeè or 0.037852 èradianè; ç r = 2.503542 èpixelè;<br />

ç çr = -0.396545 èdegree.pixelè or -0.006921 èradian.pixelè.<br />

3.4 Some Additional Observations Concerning the Accuracy<br />

of Hough Data<br />

For the case when there is no noise imposed, we can see that true lines dominate the front<br />

of the list èthose with high votesè and spurious lines are at the rear of the list. Note that<br />

some of the true lines are detected multiply. This can happen when a line is slanting,<br />

because its edgels become dispersed in ladder fashion,<br />

and two parallel lines are often detected.<br />

Negative noise is more devastating <strong>to</strong> the Hough transform than positive noise. Positive<br />

noise, unless very intense, does not make true lines disappear; Negative noise subtracts<br />

signals and can seriously aæect the accuracy of the line parameters detected. This is<br />

especially true of short lines with a small slope. For example, given negative noise the edge<br />

map<br />

can turn in<strong>to</strong> something like<br />

or


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 37<br />

Indeed, our experiments showed that most of the undetected lines are slanting lines with<br />

small slopes.<br />

While background subtraction works well <strong>to</strong> remove bias in favor of long line segments<br />

over short line segments, it in fact implicitly applies the principle of the-winner-takes-all. In<br />

particular, when a long line segment intersects a short line segment, their intersection pixel<br />

will be classiæed as belonging <strong>to</strong> the long line segment and subtracted from the image after<br />

the long line segment is detected. However, when these two line segments do not intersect<br />

each other even though their extended lines do, the perceptual grouping technique we use<br />

reduces this adverse eæect.<br />

In regard <strong>to</strong> accuracy, we observe that line length is the key role aæecting the accuracy<br />

of Hough transform results. The Hough transform works well <strong>to</strong> detect long line segments<br />

even if the image is quite noisy. As line length decreases, noise aæects the accuracy of the<br />

Hough transform even more seriously.<br />

When the center of a segment is far away from the projection of the origin on<strong>to</strong> the<br />

extended line of the segment, the average error of r increases. This phenomenon can be<br />

rationalized as follows èsee Figure 3.6è: Since our algorithm loops through ç, we in fact<br />

consider a line through the origin with slope angle ç èi.e., with line parameter èç +90 æ ; 0èè<br />

and project all the image edgels on<strong>to</strong> this line. If bucket èç; rè in Hough space is detected<br />

with n votes, roughly n edgels are projected on<strong>to</strong> the line at positions roughly r distant<br />

from the origin. With this viewpoint in mind, due <strong>to</strong> the quantization of images, if the<br />

center of a line segment, with line parameter èç; rè, is far away from the intersection point<br />

of line èç; rè and line èç +90 æ ; 0è, the projections èon<strong>to</strong> line èç +90 æ ; 0èè of the points of<br />

that line segment disperse èon line èç +90 æ ; 0èè around the position r distant from the<br />

origin èexcept when the orientation of the line segment is0 æ ,45 æ ,90 æ or 135 æ .è


CHAPTER 3. NOISE IN THE HOUGH TRANSFORM 38<br />

r<br />

Projection of the Origin on the extended line<br />

O<br />

center of the segment<br />

Figure 3.6: The thicker line is with parameter èç; rè. Projections of the pixels of this line<br />

on<strong>to</strong> line èç +90 æ ;rè will disperse around the position r distant from the origin.


Chapter 4<br />

Invariant Matching Using <strong>Line</strong><br />

<strong>Features</strong><br />

The idea behind the geometric hashing method is <strong>to</strong> encode local geometric features in a<br />

manner which isinvariant under the geometric transformation that model objects undergo<br />

during formation of the class of images being analyzed. This encoding can then be used in<br />

a hash function that makes possible fast retrieval of model features from the hash table,<br />

which can accordingly be viewed as an encoded model base. The technique can be applied<br />

directly <strong>to</strong> line features, without resorting <strong>to</strong> point features indirectly derived from lines,<br />

providing that we use some method of encoding line features in a way invariant under<br />

transformations considered.<br />

Potentially relevant transformation groups include rigid transformations, similarity<br />

transformations, aæne transformations and projective transformations, depending on the<br />

manner in which an image is formed. Each of these classes of transformations is a subgroup<br />

of the full group of projective transformations. Since a projective transformation<br />

is a collineation èreferring <strong>to</strong> p. 89 of ë35ëè, all these transformations preserve lines. This<br />

allows us <strong>to</strong> use line features as inputs <strong>to</strong> the recognition procedures.<br />

4.1 Change of Coordinates in Various Spaces<br />

Since we will represent line features by their normal parameterization èç; rè, we show in<br />

this section how the change of coordinates in image space by a linear transformation acts<br />

39


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 40<br />

<strong>to</strong> change of coordinates in èç; rè-space. This is preliminary <strong>to</strong> the invariant line encodings<br />

exploited in its following sections.<br />

4.1.1 Change of Coordinates in Image Space<br />

It is often easiest <strong>to</strong> work in homogeneous coordinates, since this allows projective transformations<br />

<strong>to</strong> be represented by matrices.<br />

Achange of coordinates in homogeneous coordinates is given by<br />

w 0<br />

= Tw;<br />

where w = èw 1 ;w 2 ;w 3 è t is the homogeneous coordinate before transformation, w 0 =<br />

èw 0 1 ;w0 2 3è t is the homogeneous coordinate after transformation and T is a non-singular<br />

;w0<br />

3 æ 3 matrix. Note that T and çT, for any ç 6= 0, deæne the same transformation, since<br />

we are dealing with homogeneous coordinates.<br />

A point èx; yè t in image space is represented by a non-zero 3-D point w =èw 1 ;w 2 ;w 3 è t<br />

in homogeneous coordinate systems, such that<br />

x = w 1<br />

w 3<br />

and y = w 2<br />

w 3<br />

;<br />

where w 3 6=0. This representation is not unique, since çw, for any ç 6= 0 is also a<br />

homogeneous representation of èx; yè t .<br />

Every non-singular 3æ3 matrix deænes a 2-D projective transformation of homogeneous<br />

coordinates.<br />

Various subgroups of the projective transformation group are deæned by<br />

diæerent restrictions on the form of the matrix T.<br />

4.1.2 Change of Coordinates in <strong>Line</strong> Parameter Space<br />

A line in a 2-D plane can be represented by its parameter vec<strong>to</strong>r and is usually parameterized<br />

as<br />

a 1 w 1 + a 2 w 2 + a 3 w 3 =0<br />

or<br />

a t w =0;<br />

where a =èa 1 ;a 2 ;a 3 è t is the parameter vec<strong>to</strong>r of the line ènote that a 1 and a 2 are not both<br />

equal <strong>to</strong> 0è and w =èw 1 ;w 2 ;w 3 è t is a homogeneous coordinate of any point on the line.


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 41<br />

This representation is not unique, since ça, for any ç 6= 0, is also a representation of the<br />

same line.<br />

A transformation T in image space changes the coordinate of every point w on a line<br />

<strong>to</strong> w 0 by w 0 = Tw. Substituting w = T ,1 w 0 in a t w =0,we get<br />

a t T ,1 w 0 =0<br />

and hence<br />

èèT ,1 è t aè t w 0 =0;<br />

or<br />

a 0 tw 0 =0; where a 0 =èT ,1 è t a:<br />

This shows that the change of the coordinate of the point in3-D parameter space is<br />

given by<br />

a 0 = Pa:<br />

where P =èT ,1 è t .<br />

4.1.3 Change of Coordinates in èç; rè Space<br />

<strong>Line</strong> features of an image are usually extracted by the Hough transform. A common<br />

implementation of the Hough transform applies the normal parameterization suggested by<br />

Duda and Hartë15ë, in the form<br />

x cos ç + y sin ç = r;<br />

where r is the perpendicular distance of the line <strong>to</strong> the origin and ç is the angle between<br />

a normal <strong>to</strong> the line and the positive x-axis.<br />

This unique parameterization of lines relates <strong>to</strong> the preceding parameterization of lines.<br />

Let F : R 2 7! R 3 be a mapping such that<br />

Fèèç; rè t è = ècos ç; sin ç; ,rè t<br />

If we restrict the domain of ç <strong>to</strong> be in ë0;çè, then F ,1 exists. Deæne another mapping<br />

G : R 3 7! R 2 by F ,1 as<br />

Gèèa 1 ;a 2 ;a 3 è t è=<br />

8<br />

é<br />

é:<br />

F ,1 èèp 1<br />

; p a 2<br />

; p a 3<br />

è t è<br />

a 2 1 +a2 a<br />

2<br />

1 +a2 a<br />

2<br />

1 +a2 2<br />

if a 2 ç 0<br />

F ,1 èèp 1<br />

; p ,a 2<br />

; p ,a 3<br />

è t è<br />

a 2 1 +a2 a<br />

2<br />

1 +a2 a<br />

2<br />

1 +a2 2<br />

otherwise


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 42<br />

where a 1 and a 2 are not both equal <strong>to</strong> 0. Then<br />

Gèèa 1 ;a 2 ;a 3 è t è=<br />

8<br />

é<br />

é:<br />

ètan ,1 è a 2<br />

è; p,a3<br />

è t if a<br />

a1 a 2 2 é 0<br />

1 +a2 2<br />

è0; ,a 3<br />

a1 èt if a 2 =0<br />

ètan ,1 è ,a 2<br />

è; p<br />

a3<br />

è t if a<br />

,a1 2 é 0<br />

where the range of tan ,1 is in ë0::çè and tan ,1 è1è = ç 2 .<br />

Then G maps a point a =èa 1 ;a 2 ;a 3 è t<br />

a 2 1 +a2 2<br />

both equal <strong>to</strong> 0, <strong>to</strong> èç; rè t = Gèaè inèç; rè-space such that<br />

in parameter space, where a 1 and a 2 are not<br />

a 1 w 1 + a 2 w 2 + a 3 w 3 =0<br />

and<br />

cos çw 1 + sin çw 2 , rw 3 =0<br />

deæne the same line.<br />

Let èç; rè t be the èç; rè-parameter deæning a line èin fact, the line x cos ç + y sin ç =<br />

rè. Then Fèèç; rè t è=a, where a = ècos ç; sin ç; ,rè t ,isapoint in parameter space. A<br />

transformation P changes the coordinate of a <strong>to</strong> a 0 = Pa in parameter space. Substituting<br />

Fèèç; rè t è for a in a 0 = Pa, we get<br />

a 0 = PFèèç; rè t è<br />

and hence èç 0 ;r 0 è t = Gèa 0 è=GèPFèèç; rè t èè deænes the same line as a 0 èor ça 0 , ç 6= 0è.<br />

Thus a transformation of a point in parameter space results in the transformation of<br />

èç; rè t in èç; rè-space. The change of coordinate of èç; rè t in èç; rè-space is given by<br />

èç 0 ;r 0 è t = Hèèç; rè t è<br />

è4:1è<br />

where H = G æ P æ F. 1<br />

4.2 Encoding <strong>Line</strong>s by a Basis of <strong>Line</strong>s<br />

To apply the geometric hashing technique <strong>to</strong> line features, we need <strong>to</strong> encode the features<br />

in a manner invariant under the transformation group considered. One way of encoding is<br />

1 We abused the notation by <strong>using</strong> Pèvè <strong>to</strong> denote Pv èmatrix P multiplies vec<strong>to</strong>r vè. We will continue<br />

<strong>to</strong> use Pèvè orPv interchangeably when no ambiguity occurs.


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 43<br />

<strong>to</strong> ænd a canonical basis and a transformation èin the transformation group under considerationè<br />

that maps the basis <strong>to</strong> the canonical basis, then apply that transformation <strong>to</strong> the<br />

line being encoded. For example, let T be a projective transformation that maps a basis<br />

b <strong>to</strong> Tèbè. If P is the unique transformation such that Pèbè = b c and P 0 is the unique<br />

transformation such that P 0 èTèbèè = b c , where b c is the chosen canonical basis, then we<br />

have P = P 0 æ T and any other line l and its correspondence l 0<br />

= Tèlè will be mapped <strong>to</strong><br />

Pèlè by P and P 0 èl 0 èby P 0<br />

respectively. Since P 0 èl 0 è=P 0 èTèlèè = P 0 æ Tèlè =Pèlè, this<br />

is an invariant encodingë25ë.<br />

Various subgroups of the projective transformation group require diæerent bases. We<br />

discuss them in the following section and give formulae for the corresponding invariants.<br />

Throughout the following discussion, we represent a line by its normal parameterization<br />

èç; rè with ç in ë0; çè and r in R. If the computed invariant èç 0 ;r 0 è has ç 0<br />

not in ë0::çè, we<br />

adjust it by adding ç or ,ç and adjust the value of r 0<br />

by æipping its sign accordingly.<br />

4.3 <strong>Line</strong> Invariants under Various Transformation Groups<br />

4.3.1 Rigid <strong>Line</strong> Invariants<br />

Any correspondence of two ordered pairs of non-parallel lines determines a rigid transformation<br />

up <strong>to</strong> a 180 æ -rotation ambiguity, and this ambiguity can be broken if we know the<br />

position of a third line during veriæcation.<br />

One way <strong>to</strong> encode a third line in terms of a pair of non-parallel lines in a rigid-invariant<br />

way is <strong>to</strong> ænd a transformation P which maps the ærst basis line <strong>to</strong> the x-axis and maps<br />

the second basis line <strong>to</strong> such that its intersection with the ærst basis line is the origin, and<br />

then apply P <strong>to</strong> the third line.<br />

A rigid transformation T in image space has the form<br />

0<br />

T = @ R b<br />

0 1<br />

0<br />

1<br />

A = B<br />

@<br />

cos ç , sin ç b1<br />

sin ç cos ç b2<br />

0 0 1<br />

where R is the rotation matrix and b, the translation vec<strong>to</strong>r. This corresponds <strong>to</strong> the<br />

1<br />

C<br />

A ;


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 44<br />

parameter-space transformation P, deæned by èc.f. section 4.1.2è<br />

so that<br />

P =èT ,1 è t =<br />

0<br />

@ èR,1 è t 0<br />

,b t èR ,1 è t 1<br />

P ,1 = T t =<br />

1 0<br />

A = @ R 0<br />

0<br />

@ Rt 0<br />

b t 1<br />

1<br />

A :<br />

,b t R 1<br />

1<br />

A ;<br />

è4.2è<br />

Let a line be parameterized as x cos ç + y sin ç , r = 0 and let two additional basis lines<br />

be represented by their parameters a 1 and a 2 , such that<br />

which are <strong>to</strong> be mapped by P <strong>to</strong><br />

a 1 = ècos ç 1 ; sin ç 1 ; ,r 1 è t ;<br />

a 2 = ècos ç 2 ; sin ç 2 ; ,r 2 è t ;<br />

e 1 = ècos ç 2 ; sin ç 2 ; 0èt =è0; 1; 0è t ;<br />

e 2 = ècos '; sin '; 0è t =èC; S; 0è t :<br />

Note that ' is æxed, though unknown. In this way, P maps the ærst basis line <strong>to</strong> x-axis<br />

èor, y = 0è and the intersection of the two basis lines <strong>to</strong> the origin.<br />

Since ax + by + c = 0 and çax + çby + çc =0,ç 6= 0, represent the same line, we have<br />

ç 1 Pa 1 = e 1 ;<br />

ç 2 Pa 2 = e 2 ;<br />

where ç 1 and ç 2 are non-zero constants. Equivalently,<br />

Then<br />

Pèç 1 a 1 ;ç 2 a 2 è=èe 1 ; e 2 è:<br />

èç 1 a 1 ;ç 2 a 2 è = P ,1 èe 1 ; e 2 è<br />

=<br />

=<br />

0<br />

B<br />

@<br />

0<br />

B<br />

@<br />

cos ç sin ç 0<br />

, sin ç cos ç 0<br />

1 0<br />

C<br />

A<br />

B<br />

@<br />

0 C<br />

1 S<br />

1<br />

C<br />

A<br />

b 1 b 2 1 0 0<br />

1<br />

sin ç C cos ç + S sin ç<br />

cos ç ,C sin ç + S cos ç C<br />

A :<br />

b 2 Cb 1 + Sb 2


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 45<br />

Thus<br />

ç 1 = sin ç<br />

= cos ç<br />

= b 2<br />

; è4.3è<br />

cos ç 1 sin ç 1 ,r 1<br />

ç 2 =<br />

From Eq. è4.4è, we obtain<br />

C cos ç + S sin ç<br />

cos ç 2<br />

=<br />

,C sin ç + S cos ç<br />

sin ç 2<br />

= Cb 1 + Sb 2<br />

,r 2<br />

: è4.4è<br />

C = ç 2 cosèç 2 + çè; è4.5è<br />

S = ç 2 sinèç 2 + çè; è4.6è<br />

b 1 = ,ç 2r 2 , Sb 2<br />

C<br />

è4.7è<br />

and from Eq. è4.3è, we have sin ç sin ç 1 = cos ç cos ç 1 and thus ç = ç 2 , ç 1 or ç = , ç 2 , ç 1.<br />

When ç = ç 2 , ç 1,wehave ç 1 = 1 and b 2 = ,r 1 . Substituting them and Eq. è4.5è,<br />

Eq. è4.6è in Eq. è4.7è, we have b 1<br />

Eq. è4.2è we obtain<br />

P =<br />

= ,èr 2 , r 1 cosèç 1 , ç 2 èè= sinèç 1 , ç 2 è. Thus, from<br />

0<br />

1<br />

sin ç 1 , cos ç 1 0<br />

B<br />

cos ç<br />

@<br />

1 sin ç 1 0 C<br />

A<br />

cscèç 1 , ç 2 èèr 2 sin ç 1 , r 1 sin ç 2 è cscèç 1 , ç 2 èè,r 2 cos ç 1 + r 1 cos ç 2 è 1<br />

When ç = , ç 2 , ç 1,wehave ç 1 = ,1 and b 2 = r 1 . Substituting them and Eq. è4.5è,<br />

Eq. è4.6è in Eq. è4.7è, we have b 1 =èr 2 , r 1 cosèç 1 , ç 2 èè= sinèç 1 , ç 2 è. Thus, from Eq. è4.2è<br />

we obtain<br />

P =<br />

0<br />

1<br />

, sin ç 1 cos ç 1 0<br />

B<br />

@<br />

, cos ç 1 , sin ç 1 0 C<br />

A<br />

cscèç 1 , ç 2 èèr 2 sin ç 1 , r 1 sin ç 2 è cscèç 1 , ç 2 èè,r 2 cos ç 1 + r 1 cos ç 2 è 1<br />

From section 4.1.3, the invariant èç 0 ;r 0 è t of a line èç; rè t can be obtained from the basis<br />

lines as follows<br />

èç 0 ;r 0 è t = Hèèç; rè t è=GèPècos ç; sin ç;,rè t è=Gèa 0 è<br />

where<br />

a 0 =<br />

0<br />

B<br />

@<br />

, sinèç , ç 1 è<br />

cosèç , ç 1 è<br />

,r , r 2 cscèç 1 , ç 2 è sinèç , ç 1 è+r 1 cscèç 1 , ç 2 è sinèç , ç 2 è<br />

1<br />

C<br />

A ;


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 46<br />

or<br />

a 0 =<br />

0<br />

B<br />

@<br />

sinèç , ç1è<br />

, cosèç , ç1è<br />

,r , r2 cscèç1 , ç2è sinèç , ç1è+r1 cscèç1 , ç2è sinèç , ç2è<br />

1<br />

C<br />

A ;<br />

depending upon which P is used, and hence<br />

ç 0 = tan ,1 è cosèç , ç 1è<br />

, sinèç , ç1è è = tan,1 è sinèç , ç 1 + ç 2 è<br />

cosèç , ç1 + ç 2 èè=ç , ç 1 + ç 2 ;<br />

r 0 = r + r2 cscèç1 , ç2è sinèç , ç1è , r1 cscèç1 , ç2è sinèç , ç2è;<br />

or<br />

ç 0 = tan ,1 è , cosèç , ç 1è<br />

sinèç , ç1è è = tan,1 è sinèç , ç 1 , ç 2 è<br />

cosèç , ç1 , ç 2 èè=ç , ç 1 , ç 2 ;<br />

r 0 = r + r2 cscèç1 , ç2è sinèç , ç1è , r1 cscèç1 , ç2è sinèç , ç2è:<br />

We may s<strong>to</strong>re each encoded invariant èç 0 ;r 0 è redundantly in two entries of the hash<br />

table, èç 0 ;r 0 è and èç 0 ; ,r 0 è, during preprocessing. Then we may hit a match with either<br />

èç 0 ;r 0 èorèç 0 ; ,r 0 è as the computed scene invariant during recognition. èNote that with<br />

180 o -rotation, the computed invariants should be èç; rè and èç + ç; rè. However, we would<br />

like <strong>to</strong> restrict ç <strong>to</strong> be in ë0::çè, which makes the invariants <strong>to</strong> be èç; rè and èç;,rè, where<br />

ç is in ë0::çè.è<br />

4.3.2 Similarity <strong>Line</strong> Invariants<br />

To determine a similarity transformation uniquely, we need a correspondence of a pair of<br />

triplets of lines, not all of which are parallel <strong>to</strong> each other or intersect at a common point.<br />

Without loss of generality, wemay assume that the ærst basis line intersects the other<br />

two basis lines. One way <strong>to</strong> encode a fourth line in a similarity-invariant way is <strong>to</strong> ænd a<br />

transformation P which maps the ærst basis line <strong>to</strong> the x-axis, maps the second basis line<br />

<strong>to</strong> such that its intersection with the ærst basis line is the origin and maps the third basis<br />

line <strong>to</strong> such that its intersection with the ærst basis line has Euclidean coordinate è1; 0è,<br />

and then apply P <strong>to</strong> the fourth line.


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 47<br />

A similarity transformation T in image space has the form<br />

T =<br />

0<br />

@ s R<br />

b<br />

0 1<br />

0<br />

1<br />

A = B<br />

@<br />

s cos ç ,s sin ç b 1<br />

s sin ç s cos ç b 2<br />

0 0 1<br />

1<br />

C<br />

A ;<br />

where s is the scaling fac<strong>to</strong>r; R, the rotation matrix; b, the translation vec<strong>to</strong>r.<br />

corresponds <strong>to</strong> the parameter-space transformation P, deæned by èc.f. section 4.1.2è<br />

so that<br />

P =èT ,1 è t =<br />

0<br />

@ 1 èR,1 è t s<br />

0<br />

, 1 s bt èR ,1 è t 1<br />

P ,1 = T t =<br />

1 0<br />

A = @ 1 R 0<br />

s<br />

, 1 s bt R 1<br />

0<br />

@ sRt 0<br />

b t 1<br />

1<br />

A :<br />

1<br />

A ;<br />

This<br />

è4.8è<br />

Let a line be parameterized as x cos ç + y sin ç , r = 0 and let three additional basis lines,<br />

assuming the ærst line is not parallel <strong>to</strong> the others, be represented by their parameters a 1 ,<br />

a 2 and a 3 , such that<br />

a 1 = ècos ç 1 ; sin ç 1 ; ,r 1 è t ;<br />

a 2 = ècos ç 2 ; sin ç 2 ; ,r 2 è t ;<br />

a 3 = ècos ç 3 ; sin ç 3 ; ,r 3 è t ;<br />

which are <strong>to</strong> be mapped by P <strong>to</strong><br />

e 1 = ècos ç 2 ; sin ç 2 ; 0èt =è0; 1; 0è t ;<br />

e 2 = ècos ' 1 ; sin ' 1 ; 0è t =èC 1 ;S 1 ; 0è t ;<br />

e 3 = ècos ' 2 ; sin ' 2 ; , sin jç 1 , ç 3 jè t =èC 2 ;S 2 ; , sin jç 1 , ç 3 jè t :<br />

Note that ' 1 and ' 2 are æxed, though unknown. Also note that since we æxed the third<br />

component ofe 3 <strong>to</strong> be , sin jç 1 , ç 3 j é 0, ' 2 ranges within è, ç 2 ; ç 2 è èor, C 2 é 0è. In this<br />

way, P maps the ærst basis line <strong>to</strong> x-axis; the intersection of the ærst basis line and the<br />

second basis line <strong>to</strong> the origin; the intersection of the ærst basis line and the third basis<br />

line <strong>to</strong> è1; 0è t .


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 48<br />

Since ax + by + c = 0 and çax + çby + çc =0,ç 6= 0, represent the same line, we have<br />

ç 1 Pa 1 = e 1 ;<br />

ç 2 Pa 2 = e 2 ;<br />

ç 3 Pa 3 = e 3 ;<br />

where ç 1 , ç 2 and ç 3 are non-zero constants. Equivalently,<br />

Pèç 1 a 1 ;ç 2 a 2 ;ç 3 a 3 è=èe 1 ; e 2 ; e 3 è:<br />

Then<br />

èç 1 a 1 ;ç 2 a 2 ;ç 3 a 3 è=P ,1 èe 1 ; e 2 ; e 3 è<br />

=<br />

=<br />

0<br />

B<br />

@<br />

0<br />

B<br />

@<br />

s cos ç s sin ç 0<br />

,s sin ç s cos ç 0<br />

b 1 b 2 1<br />

1 0<br />

C<br />

A<br />

B<br />

@<br />

0 C 1 C 2<br />

1 S 1 S 2<br />

0 0 , sin jç 1 , ç 3 j<br />

s sin ç sC 1 cos ç + sS 1 sin ç sC 2 cos ç + sS 2 sin ç<br />

s cos ç ,sC 1 sin ç + sS 1 cos ç ,sC 2 sin ç + sS 2 cos ç<br />

b 2 C 1 b 1 + S 1 b 2 C 2 b 1 + S 2 b 2 , sin jç 1 , ç 3 j<br />

1<br />

C<br />

A<br />

1<br />

C<br />

A :<br />

Thus<br />

ç 1 = s sin ç<br />

cos ç 1<br />

= s cos ç<br />

sin ç 1<br />

= b 2<br />

,r 1<br />

; è4.9è<br />

ç 2 = s èC 1 cos ç + S 1 sin çè<br />

cos ç 2<br />

= s è,C 1 sin ç + S 1 cos çè<br />

sin ç 2<br />

= C 1 b 1 + S 1 b 2<br />

,r 2<br />

; è4.10è<br />

ç 3 = s èC 2 cos ç + S 2 sin çè<br />

cos ç 3<br />

= s è,C 2 sin ç + S 2 cos çè<br />

sin ç 3<br />

= C 2 b 1 + S 2 b 2 , sin jç 1 , ç 3 j<br />

,r 3<br />

: è4.11è


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 49<br />

From Eq. è4.9è, we have s sin ç sin ç 1 = s cos ç cos ç 1 and thus ç = ç 2 , ç 1 or ç = , ç 2 , ç 1.<br />

From Eq. è4.10è, we obtain<br />

From Eq. è4.11è, we obtain<br />

C 1 = ç 2<br />

s cosèç 2 + çè;<br />

S 1 = ç 2<br />

s sinèç 2 + çè;<br />

b 1 = ,ç 2r 2 , S 1 b 2<br />

C 1<br />

: è4.12è<br />

C 2 = ç 3<br />

S 2 = ç 3<br />

and since C 2 2 + S2 2 =1,wehave ç 3 = æs.<br />

s cosèç 3 + çè;<br />

s sinèç 3 + çè;<br />

b 1 = ,ç 3r 3 , S 2 b 2 + sin jç 1 , ç 3 j<br />

C 2<br />

; è4.13è<br />

When ç = ç 2 , ç 1,wehave ç 1 = s and b 2 = ,sr 1 ; when ç = , ç 2 , ç 1,wehave ç 1 = ,s<br />

and b 2 = sr 1 . In both cases, b 1 in Eq. è4.12è and Eq. è4.13è must be consistent, we can<br />

solve <strong>to</strong>get<br />

ç 3 =<br />

sinèç 1 , ç 2 è sin jç 1 , ç 3 j<br />

r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è :<br />

Since we require C 2 é 0, we have ç 3 = s when ç 1 éç 3 and ç = , ç 2 , ç 1 or ç 1 éç 3 and<br />

ç = ç 2 , ç 1;wehave ç 3 = ,s when ç 1 éç 3 and ç = ç 2 , ç 1 or ç 1 éç 3 and ç = , ç 2 , ç 1.<br />

To summarize, we have four cases as follows:<br />

æ case 1 ç 1 éç 3 èthus sin jç 1 , ç 3 j = , sinèç 1 , ç 3 èè<br />

í case 1.1<br />

ç = ç 2 , ç 1;<br />

s =<br />

sinèç 1 , ç 2 è sinèç 1 , ç 3 è<br />

r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è ;<br />

b 1 = s è,r 2 + r 1 cosèç 1 , ç 2 èè<br />

sinèç 1 , ç 2 è<br />

b 2 = ,sr 1 ;<br />

;


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 50<br />

í case 1.2<br />

ç = , ç 2 , ç 1;<br />

s = ,<br />

sinèç 1 , ç 2 è sinèç 1 , ç 3 è<br />

r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è ;<br />

b 1 = , s è,r 2 + r 1 cosèç 1 , ç 2 èè<br />

b 2 = sr 1 ;<br />

sinèç 1 , ç 2 è<br />

;<br />

æ case 2 ç 1 éç 3 èthus sin jç 1 , ç 3 j = sinèç 1 , ç 3 èè<br />

í case 2.1<br />

ç = ç 2 , ç 1;<br />

s =<br />

sinèç 1 , ç 2 è sinèç 1 , ç 3 è<br />

r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è ;<br />

b 1 = s è,r 2 + r 1 cosèç 1 , ç 2 èè<br />

sinèç 1 , ç 2 è<br />

b 2 = ,sr 1 ;<br />

;<br />

í case 2.2<br />

ç = , ç 2 , ç 1;<br />

s = ,<br />

sinèç 1 , ç 2 è sinèç 1 , ç 3 è<br />

r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è ;<br />

b 1 = , s è,r 2 + r 1 cosèç 1 , ç 2 èè<br />

;<br />

sinèç 1 , ç 2 è<br />

b 2 = sr 1 :<br />

Note that case 1.1 and case 2.1 are identical and case 1.2 and case 2.2 are identical.<br />

The transformation P derived from case 1.1 and case 1.2 are again identical. We conclude<br />

that the above four cases result in<br />

P = 1 D<br />

0<br />

B<br />

@<br />

A sin ç 1 ,A cos ç 1 0<br />

A cos ç 1 A sin ç 1 0<br />

èr 2 sin ç 1 , r 1 sin ç 2 è sinèç 1 , ç 3 è ,èr 2 cos ç 1 , r 1 cos ç 2 è sinèç 1 , ç 3 è D<br />

1<br />

C<br />

A


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 51<br />

where<br />

A = r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è;<br />

D = sinèç 1 , ç 2 è sinèç 1 , ç 3 è:<br />

From section 4.1.3, the invariant èç 0 ;r 0 è t of a line èç; rè t can be obtained from the basis<br />

lines as follows<br />

èç 0 ;r 0 è t = Hèèç; rè t è=GèPècos ç; sin ç;,rè t è=Gèa 0 è<br />

where<br />

a 0<br />

= 1 D<br />

0<br />

B<br />

@<br />

,A sinèç , ç 1 è<br />

A cosèç , ç 1 è<br />

è,r 2 sinèç , ç 1 è+r 1 sinèç , ç 2 è , r sinèç 1 , ç 2 èè sinèç 1 , ç 3 è<br />

1<br />

C<br />

A<br />

and hence<br />

ç 0 = tan ,1 è A cosèç , ç 1è<br />

,A sinèç , ç 1 è è = tan,1 è sinèç , ç 1 + ç 2 è<br />

cosèç , ç 1 + ç 2 è è=ç , ç 1 + ç 2 ;<br />

r 0 = èr 2 sinèç , ç 1 è , r 1 sinèç , ç 2 è+r sinèç 1 , ç 2 èè sinèç 1 , ç 3 è=jAj:<br />

4.3.3 Aæne <strong>Line</strong> Invariants<br />

To determine an aæne transformation uniquely, we need a correspondence of two ordered<br />

triplets of lines, which are not parallel <strong>to</strong> one another and do not intersect at a common<br />

point.<br />

One way <strong>to</strong> encode a fourth line in an aæne-invariant way is <strong>to</strong> ænd a transformation<br />

P which maps the three basis lines <strong>to</strong> a canonical basis, say x =0,y = 0 and x + y =1,<br />

and then apply P <strong>to</strong> the fourth line.<br />

An aæne transformation T in image space has the form<br />

T =<br />

0<br />

@ A<br />

b<br />

0 1<br />

0<br />

1<br />

A = B<br />

@<br />

a 11 a 12 b 1<br />

a 21 a 22 b 2<br />

0 0 1<br />

1<br />

C<br />

A ;<br />

where A is the skewing matrix; b, the translation vec<strong>to</strong>r. This corresponds <strong>to</strong> the<br />

parameter-space transformation P, deæned by èc.f. section 4.1.2è<br />

P =èT ,1 è t =<br />

0<br />

@ èA,1 è t 0<br />

,b t èA ,1 è t 1<br />

1<br />

A ;<br />

è4.14è


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 52<br />

so that<br />

P ,1 = T t =<br />

0<br />

@ At 0<br />

b t 1<br />

Let a line be parameterized as x cos ç + y sin ç , r = 0 and let three additional basis lines<br />

1<br />

A :<br />

be represented by their parameters a 1 , a 2 and a 3 , such that<br />

which are <strong>to</strong> be mapped by P <strong>to</strong><br />

a 1 = ècos ç1 ; sin ç 1 ; ,r 1è t ;<br />

a 2 = ècos ç2 ; sin ç 2 ; ,r 2è t ;<br />

a 3 = ècos ç3 ; sin ç 3 ; ,r 3è t ;<br />

e 1 = ècos 0; sin 0; 0è t =è1; 0; 0è t ;<br />

e 2 = ècos ç 2 ; sin ç 2 ; 0èt =è0; 1; 0è t ;<br />

e 3 = ècos ç 4 ; sin ç ,1<br />

p<br />

4 ; è t =è 1 1 ,1<br />

p ; p ; p è t :<br />

2 2 2 2<br />

In this way, P maps the ærst basis line <strong>to</strong> y-axis; the second basis line <strong>to</strong> x-axis; the third<br />

basis line <strong>to</strong> x + y =1.<br />

Since ax + by + c = 0 and çax + çby + çc =0,ç 6= 0, represent the same line, we have<br />

ç1Pa 1 = e 1 ;<br />

ç2Pa 2 = e 2 ;<br />

ç3Pa 3 = e 3 ;<br />

where ç1, ç2 and ç3 are non-zero constants. Equivalently,<br />

Then<br />

Pèç1a 1 ;ç2a 2 ;ç3a 3 è=èe 1 ; e 2 ; e 3 è:<br />

èç1a 1 ;ç2a 2 ;ç3a 3 è = P ,1 èe 1 ; e 2 ; e 3 è<br />

=<br />

=<br />

0<br />

B<br />

@<br />

0<br />

B<br />

@<br />

a11 a21 0<br />

a12 a22 0<br />

b1 b2 1<br />

a11<br />

a12<br />

b1<br />

a21<br />

a22<br />

b2<br />

1 0<br />

C<br />

A<br />

B<br />

@<br />

1<br />

1 0<br />

1p<br />

2<br />

0 1<br />

1p 2<br />

C<br />

A<br />

0 0 , p 1<br />

2<br />

p1<br />

èa11 + a21è<br />

2<br />

p1<br />

2<br />

èa12 + a22è<br />

p1<br />

èb1 + b2 2<br />

, 1è<br />

1<br />

C<br />

A :


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 53<br />

Thus<br />

ç1 =<br />

ç2 =<br />

ç3 =<br />

a11<br />

cos ç1<br />

From Eq. è4.15è, we obtain<br />

= a 12<br />

sin ç1<br />

= b 1<br />

; è4.15è<br />

,r1<br />

a21<br />

= a 22<br />

= b 2<br />

; è4.16è<br />

cos ç2 sin ç2 ,r2<br />

p1<br />

èa11 + a21è p1<br />

èa12 + a22è<br />

1p èb1 + b2 2 2 2<br />

, 1è<br />

=<br />

=<br />

: è4.17è<br />

cos ç3<br />

sin ç3 ,r3<br />

a11 = ç1 cos ç1 ; a 12 = ç1 sin ç1 and b1 = ,ç1 r 1 : è4.18è<br />

From Eq. è4.16è, we obtain<br />

a21 = ç2 cos ç2 ; a 22 = ç2 sin ç2 and b2 = ,ç2 r 2 : è4.19è<br />

From Eq. è4.17è, we obtain<br />

a11 + a21 = ç3<br />

p<br />

2 cos ç3 ; a 12 + a22 = ç3<br />

Substituting Eq. è4.18è and Eq. è4.19è in<strong>to</strong> Eq. è4.20è, we have<br />

and hence<br />

ç1 cos ç1 + ç2 cos ç2 , ç3<br />

ç1 sin ç1 + ç2 sin ç2 , ç3<br />

p<br />

p<br />

2 sin ç3 and b1 + b2 , 1=,ç3 2 r3 : è4.20è<br />

p<br />

p<br />

ç1 r 1 + ç2 r 2 , ç3<br />

2 cos ç3 = 0;<br />

2 sin ç3 = 0;<br />

p<br />

2 r3 = ,1;<br />

ç1 = , sinèç2 , ç3è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è+r3 sinèç1 , ç2èè;<br />

ç2 = , sinèç3 , ç1è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è+r3 sinèç1 , ç2èè:<br />

Substituting ç1 and ç2 back <strong>to</strong> Eq. è4.18è and Eq. è4.19è, we immediately obtain,<br />

a11 = , cos ç1 sinèç2 , ç3è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è +r3 sinèç1 , ç2èè;<br />

a12 = , sin ç1 sinèç2 , ç3è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è +r3 sinèç1 , ç2èè;<br />

a21 = , cos ç2 sinèç3 , ç1è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è +r3 sinèç1 , ç2èè;<br />

a22 = , sin ç2 sinèç3 , ç1è=èr1 sinèç2 , ç3è+r2 sinèç3 , ç1è +r3 sinèç1 , ç2èè;<br />

b1 = r1 sinèç2 , ç3è=èr1 sinèç2 , ç3è +r2 sinèç3 , ç1è+r3 sinèç1 , ç2èè;<br />

b2 = r2 sinèç3 , ç1è=èr1 sinèç2 , ç3è +r2 sinèç3 , ç1è+r3 sinèç1 , ç2èè:


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 54<br />

Thus the transformation P in parameter space can be obtained from Eq. è4.14è as<br />

P =<br />

where<br />

0<br />

B@<br />

A cscèç 1 , ç 2 è cscèç 2 , ç 3 è sin ç 2 ,A cscèç 1 , ç 2 è cscèç 2 , ç 3 è cos ç 2 0<br />

A cscèç 1 , ç 2 è cscèç 1 , ç 3 è sin ç 1 ,A cscèç 1 , ç 2 è cscèç 1 , ç 3 è cos ç 1 0<br />

cscèç 1 , ç 2 èèr 2 sin ç 1 , r 1 sin ç 2 è , cscèç 1 , ç 2 èèr 2 cos ç 1 , r 1 cos ç 2 è 1<br />

A = r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è:<br />

From section 4.1.3, the invariant èç 0 ;r 0 è t of a line èç; rè t can be obtained from the basis<br />

lines as follows:<br />

1<br />

CA<br />

3æ3<br />

èç 0 ;r 0 è t = Hèèç; rè t è=GèPècos ç; sin ç;,rè t è=Gèa 0 è;<br />

where<br />

0<br />

, cscèç 1 , ç 2 è cscèç 2 , ç 3 è sinèç , ç 2 è<br />

1<br />

èr 3 sinèç 1 , ç 2 è+r 2 sinèç 3 , ç 1 è+r 1 sinèç 2 , ç 3 èè<br />

a 0 =<br />

, cscèç 1 , ç 2 è cscèç 1 , ç 3 è sinèç , ç 1 è<br />

B@<br />

èr 3 sinèç 1 , ç 2 è+r 2 sinèç 3 , ç 1 è+r 1 sinèç 2 , ç 3 èè<br />

CA<br />

r 1 cscèç 1 , ç 2 è sinèç , ç 2 è , r 2 cscèç 1 , ç 2 è sinèç , ç 1 è , r<br />

3æ1<br />

and hence<br />

ç 0 = tan ,1 èA cscèç 1 , ç 3 è sinèç 1 , çè cscèç 1 , ç 2 è=<br />

A cscèç 2 , ç 3 è sinèç 2 , çè cscèç 1 , ç 2 èè;<br />

è4.21è<br />

r 0 = 1 ç èr 1 cscèç 1 , ç 2 è sinèç 2 , çè , r 2 cscèç 1 , ç 2 è sinèç 1 , çè +rè; è4.22è<br />

where<br />

ç =<br />

q<br />

csc 2 èç 1 , ç 3 è sin 2 èç 1 , çè + csc 2 èç 2 , ç 3 è sin 2 èç 2 , çè jA cscèç 1 , ç 2 èj;<br />

A = r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è:


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 55<br />

4.3.4 Projective <strong>Line</strong> Invariants<br />

Although a projective transformation is fractional linear, one can well treat it as a linear<br />

transformation by <strong>using</strong> homogeneous coordinates.<br />

To determine a projective transformation uniquely, we need a correspondence of two<br />

ordered quadruplets of lines and for each quadruplet, no three lines are parallel <strong>to</strong> one<br />

another or intersect at a common point èthus no three points are collinear in parameter<br />

spaceè.<br />

One way <strong>to</strong> encode a æfth line in a projective-invariant way is <strong>to</strong> ænd a transformation<br />

P which maps the four basis lines <strong>to</strong> a canonical basis, say x =0,y =0,x = 1 and y =1,<br />

and then apply P <strong>to</strong> the æfth line.<br />

A projective transformation T in image space has the form<br />

0<br />

1<br />

a 11 a 12 a 13<br />

T = B<br />

a<br />

@ 21 a 22 a 23 C<br />

A :<br />

a 31 a 32 a 33<br />

This corresponds <strong>to</strong> the parameter-space transformation P, deæned by èc.f. section 4.1.2è<br />

P =èT ,1 è t = 1 D<br />

where<br />

0<br />

1<br />

a 22 a 33 , a 23 a 32 a 23 a 31 , a 21 a 33 a 21 a 32 , a 22 a 31<br />

B a<br />

@ 13 a 32 , a 12 a 33 a 11 a 33 , a 13 a 31 a 12 a 31 , a 11 a 32 C<br />

a 12 a 23 , a 13 a 22 a 13 a 21 , a 11 a 23 a 11 a 22 , a 12 a 21<br />

D = a 11 a 22 a 33 + a 12 a 23 a 31 + a 13 a 21 a 32 , a 11 a 23 a 32 , a 12 a 21 a 33 , a 13 a 22 a 31 ;<br />

so that<br />

P ,1 = T t =<br />

0<br />

B<br />

@<br />

1<br />

a 11 a 21 a 31<br />

a 12 a 22 a 32 C<br />

A :<br />

a 13 a 23 a 33<br />

A ;<br />

è4.23è<br />

Let a line be parameterized as x cos ç + y sin ç , r = 0 and let four additional basis lines<br />

be represented by their parameters a 1 , a 2 , a 3 and a 4 , such that<br />

a 1 = ècos ç 1 ; sin ç 1 ; ,r 1 è t ;<br />

a 2 = ècos ç 2 ; sin ç 2 ; ,r 2 è t ;<br />

a 3 = ècos ç 3 ; sin ç 3 ; ,r 3 è t ;<br />

a 4 = ècos ç 4 ; sin ç 4 ; ,r 4 è t ;


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 56<br />

which are <strong>to</strong> be mapped by P <strong>to</strong><br />

e 1 = ècos 0; sin 0; 0è t =è1; 0; 0è t ;<br />

e 2 = ècos ç 2 ; sin ç 2 ; 0èt =è0; 1; 0è t ;<br />

e 3 = ècos 0; sin 0; ,1è t =è1; 0; ,1è t ;<br />

e 4 = ècos ç 2 ; sin ç 2 ; ,1èt =è0; 1; ,1è t :<br />

In this way, P maps the ærst basis line <strong>to</strong> y-axis; the second basis line <strong>to</strong> x-axis; the third<br />

basis line <strong>to</strong> x = 1; the fourth basis line <strong>to</strong> y =1.<br />

Since ax + by + c = 0 and çax + çby + çc =0,ç 6= 0, represent the same line, we have<br />

ç1Pa 1 = e 1 ;<br />

ç2Pa 2 = e 2 ;<br />

ç3Pa 3 = e 3 ;<br />

ç4Pa 4 = e 4 ;<br />

where ç1, ç2, ç3 and ç4 are constants. Equivalently,<br />

Pèç1a 1 ;ç2a 2 ;ç3a 3 ;ç4a 4 è=èe 1 ; e 2 ; e 3 ; e 4 è:<br />

Then<br />

We have<br />

èç1a 1 ;ç2a 2 ;ç3a 3 ;ç4a 4 è = P ,1 èe 1 ; e 2 ; e 3 ; e 4 è<br />

=<br />

=<br />

0<br />

B<br />

@<br />

a11 a21 a31<br />

a12 a22 a32<br />

a13 a23 a33<br />

1 0<br />

C<br />

A<br />

B<br />

@<br />

1 0 1 0<br />

0 1 0 1<br />

0 0 ,1 ,1<br />

0<br />

1<br />

a11 a21 a11 , a31 a21 , a31<br />

B a12 a22 a12<br />

@<br />

, a32 a22 , a32 C<br />

a13 a23 a13 , a33 a23 , a33<br />

A :<br />

1<br />

C<br />

A<br />

ç1 =<br />

ç2 =<br />

a11<br />

cos ç1<br />

a21<br />

cos ç2<br />

ç3 = a 11 , a31<br />

cos ç3<br />

ç4 = a 21 , a31<br />

cos ç4<br />

= a 12<br />

sin ç1<br />

= a 22<br />

sin ç2<br />

= a 13<br />

; è4.24è<br />

,r1<br />

= a 23<br />

; è4.25è<br />

,r2<br />

= a 12 , a32<br />

sin ç3<br />

= a 22 , a32<br />

sin ç4<br />

= a 13 , a33<br />

; è4.26è<br />

,r3<br />

= a 23 , a33<br />

: è4.27è<br />

,r4


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 57<br />

From Eq. è4.24è, we obtain<br />

a 11 = ç 1 cos ç 1 ; a 12 = ç 1 sin ç 1 and a 13 = ,ç 1 r 1 : è4:28è<br />

From Eq. è4.25è, we obtain<br />

a 21 = ç 2 cos ç 2 ; a 22 = ç 2 sin ç 2 and a 23 = ,ç 2 r 2 : è4:29è<br />

From Eq. è4.26è, we obtain<br />

a 31 = a 11 , ç 3 cos ç 3 ; a 32 = a 12 , ç 3 sin ç 3 and a 33 = a 13 + ç 3 r 3 : è4:30è<br />

From Eq. è4.27è, we obtain<br />

a 31 = a 21 , ç 4 cos ç 4 ; a 32 = a 22 , ç 4 sin ç 4 and a 33 = a 23 + ç 4 r 4 : è4:31è<br />

Since Eq. è4.30è and Eq. è4.31è must be consistent, we have by substituting Eq. è4.28è in<br />

Eq. è4.30è and Eq. è4.29è in Eq. è4.31è,<br />

ç 1 cos ç 1 , ç 2 cos ç 2 , ç 3 cos ç 3 = ,ç 4 cos ç 4 ;<br />

ç 1 sin ç 1 , ç 2 sin ç 2 , ç 3 sin ç 3 = ,ç 4 sin ç 4 ;<br />

ç 1 r 1 , ç 2 r 2 , ç 3 r 3 = ,ç 4 r 4 ;<br />

and hence<br />

where<br />

ç 1 = C 1<br />

C 4<br />

ç 4 ; ç 2 = C 2<br />

C 4<br />

ç 4 and ç 3 = C 3<br />

C 4<br />

ç 4 ;<br />

C 1 = ,r 4 sinèç 2 , ç 3 è+r 3 sinèç 2 , ç 4 è , r 2 sinèç 3 , ç 4 è;<br />

C 2 = ,r 4 sinèç 1 , ç 3 è+r 3 sinèç 1 , ç 4 è , r 1 sinèç 3 , ç 4 è;<br />

C 3 = r 4 sinèç 1 , ç 2 è , r 2 sinèç 1 , ç 4 è+r 1 sinèç 2 , ç 4 è;<br />

C 4 = r 1 sinèç 2 , ç 3 è+r 2 sinèç 3 , ç 1 è+r 3 sinèç 1 , ç 2 è:<br />

Thus<br />

a 11 = cos ç 1<br />

C 1<br />

C 4<br />

ç 4 ; a 12 = sin ç 1<br />

C 1<br />

C 4<br />

ç 4 and a 13 = ,r 1<br />

C 1<br />

C 4<br />

ç 4 ;<br />

a 21 = cos ç 2<br />

C 2<br />

C 4<br />

ç 4 ; a 22 = sin ç 2<br />

C 2<br />

C 4<br />

ç 4 ; and a 23 = ,r 2<br />

C 2<br />

C 4<br />

ç 4 ;


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 58<br />

a 31 = ècos ç 1<br />

C 1<br />

C 4<br />

,cos ç 3<br />

C 3<br />

C 4<br />

èç 4 ;a 32 = èsin ç 1<br />

C 1<br />

C 4<br />

,sin ç 3<br />

C 3<br />

C 4<br />

èç 4 and a 33 =è,r 1<br />

C 1<br />

C 4<br />

+r 3<br />

C 3<br />

C 4<br />

èç 4 :<br />

The transformation P in parameter space can be obtained from Eq. è4.23è as<br />

0<br />

1<br />

p 11 p 12 p 13<br />

P = ,1=C 1 C 2 C 3 ç 4 B<br />

p<br />

@ 21 p 22 p 23 C<br />

A<br />

p 31 p 32 p 33<br />

where<br />

p 11 = C 2 èC 1 r 2 sin ç 1 , C 1 r 1 sin ç 2 + C 3 r 3 sin ç 2 , C 3 r 2 sin ç 3 è;<br />

p 12 = ,C 2 èC 1 r 2 cos ç 1 , C 1 r 1 cos ç 2 + C 3 r 3 cos ç 2 , C 3 r 2 cos ç 3 è;<br />

p 13 = C 2 èC 1 sinèç 1 , ç 2 è+C 3 sinèç 2 , ç 3 èè;<br />

p 21 = C 1 C 3 èr 1 sin ç 3 , r 3 sin ç 1 è;<br />

p 22 = ,C 1 C 3 èr 1 cos ç 3 , r 3 cos ç 1 è;<br />

p 23 = ,C 1 C 3 sinèç 1 , ç 3 è;<br />

p 31 = C 1 C 2 èr 1 sin ç 2 , r 2 sin ç 1 è;<br />

p 32 = ,C 1 C 2 èr 1 cos ç 2 , r 2 cos ç 1 è;<br />

p 33 = ,C 1 C 2 sinèç 1 , ç 2 è:<br />

From section 4.1.3, the invariant èç 0 ;r 0 è t of a line èç; rè t can be obtained from the basis<br />

lines as follows<br />

èç 0 ;r 0 è t = Hèèç; rè t è=GèPècos ç; sin ç;,rè t è=Gèa 0 è<br />

where<br />

0<br />

a 0 = ,1=C 1 C 2 C 3 ç 4<br />

B<br />

B<br />

@<br />

ABC<br />

DEF<br />

GCF<br />

1<br />

C<br />

A<br />

3æ1<br />

where<br />

A = èr 3 sinèç 1 , ç 2 è , r 2 sinèç 1 , ç 3 è+r 1 sinèç 2 , ç 3 èè;<br />

B = èr 4 sinèç , ç 2 è , r 2 sinèç , ç 4 è+r sinèç 2 , ç 4 èè;<br />

C = èr 4 sinèç 1 , ç 3 è , r 3 sinèç 1 , ç 4 è+r 1 sinèç 3 , ç 4 èè;


CHAPTER 4. INVARIANT MATCHING USING LINE FEATURES 59<br />

D =<br />

è,r 3 sinèç , ç 1 è+r 1 sinèç , ç 3 è , r sinèç 1 , ç 3 èè;<br />

E = èr 4 sinèç 1 , ç 2 è , r 2 sinèç 1 , ç 4 è+r 1 sinèç 2 , ç 4 èè;<br />

F = èr 4 sinèç 2 , ç 3 è , r 3 sinèç 2 , ç 4 è+r 2 sinèç 3 , ç 4 èè;<br />

G = èr 2 sinèç , ç 1 è , r 1 sinèç , ç 2 è+r sinèç 1 , ç 2 èè;<br />

and hence<br />

ç 0 = tan ,1 èDEF=ABCè;<br />

,GCF<br />

r 0 = pèABCè 2 +èDEF è 2 :


Chapter 5<br />

The Eæect of Noise on the<br />

Invariants<br />

In an environment aæected by serious noise and occlusion, line features are usually detected<br />

<strong>using</strong> the Hough transform. However, detected line parameters èç; rè's always deviate<br />

slightly from their true values, since in the physical process of image acquisition the<br />

positions of the endpoints of a segment are usually randomly perturbed and occlusion also<br />

aæects the process of the Hough transform. As long as a segment is not <strong>to</strong>o short, this induces<br />

only slight perturbation of the line parameters of the segment. Thus it is reasonable<br />

for us <strong>to</strong> assume that detected line parameters diæer from the ëtrue" values by a small<br />

perturbation having a Gaussian distribution.<br />

In the following, we derive the ëspread" of the computed invariant over hash space<br />

from the ëperturbation" of the lines which give rise <strong>to</strong> this invariant.<br />

5.1 A Noise Model for <strong>Line</strong> Parameters<br />

We make the following mild assumptions: The measured line parameters èç; rè's of a set<br />

of lines<br />

1. are statistically independent;<br />

2. are distributed according <strong>to</strong> a Gaussian distribution centered at the true value of the<br />

parameter;<br />

60


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 61<br />

3. have a æxed covariance æ =<br />

0<br />

@ ç2 ç<br />

ç çr<br />

ç çr<br />

ç 2 r<br />

1<br />

A .<br />

More precisely, let èç; rè be the ëtrue" value of the parameters of a line and èææ; æRè<br />

be the s<strong>to</strong>chastic variable denoting the perturbation of èç; rè. The joint probability density<br />

function of ææ and æR is then given by<br />

pèææ; æRè=<br />

and is centered around èç; rè.<br />

1<br />

2ç p jæj expë,1 2 èææ; æRèæ,1 èææ; æRè t ë<br />

5.2 The Spread Function of the Invariants<br />

In the following, we derive, from the perturbations of the basis lines and the line being<br />

encoded, a closed-form formula of the perturbation of the invariants considered in the<br />

preceding chapter.<br />

We ærst introduce a lemma and a corollary derived from the lemma.<br />

Lemma: Let X 1 ;X 2 ; :::; X n have amultivariate Gaussian distribution with vec<strong>to</strong>r u of<br />

means and positive deænite covariance matrix æ. Then the moment-generating function<br />

of the multivariate Gaussian p:d:f: is given by<br />

C =<br />

æv<br />

expëv t u + vt ë; for all real vec<strong>to</strong>rs of v:<br />

2<br />

0 1<br />

@ c 11 ::: c 1n<br />

A, a real 2æn matrix. Then the random variable Y is NèCu; CæCt è.<br />

c 21 ::: c 2n<br />

Corollary: Let Y t =èY 1 ;Y 2 è such that Y = CX, where X t =èX 1 ;X 2 ; :::; X n è and<br />

Proof. The moment-generating function of the distribution Y is given by<br />

Applying the above theorem, we have<br />

Mèvè=Eèe vtY è=Eèe vt CX è=Eèe èCt vè tX è<br />

vè<br />

Mèvè = expëèC t vè t u + èCt t æèC t vè<br />

ë<br />

2<br />

èCæC<br />

= expëv t èCuè+ vt t èv<br />

ë<br />

2<br />

Thus the random variable Y is NèCu; CæC t è.<br />

Q.E.D.


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 62<br />

Now, we go back <strong>to</strong> consider the perturbation of the invariant from the perturbation<br />

of lines. In the preceding chapter, we have given formulae of the invariants èç 0 ;r 0 è's for<br />

various transformations. These invariants are functions of the line being encoded, èç; rè t<br />

and the basis lines, èç i ;r i è t :<br />

i=1;...;k<br />

èç 0 ;r 0 è t = fèèç; rè t ; èç 1 ;r 1 è t ; æææ; èç k ;r k è t è:<br />

Equivalently we may rewrite the above equation as follows:<br />

ç 0 = f 0 èèç; rè t ; èç 1 ;r 1 è t ; æææ; èç k ;r k è t è; è5.1è<br />

r 0 = f 00 èèç; rè t ; èç 1 ;r 1 è t ; æææ; èç k ;r k è t è: è5.2è<br />

The exact forms of Eq. è5.1è and Eq. è5.2è, <strong>to</strong>gether with the value of k èi.e., the number<br />

of basis lines neededè, depend on the viewing transformation under consideration and are<br />

given in section 4.3.1, 4.3.2, 4.3.3 and 4.3.4 respectively.<br />

Introducing perturbations of the line parameters èç; rè t and èç i ;r i è t i=1;...;k<br />

results in a<br />

perturbation of the computed invariant èç 0 ;r 0 è t ,having the following form:<br />

ç 0 + æç 0 = f 0 èèç + æç; r + ærè t ; èç 1 + æç 1 ;r 1 + ær 1 è t ; æææ; èç k + æç k ;r k + ær k è t è; è5.3è<br />

r 0 + ær 0 = f 00 èèç + æç; r + ærè t ; èç 1 + æç 1 ;r 1 + ær 1 è t ; æææ; èç k + æç k ;r k + ær k è t è: è5.4è<br />

Expanding Eq. è5.3è and Eq. è5.4è in Maclaurin series and ignoring second and higher<br />

order perturbation terms and then subtracting Eq. è5.1è and Eq. è5.2è from them, we have<br />

kX ç<br />

æç 0 = @ç0 @ç0<br />

@ç<br />

ç<br />

0<br />

æç +<br />

@ç @r ær + æç i + @ç0<br />

ær i<br />

@ç<br />

i=1<br />

i @r i<br />

= c 11 æç + c 12 ær + c 13 æç 1 + c 14 ær 1 +...+c 1;2k+1æç k + c 1;2k+2ær k ; è5.5è<br />

kX ç<br />

ær 0 = @r0 @r0<br />

@r<br />

ç<br />

0<br />

æç +<br />

@ç @r ær + æç i + @r0<br />

ær i<br />

@ç i @r i<br />

i=1<br />

= c 21 æç + c 22 ær + c 23 æç 1 + c 24 ær 1 +...+c 2;2k+1æç k + c 2;2k+2ær k ; è5.6è<br />

where c 11 = @ç 0 =@ç, c 12 = @ç 0 =@r, c 21 = @r 0 =@ç, c 22 = @r 0 =@r, c 1;i = @ç 0 =@ç i,2,<br />

c 2;i = @r 0 =@ç i,2, for i = 3; 5; ...; 2k + 1, and c 1;i = @ç 0 =@r i,2, c 2;i = @r 0 =@r i,2, for<br />

i =4; 6; ...; 2k +2.<br />

Let èææ; æRè, èææ i ; æR i è i=1;...;k and èææ 0 ; æR 0 è be s<strong>to</strong>chastic variables denoting the<br />

perturbations of èç; rè, èç i ;r i è i=1;...;k and èç 0 ;r 0 è respectively. Eq. è5.5è and Eq. è5.6è can


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 63<br />

be rewritten as<br />

ææ 0 = c 11 ææ + c 12 æR + c 13 ææ 1 + c 14 æR 1 +...+c 1;2k+1ææ k + c 1;2k+2æR k ;<br />

æR 0 = c 21 ææ + c 22 æR + c 23 ææ 1 + c 24 æR 1 +...+c 2;2k+1ææ k + c 2;2k+2æR k :<br />

Since pèææ; æRè and pèææ i ; æR i è i=1;...;k<br />

are Gaussian and independent, we have that<br />

p:d:f: pèææ; æR; ææ 1 ; æR 1 ; ...; ææ k ; æR k è is also Gaussian. More precisely,<br />

pèææ; æR; ææ 1 ; æR 1 ; ...; ææ k ; æR k è<br />

= pèææ; æRè pèææ 1 ; æR 1 è æææpèææ k ; æR k è<br />

1<br />

= è<br />

è 1 expë, ^æ<br />

è2çè<br />

qj ,1 Vë<br />

k+1 ^æj 2 Vt<br />

where<br />

V =<br />

^æ =<br />

0<br />

èææ; æR; ææ 1 ; æR 1 ; ...; ææ k ; æR k è t<br />

B@<br />

ç 2 ç<br />

ç çr ... ... 0 0<br />

ç çr ç 2 0 0<br />

r .<br />

. ..<br />

. .<br />

.<br />

..<br />

. 0 0 ç 2 ç<br />

ç çr<br />

1<br />

CA<br />

Since<br />

èææ 0 ; æR 0 è t =<br />

0 0 ... ... ç çr ç 2 r<br />

è2k+2èæè2k+2è:<br />

0<br />

@ c 11 c 12 ... c 1;2k+1 c 1;2k+2<br />

c 21 c 22 ... c 2;2k+1 c 2;2k+2<br />

1<br />

A V;<br />

from the above corollary, we can show that èææ 0 ; æR 0 è also has a Gaussian distribution<br />

having p:d:f:<br />

pèææ 0 ; æR 0 è=<br />

and covariance matrix<br />

2ç<br />

1<br />

p<br />

jæ 1 expë, 0 j 2 èææ0 ; æR 0 èæ 0,1 èææ 0 ; æR 0 è t ë<br />

æ 0 =<br />

0<br />

@ ç2 ç 0 ç ç 0 r 0<br />

ç 0 0 ç r ç 2 r 0<br />

1<br />

A ;<br />

è5.7è


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 64<br />

where<br />

ç 2 ç 0 = èc2 + 11 c2 + æææ+ 13 c2 1;2k+1 èç2 +èc2 + ç 12 c2 + æææ+ 14 c2 1;2k+2 èç2 + r 11 c 12 + c 13 c 14 + æææ+ c 1;2k+1 c 1;2k+2èç çr ;<br />

ç 2 r 0 = èc2 21<br />

+ c 2 23<br />

+ æææ+ c 2 2;2k+1èç 2 +èc 2 ç 22<br />

+ c 2 24<br />

+ æææ+ c 2 2;2k+2èç 2 +<br />

r 21 c 22 + c 23 c 24 + æææ+ c 2;2k+1 c 2;2k+2èç çr ;<br />

ç ç<br />

0 r<br />

0 = èc 11 c 21 + c 13 c 23 + æææ+ c 1;2k+1 c 2;2k+1èç 2 +<br />

ç 12 c 22 + c 14 c 24 + æææ+ c 1;2k+2 c 2;2k+2èç 2 +<br />

r 11 c 22 + c 13 c 24 + æææ+ c 1;2k+1 c 2;2k+2 + c 21 c 12 + c 23 c 14 + æææ+ c 2;2k+1 c 1;2k+2èç çr :<br />

Recall that in computing the invariant èç 0 ;r 0 è t ,wemay adjust the value of ç 0<br />

by adding<br />

ç or ,ç with the sign of r 0 æipped accordingly such that ç 0 is in ë0::çè. If this is the case,<br />

then the covariance matrix æ 0 must be adjusted by æipping the sign of ç 0 ç r<br />

0.<br />

5.3 Spread Functions under Various Transformation Groups<br />

In the following subsections, we specialize the general result derived in the preceding section<br />

<strong>to</strong> the various transformation groups considered earlier.<br />

5.3.1 Rigid-Invariant Spread Function<br />

From section 4.3.1, we have two sets of invariants èdue <strong>to</strong> 180 o -rotation ambiguityè for<br />

rigid transformations as follows:<br />

and<br />

ç 0 = ç , ç 1 + ç 2 ;<br />

r 0 = r + cscèç 1 , ç 2èèr 2 sinèç , ç 1è , r 1 sinèç , ç 2èè;<br />

ç 0 = ç , ç 1 , ç 2 ;<br />

r 0 = r + cscèç 1 , ç2èèr 2 sinèç , ç 1è , r 1 sinèç , ç 2èè;<br />

where èç; rè is the parameter of the line being encoded and èç i ;r i è i=1;2 are the parameters<br />

of the basis lines.


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 65<br />

Let èææ; æRè, èææ i ; æR i è i=1;2 and èææ 0 ; æR 0 è be s<strong>to</strong>chastic variables denoting respectively<br />

the perturbations of èç; rè, èç i<br />

;r i<br />

è i=1;2 and the computed invariant èç 0 ;r 0 è. From<br />

section 5.2, we have<br />

ææ 0 = c11ææ + c12æR + c13ææ1 + c14æR1 + c15ææ2 + c16æR2;<br />

æR 0 = c21ææ + c22æR + c23ææ1 + c24æR1 + c25ææ2 + c26æR2:<br />

Computing c i;j<br />

's, we obtain,<br />

c11 =1,<br />

c12 =0, c22 =1,<br />

c13 = ,1,<br />

c14 =0,<br />

c15 =0,<br />

c16 =0,<br />

c21 = cscèç1 , ç2èèr2 cosèç , ç1è , r1 cosèç , ç2èè,<br />

c23 = , sinèç , ç2è csc 2 èç1 , ç2èèr2 , r1 cosèç1 , ç2èè,<br />

c24 = , sinèç , ç2è cscèç1 , ç2è,<br />

c25 = , sinèç , ç1è csc 2 èç1 , ç2èèr1 , r2 cosèç1 , ç2èè,<br />

c26 = sinèç , ç1è cscèç1 , ç2è.<br />

Thus we conclude that èææ 0 ; æR 0 è has a Gaussian distribution with p:d:f<br />

and covariance matrix<br />

pèææ 0 ; æR 0 è=<br />

1<br />

2ç<br />

pjæj expë, 1 ; æR èæ ,1 0 èææ 0 ; æR 0 è t ë<br />

2 èææ0<br />

æ =<br />

0<br />

@ ç2 ç 0 ç ç 0 r 0<br />

ç 0 0 ç r<br />

ç 2 r 0<br />

1<br />

A ;<br />

where<br />

ç 2 ç 0 = 2ç2 ç ;<br />

ç 2 r 0 = èc2 21 + c2 23 + c2 25 èç2 ç +è1+c2 24 + c2 26 èç2 r +2èc 21 + c23c24 + c25c26èç çr<br />

;<br />

ç ç<br />

0 r<br />

0 = èc21 , c23èç 2 ç +èc 22 , c24èç çr<br />

:<br />

5.3.2 Similarity-Invariant Spread Function<br />

From section 4.3.2, we have the following formula for the invariant for similarity transformations:<br />

ç 0 = = ç , ç1 + ç 2 ;<br />

r 0 = sinèç1 , ç3è B=jAj


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 66<br />

and<br />

A = r 1 sinèç2 , ç3è+r2 sinèç3 , ç1è +r3 sinèç1 , ç2è;<br />

B = r2 sinèç , ç1è , r1 sinèç , ç2è +r sinèç1 , ç2è;<br />

where èç; rè is the parameter of the line being encoded and èç i ;r i è i=1;2;3 are the parameters<br />

of the basis lines.<br />

Let èææ; æRè, èææ i<br />

; æR i<br />

è i=1;2;3 and èææ 0 ; æR 0 è be s<strong>to</strong>chastic variables denoting respectively<br />

the perturbations of èç; rè, èç i<br />

;r i<br />

è i=1;2;3 and the computed invariant èç 0 ;r 0 è.<br />

From section 5.2, we have<br />

ææ 0 = c11ææ + c12æR + c13ææ1 + c14æR1 + c15ææ2 + c16æR2 + c17ææ3 + c18æR3;<br />

æR 0 = c21ææ + c22æR + c23ææ1 + c24æR1 + c25ææ2 + c26æR2 + c27ææ3 + c28æR3:<br />

Computing c i;j<br />

's, we obtain,<br />

c11 =1,<br />

c12 =0,<br />

c13 = ,1,<br />

c14 =0,<br />

c15 =0,<br />

c16 =0,<br />

c17 =0,<br />

c18 =0,<br />

c21 = r2 cosèç , ç1è , r1 cosèç , ç2è=jAj,<br />

c22 = sinèç1 , ç2è=jAj,<br />

c23 =ëèr2 cosèç1 , ç3è , r3 cosèç1 , ç2èèB=A,<br />

r2 cosèç , ç1è +r cosèç1 , ç2èë=jAj,<br />

c24 =ë, sinèç2 , ç3èB=A , sinèç , ç2èë=jAj,<br />

c25 =ëèr3 cosèç1 , ç2è , r1 cosèç2 , ç3èèB=A+<br />

r1 cosèç , ç2è , r cosèç1 , ç2èë=jAj,<br />

c26 = ësinèç1 , ç3èB=A + sinèç , ç1èë=jAj,<br />

c27 =èr1 cosèç2 , ç3è , r2 cosèç1 , ç3èèB=èAjAjè,<br />

c28 = , sinèç1 , ç2èB=èAjAjè.<br />

Thus we conclude that èææ 0 ; æR 0 è has a Gaussian distribution with p:d:f<br />

pèææ 0 ; æR 0 è=<br />

1<br />

2ç<br />

pjæj expë, 1 ; æR èæ ,1 0 èææ 0 ; æR 0 è t ë<br />

2 èææ0<br />

and covariance matrix<br />

æ =<br />

0<br />

@ ç2 ç 0 ç ç 0 r 0<br />

ç 0 0 ç r<br />

ç 2 r 0<br />

1<br />

A ;


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 67<br />

where<br />

ç 2 ç 0 = 2ç2;<br />

ç<br />

ç 2 r 0 = èc2 21 + c2 23 + c2 25 + c2 27 èç2 +èc2 ç 22 + c2 24 + c2 26 + c2 28 èç2 + r<br />

2èc21c22 + c23c24 + c25c26 + c27c28èç çr<br />

;<br />

ç ç<br />

0 r<br />

0 = èc21 , c23èç 2 ç +èc 22 , c24èç çr<br />

:<br />

5.3.3 Aæne-Invariant Spread Function<br />

From section 4.3.3, the invariant for aæne transformations has the following form:<br />

ç 0 = tan ,1 èA cscèç1 , ç3è sinèç1 , çè cscèç1 , ç2è=<br />

r 0 = C=D<br />

A cscèç2 , ç3è sinèç2 , çè cscèç1 , ç2èè;<br />

and<br />

A = r3 sinèç1 , ç2è +r2 sinèç3 , ç1è +r1 sinèç2 , ç3è;<br />

B = csc 2 èç1 , ç3è sin 2 èç1 , ç4è + csc 2 èç2 , ç3è sin 2 èç2 , ç4è;<br />

C = r1 cscèç1 , ç2è sinèç2 , çè , r2 cscèç1 , ç2è sinèç1 , çè +r;<br />

q<br />

D = A 2 B csc 2 èç1 , ç2è;<br />

where èç; rè is the parameter of the line being encoded and èç i<br />

;r i<br />

è i=1;2;3 are the parameters<br />

of the basis lines.<br />

Let èææ; æRè, èææ i<br />

; æR i<br />

è i=1;2;3 and èææ 0 ; æR 0 è be s<strong>to</strong>chastic variables denoting respectively<br />

the perturbations of èç; rè, èç i<br />

;r i<br />

è i=1;2;3 and the computed invariant èç 0 ;r 0 è.<br />

From section 5.2, we have<br />

ææ 0 = c11ææ + c12æR + c13ææ1 + c14æR1 + c15ææ2 + c16æR2 + c17ææ3 + c18æR3;<br />

æR 0 = c21ææ + c22æR + c23ææ1 + c24æR1 + c25ææ2 + c26æR2 + c27ææ3 + c28æR3:<br />

Computing c i;j<br />

's and writing E = cscèç1 , ç3è cscèç2 , ç3è=B, we obtain,<br />

c11 = sinèç1 , ç2èE;<br />

c12 = 0;


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 68<br />

c 13 = , cscèç 1 , ç 3 è sinèç , ç 2 è sinèç , ç 3 èE;<br />

c 14 = 0;<br />

c 15 = cscèç 2 , ç 3 è sinèç , ç 1 è sinèç , ç 3 èE;<br />

c 16 = 0;<br />

c 17 = , cscèç 1 , ç 3 è cscèç 2 , ç 3 è sinèç 1 , ç 2 è sinèç , ç 1 è sinèç , ç 2 èE;<br />

c 18 = 0;<br />

c 21 = ë,ècosèç , ç 1 è csc 2 èç 1 , ç 3 è sinèç , ç 1 è+<br />

cosèç , ç 2 è csc 2 èç 2 , ç 3 è sinèç , ç 2 èèC=B +<br />

cscèç 1 , ç 2 èèr 2 cosèç , ç 1 è , r 1 cosèç , ç 2 èèë=D;<br />

c 22 = 1=D;<br />

c 23 = ë, cscèç 1 , ç 2 èèr 2 cosèç , ç 1 è + cotèç 1 , ç 2 èèr 2 sinèç , ç 1 è , r 1 sinèç , ç 2 èèè ,<br />

Cèèr 3 cosèç 1 , ç 2 è , r 2 cosèç 1 , ç 3 èè=A ,<br />

csc 2 èç 1 , ç 3 è sinèç , ç 1 èècosèç , ç 1 è + cotèç 1 , ç 3 è sinèç , ç 1 èè=Bè ,<br />

cotèç 1 , ç 2 èë=D;<br />

c 24 = ë, sinèç 2 , ç 3 èAC , cscèç 1 , ç 2 è sinèç , ç 2 èë=D;<br />

c 25 = ëcscèç 1 , ç 2 èèr 1 cosèç , ç 2 è + cotèç 1 , ç 2 èèr 2 sinèç , ç 1 è , r 1 sinèç , ç 2 èèè ,<br />

Cèèr 1 cosèç 2 , ç 3 è , r 3 cosèç 1 , ç 2 èè=A ,<br />

csc 2 èç 2 , ç 3 è sinèç , ç 2 èècosèç , ç 2 è + cotèç 2 , ç 3 è sinèç , ç 2 èè=Bè +<br />

cotèç 1 , ç 2 èë=D;<br />

c 26 = ësinèç 1 , ç 3 èC=A + cscèç 1 , ç 2 è sinèç , ç 1 èë=D;<br />

c 27 = ë,Cèèr 2 cosèç 1 , ç 3 è , r 1 cosèç 2 , ç 3 èè=A +<br />

ècotèç 1 , ç 3 è csc 2 èç 1 , ç 3 è sin 2 èç , ç 1 è+<br />

cotèç 2 , ç 3 è csc 2 èç 2 , ç 3 è sin 2 èç , ç 2 èè=Bèë=D;<br />

c 28 = ë, sinèç 1 , ç 2 èC=Aë=D:<br />

Thus we conclude that èææ 0 ; æR 0 è has a Gaussian distribution with p:d:f<br />

pèææ 0 ; æR 0 è =<br />

1<br />

2ç p jæj expë, 1 2 èææ0 ; æR 0 èæ ,1 èææ 0 ; æR 0 è t ë<br />

è5.8è


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 69<br />

and covariance matrix<br />

where<br />

æ =<br />

0<br />

@ ç2 ç 0 ç ç 0 r 0<br />

ç 0 0 ç r<br />

ç 2 r 0<br />

1<br />

A ;<br />

ç 2 ç 0 = èc2 11 + c 2 13 + c 2 15 + c 2 17èç 2 ; ç<br />

ç 2 r 0 = èc2 21 + c2 23 + c2 25 + c2 27 èç2 +èc2 ç 22 + c2 24 + c2 26 + c2 28 èç2 + r<br />

2èc21c22 + c23c24 + c25c26 + c27c28èç çr<br />

;<br />

ç 0 0 ç r<br />

= èc11c21 + c13c23 + c15c25 + c17c27èç 2 ç<br />

+èc11c22 + c13c24 + c15c26 + c17c28èç çr<br />

:<br />

5.3.4 Projective-Invariant Spread Function<br />

From section 4.3.4, the invariant for projective transformations is as follows:<br />

ç 0 = tan ,1 èDEF=ABCè;<br />

,GCF<br />

r 0 = p<br />

èABCè 2 +èDEFè 2<br />

and<br />

A = èr3 sinèç1 , ç2è , r2 sinèç1 , ç3è+r1 sinèç2 , ç3èè;<br />

B = èr4 sinèç , ç2è , r2 sinèç , ç4è+r sinèç2 , ç4èè;<br />

C = èr4 sinèç1 , ç3è , r3 sinèç1 , ç4è+r1 sinèç3 , ç4èè;<br />

D = è,r3 sinèç , ç1è+r1 sinèç , ç3è , r sinèç1 , ç3èè;<br />

E = èr4 sinèç1 , ç2è , r2 sinèç1 , ç4è+r1 sinèç2 , ç4èè;<br />

F = èr4 sinèç2 , ç3è , r3 sinèç2 , ç4è+r2 sinèç3 , ç4èè;<br />

G = èr2 sinèç , ç1è , r1 sinèç , ç2è+r sinèç1 , ç2èè;<br />

where èç; rè is the parameter of the line being encoded and èç i<br />

;r i<br />

è i=1;2;3;4 are the parameters<br />

of the basis lines.<br />

Let èææ; æRè, èææ i<br />

; æR i<br />

è i=1;2;3;4 and èææ 0 ; æR 0 è be s<strong>to</strong>chastic variables denoting<br />

respectively the perturbations of èç; rè, èç i<br />

;r i<br />

è i=1;2;3;4 and the computed invariant èç 0 ;r 0 è.<br />

From section 5.2, we have<br />

ææ 0 = c11ææ + c12æR + c13ææ1 + c14æR1 + c15ææ2 +


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 70<br />

c 16 æR 2 + c 17 ææ 3 + c 18 æR 3 + c 19 ææ 4 + c 1;10 æR 4 ;<br />

æR 0 = c 21 ææ + c 22 æR + c 23 ææ 1 + c 24 æR 1 + c 25 ææ 2 +<br />

c 26 æR 2 + c 27 ææ 3 + c 28 æR 3 + c 29 ææ 4 + c 2;10 æR 4 :<br />

Computing c i;j 's, we obtain,<br />

c 11 = ëè,r 4 cosèç , ç 2 è+r 2 cosèç , ç 4 èèI=B +<br />

è,r 3 cosèç , ç 1 è+r 1 cosèç , ç 3 èèEFë=HK;<br />

c 12 = ësinèç 2 , ç 4 èI=B , sinèç 1 , ç 3 èEFë=HK;<br />

c 13 = ë,èr 4 cosèç 1 , ç 3 è , r 3 cosèç 1 , ç 4 èèI=C +<br />

èr 4 cosèç 1 , ç 2 è , r 2 cosèç 1 , ç 4 èèDF ,<br />

èr 3 cosèç 1 , ç 2 è , r 2 cosèç 1 , ç 3 èèI=A +<br />

èr 3 cosèç , ç 1 è , r cosèç 1 , ç 3 èèEFë=HK;<br />

c 14 = ë, sinèç 3 , ç 4 èI=C + sinèç 2 , ç 4 èDF ,<br />

sinèç 2 , ç 3 èI=A + sinèç , ç 3 èEFë=HK;<br />

c 15 = ë,è,r 4 cosèç 2 , ç 3 è+r 3 cosèç 2 , ç 4 èèDE +<br />

è,r 4 cosèç 1 , ç 2 è+r 1 cosèç 2 , ç 4 èèDF ,<br />

è,r 4 cosèç , ç 2 è+r cosèç 2 , ç 4 èèI=B,<br />

è,r 3 cosèç 1 , ç 2 è+r 1 cosèç 2 , ç 3 èèI=Aë=HK;<br />

c 16 = ësinèç 3 , ç 4 èDE , sinèç 1 , ç 4 èDF +<br />

sinèç , ç 4 èI=B + sinèç 1 , ç 3 èI=Aë=HK;<br />

c 17 = ë,èr 4 cosèç 2 , ç 3 è , r 2 cosèç 3 , ç 4 èèDE +<br />

èr 4 cosèç 1 , ç 3 è , r 1 cosèç 3 , ç 4 èèI=C,<br />

èr 2 cosèç 1 , ç 3 è , r 1 cosèç 2 , ç 3 èèI=A +<br />

è,r 1 cosèç , ç 3 è+r cosèç 1 , ç 3 èèEFë=HK;<br />

c 18 = ë, sinèç 2 , ç 4 èDE + sinèç 1 , ç 4 èI=C ,<br />

sinèç 1 , ç 2 èI=A , sinèç , ç 1 èEFë=HK;<br />

c 19 = ë,è,r 3 cosèç 2 , ç 4 è+r 2 cosèç 3 , ç 4 èèDE +


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 71<br />

è,r 3 cosèç 1 , ç 4 è+r 1 cosèç 3 , ç 4 èèI=C +<br />

èr 2 cosèç 1 , ç 4 è , r 1 cosèç 2 , ç 4 èèDF ,<br />

èr 2 cosèç , ç 4 è , r cosèç 2 , ç 4 èèI=Bë=HK;<br />

c 1;10 = ësinèç 2 , ç 3 èDE , sinèç 1 , ç 3 èI=C +<br />

sinèç 1 , ç 2 èDF , sinèç , ç 2 èI=Bë=HK;<br />

c 21 = ëèr 4 cosèç , ç 2 è , r 2 cosèç , ç 4 èèHAC +<br />

è,r 3 cosèç , ç 1 è+r 1 cosèç , ç 3 èèIEFëJ=L 3 +<br />

ë,r 2 cosèç , ç 1 è+r 1 cosèç , ç 2 èëCF=L;<br />

c 22 = ësinèç 2 , ç 4 èHAC , sinèç 1 , ç 3 èIEFëJ=L 3 ,<br />

sinèç 1 , ç 2 èCF=L;<br />

c 23 = ëèr 4 cosèç 1 , ç 3 è , r 3 cosèç 1 , ç 4 èèHAB +<br />

èr 3 cosèç 1 , ç 2 è , r 2 cosèç 1 , ç 3 èèHBC +<br />

èr 4 cosèç 1 , ç 2 è , r 2 cosèç 1 , ç 4 èèIDF +<br />

èr 3 cosèç , ç 1 è+r cosèç 1 , ç 3 èèIEFëJ=L 3 ,<br />

ër 4 cosèç 1 , ç 3 è , r 3 cosèç 1 , ç 4 èëGF=L +<br />

ër 2 cosèç , ç 1 è , r cosèç 1 , ç 2 èëCF=L;<br />

c 24 = ësinèç 3 , ç 4 èHAB + sinèç 2 , ç 3 èHBC +<br />

sinèç 2 , ç 4 èIDF + sinèç , ç 3 èIEFëJ=L 3 ,<br />

sinèç 3 , ç 4 èGF=L + sinèç , ç 2 èCF=L;<br />

c 25 = ëè,r 4 cosèç , ç 2 è+r cosèç 2 , ç 4 èèHAC +<br />

è,r 3 cosèç 1 , ç 2 è+r 1 cosèç 2 , ç 3 èèHBC +<br />

èr 4 cosèç 2 , ç 3 è , r 3 cosèç 2 , ç 4 èèIDE,<br />

èr 4 cosèç 1 , ç 2 è , r 1 cosèç 2 , ç 4 èèIDFëJ=L 3 ,<br />

ër 4 cosèç 2 , ç 3 è , r 3 cosèç 2 , ç 4 èëGC=L +<br />

ë,r 1 cosèç , ç 2 è+r cosèç 1 , ç 2 èëCF=L;<br />

c 26 = ë, sinèç , ç 4 èHAC , sinèç 1 , ç 3 èHBC +<br />

sinèç 3 , ç 4 èIDE, sinèç 1 , ç 4 èIDFëJ=L 3 ,<br />

sinèç 3 , ç 4 èGC=L , sinèç , ç 1 èCF=L;


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 72<br />

c 27 = ë,èr 4 cosèç 1 , ç 3 è , r 1 cosèç 3 , ç 4 èèHAB +<br />

èr 2 cosèç 1 , ç 3 è , r 1 cosèç 2 , ç 3 èèHBC ,<br />

èr 4 cosèç 2 , ç 3 è , r 2 cosèç 3 , ç 4 èèIDE +<br />

è,r 1 cosèç , ç 3 è+r cosèç 1 , ç 3 èèIEFëJ=L 3 ,<br />

ë,r 4 cosèç 2 , ç 3 è+r 2 cosèç 3 , ç 4 èëGC=L ,<br />

ë,r 4 cosèç 1 , ç 3 è+r 1 cosèç 3 , ç 4 èëGF=L;<br />

c 28 = ë, sinèç 1 , ç 4 èHAB + sinèç 1 , ç 2 èHBC ,<br />

sinèç 2 , ç 4 èIDE, sinèç , ç 1 èIEFëJ=L 3 +<br />

sinèç 2 , ç 4 èGC=L + sinèç 1 , ç 4 èGF=L;<br />

c 29 = ëèr 3 cosèç 1 , ç 4 è , r 1 cosèç 3 , ç 4 èèHAB +<br />

èr 2 cosèç , ç 4 è , r cosèç 2 , ç 4 èèHAC +<br />

èr 3 cosèç 2 , ç 4 è , r 2 cosèç 3 , ç 4 èèIDE +<br />

èr 2 cosèç 1 , ç 4 è , r 1 cosèç 2 , ç 4 èèIDFëJ=L 3 ,<br />

ër 3 cosèç 2 , ç 4 è , r 2 cosèç 3 , ç 4 èëGC=L ,<br />

ër 3 cosèç 1 , ç 4 è , r 1 cosèç 3 , ç 4 èëGF=L;<br />

c 2;10 = ësinèç 1 , ç 3 èHAB + sinèç , ç 2 èHAC +<br />

sinèç 2 , ç 3 èIDE + sinèç 1 , ç 2 èIDFëJ=L 3 ,<br />

sinèç 2 , ç 3 èGC=L , sinèç 1 , ç 3 èGF=L:<br />

and<br />

H = ABC; I = DEF; J = GCF; K =è1+I 2 =H 2 è; L =<br />

p<br />

H 2 + I 2 :<br />

Thus we conclude that èææ 0 ; æR 0 è has a Gaussian distribution with p:d:f<br />

pèææ 0 ; æR 0 è=<br />

1<br />

2ç p jæj expë, 1 2 èææ0 ; æR 0 èæ ,1 èææ 0 ; æR 0 è t ë<br />

and covariance matrix<br />

æ =<br />

0<br />

@ ç2 ç 0 ç ç 0 r 0<br />

ç 0 0 ç r<br />

ç 2 r 0<br />

1<br />

A ;


CHAPTER 5. THE EFFECT OF NOISE ON THE INVARIANTS 73<br />

where<br />

ç 2 ç 0 = èc2 11 + c2 13 + æææ+ c2 19 èç2 ç +èc2 12 + c2 14 + æææ+ c2 1;10 èç2 r +<br />

2èc11 c 12 + c13 c 14 + æææ+ c19 c 1;10èç çr<br />

;<br />

ç 2 r 0 = èc2 21 + c2 23 + æææ+ c2 29 èç2 ç +èc2 22 + c2 24 + æææ+ c2 2;10 èç2 r +<br />

2èc21 c 22 + c23 c 24 + æææ+ c29 c 2;10èç çr<br />

;<br />

ç ç<br />

0 r<br />

0 = èc11 c 21 + c13 c 23 + æææ+ c19 c 29èç 2 ç +èc 12 c 22 + c14 c 24 + æææ+ c1;10 c 2;10èç 2 r +<br />

èc11 c 22 + c13 c 24 + æææ+ c19 c 2;10 + c21 c 12 + c23 c 14 + æææ+ c29 c 1;10èç çr<br />

:


Chapter 6<br />

Bayesian <strong>Line</strong> Invariant Matching<br />

Our recognition scheme builds upon the geometric hashing method, as reviewed in chapter<br />

2, which also notes some weaknesses of the geometric hashing technique. Even when<br />

we use an improved Hough transform <strong>to</strong> detect line features in a degraded image, we still<br />

face the problem dealing with image noise and with the resulting problem of noise in the<br />

invariants used for hashing.<br />

To minimize these noise eæects, we need <strong>to</strong> derive a theoretically justiæed scheme for<br />

weighted voting among hash bins. We show howaweighted voting scheme can be used <strong>to</strong><br />

cope with the problem. This is the aim of the probabilistic procedure which wenowgoon<br />

<strong>to</strong> describe.<br />

6.1 A Measure of Matching between Scene and Model Invariants<br />

In the geometric hashing method described in chapter 2, during recognition a computed<br />

scene invariant is used <strong>to</strong> index the geometric hash table and tally one vote for the entry<br />

retrieved. However, considering the presence of noise, a scene invariant can not exactly<br />

hit the hash table entry where its matching model invariant is supposed <strong>to</strong> reside. Gavrila<br />

and Groen ë18ë assume a bounded circular region around each hash bin and tally equal<br />

vote for all the bins within the region centered around the bin being hashed. However,<br />

our analysis of the noise eæect on computed invariants in the preceding chapter shows<br />

that the perturbation of an invariant has an approximately Gaussian distribution with a<br />

74


CHAPTER 6. BAYESIAN LINE INVARIANT MATCHING 75<br />

non-zero correlation coeæcient.<br />

This implies that the region of hash space that would<br />

need <strong>to</strong> be accessed is an ellipse, instead of a circle, centered around the ëtrue" value of<br />

the invariant. Moreover, this perturbation also depends on the basis and the line being<br />

encoded, which means that diæerent invariants should be associated with diæerent size,<br />

shape and orientation of hash space regions for error <strong>to</strong>lerance.<br />

The nodes in hash space that more ëclosely match" the value of the invariant should get<br />

a higher weighted score than those which are far away, in a manner accurately representing<br />

the behavior of the noise which aæects these invariants. Let inv s be a scene invariant and<br />

let inv Mi be a model invariant computed from a basis B Mi and a line l j of M i . Bayes'<br />

theorem ë16ë allows us <strong>to</strong> precisely determine the probability that inv s<br />

presence of a certain ëM i ;B Mi ;l j ë:<br />

where<br />

P èëM i ;B Mi ;l j ëjinv s è = P èëM i ;B Mi ;l j ëè pèinv sjëM i ;B Mi ;l j ëè<br />

pèinv s è<br />

æ P èëM i ;B Mi ;l j ëè is the a priori probability that M i<br />

B Mi<br />

results from the<br />

is embedded in the scene and<br />

is selected <strong>to</strong> encode l j . We assume that all the models are equally likely <strong>to</strong><br />

appear and have the same number of features, which is a simpliæcation that could<br />

be easily generalized. Thus P èëM i ;B Mi ;l j ëè is the same for all ëmodel, basis, lineë<br />

combinations. èNote that this may not be quite true if the basis selection procedure<br />

favors long bases instead of selecting bases randomly.è<br />

æ pèinv s jëM i ;B Mi ;l j ëè is the conditional p:d:f: of observing inv s given that M i appears<br />

in the scene with B Mi<br />

being selected and l j being encoded. It consists of two parts:<br />

the spike induced by ëM i ;B Mi ;l j ë and a background distribution.<br />

æ pèinv s è is the p:d:f: of observing inv s , regardless of which ëM i ;B Mi ;l j ë induces it.<br />

Assume that no twoinvariants of the same model and with a æxed basis locate near each<br />

other in hash space èi.e., model features are distinct and separate; If more than one model<br />

feature lands in approximately the same location, then the features should be coalesced<br />

in<strong>to</strong> a single featureè, inv s can be close <strong>to</strong> at most one model invariant. Thus, given inv s ,<br />

the supporting evidence that M i appears in the scene and the basis B Mi is selected is then<br />

P èëM i ;B Mi ëèjinv s è ç max P èëM i ;B Mi ;l j ëjinv s è<br />

j


CHAPTER 6. BAYESIAN LINE INVARIANT MATCHING 76<br />

è<br />

max j pèinv s jëM i ;B Mi ;l j ëè<br />

: è6.1è<br />

pèinv s è<br />

6.2 Evidence Synthesis by Bayesian Reasoning<br />

In the recognition stage, a certain scene basis B s is probed and the invariants of all the<br />

remaining features with respect <strong>to</strong> this basis are computed. Eachinvariant provides various<br />

degrees of support for the existence of ëM i ;B Mi ë combinations, assuming B s corresponds<br />

<strong>to</strong> B Mi . If for every ëM i ;B Mi ë combination we synthesize support from all the invariants<br />

and hypothesize the one with maximum support, we in fact exercise a maximum likelihood<br />

approach <strong>to</strong> object recognition.<br />

In order <strong>to</strong> synthesize support coming from diæerent invariants, we have the following<br />

formula, based on a probability derivation <strong>using</strong> Bayes' theorem ë12ë :<br />

P èëM i ;B Mi ëjinv s1 ; ...; inv sn è<br />

P èëM i ;B Mi ëè<br />

= pèinv s 1<br />

; ...; inv sn jëM i ;B Mi ëè<br />

pèinv s1 ; ...; inv sn è<br />

= pèinv s 1<br />

jëM i ;B Mi ëè ...pèinv sn jëM i ;B Mi ëè<br />

èassuming conditional independenceè<br />

pèinv s1 ; ...; inv sn è<br />

1 P èëM i ;B Mi ëjinv s1 è pèinv s1 è<br />

=<br />

... P èëM i;B Mi ëjinv sn è pèinv sn è<br />

pèinv s1 ; ...; inv sn è P èëM i ;B Mi ëè<br />

P èëM i ;B Mi ëè<br />

= pèinv s 1<br />

è ...pèinv sn è<br />

pèinv s1 ; ...; inv sn è<br />

è<br />

=<br />

Thus<br />

pèinv s 1<br />

è ...pèinv sn è<br />

pèinv s1 ; ...; inv sn è<br />

1<br />

pèinv s1 ; ...; inv sn è<br />

nY<br />

k=1<br />

nY<br />

nY<br />

k=1<br />

k=1<br />

P èëM i ;B Mi ëjinv sk è<br />

P èëM i ;B Mi ëè<br />

max j pèinv sk jëM i ;B Mi ;l j ë<br />

pèinv sk è P èëM i ;B Mi ëè<br />

max j pèinv sk jëM i ;B Mi ;l j ë<br />

:<br />

P èëM i ;B Mi ëè<br />

nY<br />

P èëM i ;B Mi ëjinv s1 ; ...; inv sn è è<br />

k=1<br />

max<br />

j<br />

èby Eq. è6.1èè<br />

pèinv sk jëM i ;B Mi ;l j ëè;<br />

è6.2è<br />

since pèinv s1 ; ...; inv sn è is independent of all the hypotheses and all P èëM i ;B Mi ëè's are<br />

identical.


CHAPTER 6. BAYESIAN LINE INVARIANT MATCHING 77<br />

In the above derivation, we have assumed conditional independence among the invariants,<br />

i.e.,<br />

pèinv s1 ; ...; inv sn jëM i ;B Mi ëè = pèinv sk jëM i ;B Mi ëè;<br />

k=1<br />

which means that the expected probability distribution of a scene invariant inv sk is independent<br />

of the other scene invariants. The intuition for this conditional independence <strong>to</strong><br />

hold is that if the given inv sk does not belong <strong>to</strong> the embedded model M i , it has nothing<br />

<strong>to</strong> do with ëM i ;B Mi ë and also other scene invariants; if the given inv sk belongs <strong>to</strong> the embedded<br />

model M i , since geometric hashing applies transformation-invariant information,<br />

inv sk is destined <strong>to</strong> appear, given the hypothesis that B Mi is selected. Thus inv sk is not<br />

aæected by other invariants. A more formal argument is given in ë28ë.<br />

In computing P èëM i ;B Mi ëjinv s1 ; ...; inv sn è, we do not need <strong>to</strong> compute absolute probabilities<br />

but only relative probabilities. Since the logarithm function is mono<strong>to</strong>nic, we can<br />

apply the logarithm <strong>to</strong> both sides of Eq. è6.2è:<br />

log P èëM i ;B Mi ëjinv s1 ; ...; inv sn è = log C + log max<br />

j<br />

k=1<br />

turning products in<strong>to</strong> sums.<br />

nX<br />

nY<br />

pèinv sk jëM i ;B Mi ;l j ëè;<br />

è6.3è<br />

6.3 The Algorithm<br />

The pre-processing stage parallels our description of the preprocessing stage of the geometric<br />

hashing method reviewed in section 2.2. For a speciæc transformation group considered,<br />

we use the formulae given in section 4.3 <strong>to</strong> compute the invariants. We also compute the<br />

spread functions of these invariants <strong>using</strong> Eq. è5.7è. Both pieces of information, <strong>to</strong>gether<br />

with the identity of ëmodel, basis, lineë, are s<strong>to</strong>red in a node in the hash entry indexed by<br />

each invariant. Understanding that the spread function is Gaussian, we s<strong>to</strong>re its covariance<br />

matrix as a representation of the function. We note that each node records information<br />

about a particular ëmodel, basis, lineë combination necessary for our recognition stage.<br />

The hash table is so arranged that the position of each entry in fact represents a quantized<br />

coordinate èç; rè of hash space. In this way, range query can be easily performed.<br />

In the recognition stage, lines are detected by the Hough transform and a subset of<br />

these lines are chosen as a basis for associating invariants with all remaining lines. Eq. è6.3è


CHAPTER 6. BAYESIAN LINE INVARIANT MATCHING 78<br />

is used <strong>to</strong> accumulate support from all these invariants èlog C term, which is common <strong>to</strong><br />

all the hypotheses, can be ignoredè. We note that a straightforward use of Eq. è6.3è runs<br />

through all the ëmodel, basis, lineë combinations and thus the computational cost can be<br />

immense. By exploiting the geometric hash table prepared in the pre-processing stage,<br />

for each invariant inv sk hashed in<strong>to</strong> the hash space, we access only its nearby nodes. For<br />

each node accessed, we credit log pèinv sk jëM i ;B Mi ;l j ëè <strong>to</strong> the node, whose initial score<br />

value is set <strong>to</strong> 0. In computing log pèinv sk jëM i ;B Mi ;l j ëè, the background distribution<br />

function, compared with the spike induced by ëM i ;B Mi ;l j ë is usually negligible. We will<br />

use logarithm of Eq. è5.7è <strong>to</strong> approximate log pèinv sk jëM i ;B Mi ;l j ëè. Typically this Gaussian<br />

p:d:f: falls oæ rapidly. Wethus approximately credit the same weighted score, log c,<br />

<strong>to</strong> all those not in the neighborhood of the hash. Equivalently, wemay instead credit<br />

log pèinv sk jëM i ;B Mi ;l j ëè , log c <strong>to</strong> those nearby nodes accessed and keep intact those not<br />

in the neighborhood. After all the invariants are processed, the support for all the ëM i ;B Mi ë<br />

combinations are his<strong>to</strong>gramed. The <strong>to</strong>p few ëM i ;b Mi ë's with suæciently high votes are veriæed<br />

by transforming and superimposing the model on<strong>to</strong> the scene. A quality measure,<br />

QM, deæned as the ratio of visible boundary, is computed. If the QM is above a preset<br />

threshold èe.g. 0.5è, the hypothesis passes the veriæcation. èThe veriæcation can proceed<br />

in a hierarchical way by ærst verifying the existence of the basis and reject immediately if<br />

the basis fails veriæcation.è<br />

In the above process, each probe of a scene basis is hypothesized as an instance of a<br />

particular basis of a certain model. It is possible that the scene basis probed does not<br />

correspond <strong>to</strong> any basis of any model. In the following section, we give a probabilistic<br />

analysis of the number of probes needed <strong>to</strong> detect all the instances in the scene.<br />

6.4 An Analysis of the Number of Probings Needed<br />

Let there be n lines in the scene, n = n 1 + n 2 , where n 2 lines are spurious. Let also there<br />

be M model instances in the scene. Assuming that the degree of occlusion of each instance<br />

is the same. Thus on average each model appearing in the scene has n 1<br />

lines.<br />

M<br />

The probability ofchoosing a scene basis, consisting of k features, which happens <strong>to</strong>


CHAPTER 6. BAYESIAN LINE INVARIANT MATCHING 79<br />

correspond <strong>to</strong> a basis of a model is<br />

k,1<br />

Y<br />

n 1<br />

p = è<br />

, i M<br />

i=0<br />

n , i è:<br />

If we are <strong>to</strong> detect all M model instances at once, in the best case we have <strong>to</strong> probe<br />

only M scene bases from the scene. This happens when each time a basis is probed, it<br />

happens <strong>to</strong> correspond <strong>to</strong> a basis of a certain model and next time another basis is probed,<br />

it corresponds <strong>to</strong> a basis of another model. However, the probability for this <strong>to</strong> happen is<br />

p M , which isobviously small. The probability of detecting exactly i diæerent models after<br />

t trials èi.e., among the t trials, t = i 0 + i 00 ;i 0 ç i probes cover bases of each of the i models<br />

and i 00 probes correspond <strong>to</strong> no bases of any modelè is<br />

p i =<br />

0<br />

@ t i<br />

1<br />

A i! p i ;i=0; :::; M:<br />

We may derive the lower bound of t by restricting p i ç æ, 0ç æ ç 1:<br />

tèt , 1è æææèt , i +1èç æ=p i :<br />

Thus,<br />

t i é æ=p i or té i p æ=p:<br />

The probability that at least j diæerent models are detected is then<br />

X<br />

j,1<br />

q j =1, p i :<br />

i=0<br />

In practice, we have <strong>to</strong> try more probes than theoretically predicted, because some<br />

probes, though corresponding <strong>to</strong> some model bases, are bad èe.g., <strong>to</strong>o small, <strong>to</strong>o big or <strong>to</strong>o<br />

much perturbedè bases and do not result in a stable transformation that transforms model<br />

<strong>to</strong> properly æt its scene instance <strong>to</strong> pass veriæcation.


Chapter 7<br />

The Experiments<br />

In this chapter, we test our approach <strong>using</strong> both synthesized and real images. We choose <strong>to</strong><br />

implement the case of aæne transformations. The group of aæne transformations is a supergroup<br />

of both rigid and similarity transformations. In addition, aæne transformations<br />

are often appropriate approximations <strong>to</strong> perspective transformations under suitable viewing<br />

situations èreferring <strong>to</strong> p. 79 of ë35ëè and thus can be used in recognition algorithms as<br />

substitutes for more general perspective transformations. It should be noted that a similar<br />

approach can be applied <strong>to</strong> the cases of other transformations we discussed in chapter 4.<br />

Implementation details, with experimental results, are presented in the following.<br />

7.1 Implementation of Aæne Invariant Matching<br />

To apply the procedure described in section 6.3 <strong>to</strong> the case of aæne group, a triplet of<br />

lines is necessary <strong>to</strong> form a basis. Eq. è4.21è and Eq. è4.22è in section 4.3.3 are used <strong>to</strong><br />

compute the invariant and the spread function of the invariant isgiven by Eq. è5.8è in<br />

section 5.3.3. Our formulae involve many trigonometric evaluations. To accelerate the<br />

process, a æne-grained look-up table of trigonometric functions is pre-computed.<br />

Considering numerical stability, we should avoid prevent <strong>using</strong> small bases or large<br />

bases for invariant computing. In our implementation, we use 256 æ 256 images. If any<br />

side of the basis is less than 20 pixels or all sides of the basis is greater than 500 pixels, we<br />

consider it infeasible.<br />

The hash table is implemented as a 2-D array <strong>to</strong> represent the 2-D hash space. Since<br />

80


CHAPTER 7. THE EXPERIMENTS 81<br />

during recognition, wehave <strong>to</strong> consider a neighborhood of a hash entry when it is retrieved,<br />

the boundary of the hash table has <strong>to</strong> be treated specially: The ç-axis of the 2-D hash<br />

table should be viewed as round-wrapped with æip, since the invariant èç; rè is mathematically<br />

equivalent <strong>to</strong>èç , ç;,rè èe.g., è179 æ ; 2è is equivalent <strong>to</strong>è,1 æ ; ,2è, thus neighboring,<br />

say,è0 æ ; ,2èè.<br />

During evidence synthesis, care should be taken <strong>to</strong> prevent multiple accumulation of<br />

weighted votes: For each probing of scene bases, each hash node, ëM i ;b Mi ;l j ë, should<br />

receive avote only once, since model feature l j can not match more than two scene features<br />

simultaneously. Wethus have <strong>to</strong>keep track of the score each hash node receives and take<br />

only the maximum weighted score the node has received.<br />

We also have <strong>to</strong> reject reæexion cases. Reæexion transformations form a subgroup of<br />

aæne transformations. However, a 2-D object, say a triangle with vertex-1; 2; 3 in clockwise<br />

sequence, preserves this sequence, even its shape being seriously skewed when viewed<br />

from a tilted camera. This problem is tackled by detecting the orientation èclockwise or<br />

counterclockwiseè of a basis by its cross product. Thus no such false match between scene<br />

basis and model basis will be hypothesized.<br />

Since we are dealing with highly occluded scenes, we have found that even if a hypothesis<br />

has passed veriæcation èby verifying its boundaryè, it can still be a false alarm. By<br />

the viewpoint consistency principle ë41ë, the locations of all object features in an image<br />

should be consistent with the projection from a single viewpoint. We may show that all<br />

2-D objects lying on the same plane have an identical enlarging èor shrinkingè ratio of<br />

area, if viewed from the same camera, assuming approximation of aæne transformations<br />

<strong>to</strong> perspective transformations. The quantity of this ratio is j detèAèj, where A is the<br />

skewing matrix of an aæne transformation T =èA; bè. We call it the skewing fac<strong>to</strong>r and<br />

request that all the hypothesized instances have the same skewing fac<strong>to</strong>r. èAppendix-A<br />

provides the formula for computing T from a correspondence of a scene basis and a model<br />

basis.è<br />

For each probe, after his<strong>to</strong>graming of the weighted votes, the <strong>to</strong>p few hypotheses are<br />

veriæed. If the veriæcation is successful, the skewing fac<strong>to</strong>r of the alleged instance is<br />

computed and used <strong>to</strong> vote for a common skewing fac<strong>to</strong>r.<br />

After suæcient number of bases are probed, which is determined by a statistical estimation<br />

procedure èsee section 6.4è, a global skewing fac<strong>to</strong>r, which istaken <strong>to</strong> be the


CHAPTER 7. THE EXPERIMENTS 82<br />

j detèAèj with highest number of votes, is obtained. Those hypothesized model instances<br />

voting for the global skewing fac<strong>to</strong>r are passed on<strong>to</strong> a disambiguation process which wego<br />

on <strong>to</strong> describe below.<br />

It is possible that a scene line is shared by many model instances, some of which are<br />

spurious. We could reæne the result by classifying the shared lines <strong>to</strong> belong <strong>to</strong> that model<br />

instance most strongly supported by the evidence. One possible technique for disambiguation<br />

is relaxation ë54ë. However, relaxation is a complicated process. Since each model<br />

instance is associated with its quality measure, QM, as a measure of the strength of evidence<br />

supporting its existence èthis deænition of QM favors long segments of a model<br />

instance and is similar <strong>to</strong> that used in ë4ë with the same justiæcationè. The following<br />

straightforward technique proves <strong>to</strong> work well.<br />

We maintain two data structures, a QM-buæer and a frame-buæer, of the same size<br />

as the image. After initializing the QM-buæer and the frame-buæer, we process each<br />

candidate model instance in an arbitrary order by superimposing the candidate model<br />

instance on<strong>to</strong> the QM-buæer, then tracing the boundary of the model instance. For each<br />

position traced, if the QM of the current model instance is greater than that value s<strong>to</strong>red<br />

in the QM-buæer, we write the id of this candidate model instance <strong>to</strong> the corresponding<br />

position of the frame-buæer and replace that value in the QM-buæer by the current QM.<br />

After all candidate model instances have been processed, we recompute the QM of<br />

each candidate on the frame-buæer by considering only positions with the same id as this<br />

candidate. Those with the new QM's less than a preset threshold èe.g. 0:5è are removed<br />

from the list of candidate model instances.<br />

7.2 Best Least-Squares Match<br />

The transformation between a model and its scene instance can be recovered by the correspondence<br />

of the model basis and the scene basis alone. However, scene lines detected by<br />

the Hough transform are usually somewhat dis<strong>to</strong>rted due <strong>to</strong> noise. This results in dis<strong>to</strong>rtion<br />

in computing the transformation. Usually this dis<strong>to</strong>rted transformation transforms<br />

a model <strong>to</strong> match its scene instance with basis lines matching each other perfectly while<br />

the other lines deviating from their correspondences more or less. Knowledge of additional<br />

line correspondences between a model and its scene instance can be used <strong>to</strong> improve the


CHAPTER 7. THE EXPERIMENTS 83<br />

accuracy of the computed transformation. In this following, we discuss several methods <strong>to</strong><br />

minimize the errors.<br />

Method 1<br />

Treat each line as a point with coordinate èç; rè inèç; rè-space and minimize the squared<br />

distance between èç; rè and its correspondence èç 0 ;r 0 è.<br />

Discussion<br />

The problem of this method is that ç and r are of diæerent metrics. To minimize the squared<br />

distance between a point èç; rè and its correspondence èç 0 ;r 0 è, we implicitly assume equal<br />

weight onboth ç and r.<br />

Method 2<br />

To circumvent the problem caused by Method 1, we note that a line èç; rè can be uniquely<br />

represented by the point èr cos ç; r sin çè, which is the projection of the origin on<strong>to</strong> the line.<br />

To match line èç; rè <strong>to</strong> its correspondence èç 0 ;r 0 è, we try <strong>to</strong> minimize the squared distance<br />

between èr cos ç; r sin çè and èr 0 cos ç 0 ;r 0 sin ç 0 è.<br />

Discussion<br />

The drawback of this method is its dependency on the origin. The nearer the line is <strong>to</strong> the<br />

origin, the more weight ison r èthink of the special case when both lines pass through the<br />

origin in the imageè.<br />

Method 3 1<br />

The models in a model base are usually ænite in the sense that though they are modeled<br />

by lines, they in fact consist of segments. We can minimize the squared distance of the<br />

endpoints of the transformed model line segments <strong>to</strong> their corresponding scene lines in<br />

image space.<br />

We derive in the following the closed-form formula for the case of aæne transformations.<br />

A similar technique can be applied for the cases of other transformation groups.<br />

1 suggested by Professor Jaiwei Hong.


CHAPTER 7. THE EXPERIMENTS 84<br />

Speciæcally, assuming that we are looking for an aæne match between n scene lines l j<br />

and endpoints of n segments, u j1 and u j2, j =1; ...;n,wewould like <strong>to</strong> ænd the aæne<br />

transformation T =èA; bè, such that the summations of the squared distances between<br />

the sequence Tèu j1è <strong>to</strong>l j<br />

E = min<br />

T<br />

nX<br />

and Tèu j2è <strong>to</strong>l j , j =1; ...;n, is minimized:<br />

èdistance of Tèu j1è and l j è 2 + èdistance of Tèu j2è and l j è 2 :<br />

j=1<br />

Let line l j be with parameter èç j ;r j è and endpoints u ji be èx ji ;y ji è;j =1; ...;n and<br />

i =1; 2. Also let T =èA; bè such that<br />

A = @ a 11 a12<br />

a21 a22<br />

Then<br />

and<br />

E = min<br />

T<br />

nX<br />

j=1<br />

0<br />

1<br />

A and b =<br />

0<br />

@ b 1<br />

b2<br />

1<br />

A :<br />

Tèu ji è=èa11x ji + a12y ji + b1;a21x ji + a22y ji + b2è t<br />

èècos ç j ; sin ç j èTèu j1è , r j è 2 + èècos ç j ; sin ç j èTèu j2è , r j è 2 :<br />

To minimize E, wehave <strong>to</strong> solve the following system of equations:<br />

@E<br />

@a11 =0; @E<br />

@a12 =0; @E<br />

@a21 =0; @E<br />

@a22 =0; @E<br />

@E<br />

=0 and<br />

@b1 @b2 =0:<br />

Since E is a quadratic function in each of its unknowns, the above is a system of linear<br />

equations with six unknowns. We can rewrite it in the matrix form as follows. èSee<br />

0<br />

Appendix-B for the expressions of m ij and n i , i; j =1; :::; 6.è<br />

B@<br />

m11 m12 m13 m14 m15 m16<br />

m21 m22 m23 m24 m25 m26<br />

m31 m32 m33 m34 m35 m36<br />

m41 m42 m43 m44 m45 m46<br />

m51 m52 m53 m54 m55 m56<br />

m61 m62 m63 m64 m65 m66<br />

10<br />

CA B@<br />

The six unknowns can be solved by Cramer's rule èsee p.89 of ë39ëè as follows:<br />

a11<br />

a12<br />

a21<br />

a22<br />

a11 = detèM1è= detèMè; a12 = detèM2è= detèMè;<br />

a21 = detèM3è= detèMè; a22 = detèM4è= detèMè;<br />

b1 = detèM5è= detèMè; b2 = detèM6è= detèMè;<br />

b1<br />

b2<br />

1<br />

CA<br />

=<br />

0<br />

B@<br />

n1<br />

n2<br />

n3<br />

n4<br />

n5<br />

n6<br />

1<br />

CA


CHAPTER 7. THE EXPERIMENTS 85<br />

where M =ëm ij ë i;j=1;...;6 and M k<br />

k =1; ...; 6.<br />

= M with k-th column substituted by ën i ë t i=1;...;6 , for<br />

Discussion<br />

We could <strong>to</strong> do a weighted sum of the squared distances. More speciæcally, the two squared<br />

distances provided by the endpoints of long segments get more weight than those of short<br />

segments.<br />

7.3 Experimental Results and Observations<br />

We have done a series of experiments <strong>using</strong> synthesized images containing polygonal objects,<br />

which are modeled by line features, from the model base consisting of twenty models<br />

shown in Figure 3.3. The way the synthesized images are generated has been explained in<br />

section 3.3.<br />

Figure 7.1, Figure 7.2 and Figure 7.3 show three examples of the experiments on synthesized<br />

images. The perturbation of the lines detected by the Hough transform is obtained<br />

from statistics of extensive simulations on images of diæerent levels of degradation. We<br />

also did experiments on real images containing objects from the same model base, with<br />

æakes scattered on the objects. Two examples èFigure 7.5 and Figure 7.6è are shown with<br />

best least-squares match <strong>using</strong> Method-3 discussed before.<br />

Figure 7.1èaè shows a composite overlapping scene of model-0, 3 and 4 ènote: model-0<br />

appears twiceè, which are signiæcantly skewed. Figure 7.1èbè is the result of Hough analysis<br />

applied <strong>to</strong> this scene. 50 lines were detected. The covariance matrix used for the deviation<br />

0:003384 ,0:00624<br />

of the detected lines from true lines was æ =<br />

,0:00624 2:5198<br />

shows the model instances hypothesized in the recognition stage. 5000 probes were tried.<br />

Figure 7.1èdè shows the ænal result after disambiguation.<br />

0<br />

@<br />

1<br />

A . Figure 7.1ècè<br />

Figure 7.2èaè shows a composite overlapping scene of model-0, 1, 2, 4 and 9. Figure<br />

7.2èbè is the result of Hough analysis applied <strong>to</strong> this scene. 60 lines were detected. The<br />

covariance matrix used for the deviation of the detected lines from true lines is the same<br />

as in Figure 7.1. Figure 7.2ècè shows the model instances hypothesized in the recognition<br />

stage. 5000 probes were tried. Figure 7.2èdè shows the ænal result after disambiguation.


CHAPTER 7. THE EXPERIMENTS 86<br />

Figure 7.3èaè shows a composite overlapping scene of model-1, 3, 4, 13 and 19. Figure<br />

7.3èbè is the result of Hough analysis applied <strong>to</strong> this scene. 60 lines were detected. The<br />

covariance matrix used for the deviation of the detected lines from true lines is the same<br />

as in Figure 7.1. Figure 7.3ècè shows the model instances hypothesized in the recognition<br />

stage. 5000 probes were tried. Figure 7.3èdè shows the ænal result after disambiguation.<br />

èNote that model-19 was not detected, since half of its boundary is invisible.è<br />

Figure 7.4èaè shows a composite overlapping scene of model-3, 12 and 15. Figure 7.4èbè<br />

is the result of Hough analysis applied <strong>to</strong> this scene. 60 lines were detected. The covariance<br />

matrix used for the deviation of the detected lines from true lines is the same as in Figure<br />

7.1. Figure 7.4ècè shows the model instances hypothesized in the recognition stage. 5000<br />

probes were tried. Figure 7.4èdè shows the ænal result after disambiguation.<br />

Figure 7.5èaè shows a real image of model-0, 2, 3 and 4 with æakes scattered. Figure<br />

7.5èbè is the result of edge detection <strong>using</strong> Boie-Cox edge detec<strong>to</strong>r ë9ë. Figure 7.5ècè shows<br />

the result of Hough analysis applied <strong>to</strong> this scene. 50 lines are detected. Figure 7.5èdè<br />

shows the recognition result before disambiguation. Figure 7.5èeè shows the recognition<br />

result after disambiguation. Figure 7.5èfè shows the result after best least-squares match.<br />

Figure 7.6èaè shows a real image of model-1, 2 and 3 ènote: model-3 appears twiceè<br />

with æakes scattered. Figure 7.6èbè is the result of edge detection <strong>using</strong> also Boie-Cox edge<br />

detec<strong>to</strong>r. Figure 7.6ècè shows the result of Hough analysis applied <strong>to</strong> this scene. 50 lines<br />

are detected. Figure 7.6èdè shows the recognition result before disambiguation. Figure<br />

7.6èeè shows the recognition result after disambiguation. Figure 7.6èfè shows the result<br />

after best least-squares match.<br />

When doing the experiments, in our ærst attempt we let the candidate model instances<br />

vote for the global skewing fac<strong>to</strong>r without verifying them ærst. A wrong global skewing<br />

fac<strong>to</strong>r was usually obtained. The reason involves: è1è The image is so noisy that there<br />

are <strong>to</strong>o many lines detected and irrelevant lines are often chosen as a basis; è2è An aæne<br />

transformation is so ëpowerful" that it is not diæcult <strong>to</strong> have an aæne transformation that<br />

maps a quadruplet of lines <strong>to</strong> another quadruplet of lines if some error <strong>to</strong>lerance is allowed.<br />

Our algorithm was modiæed <strong>to</strong> let only those veriæed candidate model instances vote for<br />

the global skewing fac<strong>to</strong>r. The trade-oæ is that veriæcation takes greater computational<br />

time.<br />

We also found from the experiments that when doing probing, if the probed triplet


CHAPTER 7. THE EXPERIMENTS 87<br />

èaè<br />

èbè<br />

ècè<br />

èdè<br />

Figure 7.1: Experimental Example 1


CHAPTER 7. THE EXPERIMENTS 88<br />

èaè<br />

èbè<br />

ècè<br />

èdè<br />

Figure 7.2: Experimental Example 2


CHAPTER 7. THE EXPERIMENTS 89<br />

èaè<br />

èbè<br />

ècè<br />

èdè<br />

Figure 7.3: Experimental Example 3


CHAPTER 7. THE EXPERIMENTS 90<br />

èaè<br />

èbè<br />

ècè<br />

èdè<br />

Figure 7.4: Experimental Example 4


CHAPTER 7. THE EXPERIMENTS 91<br />

èaè<br />

èbè<br />

ècè<br />

èdè


CHAPTER 7. THE EXPERIMENTS 92<br />

èeè<br />

èfè<br />

Figure 7.5: Experimental Example 5<br />

èaè<br />

èbè


CHAPTER 7. THE EXPERIMENTS 93<br />

ècè<br />

èdè<br />

èeè<br />

èfè<br />

Figure 7.6: Experimental Example 6


CHAPTER 7. THE EXPERIMENTS 94<br />

of lines happens <strong>to</strong> correspond <strong>to</strong> a basis of a certain model èi.e., a correct matchè, then<br />

this ëmodel, basisë pair usually accumulates the highest weighted score among all others.<br />

If it does not rank the highest, the cause is usually due <strong>to</strong> insuæcient lines detected for<br />

that particular model instance. However, its weighted vote is still among the highest few.<br />

The high-voted false alarms èi.e., matching this scene basis <strong>to</strong> a wrong ëmodel, basisëè are<br />

usually rejected by veriæcation. We conclude that this weighted voting scheme is quite<br />

eæective.<br />

The method described above does not assume any grouping of features, which is expected<br />

<strong>to</strong> greatly expedite the recognition stage èat least for scene basis selectionè. Lowe<br />

ë40ë ærst explicitly discussed the importance of grouping <strong>to</strong> recognition. However, if reliable<br />

segmentation is not available, intelligent grouping seems diæcult. We note that in probing,<br />

for a given selected basis that does not correspond <strong>to</strong> any of model bases, it may happen<br />

that false alarms pass even the veriæcation because of high noise in the scene. However, we<br />

point out here that statistically false alarms of such case disperse their votes for diæerent<br />

skewing fac<strong>to</strong>rs, while correct matches will accumulate their votes for the global skewing<br />

fac<strong>to</strong>r.


Chapter 8<br />

Conclusions<br />

8.1 Discussion and Summary<br />

<strong>Geometric</strong> hashing is often compared <strong>to</strong> the generalized Hough transform. The main<br />

diæerence between them lies in the kind of ëevidence" used for accumulation <strong>to</strong> generate<br />

hypotheses. While the generalized Hough transform accumulates ëpose" evidence in a continuous<br />

inænite parameter space, geometric hashing accumulates ëfeature correspondence"<br />

evidence within discrete ænite feature space of èmodel identiæer, basis setè's. However, the<br />

quantization problem exists not only in the generalized Hough transform technique. <strong>Geometric</strong><br />

hashing implicitly transfers this problem <strong>to</strong> the preprocessing stage when making<br />

decisions about the quantization of hash space of invariants.<br />

Grimson and Huttenlocher ë21ë analyze the performance of geometric hashing aæneinvariant<br />

matching in the presence of noise on point features and give pessimistic predictions.<br />

They assert that it is impossible <strong>to</strong> construct a sparsely populated hash table due <strong>to</strong><br />

the perturbation of the invariants computed from point features with uncertainty. Rigoutsos<br />

and Hummel ë45ë incorporate additive Gaussian noise in<strong>to</strong> the model and analytically<br />

determine its eæect on the computed invariants for the case where models are allowed <strong>to</strong><br />

undergo similarity transformations. They have shown promising results.<br />

However, there has not been here<strong>to</strong>fore an exploration of line features and their performance<br />

in seriously degraded intensity images. We have shown how the geometric hashing<br />

technique can be applied <strong>to</strong> line features by exploring line invariants under various geometric<br />

transformations, including rigid, similarity, aæne and projective cases.<br />

95


CHAPTER 8. CONCLUSIONS 96<br />

We have also done an analysis of noise sensitivity. Formulae that describe the spread<br />

of the computed invariants from the lines giving rise <strong>to</strong> these invariants are derived for<br />

the above geometric transformations. The basic underlying assumption here is that the<br />

perturbations of line features can be modeled by a Gaussian process. Since more image<br />

points are involved in line features, lines can be extracted by our improved Hough transform<br />

with good accuracy èsmall perturbationè even in seriously degraded images. This is the<br />

ærst of its kind for noise analysis of line features for geometric hashing.<br />

The knowledge about noise sensitivity of the invariants allows us <strong>to</strong> give aweighted<br />

voting scheme for geometric hashing <strong>using</strong> line features. We have applied our technique for<br />

the case of aæne transformations, whichcover both rigid and similarity transformations and<br />

are often suitable substitutes for more general perspective transformations. It shows that<br />

the technique is noise resistant and suitable in an environment containing many occlusions.<br />

8.2 Future Directions<br />

We end this dissertation with a brief discussion of possible directions for future research.<br />

It is obvious that the recognition stage can be easily parallelized. We may assign each<br />

basis <strong>to</strong> a processing element in a parallel computer, since each basis can be processed<br />

almost independently or assign each hash node a processing element. Rigoutsos ë43ë discuss<br />

a parallel implementation of geometric hashing with point features on the Connection<br />

Machine ë26ë, which is equipped with 16K , 64K processors èfor model CM-2è. When the<br />

number of processors simultaneously needed exceeded the maximum numberofphysical<br />

processors in the machine, the machine can operate in a virtual processor mode. In this<br />

mode, a single processor emulates the work of several virtual processors, by serializing<br />

operations in time and partitioning the local memory associated with that processor.<br />

With the advance of hardware techniques and the decrease of cost, parallel realization<br />

of image processing algorithms has become more and more popular.<br />

Another <strong>to</strong>pic <strong>to</strong> be explored is the intelligent fusion of multiple feature types. Since<br />

we assume no reliable segmentation is available, it seems that only line features can be<br />

extracted and used as out primitive feature. However, if the image can be well segmented<br />

and geometric features like points, segments, circles, ovals and so on can be reliably extracted,<br />

our technique can be helpful <strong>to</strong> design a suitable weighting function or <strong>to</strong> perform


CHAPTER 8. CONCLUSIONS 97<br />

hierarchical æltering by diæerent feature types. This may require evaluation criteria for<br />

measuring the relative merit of diæerent feature types and can be application-dependent.<br />

Recognition of non-æat 3-D objects from 2-D images can also be a direction for extension.<br />

A combined use of the transformation parameter space of the Hough transform and<br />

the invariant hash space of geometric hashing may possibly lead <strong>to</strong> a recognition system<br />

which can recognize parameterized objects.<br />

Finally, the technique of ëreasoning by parts" and the geometric hashing technique<br />

can beneæt from each other. A complicated object usually can not be represented by<br />

only one primitive feature type. It may be more suitable <strong>to</strong> represent diæerent parts of<br />

an object by diæerent primitive feature types and recognize each part by the geometric<br />

hashing technique <strong>using</strong> diæerent feature types.


Appendix A<br />

Given a correspondence between pairs of triplets of lines in general position, the aæne<br />

transformation T =èA,bè, where A isa2æ 2 non-singular skewing matrix and b is a<br />

2 æ 1 translation vec<strong>to</strong>r, can be uniquely determined such that T maps points of the ærst<br />

triplet of lines <strong>to</strong> their corresponding points of the second triplet of lines.<br />

Let the ærst triplet of lines be èç i ;r i è t i=1;2;3 and the second triplet of lines be èç 0<br />

We have the closed-form formula for T =èA,bè as follows:<br />

0 1<br />

A = 1 @ a 11 a 12<br />

det ,a 21 ,a 22<br />

b = 1<br />

det<br />

0 1<br />

@ ,b 1 A ;<br />

b 2<br />

A ;<br />

i ;r 0<br />

i èt i=1;2;3 .<br />

where<br />

a 11 =<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

cos ç 3 cscèç1 , ç2è sinèç 1 , ç 2 èèr1 sin ç2 , r2 sin ç1è+<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

cos ç 1 cscèç2 , ç3è sinèç 2 , ç 3 èèr2 sin ç3 , r3 sin ç2è+<br />

0<br />

0<br />

0<br />

0<br />

0<br />

cos ç 2 cscèç3 , ç1è sinèç 3 , ç 1 èèr3 sin ç1 , r1 sin ç<br />

a 21 =<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

cos ç 3 cscèç1 , ç2è sinèç 1 , ç 2 èèr1 cos ç2 , r2 cos ç1è+<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

cos ç 1 cscèç2 , ç3è sinèç 2 , ç 3 èèr2 cos ç3 , r3 cos ç2è+<br />

0<br />

0<br />

cos ç 2 cscèç3 , ç1 è 0<br />

sinèç 3 , ç 1 èèr3 cos 0 0<br />

ç 1 , r1 cos 0 ç 3<br />

è;<br />

a 12 =<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

sin ç 3 cscèç1 , ç2è sinèç 1 , ç 2 èèr1 sin ç2 , r2 sin ç1è+<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

sin ç 1 cscèç2 , ç3è sinèç 2 , ç 3 èèr2 sin ç3 , r3 sin ç2è+<br />

0<br />

0<br />

0<br />

0<br />

0<br />

sin ç 2 cscèç3 , ç1è sinèç 3 , ç 1 èèr3 sin ç1 , r1 sin ç<br />

98<br />

0<br />

3è;<br />

0<br />

3è;


APPENDIX A. 99<br />

0<br />

a22 = sin ç3 cscèç<br />

0<br />

sin ç1 cscèç<br />

0<br />

sin ç2 cscèç<br />

0<br />

b1 = r3 cscèç<br />

0<br />

r1 cscèç<br />

0<br />

r2 cscèç<br />

0<br />

b2 = r3 cscèç<br />

0<br />

r1 cscèç<br />

0<br />

r2 cscèç<br />

1 , ç<br />

2 , ç<br />

3<br />

, ç<br />

1 , ç<br />

2 , ç<br />

3 , ç<br />

1 , ç<br />

2 , ç<br />

3 , ç<br />

0<br />

0<br />

2è sinèç1 , ç2èèr1 cos ç<br />

0<br />

3è sinèç2 , ç3èèr2 cos ç<br />

0<br />

1è sinèç3 , ç1èèr3 cos ç<br />

2è sinèç1 , ç2èèr1 sin ç<br />

0<br />

3è sinèç2 , ç3èèr2 sin ç<br />

0<br />

1 è 0<br />

sinèç 3 , ç1èèr<br />

3 sin 0<br />

ç<br />

0<br />

2è sinèç1 , ç2èèr1 cos ç<br />

0<br />

3è sinèç2 , ç3èèr2 cos ç<br />

0<br />

1è sinèç3 , ç1èèr3 cos ç<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

2 , r<br />

0<br />

3 , r<br />

0<br />

2 , r<br />

0<br />

3 , r<br />

1<br />

, r<br />

0<br />

2 , r<br />

0<br />

3 , r<br />

0<br />

1 , r<br />

1 , r<br />

0<br />

0<br />

2 cos ç1è+<br />

0<br />

0<br />

3 cos ç2è+<br />

0<br />

1 cos ç<br />

0<br />

0<br />

2 sin ç1è+<br />

0<br />

0<br />

3 sin ç2è+<br />

0<br />

1 sin 0<br />

ç<br />

3 è;<br />

0<br />

3è;<br />

0<br />

0<br />

2 cos ç1è+<br />

0<br />

0<br />

3 cos ç2è+<br />

0<br />

1 cos ç<br />

det = r1 sinèç2 , ç3è+r2 sinèç3 , ç1è+r3 sinèç1 , ç2è:<br />

Given the correspondence of the basis lines of a model object and its scene instance, we<br />

can use the above formula <strong>to</strong> transform the boundary of the model object on<strong>to</strong> the scene<br />

for veriæcation.<br />

0<br />

3è;


Appendix B<br />

The components of the matrix for the system of linear equations for best least-squares<br />

match are as follows:<br />

P n<br />

m 11 = j=1 cos 2 ç j èx 2 P P j1 + x2 j2 è, m n<br />

31 =2 j=1 cos ç j sin ç j èx 2 j1 + x2 j2 è,<br />

n<br />

m12 = P j=1 cos 2 P n<br />

ç j èx j1 y j1 + x j2 y j2 è, m32 = j=1 cos ç j sin ç j èx j1 y j1 + x j2 y j2 è;<br />

n<br />

m13 = j=1 cos ç j sin ç j èx 2 P P j1 + x2 j2 è, m n<br />

33 = j=1 sin 2 ç j èx 2 j1 + x2 j2 è,<br />

n<br />

m14 = P P n<br />

j=1 cos ç j sin ç j èx j1 y j1 + x j2 y j2 è, m34 = j=1 sin 2 ç j èx j1 y j1 + x j2 y j2 è,<br />

n<br />

m15 = P j=1 cos 2 P n<br />

ç j èx j1 + x j2 è, m35 = P j=1 cos ç j sin ç j èx j1 + x j2 è,<br />

n n<br />

m16 = j=1 cos ç j sin ç j èx j1 + x j2 è, m36 = j=1 sin 2 ç j èx j1 + x j2 è,<br />

P n<br />

m21 = P j=1 cos 2 P n<br />

ç j èx j1 y j1 + x j2 y j2 è, m41 = j=1 cos ç j sin ç j èx j1 y j1 + x j2 y j2 è,<br />

n<br />

m22 = j=1 cos 2 ç j èyj1 2 P P + y2 j2 è, m n<br />

42 = j=1 cos ç j sin ç j èyj1 2 + y2 j2 è,<br />

n<br />

m23 = P P n<br />

j=1 cos ç j sin ç j èx j1 y j1 + x j2 y j2 è, m43 = j=1 sin 2 ç j èx j1 y j1 + x j2 y j2 è,<br />

n<br />

m24 = j=1 cos ç j sin ç j èyj1 2 P P + y2 j2 è, m n<br />

44 = j=1 sin 2 ç j èyj1 2 + y2 j2 è,<br />

n<br />

m25 = P j=1 cos 2 P n<br />

ç j èy j1 + y j2 è, m45 = P j=1 cos ç j sin ç j èy j1 + y j2 è,<br />

n n<br />

m26 = j=1 cos ç j sin ç j èy j1 + y j2 è, m46 = j=1 sin 2 ç j èy j1 + y j2 è,<br />

P n<br />

m51 = P j=1 cos 2 P n<br />

ç j èx j1 + x j2 è, m61 = j=1 cos ç j sin ç j èx j1 + x j2 è,<br />

n<br />

m52 = j=1 cos 2 P P n<br />

ç j èy j1 + y j2 è, m62 = j=1 cos ç j sin ç j èy j1 + y j2 è,<br />

n<br />

m53 = P P n<br />

j=1 cos ç j sin ç j èx j1 + x j2 è, m63 = j=1 sin 2 ç j èx j1 + x j2 è,<br />

n<br />

m54 = P P n<br />

j=1 cos ç j sin ç j èy j1 + y j2 è, m64 = j=1 sin 2 ç j èy j1 + y j2 è,<br />

n<br />

m55 =2 j=1 cos 2 P n<br />

ç j , m65 =2 j=1 cos ç j sin ç j ,<br />

P P n n<br />

m56 =2 j=1 cos ç j sin ç j ,<br />

m66 =2 j=1 sin 2 ç j<br />

100


APPENDIX B. 101<br />

P n<br />

n1 = P P j=1 cos çj r n<br />

jèxj1 + xj2è,<br />

n3 = P j=1 sin çj r jèxj1 + xj2è,<br />

n<br />

n2 = j=1 cos çj r n<br />

jèyj1 + yj2è,<br />

n4 = j=1 sin çj r jèyj1 + yj2è,<br />

n5 =2 P n<br />

j=1 cos çj r j, n6 =2 P n<br />

j=1 sin çj r j.


Bibliography<br />

ë1ë A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms.<br />

Addison-Wesley, 1984.<br />

ë2ë K. Arbter, W. E. Snyder, H. Burkhardt, and Hirzinger. Application of Aæne-Invariant<br />

Fourier Descrip<strong>to</strong>rs <strong>to</strong> Recognition of 3-D Objects. IEEE Trans. on PAMI, 12è7è:640í<br />

647, 1990.<br />

ë3ë L. P. Arkin, E. M .and Chew, D. P. Huttenlocher, K. Kedem, and J.S.B. Mitchell. An<br />

Eæciently Computable Metric for Comparing Polygonal Shapes. Technical Report<br />

TR 89-1007, Computer Science Dept., Cornell University, 1989.<br />

ë4ë N. Ayache and O. D. Faugeras. HYPER: A New <strong>Approach</strong> for the Recognition and<br />

Positioning of Two-Dimensional Objects. IEEE Trans. on PAMI, 8è1è:44í54, 1986.<br />

ë5ë D. H. Ballard. Generalizing the Hough Transform <strong>to</strong> Detect Arbitrary Shapes. Pattern<br />

Recognition, 13è2è:111í122, 1981.<br />

ë6ë H. G. Barrow and J. M. Tenenbaum. Computational Vision. Prentice-Hall, 1981.<br />

ë7ë P. J. Besl and R. C. Jain. Three-Dimensional Object Recognition. ACM Computing<br />

Surveys, 17è1è:75í154, 1985.<br />

ë8ë T. O. Binfold. Survey of Model-Based Image Analysis Systems. The Int. J. of Robotics<br />

Research, 1è1è:18í64, 1982.<br />

ë9ë R. Boie and I. Cox. Two Dimensional Optimum Edge Recognition <strong>using</strong> Matched<br />

and Weiner Filter for Machine Vision. In Proc. of the IEEE Int. Conf. on Computer<br />

Vision, 1987.<br />

102


BIBLIOGRAPHY 103<br />

ë10ë R. C. Bolles and R. A. Cain. Recognizing and Locating Partially Visible Objects:<br />

The Local-Feature-Focus Method. The Int. J. of Robotics Research, 1è3è:57í82, 1982.<br />

ë11ë R. A. Brooks. Model-Based Three-Dimensional Interpretations of Two-Dimensional<br />

Images. IEEE Trans. on PAMI, 5è2è:140í149, 1983.<br />

ë12ë E. Charniak and D. McDermott. Introduction <strong>to</strong> Artiæcial Intelligence. Addison-<br />

Wesley, 1985.<br />

ë13ë R. T. Chin and C. R. Dyer. Model-Based Recognition in Robot Vision. ACM Computing<br />

Surveys, 18è1è:67í108, 1986.<br />

ë14ë P. R. Cohen and E. A. Feigenbaum èEds.è. The Handbook of Artiæcial Intelligence,<br />

Vol. III. William Kaufmann, 1982.<br />

ë15ë R. O. Duda and P. E. Hart. Use of the Hough Transform <strong>to</strong> Detect <strong>Line</strong>s and Curves<br />

in Pictures. Communication of ACM, 15:11í15, 1972.<br />

ë16ë R. O. Duda and P. E. Hart. Pattern Classiæcation and Scene Analysis. Wiley, 1973.<br />

ë17ë P. J. Flynn and A. K. Jain. 3-D Object Recognition <strong>using</strong> Invariant Feature Indexing<br />

of Interpretation Table. J. of Computer Vision, Graphics and Image Processing:<br />

Image Understanding, 55è2è:119í129, 1992.<br />

ë18ë D. Gavrila and F. Groen. 3-D Object Recognition from 2-D Images Using <strong>Geometric</strong><br />

<strong>Hashing</strong>. Pattern Recognition Letters, 13è4è:263 í 278, 1992.<br />

ë19ë P. G. Gottschalk, J. L. Turney, and T. N. Mudge. Eæcient Recognition of Partially<br />

Visible Objects <strong>using</strong> a Logarithmic Complexity Matching Technique. The Int. J. of<br />

Robotics Research, 8è6è:110í140, 1989.<br />

ë20ë W. E. L. Grimson. On the Recognition of Parameterized 2-D Objects. Int. J. of<br />

Computer Vision, 2è4è:353í372, 1989.<br />

ë21ë W. E. L. Grimson and D. P. Huttenlocher. On the Sensitivity of <strong>Geometric</strong> <strong>Hashing</strong>.<br />

In Proc. of the IEEE Int. Conf. on Computer Vision, pages 334í338, 1990.<br />

ë22ë W. E. L. Grimson and D. P. Huttenlocher. On the Sensitivity of the Hough Transform<br />

for Object Recognition. IEEE Trans. on PAMI, 12è3è:255í274, 1990.


BIBLIOGRAPHY 104<br />

ë23ë W. E. L. Grimson and T. Lozano-Pçerez. Localizing Overlapping Parts by Searching<br />

the Interpretation Tree. IEEE Trans. on PAMI, 9è4è:469í482, 1987.<br />

ë24ë A. Gueziec and N. Ayache. New Developments on <strong>Geometric</strong> <strong>Hashing</strong> for Curve<br />

Matching. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition,<br />

pages 703í704, 1993.<br />

ë25ë Y. C. Hecker and R. M. Bolle. Invariant Feature Matching in Parameter Space with<br />

Application <strong>to</strong> <strong>Line</strong> <strong>Features</strong>. Technical Report RC 16447, IBM T.J.Watson Research<br />

Center, 1991.<br />

ë26ë D. Hillis. The Connection Machine. MIT Press, 1985.<br />

ë27ë J. Hong and X. Tan. Recognize the Similarity between Shapes under Aæne Transformation.<br />

In Proc. of the IEEE Int. Conf. on Computer Vision, pages 489í493,<br />

1988.<br />

ë28ë R. Hummel and Rigoutsos I. <strong>Geometric</strong> <strong>Hashing</strong> as a Bayesian Maximum Likelihood<br />

Object Recognition Method. Int. J. of Computer Vision. Currently under review.<br />

ë29ë R. Hummel and H. Wolfson. Aæne Invariant Matching. In Proc. of the DARPA IU<br />

Workshop, pages 351í364, 1988.<br />

ë30ë D. P. Huttenlocher and S. Ullman. Object Recognition <strong>using</strong> Alignment. In Proc. of<br />

the IEEE Int. Conf. on Computer Vision, pages 102í111, 1987.<br />

ë31ë D. P. Huttenlocher and S. Ullman. Recognizing Solid Objects by Alignment with an<br />

Image. Int. J. of Computer Vision, 5è2è:195í212, 1990.<br />

ë32ë J. Illingworth and J. Kittler. A Survey of the Hough Transform. J. of Computer<br />

Vision, Graphics and Image Processing, 44:87í116, 1988.<br />

ë33ë A. Kalvin, E. Schonberg, J. T. Schwartz, and M. Sharir. Two Dimensional Model<br />

Based Boundary Matching <strong>using</strong> Footprints. The Int. J. of Robotics Research, 5è4è:38í<br />

55, 1986.<br />

ë34ë E. Kishon. Use of Three Dimensional Curves in Computer Vision. PhD thesis,<br />

Computer Science Dept., Courant Institute of Mathematical Sciences, New York University,<br />

1989.


BIBLIOGRAPHY 105<br />

ë35ë F. Klein. Elementary Mathematics from an Advanced Standpoint ; Geometry. Macmillan,<br />

1925.<br />

ë36ë Y. Lamdan. Object Recognition by <strong>Geometric</strong> <strong>Hashing</strong>. PhD thesis, Computer Science<br />

Dept., Courant Institute of Mathematical Sciences, New York University, 1989.<br />

ë37ë Y. Lamdan, J. T. Schwartz, and H. J. Wolfson. Object Recognition by Aæne Invariant<br />

Matching. In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition,<br />

pages 335í344, 1988.<br />

ë38ë Y. Lamdan and H. J. Wolfson. On the Error Analysis of <strong>Geometric</strong> <strong>Hashing</strong>. In Proc.<br />

of the IEEE Conf. on Computer Vision and Pattern Recognition, pages 22 í 27, 1991.<br />

ë39ë S. Lang. <strong>Line</strong>ar Algebra. Addison-Wesley, 1966.<br />

ë40ë D. G. Lowe. Perceptual Organization and Visual Recognition. Kluwer, 1985.<br />

ë41ë D. G. Lowe. The Viewpoint Consistency Constraint. Int. J. of Computer Vision,<br />

1è1è:57í72, 1987.<br />

ë42ë D. G. Lowe. Three-dimensional Object Recognition from Single Two-dimensional<br />

Images. Artiæcial Intelligence, 31:355í395, 1987.<br />

ë43ë I. Rigoutsos. Massively Parallel Bayesian Object Recognition. PhD thesis, Computer<br />

Science Dept., Courant Institute of Mathematical Sciences, New York University,<br />

1992.<br />

ë44ë I. Rigoutsos and R. Hummel. ABayesian <strong>Approach</strong> <strong>to</strong> Model Matching with <strong>Geometric</strong><br />

<strong>Hashing</strong>. J. of Computer Vision, Graphics and Image Processing: Image<br />

Understanding. Currently under review.<br />

ë45ë I. Rigoutsos and R. Hummel. Robust Similarity Invariant Matching in the Presence<br />

of Noise. In Proc. of the 8th Israeli Conf. on Artiæcial Intelligence and Computer<br />

Vision, 1991.<br />

ë46ë I. Rigoutsos and R. Hummel. Massively Parallel Model Matching: <strong>Geometric</strong> <strong>Hashing</strong><br />

on the Connection Machine. IEEE Computer: Special Issue on Parallel Processing<br />

for Computer Vision and Image Understanding, 1992.


BIBLIOGRAPHY 106<br />

ë47ë I. Rigoutsos and R. A. Hummel. Implementation of <strong>Geometric</strong> <strong>Hashing</strong> on the Connection<br />

Machine. In IEEE Workshop on Directions in Aut. CAD-Based Vision, 1991.<br />

ë48ë T. Risse. The Hough Transform for <strong>Line</strong> Recognition: Complexity of Evidence Accumulation<br />

and Cluster Detection. J. of Computer Vision, Graphics and Image Processing,<br />

46:327í345, 1989.<br />

ë49ë J. T. Schwartz and M. Sharir. Identiæcation of Partially Obscured Objects in Two<br />

Dimensions by Matching of Noisy `Characteristic Curves'. The Int. J. of Robotics<br />

Research, 6è2è:29í44, 1987.<br />

ë50ë G. S<strong>to</strong>ckman. Object Recognition and Localization via Pose Clustering. J. of Computer<br />

Vision, Graphics and Image Processing, 40:361í387, 1987.<br />

ë51ë J. M. Tenenbaum and H. G. Barrow. A Paradigm for Integrating Image Segmentation<br />

and Interpretation. In Proc. of the Int. Conf. on Pattern Recognition, pages 504í513,<br />

1976.<br />

ë52ë D. W. Thompson and J. L. Mundy. Three-Dimensional Model Matching from an Unconstrained<br />

Viewpoint. In Proc. of the IEEE Int. Conf. on Robotics and Au<strong>to</strong>mation,<br />

pages 208í220, 1987.<br />

ë53ë L. W. Tucker, C. R. Feynman, and D. M. Fritzsche. Object Recognition <strong>using</strong> the<br />

Connection Machine. In Proc. of the IEEE Conf. on Computer Vision and Pattern<br />

Recognition, pages 871í878, 1988.<br />

ë54ë S. W. Zucker, R. Hummel, and A. Rosenfeld. An Application of Relaxation Labeling<br />

<strong>to</strong> <strong>Line</strong> and Curve Enhancement. IEEE Trans. on Computers, 26è4è:394í403, 1977.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!