09.04.2013 Views

2D image mosaic building 2D3 - Ifremer

2D image mosaic building 2D3 - Ifremer

2D image mosaic building 2D3 - Ifremer

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Underwater Systems Department<br />

A.G. Allais<br />

27/09/2006 – DOP/CM/SM/PRAO/06.224<br />

Project Exocet/D<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong><br />

<strong>building</strong><br />

Diffusion:<br />

P.M. Sarradin DOP/CB/EEP/LEP<br />

M. Perrier DOP/CM/SM/PRAO<br />

Confidential<br />

Restricted<br />

Public


Date : 27/09/2006<br />

Reference : DOP/CM/SM/PRAO/06.224<br />

Analytic N° : E010403A1<br />

Contract N° :<br />

Subject/Title :<br />

Abstract :<br />

Key-words :<br />

Number of pages : 16<br />

Number of figures :<br />

Number of annex :<br />

Project Exocet/D<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

Revisions<br />

File name : <strong>2D</strong>3.doc<br />

Writer : A.G. Allais<br />

Grade Object Date Written by Checked by Approved by<br />

1.0 Creation 14/12/05 A.G. Allais M. Perrier M. Perrier<br />

CE DOCUMENT, PROPRIETE DE L'IFREMER, NE PEUT ETRE REPRODUIT OU COMMUNIQUE SANS SON AUTORISATION


Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

Project Exocet/D page 3/16<br />

TABLE OF CONTENTS<br />

1. INTRODUCTION ..............................................................................................................................4<br />

2. WHAT IS A GEO-REFERENCED VIDEO MOSAIC?......................................................................4<br />

3. MOSAICING ALGORITHMS............................................................................................................5<br />

3.1. KLT algorithm ...........................................................................................................5<br />

3.1.1. Principle ............................................................................................................5<br />

3.1.2. Algorithm description ........................................................................................6<br />

3.1.2.1. Detection of point features............................................................................6<br />

3.1.2.2. Tracking of features ......................................................................................6<br />

3.1.2.3. Global displacement computation by least square method ..........................7<br />

3.2. RMR algorithm..........................................................................................................7<br />

3.2.1. Principle ............................................................................................................7<br />

3.2.2. Algorithm description ........................................................................................8<br />

3.2.2.1. Model of motion ............................................................................................8<br />

3.2.2.2. Robust estimation .........................................................................................8<br />

3.3. Metric conversion - camera self-calibration ..............................................................9<br />

3.3.1. Extraction and matching of points...................................................................10<br />

3.3.2. Scene geometry..............................................................................................10<br />

3.3.3. Points validation..............................................................................................10<br />

3.3.4. Intrinsic parameters estimation.......................................................................11<br />

3.3.5. Extrinsic parameters estimation......................................................................11<br />

3.4. Experiments............................................................................................................11<br />

3.5. Fusion with navigation data ....................................................................................12<br />

4. MATISSE SOFTWARE® .............................................................................................................. 14<br />

4.1. General architecture ...............................................................................................14<br />

4.2. User interface .........................................................................................................15<br />

5. CONCLUSION .............................................................................................................................. 15<br />

6. BIBLIOGRAPHY ........................................................................................................................... 16<br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


1. INTRODUCTION<br />

Project Exocet/D page 4/16<br />

In this report, we address the issue of managing seabed video records. During at sea trials, a<br />

lot of video is recorded by scientists who need it to analyze the ocean floor. Because of the<br />

storage capacity increase, the number of video records always increases during at sea trials.<br />

That leads scientists to spend more and more time on analyzing video records. The aim of<br />

video <strong>mosaic</strong>ing is then to propose a tool to obtain a map whose extend is far larger than the<br />

camera field of view to enable the scientist to have a global view of the scene, and at the<br />

same time to reduce and compress the video storage capacity.<br />

2. WHAT IS A GEO-REFERENCED VIDEO MOSAIC?<br />

In order to simplify the work of the scientists concerning the exploitation of the numerous<br />

video DVD’s resulting from at sea trials, we have been led to develop a tool to provide a<br />

larger view of the seabed than the restricted field of view of a video camera. The aim of video<br />

<strong>mosaic</strong>ing is then to build <strong>image</strong>s whose extend is far larger than the snapshot of a video<br />

recording. The resulting <strong>image</strong> represents a larger area of the seabed and is called a <strong>mosaic</strong>.<br />

The principle used to build <strong>mosaic</strong>s is quite simple to understand. When the video stream is<br />

acquired, it can be seen as a succession of <strong>image</strong>s which have a great part in common. The<br />

idea is to estimate the part which has been added from one <strong>image</strong> to another. The new part<br />

of the current <strong>image</strong> is added and merged with the previous <strong>image</strong>. Every N <strong>image</strong>s, a<br />

<strong>mosaic</strong> is built and another one can begin. This step is performed by <strong>image</strong> processing<br />

techniques.<br />

The other main issue of video <strong>mosaic</strong>ing is to locate the <strong>mosaic</strong>s on the seabed so that the<br />

scientists can deal the <strong>mosaic</strong>s with other geo-referenced data such as bathymetry, samples,<br />

physical or chemical data. This can be done by two ways. In the simplest case, the operator<br />

gives the positioning, heading and altitude of the first point and then the <strong>mosaic</strong> location is<br />

calculated by <strong>image</strong> processing. But this way is not the best one since errors due to <strong>image</strong><br />

processing occur and accumulate through the whole process. So, to overcome this<br />

drawback, the other means consists in merging navigation data with <strong>image</strong>.<br />

The whole process is developed in the following part but is summarized in the sketch<br />

hereafter (Figure 1).<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


Image1<br />

Image2<br />

Image3<br />

………<br />

3. MOSAICING ALGORITHMS<br />

Project Exocet/D page 5/16<br />

(X0,Y0)<br />

1 <strong>mosaic</strong> built<br />

from 3 <strong>image</strong>s<br />

(X0,Y0) (X1,Y1)<br />

Figure 1: Illustration of video <strong>mosaic</strong>ing<br />

2 <strong>mosaic</strong>s built<br />

from 3 <strong>image</strong>s each<br />

Many <strong>image</strong> processing techniques allow us to calculate the geometric relationship between<br />

two <strong>image</strong>s. Hereafter two different methods have been investigated and integrated into the<br />

MATISSE Software® which has been developed at IFREMER. The first one relies on a<br />

feature tracking algorithm (KLT) while the second one, the RMR method, estimates the<br />

movement by a robust optical flow algorithm. Both algorithms estimate a displacement in<br />

pixels between two successive <strong>image</strong>s. In order to provide the <strong>mosaic</strong>s with metric<br />

dimension, the displacement in pixels needs to be converted into meters. This step is<br />

performed using the camera parameters and the altitude provided by navigation.<br />

3.1. KLT algorithm<br />

3.1.1. Principle<br />

This algorithm is based upon researches performed by Kanade, Lucas and Tomasi (KLT). It<br />

consists in detecting and tracking point features through an <strong>image</strong> stream [SHI94]. The<br />

points are selected if they meet a criterion which characterizes a locally textured area. Then,<br />

the points are tracked through the sequence. When a list of matched points is obtained from<br />

successive <strong>image</strong>s, a global displacement is computed to register the successive <strong>image</strong>s<br />

and to build a <strong>mosaic</strong>.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


3.1.2. Algorithm description<br />

3.1.2.1.Detection of point features<br />

Project Exocet/D page 6/16<br />

According to the KLT algorithm, a feature is selected if it is easily “trackable” through an<br />

<strong>image</strong> stream. So, the features consist of windows of several pixel side, which are locally<br />

textured. More precisely, a window is selected if its mean gradient is high enough and<br />

without any particular direction.<br />

Mathematically saying, we consider the gradient vector within a window of typically 7x7<br />

pixels:<br />

⎡g<br />

x ⎤<br />

g = ⎢ ⎥<br />

⎣<br />

gy<br />

⎦<br />

We note Z the following matrix:<br />

2<br />

⎡ g x<br />

Z = ∫∫ ⎢<br />

W ⎢⎣<br />

g g x y<br />

g g ⎤ x y<br />

⎥ ⋅ω<br />

⋅ dA<br />

2<br />

gy<br />

⎥⎦<br />

Where:<br />

• g is the <strong>image</strong> intensity gradient along the x-axis,<br />

x<br />

• g is the <strong>image</strong> intensity gradient along the y-axis,<br />

y<br />

• W is the computation window,<br />

• ω is a weigh function,<br />

• dA is the area of the computation window.<br />

This method relies on the fact that the Z-matrix eigenvalues are directly linked to the texture<br />

of the area they are computed in.<br />

If we try to give a simple meaning of the eigenvalues, we can say that two small eigenvalues<br />

represent a non-textured area whereas a high eigenvalue and a small one are characteristic<br />

of an area with one specific direction. In our case, only the textured but with no particular<br />

direction windows must be selected, that’s to say we are interested in areas having both high<br />

eigenvalues. So, within an <strong>image</strong>, small areas (7x7 pixels, for instance) will be selected only<br />

if both eigenvalues are greater than a given threshold.<br />

3.1.2.2.Tracking of features<br />

The features that have been selected in the first part of the algorithm are then tracked<br />

through the video stream or the <strong>image</strong> sequence. The displacement between two successive<br />

<strong>image</strong>s is supposed to be quite small since the camera moves slowly. So, nearly all the<br />

points of an <strong>image</strong> I are also in the next <strong>image</strong> J and they are linked by a translation vector<br />

d .<br />

Now, let’s give the relationship linking two <strong>image</strong>s between two moments.<br />

Let be I ( x ξ, y −η,<br />

t )<br />

I x,<br />

y,<br />

t + τ the <strong>image</strong> at t + τ .<br />

− the <strong>image</strong> at t and ( )<br />

Put J ( x ) = I(<br />

x,<br />

y,<br />

t + τ ) and I ( x − d)<br />

= I(<br />

x − ξ, y −η,<br />

t ) where = ( ξ, η)<br />

vector of the point x = ( x, y ) between two time steps t and t + τ .<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

d is the displacement<br />

Grade : 1.0 27/09/2006


We can note that J ( x) = I(<br />

x − d)<br />

+ n(<br />

x)<br />

where ( x)<br />

For each window W , d is obtained by minimizing [ ( ) ]<br />

That leads to solve the following equation:<br />

Gd = e<br />

With:<br />

t<br />

G = ∫∫ gg ⋅ω<br />

⋅ dA<br />

e<br />

∫∫<br />

W<br />

= W<br />

( − J )<br />

I ⋅ g ⋅ω<br />

⋅ dA<br />

Project Exocet/D page 7/16<br />

n represents the noise.<br />

∫∫<br />

W<br />

2<br />

n x ⋅ω<br />

⋅ dA .<br />

This equation is solved for each selected window. So, for each window selected in the first<br />

<strong>image</strong>, we can compute the local displacement between the first <strong>image</strong> and the second one.<br />

The steps of point detection and tracking of the KLT algorithm are illustrated in Figure 2.<br />

Figure 2: Detection and tracking of points in a sequence of coral reef<br />

3.1.2.3.Global displacement computation by least square method<br />

When the points are matched between two successive <strong>image</strong>s, a global displacement is<br />

computed in order to register the <strong>image</strong>s.<br />

The displacement is modelled as a 4-parameter rigid global <strong>2D</strong> transformation, that’s to say a<br />

transformation composed of a translation, a rotation and a scale factor.<br />

This global displacement is computed by the iterative least square method. Each iteration<br />

enables to refine the result. Besides, this method is completed by an acceptance criterion<br />

which is used to validate the matches and to strike off the false matches in order to make the<br />

computation more robust.<br />

3.2. RMR algorithm<br />

3.2.1. Principle<br />

The second method we have investigated to build <strong>mosaic</strong>s is the Robust Multi-Resolution<br />

(RMR) method which is based upon the estimation of the optical flow [ODO95]. The<br />

advantage of this method is that the motion is estimated from the whole <strong>image</strong>.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


Project Exocet/D page 8/16<br />

In the RMR algorithm, the first stage consists in choosing a motion model. Then, the aim is to<br />

estimate the parameters of the model using the classic method of robust estimation in the<br />

domain of <strong>image</strong> and signal processing. This step is combined to a coarse-to-fine estimation<br />

using multi-resolution levels of <strong>image</strong>s.<br />

3.2.2. Algorithm description<br />

3.2.2.1.Model of motion<br />

In the first step of the algorithm, a motion model is chosen. In the RMR algorithm, we<br />

consider the class of <strong>2D</strong> polynomial motion models and we deal only with the <strong>2D</strong> affine<br />

model which is not too complex but enough representative of a large part of motion<br />

transformations:<br />

⎧u(<br />

X ) = a + a x + a y<br />

i 1 2 i 3 i<br />

⎨<br />

⎩v(<br />

X ) = a + a x + a y<br />

i 4 5 i 6 i<br />

And with matrix notation, it can be stated as:<br />

⎡u(<br />

X ) i ⎤<br />

( X ) = B(<br />

X ) A<br />

i ⎢ ⎥ =<br />

⎣v(<br />

X ) i ⎦<br />

V i<br />

Where<br />

A ( a a a a a a<br />

1 2 3 4 5 6<br />

T =<br />

⎡1<br />

x y 0 0 0<br />

i i<br />

⎤<br />

And B = B(<br />

X ) =<br />

i<br />

i ⎢<br />

⎥<br />

⎣0<br />

0 0 1 x y i i ⎦<br />

)<br />

For each point X , one can write the flow constraint equation linking spatial and temporal<br />

intensity gradients:<br />

( X ) ∇I(<br />

X ) + I ( X ) = 0<br />

V ,<br />

I<br />

x<br />

i<br />

⋅ i t i<br />

i<br />

( X ) u(<br />

X ) I ( X ) v(<br />

X ) + I ( X ) = 0<br />

i<br />

i<br />

+ y i i t i<br />

where ( ) is the spatial gradient vector of the intensity, is the partial temporal<br />

i<br />

derivative of the intensity relative to time and is the vector field.<br />

X I ∇ ) ( t i X I<br />

( ) i X V<br />

3.2.2.2.Robust estimation<br />

The goal of robust estimation is to find the parameter vector Θ which best fits the model<br />

( X , Θ)<br />

to the observations . y<br />

M i<br />

i<br />

t t<br />

In our case, Θ = ( A , 0)<br />

.<br />

The estimation of parameter Θ is achieved by a maximum likelihood estimator:<br />

Θˆ = argmin∑<br />

ρ ( y − M(<br />

X , Θ))<br />

where ρ is called the M-estimator and corresponds to the<br />

i<br />

i<br />

Θ<br />

maximum likelihood estimation.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


Project Exocet/D page 9/16<br />

This estimation is performed at all the scales of a multi-resolution <strong>image</strong>s pyramid so that at<br />

each scale the estimation is refined.<br />

3.3. Metric conversion - camera self-calibration<br />

The displacement calculated by <strong>image</strong> processing is given in pixels. The aim of video<br />

<strong>mosaic</strong>ing is to provide the <strong>mosaic</strong>s with metric dimension in order to make quantitative<br />

measurements within the <strong>image</strong>s. Thus, the displacement and the <strong>image</strong> size in pixels must<br />

be converted into meters (see Figure 3).<br />

h<br />

f<br />

l<br />

camera<br />

1 pix<br />

L<br />

<strong>image</strong> plane<br />

seabed plane<br />

Figure 3: Link between one pixel and its real metric size on the seabed<br />

Thus we have the relationship:<br />

−1<br />

−1<br />

SizeOfAPixel(<br />

m ⋅ pix )<br />

l( m ⋅ pix ) =<br />

⋅ h(<br />

m)<br />

,<br />

f ( m)<br />

Given that<br />

SizeOfAPixel(<br />

m ⋅ pix<br />

−1<br />

f ( m)<br />

) = , we deduce that:<br />

α(<br />

pix)<br />

−1<br />

h(<br />

m)<br />

l(<br />

mm ⋅ pix ) = , where α is an intrinsic parameter of the camera which has to be<br />

α(<br />

pix)<br />

estimated and h is the altitude of the camera.<br />

Since it is not always possible to deploy a calibration pattern on the seabed, an algorithm for<br />

camera self-calibration has been investigated [PES03].<br />

The self-calibration algorithm allows us to determine the intrinsic and extrinsic parameters of<br />

a vertical camera mounted on an underwater vehicle. This method needs only a sequence of<br />

few <strong>image</strong>s of the seabed and is based upon the determination of the epipolar geometry<br />

between two successive <strong>image</strong> of the sequence.<br />

The diagram below presents all the steps of the self-calibration method:<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


Images of the<br />

observed<br />

scene<br />

Matching<br />

points<br />

Project Exocet/D page 10/16<br />

Points<br />

matches<br />

validation<br />

Scene<br />

geometry<br />

estimation<br />

Extrinsic<br />

parameters<br />

estimation<br />

Figure 4: Steps of camera self-calibration method<br />

3.3.1. Extraction and matching of points<br />

Intrinsic<br />

parameters<br />

estimation<br />

In our application, <strong>image</strong> sequences are composed of small displacements between two<br />

successive <strong>image</strong>s. The extraction and the matching of points are carried out by the KLT<br />

algorihthm which is detailed in paragraph 3.1. This algorithm extracts features in the first<br />

<strong>image</strong> and tracks them across the sequence.<br />

Despite the complexity of underwater <strong>image</strong>s, a great number of features are positively<br />

tracked through the sequence. Moreover, the points are well distributed in the <strong>image</strong>.<br />

3.3.2. Scene geometry<br />

The scene geometry can be represented algebraically by the fundamental matrix F . The<br />

fundamental matrix links the coordinates q and q'<br />

of a same 3D point Q in two <strong>image</strong>s:<br />

'T<br />

q F q = 0 ∀i ∈ [1, n]<br />

i<br />

i<br />

The estimation of the fundamental matrix is based on the Hartley’s normalized 8-point<br />

algorithm and uses the points detected and tracked by the KLT algorithm.<br />

This algorithm requires a set of at least eight matched points q ↔ q . In order to increase<br />

'<br />

the estimation accuracy, a set of about 30 or 50 points can be used. But using more than 8<br />

points leads to the presence of false matches which perturb the estimation of F . So, in<br />

compensation, a criterion has been integrated to validate the points and remove false<br />

matches.<br />

3.3.3. Points validation<br />

The validation of features is carried out by an algorithm based on the RANSAC (RANdom<br />

Sample Consensus) algorithm. The selection of point matches is based on the accuracy of<br />

the equation representing the scene geometry (see 3.3.2). The equation is computed for all<br />

the matched points and allows, estimating the fundamental matrix, to determine the matching<br />

error. A list of good features is then constituted to estimate the best fundamental matrix. As a<br />

result, only the “best” matches are kept to estimate the final fundamental matrix.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

i<br />

i<br />

Grade : 1.0 27/09/2006


3.3.4. Intrinsic parameters estimation<br />

Project Exocet/D page 11/16<br />

There are five intrinsic parameters: focal distance f , scale factors and according to<br />

k<br />

the <strong>image</strong> axes u and<br />

ku v<br />

u0 0 v<br />

v , and coordinates of principal point of the <strong>image</strong> and . The<br />

intrinsic parameters estimation is carried out using the Mendonça and Cipolla algorithm [4]<br />

applied to a set of five <strong>image</strong>s taken at given intervals from a dense sequence. This<br />

algorithm is based on the minimization of a cost function which takes the intrinsic parameters<br />

as arguments and the fundamental matrix as parameters. The cost function is:<br />

C ( K ) =∑∑<br />

σ − σ<br />

n n 1 2<br />

i = 1<br />

w ij<br />

j > i<br />

ij<br />

2<br />

σ ij<br />

ij<br />

With:<br />

• , , , ) u f α = where αu, αv, u0, v0 correspond respectively to the products of the<br />

( 0 0 v<br />

K u v α<br />

scale factors according to the axis u and v by the focal length and to the coordinates<br />

of the intersection of the optical axis with the <strong>image</strong> plane,<br />

• is the degree of confidence of the fundamental matrix estimation,<br />

F<br />

wij ij<br />

1 2<br />

• σ > σ are the non-zero singular values of the essential matrix E .<br />

ij<br />

ij<br />

3.3.5. Extrinsic parameters estimation<br />

The extrinsic parameters are composed of rotations and translations of the camera around<br />

the three axes (twelve parameters).<br />

The extrinsic parameters estimation represents the last step of the self-calibration algorithm.<br />

This part is function of intrinsic parameters and of the fundamental matrix F:<br />

E =<br />

T [] t R = K FK<br />

x<br />

With:<br />

• t : the antisymmetric matric associated to the translation vector t,<br />

[] x<br />

• R : the rotation matrix,<br />

• K : the intrinsic parameters matrix.<br />

The algorithm firstly determines the translation t. Afterwards, the rotation matrix is estimated<br />

by minimizing:<br />

3<br />

∑<br />

i=<br />

1<br />

E − R<br />

i<br />

T<br />

[] t<br />

xi<br />

2<br />

[] xi<br />

Where E and t are the i-th row vectors of matrices E and [ t ] x .<br />

3.4. Experiments<br />

A statistical comparative study of this camera self-calibration method is presented in<br />

[PES03]. Some studies with simulated data have allowed us to show that some trajectories<br />

of the underwater vehicle are more adapted for the intrinsic parameters estimation of the<br />

camera.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

ij<br />

Grade : 1.0 27/09/2006


Project Exocet/D page 12/16<br />

The results obtained with real data show that a rotation around the optical axis with roll and<br />

pitch angles, allows to estimate concurrently all intrinsic parameters of the camera with a<br />

good accuracy. The table below presents errors expressed as a percentage of the parameter<br />

value estimated in this case of movement.<br />

εαu % εαv % εu0 % εv0 %<br />

θz + (θx, θy) 2.06 % 2.06 % 1.65 % 1.19 %<br />

Table 1: Errors in intrinsic parameters estimations<br />

θz: rotation around the optical axis, θx: roll angle and θy : pitch angle<br />

3.5. Fusion with navigation data<br />

When the displacement is computed by <strong>image</strong> processing and when the real geographic size<br />

is provided thanks to altitude measurement and camera parameters, we could think that it is<br />

enough to obtain <strong>mosaic</strong>s well located in the seabed. In fact, <strong>image</strong> processing techniques<br />

imply errors that accumulate during the <strong>mosaic</strong>ing process. In order to reduce these errors<br />

and to obtain a geo-referencing accurately, navigation data can be used and fused with<br />

displacement given by <strong>image</strong> processing. That’s why, to correct the displacements calculated<br />

through an <strong>image</strong> sequence, we have introduced a Kalman filter [WEL94] which is well suited<br />

to problems of estimating the variables of a dynamic system (that varies with time). In<br />

dynamic systems, the system variables are denoted by the term “state variables”.<br />

The question which is addressed by the Kalman filter is “Given our knowledge of the<br />

behaviour of the system, and given our measurements, what is the best estimate of the state<br />

variables?<br />

Mathematically saying, the aim of the Kalman filter is to estimate a posteriori a state vector<br />

−<br />

xˆ in relation to the a priori estimation xˆ and a weighted difference between the<br />

k<br />

−<br />

measurement z at time k and the prediction , where the subscript k denotes the time<br />

H ˆ<br />

step.<br />

k<br />

Thus, at time k, the measurement is known and the estimation a posteriori and the xˆ<br />

−<br />

estimation a priori xˆ<br />

must be calculated.<br />

k +1<br />

k<br />

k k x<br />

zk k<br />

The two equations hereafter link the state vector and the measurement vector.<br />

x = A x + Bu + w<br />

k +1<br />

z = H x + v<br />

k<br />

k<br />

k<br />

k<br />

k<br />

k<br />

k<br />

k<br />

This yields to the two sets of equations of the classic formulation of the Kalman filtering. The<br />

−<br />

first ones are the time update equations to predict the state of the vector xˆ<br />

while the<br />

k +1<br />

second ones are the measurement update equations to correct the state of the vector xˆ .<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006<br />

k


Time update equations (“predict”)<br />

x = A xˆ<br />

+ Bu<br />

− ˆk + 1<br />

−<br />

k +1<br />

k<br />

P = A P A + Q<br />

k<br />

k<br />

k<br />

T<br />

k<br />

k<br />

k<br />

Measurement update equations (“correct”)<br />

K<br />

x ˆ<br />

k<br />

k<br />

− T − T<br />

= P H ( H P H − R )<br />

k k k k k k<br />

−<br />

−<br />

= xˆ<br />

+ K ( z − H xˆ<br />

)<br />

k k k k k<br />

P I K H ) P<br />

−<br />

= ( −<br />

k<br />

k k k<br />

−1<br />

In these formulae, the variables stand for:<br />

Project Exocet/D page 13/16<br />

−<br />

• xˆ : A priori state vector at time k (given the process before step k)<br />

k<br />

• xˆ : A posteriori state vector at time k (given the measurement at time k)<br />

k<br />

• z : Measurement vector at time k<br />

k<br />

• u : Vector of the control entry<br />

k<br />

• A : Matrix linking the states at time k and k+1<br />

k<br />

• B : Matrix linking the control entry to the state vector<br />

• K : Kalman gain matrix<br />

k<br />

−<br />

• P : Matrix of covariance of the prediction error<br />

k<br />

• P : Matrix of covariance of the a posteriori error<br />

k<br />

• H : Matrix linking the state to the measurement<br />

k<br />

• w : Process noise, assumed to be white and gaussian<br />

k<br />

• v : Measurement noise, assumed to be white and gaussian<br />

k<br />

• R : Covariance matrix of the measurement noise<br />

k<br />

• Q : Covariance matrix of the process noise<br />

k<br />

In the case of video <strong>mosaic</strong>ing, the state variables are the positioning (X_utm and Y_utm)<br />

and a term to correct the pixel size. The measurement vector consists of X_utm and Y_utm<br />

given by navigation system.<br />

The experiment we have led is to perform a route with the underwater vehicle and to perform<br />

the same route in the other direction. We can notice in Figure 5 that there is a shift of the<br />

<strong>mosaic</strong> if we don’t use navigation in the algorithm whereas when using dead-reckoning<br />

navigation in the Kalman filter, the <strong>mosaic</strong> drift is well corrected.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


Project Exocet/D page 14/16<br />

(a) (b)<br />

Figure 5: Mosaics obtained without using navigation (a), using Kalman filtering (b)<br />

4. MATISSE SOFTWARE®<br />

4.1. General architecture<br />

The MATISSE Software® [ALL04] has been developed to integrate all the algorithms of georeferenced<br />

<strong>mosaic</strong>ing. It is flexible and can be used with many underwater vehicles as soon<br />

as they are equipped with a down-facing camera and a continuous navigation system (deadreckoning<br />

navigation).<br />

In Figure 6, the diagram represents the condition of use of MATISSE Software® with a ROV.<br />

The video stream and the navigation data are transferred up to the surface via the umbilical<br />

tether. They are processed in-line to produce geo-referenced <strong>mosaic</strong>s. An option consists of<br />

recording the video stream on a video DVD and the navigation data as messages on a CD.<br />

Playing back video and navigation data together makes possible off-line <strong>mosaic</strong> <strong>building</strong>.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

MATISSE<br />

camera<br />

Navigation<br />

MATISSE<br />

Downlooking<br />

camera<br />

Real-time creation of georeferenced<br />

video <strong>mosaic</strong>s<br />

Figure 6: MATISSE Software® used with a ROV<br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


4.2. User interface<br />

Project Exocet/D page 15/16<br />

MATISSE Software® consists of a user-friendly interface which is presented in Figure 7. The<br />

main window (black background) allows the user to control and check <strong>mosaic</strong> processing. On<br />

the left, we can see MATISSE data architecture. On the bottom left hand corner, video<br />

stream is visualized. And the upper part of the interface is dedicated to specified menus and<br />

predefined sets of parameters for <strong>mosaic</strong> creation.<br />

The MATISSE outputs consist of geo-referenced <strong>image</strong>s (tiff <strong>image</strong>s and tfw geo-referencing<br />

files) and of network messages sent when a <strong>mosaic</strong> is created. These messages can be<br />

used for example by a GIS in order to integrate the geo-referenced <strong>mosaic</strong>s on-line with<br />

other geo-referenced data in a dedicated environment.<br />

5. CONCLUSION<br />

Figure 7: MATISSE software® interface<br />

In this report, we have detailed some methods to build geo-referenced <strong>mosaic</strong>s. This has<br />

resulted in the development of a user-friendly software which is used at IFREMER with ROV<br />

victor6000 and has been tested with other underwater vehicles.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


6. BIBLIOGRAPHY<br />

Project Exocet/D page 16/16<br />

[ALL04] Allais, A.G, Borgetto, M, Opderbecke, J, Pessel, N, Rigaud, V, “Seabed video<br />

<strong>mosaic</strong>king with MATISSE: a technical overview and cruise results”, Proc. of 14 th<br />

International Offshore and Polar Engineering Conference, ISOPE-2004, vol. 2 pp<br />

417-421, Toulon, France, May 23-28, 2004.<br />

[ODO95] Odobez, J.M, Bouthémy, P, “Robust Mutiresolution Estimation of Parametric<br />

Motion Models”, Journal of visual communication and <strong>image</strong> representation, Vol.6,<br />

N°4:348-365, Dec.1995.<br />

[PES03] Pessel, N, Opderbecke, J, Aldon, M.J, “An Experimental Study of a Robust Self-<br />

Calibration Method for a Single Camera”, Proc. of the 3rd International Symposium<br />

on Image and Signal Processing and Analysis, ISPA’2003, Sponsored by IEEE<br />

and EURASIP, Rome, Italy, September 18-20, 2003.<br />

[SHI94] Shi, J, Tomasi, C, “Good features to track,” Proc. IEEE Conference on Computer<br />

Vision and Pattern Recognition, CVPR, Seattle, 1994.<br />

[WEL94] Welch, G, Bishop, G, “An introduction to the Kalman filter”, UNC-CH Computer<br />

Science Technical Report 95-041, 1995.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!