09.04.2013 Views

2D image mosaic building 2D3 - Ifremer

2D image mosaic building 2D3 - Ifremer

2D image mosaic building 2D3 - Ifremer

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Underwater Systems Department<br />

A.G. Allais<br />

27/09/2006 – DOP/CM/SM/PRAO/06.224<br />

Project Exocet/D<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong><br />

<strong>building</strong><br />

Diffusion:<br />

P.M. Sarradin DOP/CB/EEP/LEP<br />

M. Perrier DOP/CM/SM/PRAO<br />

Confidential<br />

Restricted<br />

Public


Date : 27/09/2006<br />

Reference : DOP/CM/SM/PRAO/06.224<br />

Analytic N° : E010403A1<br />

Contract N° :<br />

Subject/Title :<br />

Abstract :<br />

Key-words :<br />

Number of pages : 16<br />

Number of figures :<br />

Number of annex :<br />

Project Exocet/D<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

Revisions<br />

File name : <strong>2D</strong>3.doc<br />

Writer : A.G. Allais<br />

Grade Object Date Written by Checked by Approved by<br />

1.0 Creation 14/12/05 A.G. Allais M. Perrier M. Perrier<br />

CE DOCUMENT, PROPRIETE DE L'IFREMER, NE PEUT ETRE REPRODUIT OU COMMUNIQUE SANS SON AUTORISATION


Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

Project Exocet/D page 3/16<br />

TABLE OF CONTENTS<br />

1. INTRODUCTION ..............................................................................................................................4<br />

2. WHAT IS A GEO-REFERENCED VIDEO MOSAIC?......................................................................4<br />

3. MOSAICING ALGORITHMS............................................................................................................5<br />

3.1. KLT algorithm ...........................................................................................................5<br />

3.1.1. Principle ............................................................................................................5<br />

3.1.2. Algorithm description ........................................................................................6<br />

3.1.2.1. Detection of point features............................................................................6<br />

3.1.2.2. Tracking of features ......................................................................................6<br />

3.1.2.3. Global displacement computation by least square method ..........................7<br />

3.2. RMR algorithm..........................................................................................................7<br />

3.2.1. Principle ............................................................................................................7<br />

3.2.2. Algorithm description ........................................................................................8<br />

3.2.2.1. Model of motion ............................................................................................8<br />

3.2.2.2. Robust estimation .........................................................................................8<br />

3.3. Metric conversion - camera self-calibration ..............................................................9<br />

3.3.1. Extraction and matching of points...................................................................10<br />

3.3.2. Scene geometry..............................................................................................10<br />

3.3.3. Points validation..............................................................................................10<br />

3.3.4. Intrinsic parameters estimation.......................................................................11<br />

3.3.5. Extrinsic parameters estimation......................................................................11<br />

3.4. Experiments............................................................................................................11<br />

3.5. Fusion with navigation data ....................................................................................12<br />

4. MATISSE SOFTWARE® .............................................................................................................. 14<br />

4.1. General architecture ...............................................................................................14<br />

4.2. User interface .........................................................................................................15<br />

5. CONCLUSION .............................................................................................................................. 15<br />

6. BIBLIOGRAPHY ........................................................................................................................... 16<br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


1. INTRODUCTION<br />

Project Exocet/D page 4/16<br />

In this report, we address the issue of managing seabed video records. During at sea trials, a<br />

lot of video is recorded by scientists who need it to analyze the ocean floor. Because of the<br />

storage capacity increase, the number of video records always increases during at sea trials.<br />

That leads scientists to spend more and more time on analyzing video records. The aim of<br />

video <strong>mosaic</strong>ing is then to propose a tool to obtain a map whose extend is far larger than the<br />

camera field of view to enable the scientist to have a global view of the scene, and at the<br />

same time to reduce and compress the video storage capacity.<br />

2. WHAT IS A GEO-REFERENCED VIDEO MOSAIC?<br />

In order to simplify the work of the scientists concerning the exploitation of the numerous<br />

video DVD’s resulting from at sea trials, we have been led to develop a tool to provide a<br />

larger view of the seabed than the restricted field of view of a video camera. The aim of video<br />

<strong>mosaic</strong>ing is then to build <strong>image</strong>s whose extend is far larger than the snapshot of a video<br />

recording. The resulting <strong>image</strong> represents a larger area of the seabed and is called a <strong>mosaic</strong>.<br />

The principle used to build <strong>mosaic</strong>s is quite simple to understand. When the video stream is<br />

acquired, it can be seen as a succession of <strong>image</strong>s which have a great part in common. The<br />

idea is to estimate the part which has been added from one <strong>image</strong> to another. The new part<br />

of the current <strong>image</strong> is added and merged with the previous <strong>image</strong>. Every N <strong>image</strong>s, a<br />

<strong>mosaic</strong> is built and another one can begin. This step is performed by <strong>image</strong> processing<br />

techniques.<br />

The other main issue of video <strong>mosaic</strong>ing is to locate the <strong>mosaic</strong>s on the seabed so that the<br />

scientists can deal the <strong>mosaic</strong>s with other geo-referenced data such as bathymetry, samples,<br />

physical or chemical data. This can be done by two ways. In the simplest case, the operator<br />

gives the positioning, heading and altitude of the first point and then the <strong>mosaic</strong> location is<br />

calculated by <strong>image</strong> processing. But this way is not the best one since errors due to <strong>image</strong><br />

processing occur and accumulate through the whole process. So, to overcome this<br />

drawback, the other means consists in merging navigation data with <strong>image</strong>.<br />

The whole process is developed in the following part but is summarized in the sketch<br />

hereafter (Figure 1).<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


Image1<br />

Image2<br />

Image3<br />

………<br />

3. MOSAICING ALGORITHMS<br />

Project Exocet/D page 5/16<br />

(X0,Y0)<br />

1 <strong>mosaic</strong> built<br />

from 3 <strong>image</strong>s<br />

(X0,Y0) (X1,Y1)<br />

Figure 1: Illustration of video <strong>mosaic</strong>ing<br />

2 <strong>mosaic</strong>s built<br />

from 3 <strong>image</strong>s each<br />

Many <strong>image</strong> processing techniques allow us to calculate the geometric relationship between<br />

two <strong>image</strong>s. Hereafter two different methods have been investigated and integrated into the<br />

MATISSE Software® which has been developed at IFREMER. The first one relies on a<br />

feature tracking algorithm (KLT) while the second one, the RMR method, estimates the<br />

movement by a robust optical flow algorithm. Both algorithms estimate a displacement in<br />

pixels between two successive <strong>image</strong>s. In order to provide the <strong>mosaic</strong>s with metric<br />

dimension, the displacement in pixels needs to be converted into meters. This step is<br />

performed using the camera parameters and the altitude provided by navigation.<br />

3.1. KLT algorithm<br />

3.1.1. Principle<br />

This algorithm is based upon researches performed by Kanade, Lucas and Tomasi (KLT). It<br />

consists in detecting and tracking point features through an <strong>image</strong> stream [SHI94]. The<br />

points are selected if they meet a criterion which characterizes a locally textured area. Then,<br />

the points are tracked through the sequence. When a list of matched points is obtained from<br />

successive <strong>image</strong>s, a global displacement is computed to register the successive <strong>image</strong>s<br />

and to build a <strong>mosaic</strong>.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


3.1.2. Algorithm description<br />

3.1.2.1.Detection of point features<br />

Project Exocet/D page 6/16<br />

According to the KLT algorithm, a feature is selected if it is easily “trackable” through an<br />

<strong>image</strong> stream. So, the features consist of windows of several pixel side, which are locally<br />

textured. More precisely, a window is selected if its mean gradient is high enough and<br />

without any particular direction.<br />

Mathematically saying, we consider the gradient vector within a window of typically 7x7<br />

pixels:<br />

⎡g<br />

x ⎤<br />

g = ⎢ ⎥<br />

⎣<br />

gy<br />

⎦<br />

We note Z the following matrix:<br />

2<br />

⎡ g x<br />

Z = ∫∫ ⎢<br />

W ⎢⎣<br />

g g x y<br />

g g ⎤ x y<br />

⎥ ⋅ω<br />

⋅ dA<br />

2<br />

gy<br />

⎥⎦<br />

Where:<br />

• g is the <strong>image</strong> intensity gradient along the x-axis,<br />

x<br />

• g is the <strong>image</strong> intensity gradient along the y-axis,<br />

y<br />

• W is the computation window,<br />

• ω is a weigh function,<br />

• dA is the area of the computation window.<br />

This method relies on the fact that the Z-matrix eigenvalues are directly linked to the texture<br />

of the area they are computed in.<br />

If we try to give a simple meaning of the eigenvalues, we can say that two small eigenvalues<br />

represent a non-textured area whereas a high eigenvalue and a small one are characteristic<br />

of an area with one specific direction. In our case, only the textured but with no particular<br />

direction windows must be selected, that’s to say we are interested in areas having both high<br />

eigenvalues. So, within an <strong>image</strong>, small areas (7x7 pixels, for instance) will be selected only<br />

if both eigenvalues are greater than a given threshold.<br />

3.1.2.2.Tracking of features<br />

The features that have been selected in the first part of the algorithm are then tracked<br />

through the video stream or the <strong>image</strong> sequence. The displacement between two successive<br />

<strong>image</strong>s is supposed to be quite small since the camera moves slowly. So, nearly all the<br />

points of an <strong>image</strong> I are also in the next <strong>image</strong> J and they are linked by a translation vector<br />

d .<br />

Now, let’s give the relationship linking two <strong>image</strong>s between two moments.<br />

Let be I ( x ξ, y −η,<br />

t )<br />

I x,<br />

y,<br />

t + τ the <strong>image</strong> at t + τ .<br />

− the <strong>image</strong> at t and ( )<br />

Put J ( x ) = I(<br />

x,<br />

y,<br />

t + τ ) and I ( x − d)<br />

= I(<br />

x − ξ, y −η,<br />

t ) where = ( ξ, η)<br />

vector of the point x = ( x, y ) between two time steps t and t + τ .<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

d is the displacement<br />

Grade : 1.0 27/09/2006


We can note that J ( x) = I(<br />

x − d)<br />

+ n(<br />

x)<br />

where ( x)<br />

For each window W , d is obtained by minimizing [ ( ) ]<br />

That leads to solve the following equation:<br />

Gd = e<br />

With:<br />

t<br />

G = ∫∫ gg ⋅ω<br />

⋅ dA<br />

e<br />

∫∫<br />

W<br />

= W<br />

( − J )<br />

I ⋅ g ⋅ω<br />

⋅ dA<br />

Project Exocet/D page 7/16<br />

n represents the noise.<br />

∫∫<br />

W<br />

2<br />

n x ⋅ω<br />

⋅ dA .<br />

This equation is solved for each selected window. So, for each window selected in the first<br />

<strong>image</strong>, we can compute the local displacement between the first <strong>image</strong> and the second one.<br />

The steps of point detection and tracking of the KLT algorithm are illustrated in Figure 2.<br />

Figure 2: Detection and tracking of points in a sequence of coral reef<br />

3.1.2.3.Global displacement computation by least square method<br />

When the points are matched between two successive <strong>image</strong>s, a global displacement is<br />

computed in order to register the <strong>image</strong>s.<br />

The displacement is modelled as a 4-parameter rigid global <strong>2D</strong> transformation, that’s to say a<br />

transformation composed of a translation, a rotation and a scale factor.<br />

This global displacement is computed by the iterative least square method. Each iteration<br />

enables to refine the result. Besides, this method is completed by an acceptance criterion<br />

which is used to validate the matches and to strike off the false matches in order to make the<br />

computation more robust.<br />

3.2. RMR algorithm<br />

3.2.1. Principle<br />

The second method we have investigated to build <strong>mosaic</strong>s is the Robust Multi-Resolution<br />

(RMR) method which is based upon the estimation of the optical flow [ODO95]. The<br />

advantage of this method is that the motion is estimated from the whole <strong>image</strong>.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


Project Exocet/D page 8/16<br />

In the RMR algorithm, the first stage consists in choosing a motion model. Then, the aim is to<br />

estimate the parameters of the model using the classic method of robust estimation in the<br />

domain of <strong>image</strong> and signal processing. This step is combined to a coarse-to-fine estimation<br />

using multi-resolution levels of <strong>image</strong>s.<br />

3.2.2. Algorithm description<br />

3.2.2.1.Model of motion<br />

In the first step of the algorithm, a motion model is chosen. In the RMR algorithm, we<br />

consider the class of <strong>2D</strong> polynomial motion models and we deal only with the <strong>2D</strong> affine<br />

model which is not too complex but enough representative of a large part of motion<br />

transformations:<br />

⎧u(<br />

X ) = a + a x + a y<br />

i 1 2 i 3 i<br />

⎨<br />

⎩v(<br />

X ) = a + a x + a y<br />

i 4 5 i 6 i<br />

And with matrix notation, it can be stated as:<br />

⎡u(<br />

X ) i ⎤<br />

( X ) = B(<br />

X ) A<br />

i ⎢ ⎥ =<br />

⎣v(<br />

X ) i ⎦<br />

V i<br />

Where<br />

A ( a a a a a a<br />

1 2 3 4 5 6<br />

T =<br />

⎡1<br />

x y 0 0 0<br />

i i<br />

⎤<br />

And B = B(<br />

X ) =<br />

i<br />

i ⎢<br />

⎥<br />

⎣0<br />

0 0 1 x y i i ⎦<br />

)<br />

For each point X , one can write the flow constraint equation linking spatial and temporal<br />

intensity gradients:<br />

( X ) ∇I(<br />

X ) + I ( X ) = 0<br />

V ,<br />

I<br />

x<br />

i<br />

⋅ i t i<br />

i<br />

( X ) u(<br />

X ) I ( X ) v(<br />

X ) + I ( X ) = 0<br />

i<br />

i<br />

+ y i i t i<br />

where ( ) is the spatial gradient vector of the intensity, is the partial temporal<br />

i<br />

derivative of the intensity relative to time and is the vector field.<br />

X I ∇ ) ( t i X I<br />

( ) i X V<br />

3.2.2.2.Robust estimation<br />

The goal of robust estimation is to find the parameter vector Θ which best fits the model<br />

( X , Θ)<br />

to the observations . y<br />

M i<br />

i<br />

t t<br />

In our case, Θ = ( A , 0)<br />

.<br />

The estimation of parameter Θ is achieved by a maximum likelihood estimator:<br />

Θˆ = argmin∑<br />

ρ ( y − M(<br />

X , Θ))<br />

where ρ is called the M-estimator and corresponds to the<br />

i<br />

i<br />

Θ<br />

maximum likelihood estimation.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


Project Exocet/D page 9/16<br />

This estimation is performed at all the scales of a multi-resolution <strong>image</strong>s pyramid so that at<br />

each scale the estimation is refined.<br />

3.3. Metric conversion - camera self-calibration<br />

The displacement calculated by <strong>image</strong> processing is given in pixels. The aim of video<br />

<strong>mosaic</strong>ing is to provide the <strong>mosaic</strong>s with metric dimension in order to make quantitative<br />

measurements within the <strong>image</strong>s. Thus, the displacement and the <strong>image</strong> size in pixels must<br />

be converted into meters (see Figure 3).<br />

h<br />

f<br />

l<br />

camera<br />

1 pix<br />

L<br />

<strong>image</strong> plane<br />

seabed plane<br />

Figure 3: Link between one pixel and its real metric size on the seabed<br />

Thus we have the relationship:<br />

−1<br />

−1<br />

SizeOfAPixel(<br />

m ⋅ pix )<br />

l( m ⋅ pix ) =<br />

⋅ h(<br />

m)<br />

,<br />

f ( m)<br />

Given that<br />

SizeOfAPixel(<br />

m ⋅ pix<br />

−1<br />

f ( m)<br />

) = , we deduce that:<br />

α(<br />

pix)<br />

−1<br />

h(<br />

m)<br />

l(<br />

mm ⋅ pix ) = , where α is an intrinsic parameter of the camera which has to be<br />

α(<br />

pix)<br />

estimated and h is the altitude of the camera.<br />

Since it is not always possible to deploy a calibration pattern on the seabed, an algorithm for<br />

camera self-calibration has been investigated [PES03].<br />

The self-calibration algorithm allows us to determine the intrinsic and extrinsic parameters of<br />

a vertical camera mounted on an underwater vehicle. This method needs only a sequence of<br />

few <strong>image</strong>s of the seabed and is based upon the determination of the epipolar geometry<br />

between two successive <strong>image</strong> of the sequence.<br />

The diagram below presents all the steps of the self-calibration method:<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


Images of the<br />

observed<br />

scene<br />

Matching<br />

points<br />

Project Exocet/D page 10/16<br />

Points<br />

matches<br />

validation<br />

Scene<br />

geometry<br />

estimation<br />

Extrinsic<br />

parameters<br />

estimation<br />

Figure 4: Steps of camera self-calibration method<br />

3.3.1. Extraction and matching of points<br />

Intrinsic<br />

parameters<br />

estimation<br />

In our application, <strong>image</strong> sequences are composed of small displacements between two<br />

successive <strong>image</strong>s. The extraction and the matching of points are carried out by the KLT<br />

algorihthm which is detailed in paragraph 3.1. This algorithm extracts features in the first<br />

<strong>image</strong> and tracks them across the sequence.<br />

Despite the complexity of underwater <strong>image</strong>s, a great number of features are positively<br />

tracked through the sequence. Moreover, the points are well distributed in the <strong>image</strong>.<br />

3.3.2. Scene geometry<br />

The scene geometry can be represented algebraically by the fundamental matrix F . The<br />

fundamental matrix links the coordinates q and q'<br />

of a same 3D point Q in two <strong>image</strong>s:<br />

'T<br />

q F q = 0 ∀i ∈ [1, n]<br />

i<br />

i<br />

The estimation of the fundamental matrix is based on the Hartley’s normalized 8-point<br />

algorithm and uses the points detected and tracked by the KLT algorithm.<br />

This algorithm requires a set of at least eight matched points q ↔ q . In order to increase<br />

'<br />

the estimation accuracy, a set of about 30 or 50 points can be used. But using more than 8<br />

points leads to the presence of false matches which perturb the estimation of F . So, in<br />

compensation, a criterion has been integrated to validate the points and remove false<br />

matches.<br />

3.3.3. Points validation<br />

The validation of features is carried out by an algorithm based on the RANSAC (RANdom<br />

Sample Consensus) algorithm. The selection of point matches is based on the accuracy of<br />

the equation representing the scene geometry (see 3.3.2). The equation is computed for all<br />

the matched points and allows, estimating the fundamental matrix, to determine the matching<br />

error. A list of good features is then constituted to estimate the best fundamental matrix. As a<br />

result, only the “best” matches are kept to estimate the final fundamental matrix.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

i<br />

i<br />

Grade : 1.0 27/09/2006


3.3.4. Intrinsic parameters estimation<br />

Project Exocet/D page 11/16<br />

There are five intrinsic parameters: focal distance f , scale factors and according to<br />

k<br />

the <strong>image</strong> axes u and<br />

ku v<br />

u0 0 v<br />

v , and coordinates of principal point of the <strong>image</strong> and . The<br />

intrinsic parameters estimation is carried out using the Mendonça and Cipolla algorithm [4]<br />

applied to a set of five <strong>image</strong>s taken at given intervals from a dense sequence. This<br />

algorithm is based on the minimization of a cost function which takes the intrinsic parameters<br />

as arguments and the fundamental matrix as parameters. The cost function is:<br />

C ( K ) =∑∑<br />

σ − σ<br />

n n 1 2<br />

i = 1<br />

w ij<br />

j > i<br />

ij<br />

2<br />

σ ij<br />

ij<br />

With:<br />

• , , , ) u f α = where αu, αv, u0, v0 correspond respectively to the products of the<br />

( 0 0 v<br />

K u v α<br />

scale factors according to the axis u and v by the focal length and to the coordinates<br />

of the intersection of the optical axis with the <strong>image</strong> plane,<br />

• is the degree of confidence of the fundamental matrix estimation,<br />

F<br />

wij ij<br />

1 2<br />

• σ > σ are the non-zero singular values of the essential matrix E .<br />

ij<br />

ij<br />

3.3.5. Extrinsic parameters estimation<br />

The extrinsic parameters are composed of rotations and translations of the camera around<br />

the three axes (twelve parameters).<br />

The extrinsic parameters estimation represents the last step of the self-calibration algorithm.<br />

This part is function of intrinsic parameters and of the fundamental matrix F:<br />

E =<br />

T [] t R = K FK<br />

x<br />

With:<br />

• t : the antisymmetric matric associated to the translation vector t,<br />

[] x<br />

• R : the rotation matrix,<br />

• K : the intrinsic parameters matrix.<br />

The algorithm firstly determines the translation t. Afterwards, the rotation matrix is estimated<br />

by minimizing:<br />

3<br />

∑<br />

i=<br />

1<br />

E − R<br />

i<br />

T<br />

[] t<br />

xi<br />

2<br />

[] xi<br />

Where E and t are the i-th row vectors of matrices E and [ t ] x .<br />

3.4. Experiments<br />

A statistical comparative study of this camera self-calibration method is presented in<br />

[PES03]. Some studies with simulated data have allowed us to show that some trajectories<br />

of the underwater vehicle are more adapted for the intrinsic parameters estimation of the<br />

camera.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

ij<br />

Grade : 1.0 27/09/2006


Project Exocet/D page 12/16<br />

The results obtained with real data show that a rotation around the optical axis with roll and<br />

pitch angles, allows to estimate concurrently all intrinsic parameters of the camera with a<br />

good accuracy. The table below presents errors expressed as a percentage of the parameter<br />

value estimated in this case of movement.<br />

εαu % εαv % εu0 % εv0 %<br />

θz + (θx, θy) 2.06 % 2.06 % 1.65 % 1.19 %<br />

Table 1: Errors in intrinsic parameters estimations<br />

θz: rotation around the optical axis, θx: roll angle and θy : pitch angle<br />

3.5. Fusion with navigation data<br />

When the displacement is computed by <strong>image</strong> processing and when the real geographic size<br />

is provided thanks to altitude measurement and camera parameters, we could think that it is<br />

enough to obtain <strong>mosaic</strong>s well located in the seabed. In fact, <strong>image</strong> processing techniques<br />

imply errors that accumulate during the <strong>mosaic</strong>ing process. In order to reduce these errors<br />

and to obtain a geo-referencing accurately, navigation data can be used and fused with<br />

displacement given by <strong>image</strong> processing. That’s why, to correct the displacements calculated<br />

through an <strong>image</strong> sequence, we have introduced a Kalman filter [WEL94] which is well suited<br />

to problems of estimating the variables of a dynamic system (that varies with time). In<br />

dynamic systems, the system variables are denoted by the term “state variables”.<br />

The question which is addressed by the Kalman filter is “Given our knowledge of the<br />

behaviour of the system, and given our measurements, what is the best estimate of the state<br />

variables?<br />

Mathematically saying, the aim of the Kalman filter is to estimate a posteriori a state vector<br />

−<br />

xˆ in relation to the a priori estimation xˆ and a weighted difference between the<br />

k<br />

−<br />

measurement z at time k and the prediction , where the subscript k denotes the time<br />

H ˆ<br />

step.<br />

k<br />

Thus, at time k, the measurement is known and the estimation a posteriori and the xˆ<br />

−<br />

estimation a priori xˆ<br />

must be calculated.<br />

k +1<br />

k<br />

k k x<br />

zk k<br />

The two equations hereafter link the state vector and the measurement vector.<br />

x = A x + Bu + w<br />

k +1<br />

z = H x + v<br />

k<br />

k<br />

k<br />

k<br />

k<br />

k<br />

k<br />

k<br />

This yields to the two sets of equations of the classic formulation of the Kalman filtering. The<br />

−<br />

first ones are the time update equations to predict the state of the vector xˆ<br />

while the<br />

k +1<br />

second ones are the measurement update equations to correct the state of the vector xˆ .<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006<br />

k


Time update equations (“predict”)<br />

x = A xˆ<br />

+ Bu<br />

− ˆk + 1<br />

−<br />

k +1<br />

k<br />

P = A P A + Q<br />

k<br />

k<br />

k<br />

T<br />

k<br />

k<br />

k<br />

Measurement update equations (“correct”)<br />

K<br />

x ˆ<br />

k<br />

k<br />

− T − T<br />

= P H ( H P H − R )<br />

k k k k k k<br />

−<br />

−<br />

= xˆ<br />

+ K ( z − H xˆ<br />

)<br />

k k k k k<br />

P I K H ) P<br />

−<br />

= ( −<br />

k<br />

k k k<br />

−1<br />

In these formulae, the variables stand for:<br />

Project Exocet/D page 13/16<br />

−<br />

• xˆ : A priori state vector at time k (given the process before step k)<br />

k<br />

• xˆ : A posteriori state vector at time k (given the measurement at time k)<br />

k<br />

• z : Measurement vector at time k<br />

k<br />

• u : Vector of the control entry<br />

k<br />

• A : Matrix linking the states at time k and k+1<br />

k<br />

• B : Matrix linking the control entry to the state vector<br />

• K : Kalman gain matrix<br />

k<br />

−<br />

• P : Matrix of covariance of the prediction error<br />

k<br />

• P : Matrix of covariance of the a posteriori error<br />

k<br />

• H : Matrix linking the state to the measurement<br />

k<br />

• w : Process noise, assumed to be white and gaussian<br />

k<br />

• v : Measurement noise, assumed to be white and gaussian<br />

k<br />

• R : Covariance matrix of the measurement noise<br />

k<br />

• Q : Covariance matrix of the process noise<br />

k<br />

In the case of video <strong>mosaic</strong>ing, the state variables are the positioning (X_utm and Y_utm)<br />

and a term to correct the pixel size. The measurement vector consists of X_utm and Y_utm<br />

given by navigation system.<br />

The experiment we have led is to perform a route with the underwater vehicle and to perform<br />

the same route in the other direction. We can notice in Figure 5 that there is a shift of the<br />

<strong>mosaic</strong> if we don’t use navigation in the algorithm whereas when using dead-reckoning<br />

navigation in the Kalman filter, the <strong>mosaic</strong> drift is well corrected.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


Project Exocet/D page 14/16<br />

(a) (b)<br />

Figure 5: Mosaics obtained without using navigation (a), using Kalman filtering (b)<br />

4. MATISSE SOFTWARE®<br />

4.1. General architecture<br />

The MATISSE Software® [ALL04] has been developed to integrate all the algorithms of georeferenced<br />

<strong>mosaic</strong>ing. It is flexible and can be used with many underwater vehicles as soon<br />

as they are equipped with a down-facing camera and a continuous navigation system (deadreckoning<br />

navigation).<br />

In Figure 6, the diagram represents the condition of use of MATISSE Software® with a ROV.<br />

The video stream and the navigation data are transferred up to the surface via the umbilical<br />

tether. They are processed in-line to produce geo-referenced <strong>mosaic</strong>s. An option consists of<br />

recording the video stream on a video DVD and the navigation data as messages on a CD.<br />

Playing back video and navigation data together makes possible off-line <strong>mosaic</strong> <strong>building</strong>.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

MATISSE<br />

camera<br />

Navigation<br />

MATISSE<br />

Downlooking<br />

camera<br />

Real-time creation of georeferenced<br />

video <strong>mosaic</strong>s<br />

Figure 6: MATISSE Software® used with a ROV<br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


4.2. User interface<br />

Project Exocet/D page 15/16<br />

MATISSE Software® consists of a user-friendly interface which is presented in Figure 7. The<br />

main window (black background) allows the user to control and check <strong>mosaic</strong> processing. On<br />

the left, we can see MATISSE data architecture. On the bottom left hand corner, video<br />

stream is visualized. And the upper part of the interface is dedicated to specified menus and<br />

predefined sets of parameters for <strong>mosaic</strong> creation.<br />

The MATISSE outputs consist of geo-referenced <strong>image</strong>s (tiff <strong>image</strong>s and tfw geo-referencing<br />

files) and of network messages sent when a <strong>mosaic</strong> is created. These messages can be<br />

used for example by a GIS in order to integrate the geo-referenced <strong>mosaic</strong>s on-line with<br />

other geo-referenced data in a dedicated environment.<br />

5. CONCLUSION<br />

Figure 7: MATISSE software® interface<br />

In this report, we have detailed some methods to build geo-referenced <strong>mosaic</strong>s. This has<br />

resulted in the development of a user-friendly software which is used at IFREMER with ROV<br />

victor6000 and has been tested with other underwater vehicles.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006


6. BIBLIOGRAPHY<br />

Project Exocet/D page 16/16<br />

[ALL04] Allais, A.G, Borgetto, M, Opderbecke, J, Pessel, N, Rigaud, V, “Seabed video<br />

<strong>mosaic</strong>king with MATISSE: a technical overview and cruise results”, Proc. of 14 th<br />

International Offshore and Polar Engineering Conference, ISOPE-2004, vol. 2 pp<br />

417-421, Toulon, France, May 23-28, 2004.<br />

[ODO95] Odobez, J.M, Bouthémy, P, “Robust Mutiresolution Estimation of Parametric<br />

Motion Models”, Journal of visual communication and <strong>image</strong> representation, Vol.6,<br />

N°4:348-365, Dec.1995.<br />

[PES03] Pessel, N, Opderbecke, J, Aldon, M.J, “An Experimental Study of a Robust Self-<br />

Calibration Method for a Single Camera”, Proc. of the 3rd International Symposium<br />

on Image and Signal Processing and Analysis, ISPA’2003, Sponsored by IEEE<br />

and EURASIP, Rome, Italy, September 18-20, 2003.<br />

[SHI94] Shi, J, Tomasi, C, “Good features to track,” Proc. IEEE Conference on Computer<br />

Vision and Pattern Recognition, CVPR, Seattle, 1994.<br />

[WEL94] Welch, G, Bishop, G, “An introduction to the Kalman filter”, UNC-CH Computer<br />

Science Technical Report 95-041, 1995.<br />

Deliverable N° <strong>2D</strong>3<br />

Report on <strong>image</strong> <strong>mosaic</strong> <strong>building</strong><br />

DOP/CM/SM/PRAO/06.224<br />

Grade : 1.0 27/09/2006

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!