13.01.2015 Views

Sirgue, Laurent, 2003. Inversion de la forme d'onde dans le ...

Sirgue, Laurent, 2003. Inversion de la forme d'onde dans le ...

Sirgue, Laurent, 2003. Inversion de la forme d'onde dans le ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Thèse <strong>de</strong> Doctorat <strong>de</strong> l’Université Paris XI<br />

présentée par<br />

<strong>Laurent</strong> <strong>Sirgue</strong><br />

pour obtenir <strong>le</strong> titre <strong>de</strong><br />

Docteur <strong>de</strong> l’Université PARIS XI<br />

Spécialité: Sciences <strong>de</strong> <strong>la</strong> Terre<br />

<strong>Inversion</strong> <strong>de</strong> <strong>la</strong> <strong>forme</strong> d’on<strong>de</strong> <strong>dans</strong> <strong>le</strong><br />

domaine fréquentiel <strong>de</strong> données sismiques<br />

grands offsets<br />

soutenue <strong>le</strong> 3 juil<strong>le</strong>t 2003, <strong>de</strong>vant <strong>le</strong> jury composé <strong>de</strong>:<br />

Patrick Lailly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Examinateur<br />

Gil<strong>le</strong>s Lambaré . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Rapporteur<br />

Raùl Madariaga . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Directeur <strong>de</strong> thèse<br />

Gerhard Pratt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Co-directeur<br />

Jean Virieux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rapporteur<br />

Hermann Zeyen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Examinateur<br />

Thèse préparée aux Laboratoires <strong>de</strong> Géologie <strong>de</strong><br />

l’Eco<strong>le</strong> Norma<strong>le</strong> Supérieure <strong>de</strong> Paris et <strong>de</strong><br />

Queen’s University, Canada


ii<br />

c○<br />

Copyright 2003, <strong>Laurent</strong> SIRGUE


Abstract<br />

The standard imaging approach in exploration seismology relies on a <strong>de</strong>composition by spatial<br />

sca<strong>le</strong>s: the <strong>de</strong>termination of the low wavenumbers of the velocity field is followed by the reconstruction<br />

of the high wavenumbers. In this standard approach, intermediate spatial wavenumbers<br />

are ignored, partly because they are known to have litt<strong>le</strong> effect on the data for standard<br />

near-normal inci<strong>de</strong>nce geometries. However, for mo<strong>de</strong>ls presenting a comp<strong>le</strong>x structure, the<br />

recovery of the high wavenumbers may be significantly improved by the <strong>de</strong>termination of intermediate<br />

wavenumbers. These can potentially be recovered by local, non-linear waveform<br />

inversion of wi<strong>de</strong>-ang<strong>le</strong> data. However, waveform inversion schemes based on the computation<br />

of the gradient of the misfit function are limited by the non-linearity of the inverse prob<strong>le</strong>m,<br />

which is in turn governed by the minimum frequency in the data and the starting velocity mo<strong>de</strong>l.<br />

It is critical to invert from the lowest to the highest frequencies of the seismic data. For very<br />

low temporal frequencies (< 7 Hz), the prob<strong>le</strong>m is reasonably linear so that waveform inversion<br />

may be applied using a starting mo<strong>de</strong>l obtained from traveltime tomography. The frequency<br />

domain is particu<strong>la</strong>rly advantageous as the inversion from the low to the high frequencies is<br />

then very efficient.<br />

It is possib<strong>le</strong> to discretise the frequencies with a much <strong>la</strong>rger sampling interval than that<br />

dictated by the sampling theorem and still obtain an imaging result that does not suffer from<br />

aliasing (wrap-around) in the <strong>de</strong>pth domain. The number of input frequencies can be reduced<br />

when a range of offsets is avai<strong>la</strong>b<strong>le</strong>; this creates a redundancy of information in the wavenumber<br />

coverage of the target. In or<strong>de</strong>r to optimise the use of this information, I <strong>de</strong>fine a new discretisation<br />

strategy that <strong>de</strong>pends on the maximum effective offset present in the surface seismic survey.<br />

This strategy is <strong>de</strong>fined by <strong>de</strong>riving an analytical prediction of the wavenumber coverage of a<br />

1-D target, for a sing<strong>le</strong> frequency and for a given acquisition geometry. This analytic expression<br />

is obtained using the results of diffraction tomography, and is adapted to formu<strong>la</strong>te an expression<br />

for the gradient image in waveform inversion. The main i<strong>de</strong>a of the se<strong>le</strong>ction strategy is:<br />

iii


iv<br />

the <strong>la</strong>rger the range of offsets, the fewer frequencies are required. Validation tests are carried<br />

out on a 1-D velocity mo<strong>de</strong>l that show the efficiency of frequency domain when a very limited<br />

number of frequencies are a<strong>de</strong>quately chosen. In these tests, the frequency domain performs<br />

as well as time domain inversion at a much smal<strong>le</strong>r computation cost. This strategy remains<br />

efficient in 2D structures. This is <strong>de</strong>monstrated by the very good results that are obtained in the<br />

2D Marmousi mo<strong>de</strong>l from a wi<strong>de</strong>-ang<strong>le</strong> seismic acquisition survey, using only 3 frequencies.<br />

Real seismic data do not contain very low frequencies and waveform inversion at higher frequencies<br />

are likely to fail due to convergence into a local minimum. Preconditioning techniques<br />

must hence be applied in or<strong>de</strong>r to enhance the efficacy of waveform inversion starting from realistic<br />

frequencies. Because the high wavenumbers dominate the gradient image, the <strong>la</strong>tter must<br />

be smoothed to insure the proper reconstruction of lower wavenumbers. The <strong>de</strong>termination<br />

of the low wavenumbers is <strong>de</strong>licate as these correspond to the most non-linear components of<br />

the mo<strong>de</strong>l. These non-linearities may be mitigated by applying a<strong>de</strong>quate preconditioning on the<br />

data residuals that focus on the inversion of the early arrivals. A numerical test carried out on an<br />

exten<strong>de</strong>d version of the 2D Marmousi mo<strong>de</strong>l, in which a <strong>de</strong>nse wi<strong>de</strong>-ang<strong>le</strong> survey is mo<strong>de</strong>l<strong>le</strong>d,<br />

shows that the proposed preconditioning strategy significantly improves the result of waveform<br />

inversion. This test neverthe<strong>le</strong>ss also shows, that the starting mo<strong>de</strong>l remains an important aspect<br />

of waveform inversion.<br />

The potential of waveform inversion in solving a given imaging prob<strong>le</strong>m may be evaluated<br />

by <strong>de</strong>fining the requirements of the starting mo<strong>de</strong>l that ensure the success of the inversion. To<br />

this end, I <strong>de</strong>velop a linearity study that analyses the evolution of the misfit function with respect<br />

to increasing <strong>de</strong>gree of smoothness of the true mo<strong>de</strong>l.<br />

Such study is carried out on a 1D velocity mo<strong>de</strong>l representing the sub-basalt imaging prob<strong>le</strong>m.<br />

The analysis <strong>le</strong>ads to the conclusion that waveform inversion may be used for the recovery<br />

of the intra-basalt velocities, although the <strong>de</strong>termination of the overbur<strong>de</strong>n sediments is more<br />

difficult. Due to the presence of the basalt <strong>la</strong>yer, only a migration-like image (i.e., the high<br />

wavenumbers) can be obtained for the sub-basalt sediments. This study also emphasises the<br />

importance of the low frequencies, as the requirements of the starting mo<strong>de</strong>l are much more<br />

<strong>de</strong>manding for higher frequencies.


Résumé <strong>de</strong> thèse<br />

L’approche standard en imagerie sismique repose sur une décomposition par échel<strong>le</strong> du modè<strong>le</strong><br />

<strong>de</strong> vitesse : <strong>la</strong> détermination <strong>de</strong>s bas nombres d’on<strong>de</strong>s est suivie par une reconstruction <strong>de</strong>s<br />

hauts nombres d’on<strong>de</strong>s. Dans une tel<strong>le</strong> approche, <strong>le</strong>s nombres d’on<strong>de</strong>s intermédiaires sont<br />

ignorés, en partie car ils ont peu d’effets sur <strong>le</strong>s données sismiques c<strong>la</strong>ssiques courts offsets.<br />

Cependant, pour <strong>de</strong>s modè<strong>le</strong>s présentant une structure comp<strong>le</strong>xe, <strong>la</strong> détermination <strong>de</strong>s hauts<br />

nombres d’on<strong>de</strong>s peut être améliorée <strong>de</strong> manière significative par l’apport <strong>de</strong>s nombres d’on<strong>de</strong>s<br />

intermédiaires. Ces <strong>de</strong>rniers peuvent être déterminés par l’inversion non-linéaire <strong>de</strong> <strong>la</strong> <strong>forme</strong><br />

d’on<strong>de</strong> <strong>de</strong> données sismiques grands ang<strong>le</strong>s qui est, par ail<strong>le</strong>urs, limitée par <strong>la</strong> non-linéarité du<br />

problème inverse. La non-linéarité est gouvernée par <strong>la</strong> fréquence minimum <strong>dans</strong> <strong>le</strong>s données<br />

et <strong>le</strong> modè<strong>le</strong> <strong>de</strong> vitesse initial.<br />

Pour <strong>le</strong>s trés basses fréquences, inférieures à 7 Hz, <strong>le</strong> problème est raisonnab<strong>le</strong>ment linéaire<br />

pour appliquer l’inversion <strong>de</strong> <strong>la</strong> <strong>forme</strong> d’on<strong>de</strong> à partir d’un modè<strong>le</strong> <strong>de</strong> départ déterminé par inversion<br />

tomographique <strong>de</strong>s temps <strong>de</strong> trajets. Il est donc très important d’inverser <strong>de</strong>s fréquences<br />

<strong>le</strong>s plus basses présentes <strong>dans</strong> <strong>le</strong>s données. Le domaine fréquentiel est alors très efficace pour<br />

inverser <strong>de</strong>s basses vers <strong>le</strong>s hautes fréquences.<br />

De plus, il est possib<strong>le</strong> <strong>de</strong> discrétiser <strong>le</strong>s fréquences avec un pas d’échantillonnage plus grand<br />

que celui dicté par <strong>le</strong> théorème d’échantillonnage tout en conservant un résultat en imagerie qui<br />

ne contient pas d’aliasing (recouvrement) en profon<strong>de</strong>ur. Le nombre <strong>de</strong> fréquences peut être<br />

réduit lorsqu’une gamme d’offset est disponib<strong>le</strong>, ce qui crée une redondance d’information <strong>de</strong><br />

<strong>la</strong> couverture spectral <strong>de</strong> l’objet à reconstruire. Afin d’optimiser l’utilisation <strong>de</strong> cette information,<br />

une stratégie <strong>de</strong> discrétisation <strong>de</strong>s fréquences est définit qui dépend <strong>de</strong> l’offset maximum<br />

effectif présent <strong>dans</strong> l’acquisition sismique. Cette stratégie est définie en dérivant une formu<strong>le</strong><br />

analytique <strong>de</strong> <strong>la</strong> couverture spectra<strong>le</strong> d’un objet uni-dimensionnel, pour une fréquence et une<br />

acquisition données. Cette expression analytique est obtenue à partir <strong>de</strong>s résultats <strong>de</strong> <strong>la</strong> tomographie<br />

<strong>de</strong> diffraction qui utilise l’approximation d’on<strong>de</strong> p<strong>la</strong>ne en milieu homogène. Cette<br />

v


vi<br />

formu<strong>la</strong>tion est adaptée afin d’être applicab<strong>le</strong> au vecteur gradient <strong>de</strong> l’inversion <strong>de</strong> <strong>la</strong> <strong>forme</strong><br />

d’on<strong>de</strong>. L’idée principa<strong>le</strong> <strong>de</strong> <strong>la</strong> stratégie <strong>de</strong> sé<strong>le</strong>ction est <strong>la</strong> suivante : plus <strong>la</strong> gamme d’offset est<br />

gran<strong>de</strong>, moins <strong>de</strong> fréquences sont nécessaires. La métho<strong>de</strong> proposée utilise l’effet bien connu<br />

d’étirement <strong>de</strong> l’on<strong>de</strong><strong>le</strong>tte (stretch) qui apparait lors d’une correction NMO ou en migration. Du<br />

fait <strong>de</strong>s simi<strong>la</strong>rités entre <strong>le</strong> vecteur gradient et <strong>la</strong> première itération et l’opérateur <strong>de</strong> migration,<br />

cette stratégie peut faci<strong>le</strong>ment être implémentée en migration.<br />

Des tests <strong>de</strong> validations sont effectués sur un modè<strong>le</strong> <strong>de</strong> vitesse 1-D et montrent l’efficacité<br />

du domaine fréquentiel quand un nombre limité <strong>de</strong> fréquences est choisi <strong>de</strong> manière adéquate.<br />

Lors <strong>de</strong> ces tests, <strong>le</strong> domaine fréquentiel donne un résultat comparab<strong>le</strong> au domaine temps à<br />

un coût <strong>de</strong> calcul informatique bien inférieur. Cette stratégie reste efficace pour <strong>le</strong>s modè<strong>le</strong>s<br />

2D : <strong>de</strong> trés bons résultats sont obtenus pour <strong>le</strong> modè<strong>le</strong> Marmousi 2-D à partir <strong>de</strong> données<br />

sismiques grands ang<strong>le</strong>s <strong>de</strong> type OBC, en utilisant uniquement 3 fréquences ( <strong>la</strong> fréquence <strong>de</strong><br />

départ utilisée est 5 Hz).<br />

Ces résultats montrent l’intérêt du domaine fréquentiel car en inversion <strong>de</strong> <strong>la</strong> <strong>forme</strong> d’on<strong>de</strong>,<br />

comme en migration, <strong>le</strong>s coûts <strong>de</strong> calculs informatiques sont directement proportionnels au<br />

nombre <strong>de</strong> fréquences utilisées.<br />

Les données sismiques réel<strong>le</strong>s ne contiennent malheureusement pas <strong>de</strong> très basses fréquences.<br />

Des techniques <strong>de</strong> pré conditionnement doivent alors être appliquées afin d’améliorer l’efficacité<br />

<strong>de</strong> l’inversion à partir <strong>de</strong> fréquences réalistes (~ 7 Hz).<br />

Une décomposition en va<strong>le</strong>urs singulières <strong>de</strong> <strong>la</strong> matrice <strong>de</strong>s dérivées <strong>de</strong> Fréchet pour une<br />

seu<strong>le</strong> fréquence montre que <strong>le</strong>s hauts nombres d’on<strong>de</strong>s correspon<strong>de</strong>nt aux vecteurs propres associés<br />

aux gran<strong>de</strong>s va<strong>le</strong>urs propres. Le vecteur gradient est donc dominé par <strong>le</strong>s hauts nombres<br />

d’on<strong>de</strong>s et l’importance <strong>de</strong> ces <strong>de</strong>rniers doit donc être diminuée. Ceci peut être réalisé en lissant<br />

<strong>le</strong> vecteur gradient afin d’assurer une bonne reconstruction <strong>de</strong>s bas nombres d’on<strong>de</strong>s. La détermination<br />

<strong>de</strong>s bas nombres d’on<strong>de</strong>s reste délicate car ils correspon<strong>de</strong>nt aux composantes du modè<strong>le</strong><br />

<strong>le</strong>s plus non-linéaires par rapport aux données. La linéarité peut cependant être améliorée<br />

en appliquant un pré conditionnement <strong>de</strong>s résidus <strong>de</strong>s données qui accentu l’importance <strong>de</strong><br />

l’information contenue <strong>dans</strong> <strong>le</strong>s premières arrivées. Cette approche est uti<strong>le</strong> car <strong>le</strong>s premières<br />

arrivées correspon<strong>de</strong>nt à <strong>la</strong> composante <strong>de</strong>s données <strong>la</strong> plus linéaire.<br />

Une inversion à 7 Hz est effectuée sur une version étendue du modè<strong>le</strong> Marmousi 2D, <strong>dans</strong><br />

<strong>le</strong>quel une acquisition grand ang<strong>le</strong> est modélisée. Cette expérience montre que <strong>le</strong>s métho<strong>de</strong>s <strong>de</strong><br />

pré conditionnement améliorent considérab<strong>le</strong>ment <strong>le</strong> résultat <strong>de</strong> l’inversion <strong>de</strong> <strong>la</strong> <strong>forme</strong> d’on<strong>de</strong>.<br />

Ces tests montrent éga<strong>le</strong>ment que <strong>le</strong> modè<strong>le</strong> <strong>de</strong> vitesse initial reste un aspect fondamental <strong>de</strong>


vii<br />

l’inversion <strong>de</strong> <strong>la</strong> <strong>forme</strong> d’on<strong>de</strong>.<br />

Le modè<strong>le</strong> <strong>de</strong> départ est un aspect fondamental <strong>de</strong> l’inversion <strong>de</strong> <strong>la</strong> <strong>forme</strong> d’on<strong>de</strong>. Le potentiel<br />

d’utilisation <strong>de</strong> l’inversion <strong>de</strong> <strong>la</strong> <strong>forme</strong> d’on<strong>de</strong> pour un problème d’imagerie donnée peut<br />

être évalué en définissant <strong>le</strong>s critères du modè<strong>le</strong> <strong>de</strong> départ qui assurent <strong>le</strong> succés <strong>de</strong> l’inversion.<br />

A cette fin, une étu<strong>de</strong> <strong>de</strong> linéarité est effectuée qui repose sur l’analyse <strong>de</strong> l’évolution <strong>de</strong> <strong>la</strong><br />

fonction coût par rapport au <strong>de</strong>gré <strong>de</strong> lissage du modè<strong>le</strong> vrai.<br />

Une tel<strong>le</strong> étu<strong>de</strong> est appliquée à un modè<strong>le</strong> <strong>de</strong> vitesse 1-D représentant <strong>le</strong> problème <strong>de</strong><br />

l’imagerie sous basalte. L’analyse <strong>de</strong> linéarité montre que l’inversion <strong>de</strong> <strong>la</strong> <strong>forme</strong> d’on<strong>de</strong> peut<br />

être utilisée pour déterminer <strong>le</strong>s vitesses <strong>de</strong> <strong>la</strong> couche <strong>de</strong> basalte bien que <strong>la</strong> détermination <strong>de</strong>s<br />

sédiments supérieurs au basalte soit plus diffici<strong>le</strong>. Du fait <strong>de</strong> <strong>la</strong> présence <strong>de</strong> <strong>la</strong> couche <strong>de</strong> basalte,<br />

seu<strong>le</strong> une image <strong>de</strong> type migration (hauts nombres d’on<strong>de</strong>s) peut être obtenue pour <strong>le</strong>s sédiments<br />

inférieurs au basalte. En effet, <strong>la</strong> présence du basalte empêche l’illumination du sous basalte<br />

par <strong>de</strong>s données grands ang<strong>le</strong>s. Cette étu<strong>de</strong> met éga<strong>le</strong>ment l’accent sur l’importance <strong>de</strong>s basses<br />

fréquences compte tenu du fait que <strong>le</strong>s critères du modè<strong>le</strong> <strong>de</strong> départ sont plus restrictifs pour <strong>le</strong>s<br />

hautes fréquences.


viii


Ackow<strong>le</strong>dgments<br />

First and foremost, I would like to thank Gerhard Pratt for his assistance throughout the thesis<br />

project. I am most grateful to him for welcoming me into his <strong>la</strong>boratory, here at Queen’s<br />

University, Kingston, Canada, where most of the research for the project was conducted. His<br />

advise and insight on the topic of waveform inversion was invaluab<strong>le</strong> and greatly appreciated.<br />

Also, I want to offer him my sincerest thanks you for accepting the bur<strong>de</strong>n of correcting the<br />

spelling and grammar, as well as commenting on the manuscript.<br />

Secondly, I am most thankful to Georges Pascal who welcomed me as a Ph.D. stu<strong>de</strong>nt into<br />

the <strong>la</strong>boratoire <strong>de</strong> géologie <strong>de</strong> l’Eco<strong>le</strong> Norma<strong>le</strong> Supérieure <strong>de</strong> Paris. I wish him all the best for<br />

the future.<br />

I would also like to thank the members of the jury for agreeing to review my work, especially<br />

Raùl Madariaga for assuming the task of thesis advisor for my <strong>de</strong>fense.<br />

In addition, I would like to offer a special thank you to François Au<strong>de</strong>bert and Philippe Herrmann<br />

from CGG for funding my project through the CIFRE program as well as for providing<br />

me with the means to carry out my research in Canada. Likewise, I am most grateful to Paul<br />

Williamson for my 4 months stay at the Geosciences Research Center of Total in London.<br />

These acknow<strong>le</strong>dgments would not be comp<strong>le</strong>te without thanking my friends who throughout<br />

my PhD, ma<strong>de</strong> my life in Kingston so enjoyab<strong>le</strong> and memorab<strong>le</strong>. Thanks to Stéphane<br />

and Doug for their friendship and entertaining coffee breaks, to Graham for our technical discussions<br />

and his kindness. Many thanks to ZZ Rob and Alfredo for being such easy going<br />

housemates. To Marc, thanks for all the fun times we had. Thanks to Nadine for her patience<br />

and her generosity.<br />

ix


Contents<br />

1 Introduction 1<br />

1.1 Determination of the macro-mo<strong>de</strong>l . . . . . . . . . . . . . . . . . . . . . . . . 2<br />

1.2 Determination of the high wavenumbers by migration . . . . . . . . . . . . . . 4<br />

1.3 Quantitative migration and linearised inversion . . . . . . . . . . . . . . . . . 5<br />

1.4 Migration/<strong>Inversion</strong> and non-linearity . . . . . . . . . . . . . . . . . . . . . . 6<br />

1.5 Non-linear waveform inversion . . . . . . . . . . . . . . . . . . . . . . . . . . 6<br />

1.6 Non-linear inversion of wi<strong>de</strong>-ang<strong>le</strong> data . . . . . . . . . . . . . . . . . . . . . 8<br />

1.7 Aim of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9<br />

1.8 Summary of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10<br />

2 Inverse theory 13<br />

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13<br />

2.2 <strong>Inversion</strong> of linear prob<strong>le</strong>ms . . . . . . . . . . . . . . . . . . . . . . . . . . . 15<br />

2.2.1 Singu<strong>la</strong>r value <strong>de</strong>composition . . . . . . . . . . . . . . . . . . . . . . 16<br />

2.2.2 Local methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20<br />

2.3 <strong>Inversion</strong> of non-linear prob<strong>le</strong>ms . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />

2.3.1 Newton method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33<br />

2.3.2 Linearised inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . 33<br />

2.3.3 Non-linear gradient method . . . . . . . . . . . . . . . . . . . . . . . 35<br />

2.3.4 Importance of the starting mo<strong>de</strong>l . . . . . . . . . . . . . . . . . . . . . 37<br />

2.3.5 Numerical examp<strong>le</strong>s . . . . . . . . . . . . . . . . . . . . . . . . . . . 38<br />

2.4 Frequency domain waveform inversion . . . . . . . . . . . . . . . . . . . . . 43<br />

2.4.1 The forward prob<strong>le</strong>m . . . . . . . . . . . . . . . . . . . . . . . . . . . 43<br />

2.4.2 The inverse prob<strong>le</strong>m . . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />

2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50<br />

xi


xii<br />

CONTENTS<br />

3 A strategy for se<strong>le</strong>cting temporal frequencies 53<br />

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />

3.2 Gradient vector and wavenumber illumination . . . . . . . . . . . . . . . . . . 55<br />

3.2.1 Analysis of the gradient within the Born approximation . . . . . . . . . 56<br />

3.3 The 1-D case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />

3.3.1 Wavenumber illumination for the 1-D case . . . . . . . . . . . . . . . 60<br />

3.4 Strategy for choosing frequencies for 1-D imaging . . . . . . . . . . . . . . . 62<br />

3.5 Numerical test I: 1-D mo<strong>de</strong>ls . . . . . . . . . . . . . . . . . . . . . . . . . . . 64<br />

3.5.1 The synthetic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65<br />

3.5.2 Validation of the p<strong>la</strong>ne wave predictions . . . . . . . . . . . . . . . . . 65<br />

3.5.3 A remark on “image stretch” . . . . . . . . . . . . . . . . . . . . . . . 68<br />

3.5.4 1-D waveform inversion . . . . . . . . . . . . . . . . . . . . . . . . . 70<br />

3.5.5 Sensitivity to noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73<br />

3.6 Numerical test II: 2-D mo<strong>de</strong>ls . . . . . . . . . . . . . . . . . . . . . . . . . . 76<br />

3.6.1 Marmousi velocity mo<strong>de</strong>l . . . . . . . . . . . . . . . . . . . . . . . . 76<br />

3.6.2 Determination of the starting mo<strong>de</strong>l by traveltime inversion . . . . . . 77<br />

3.6.3 Waveform inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . 79<br />

3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85<br />

3.7.1 Efficiency of the strategy for 2-D heterogeneous media . . . . . . . . . 85<br />

3.7.2 Practical imp<strong>le</strong>mentation of the frequency se<strong>le</strong>ction strategy . . . . . . 87<br />

3.7.3 The equiva<strong>le</strong>nce between gradient images and migration . . . . . . . . 87<br />

3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89<br />

4 Waveform inversion starting from realistic frequencies 91<br />

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91<br />

4.2 SVD of the Fréchet <strong>de</strong>rivative matrix . . . . . . . . . . . . . . . . . . . . . . . 93<br />

4.2.1 1-D medium in homogeneous background . . . . . . . . . . . . . . . 93<br />

4.2.2 1-D medium with a constant vertical velocity gradient . . . . . . . . . 100<br />

4.3 Non-linearity of the waveform inverse prob<strong>le</strong>m . . . . . . . . . . . . . . . . . 108<br />

4.3.1 The effect of cyc<strong>le</strong> skipping . . . . . . . . . . . . . . . . . . . . . . . 108<br />

4.3.2 Analytic linearity study of the 3D homogeneous Green’s function . . . 108<br />

4.3.3 Non-linearity and wavenumber domain . . . . . . . . . . . . . . . . . 110<br />

4.3.4 Non-linearity and offset . . . . . . . . . . . . . . . . . . . . . . . . . 112


CONTENTS<br />

xiii<br />

4.4 Tools for the mitigation of non-linearities . . . . . . . . . . . . . . . . . . . . 114<br />

4.4.1 Gradient preconditioning by wavenumber filtering . . . . . . . . . . . 114<br />

4.4.2 Time damping of the data residuals . . . . . . . . . . . . . . . . . . . 116<br />

4.4.3 f-k filtering of the data residuals . . . . . . . . . . . . . . . . . . . . . 122<br />

4.4.4 Offset windowing of the data residuals . . . . . . . . . . . . . . . . . . 124<br />

4.4.5 Mo<strong>de</strong>l parameterisation . . . . . . . . . . . . . . . . . . . . . . . . . 125<br />

4.5 Definition of a strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125<br />

4.6 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126<br />

4.6.1 1-D velocity mo<strong>de</strong>l . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126<br />

4.6.2 The 2-D exten<strong>de</strong>d Marmousi mo<strong>de</strong>l . . . . . . . . . . . . . . . . . . . 128<br />

4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131<br />

5 Waveform inversion and starting mo<strong>de</strong>ls: the sub-basalt imaging prob<strong>le</strong>m 135<br />

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135<br />

5.2 The sub-basalt imaging prob<strong>le</strong>m . . . . . . . . . . . . . . . . . . . . . . . . . 136<br />

5.2.1 Transmission loss at the top basalt interface and multip<strong>le</strong>s . . . . . . . 136<br />

5.2.2 Interface scattering at the top of the basalt . . . . . . . . . . . . . . . 138<br />

5.2.3 Body wave scattering within the basalt . . . . . . . . . . . . . . . . . . 138<br />

5.2.4 Sub-basalt imaging and waveform inversion . . . . . . . . . . . . . . . 138<br />

5.3 The 1-D sub-basalt numerical experiment . . . . . . . . . . . . . . . . . . . . 139<br />

5.4 <strong>Inversion</strong> starting from a “correct” macro mo<strong>de</strong>l . . . . . . . . . . . . . . . . . 141<br />

5.4.1 Misfit function and mo<strong>de</strong>l smoothness . . . . . . . . . . . . . . . . . . 141<br />

5.4.2 The top basalt and bottom basalt interfaces . . . . . . . . . . . . . . . 141<br />

5.5 Wavefield separation by <strong>la</strong>yer stripping . . . . . . . . . . . . . . . . . . . . . . 143<br />

5.5.1 The sediment <strong>la</strong>yer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144<br />

5.5.2 The basalt <strong>la</strong>yer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144<br />

5.5.3 The sub-basalt <strong>la</strong>yer . . . . . . . . . . . . . . . . . . . . . . . . . . . 146<br />

5.6 Waveform inversion and starting mo<strong>de</strong>l requirements . . . . . . . . . . . . . . 146<br />

5.6.1 Frequency <strong>de</strong>pen<strong>de</strong>nce of the starting mo<strong>de</strong>l requirements . . . . . . . 146<br />

5.6.2 Waveform inversion at 7 Hz . . . . . . . . . . . . . . . . . . . . . . . 148<br />

5.7 Discussion: starting mo<strong>de</strong>l and standard methods . . . . . . . . . . . . . . . . 151<br />

5.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152


xiv<br />

CONTENTS<br />

6 Conclusions 155<br />

6.1 Wavenumber, frequency and offset . . . . . . . . . . . . . . . . . . . . . . . . 156<br />

6.2 Waveform inversion and starting frequency . . . . . . . . . . . . . . . . . . . 157<br />

6.3 Waveform inversion and starting mo<strong>de</strong>l . . . . . . . . . . . . . . . . . . . . . 158<br />

6.4 Towards the waveform inversion of real data . . . . . . . . . . . . . . . . . . . 158<br />

A Image stretch and NMO stretch 161<br />

Bibliography 163


List of Figures<br />

2.1 The Inverse and the Forward Prob<strong>le</strong>m in waveform inversion. . . . . . . . . . . 14<br />

2.2 Illustration of the Singu<strong>la</strong>r Value Decomposition. . . . . . . . . . . . . . . . . 17<br />

2.3 Scheme of the Gauss-Newton method applied to linear prob<strong>le</strong>m. . . . . . . . . 22<br />

2.4 Scheme of gradient method applied to linear prob<strong>le</strong>m. . . . . . . . . . . . . . . 24<br />

2.5 Construction of the gradient vector from eigenvectors and eigenvalues of the<br />

Hessian. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />

2.6 Examp<strong>le</strong> of a 2-D, linear inverse prob<strong>le</strong>m. . . . . . . . . . . . . . . . . . . . . 30<br />

2.7 Local methods applied to the 2-D, linear inverse prob<strong>le</strong>m. . . . . . . . . . . . 31<br />

2.8 Evolution of the misfit function with iterations using local methods. . . . . . . 32<br />

2.9 Scheme of linearised inversion. . . . . . . . . . . . . . . . . . . . . . . . . . . 36<br />

2.10 Examp<strong>le</strong> of a non-linear inverse prob<strong>le</strong>m. . . . . . . . . . . . . . . . . . . . . 38<br />

2.11 Results of local methods with an accurate starting mo<strong>de</strong>l. . . . . . . . . . . . . 40<br />

2.12 Results with a starting mo<strong>de</strong>l where the quadratic assumption fails. . . . . . . . 41<br />

2.13 Results with a starting mo<strong>de</strong>l is close to a local minimum. . . . . . . . . . . . . 42<br />

3.1 Wavenumber illumination for a sing<strong>le</strong> frequency and source/receiver pair. . . . 57<br />

3.2 Wavenumber illumination for different inci<strong>de</strong>nt/scattering ang<strong>le</strong>. . . . . . . . . 58<br />

3.3 The 1-D basic scattering experiment. . . . . . . . . . . . . . . . . . . . . . . . 61<br />

3.4 Illustration of the frequency discretisation strategy. . . . . . . . . . . . . . . . 64<br />

3.5 Velocity mo<strong>de</strong>l after Freu<strong>de</strong>nreich and Singh, 2000. . . . . . . . . . . . . . . 66<br />

3.6 Illustration of the computation of the 5 Hz gradient. . . . . . . . . . . . . . . . 67<br />

3.7 A set of amplitu<strong>de</strong> normalized gradient images. . . . . . . . . . . . . . . . . . 69<br />

3.8 Frequency sequence generated by equation (3.18). . . . . . . . . . . . . . . . . 71<br />

3.9 Sequential frequency domain inversion results. . . . . . . . . . . . . . . . . . 72<br />

3.10 Time-like waveform inversion. . . . . . . . . . . . . . . . . . . . . . . . . . . 74<br />

xv


xvi<br />

LIST OF FIGURES<br />

3.11 Imapact of noise on the reconstruction of the 1-D target. . . . . . . . . . . . . 75<br />

3.12 Marmousi wi<strong>de</strong> ang<strong>le</strong> synthetic data. . . . . . . . . . . . . . . . . . . . . . . . 77<br />

3.13 Examp<strong>le</strong> of the finite difference solver of the eikonal. . . . . . . . . . . . . . . 78<br />

3.14 Evolution of the traveltime misfit function with respect to smoothness of the<br />

true Marmousi mo<strong>de</strong>l. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80<br />

3.15 First arrival traveltime inversion. . . . . . . . . . . . . . . . . . . . . . . . . . 81<br />

3.16 Waveform inversion in Marmousi. . . . . . . . . . . . . . . . . . . . . . . . . 82<br />

3.17 Waveform inversion results in the Marmousi mo<strong>de</strong>l at progressively higher frequencies.<br />

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83<br />

3.18 Velocity profi<strong>le</strong>s at three locations in the Marmousi mo<strong>de</strong>l. . . . . . . . . . . . 84<br />

3.19 Data residuals in the Marmousi mo<strong>de</strong>l. . . . . . . . . . . . . . . . . . . . . . . 86<br />

4.1 SVD at 3 Hz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97<br />

4.2 SVD at 5 Hz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98<br />

4.3 SVD at 10 Hz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99<br />

4.4 1-D velocity mo<strong>de</strong>l with a constant vertical velocity gradient. . . . . . . . . . . 100<br />

4.5 SVD at 3 Hz in the mo<strong>de</strong>l with a vertical velocity gradient. . . . . . . . . . . . 101<br />

4.6 SVD at 5 Hz in the mo<strong>de</strong>l with a vertical velocity gradient.. . . . . . . . . . . . 102<br />

4.7 SVD at 10 Hz in the mo<strong>de</strong>l with a vertical velocity gradient.. . . . . . . . . . . 103<br />

4.8 SVD at 3 Hz in mo<strong>de</strong>l with velocity gradient. Slowness parameterisation. . . . 105<br />

4.9 SVD at 5 Hz in mo<strong>de</strong>l with velocity gradient. Slowness parameterisation. . . . 106<br />

4.10 SVD at 10 Hz in mo<strong>de</strong>l with velocity gradient. Slowness parameterisation. . . . 107<br />

4.11 Illustration of the cyc<strong>le</strong> skipping prob<strong>le</strong>m. . . . . . . . . . . . . . . . . . . . . 109<br />

4.12 Misfit function of the linearization of the 3D homogeneous Green’s function . . 111<br />

4.13 The 1-D Marmousi velocity mo<strong>de</strong>l. . . . . . . . . . . . . . . . . . . . . . . . 112<br />

4.14 Non-linearity and wavenumbers. . . . . . . . . . . . . . . . . . . . . . . . . . 113<br />

4.15 Non-linearity and offset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114<br />

4.16 2-D low-pass wavenumber filter of the gradient vector. . . . . . . . . . . . . . 116<br />

4.17 1-D inversion with gradient preconditioning by wavenumber filtering. . . . . . 117<br />

4.18 Finite difference mo<strong>de</strong>lling with different value of τ. . . . . . . . . . . . . . . 120<br />

4.19 Misfit function of the far offset at 7 Hz and time damping. . . . . . . . . . . . 121<br />

4.20 1-D velocity mo<strong>de</strong>l containing a shallow and <strong>de</strong>ep heterogeneity. . . . . . . . . 123<br />

4.21 Preconditioning of the data residuals using an f-k filter. . . . . . . . . . . . . . 123


LIST OF FIGURES<br />

xvii<br />

4.22 Standard waveform inversion at 7 Hz in the 1-D Marmousi mo<strong>de</strong>l. . . . . . . . 126<br />

4.23 1-D waveform inversion at 7 Hz with preconditioning of the data residuals and<br />

gradient vector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127<br />

4.24 2D exten<strong>de</strong>d Marmousi mo<strong>de</strong>l. . . . . . . . . . . . . . . . . . . . . . . . . . . 128<br />

4.25 Standard waveform inversion at 7 Hz. . . . . . . . . . . . . . . . . . . . . . . 129<br />

4.26 Preconditioned <strong>Inversion</strong> at 7 Hz starting from the exten<strong>de</strong>d “FAST” mo<strong>de</strong>l. . . 130<br />

4.27 Velocity profi<strong>le</strong>s at 3 locations in the mo<strong>de</strong>l. . . . . . . . . . . . . . . . . . . . 131<br />

4.28 Shot gathers at 3 locations in the true mo<strong>de</strong>l. . . . . . . . . . . . . . . . . . . . 132<br />

4.29 <strong>Inversion</strong> at 7 Hz starting from an improved macro mo<strong>de</strong>l. . . . . . . . . . . . 133<br />

5.1 Converted waves in sub-basalt imaging. . . . . . . . . . . . . . . . . . . . . . 137<br />

5.2 The sub-basalt synthetic data. . . . . . . . . . . . . . . . . . . . . . . . . . . . 140<br />

5.3 <strong>Inversion</strong> starting from an accurate macro-mo<strong>de</strong>l. . . . . . . . . . . . . . . . . 142<br />

5.4 Evolution of the misfit function as the entire mo<strong>de</strong>l is <strong>de</strong>creasingly smoothed. . 143<br />

5.5 Evolution of the misfit function as the sediment and top basalt interface are<br />

<strong>de</strong>creasingly smoothed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144<br />

5.6 Illustration of the i<strong>de</strong>alised <strong>la</strong>yer stripping strategy . . . . . . . . . . . . . . . . 145<br />

5.7 Evolution of the misfit function as the sediment <strong>la</strong>yer is <strong>de</strong>creasingly smoothed. 146<br />

5.8 Evolution of the misfit function as the basalt <strong>la</strong>yer is <strong>de</strong>creasingly smoothed. . 147<br />

5.9 Evolution of the misfit function as the sub-basalt <strong>la</strong>yer is <strong>de</strong>creasingly smoothed. 147<br />

5.10 A<strong>de</strong>quate velocity mo<strong>de</strong>ls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148<br />

5.11 Waveform inversion at 7 Hz when for a wrong and a<strong>de</strong>quate starting mo<strong>de</strong>ls. . . 150<br />

5.12 Wavefield in the sediments in time in the true mo<strong>de</strong>l and inversion results with<br />

the bad starting mo<strong>de</strong>l. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152


xviii<br />

LIST OF FIGURES


List of Tab<strong>le</strong>s<br />

2.1 Condition of existence of G −1 in equation 2.5 (p. 48 of (Sca<strong>le</strong>s et al., 2001)). . 16<br />

2.2 Review of possib<strong>le</strong> cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20<br />

2.3 Possib<strong>le</strong> mo<strong>de</strong>l parameterisations. . . . . . . . . . . . . . . . . . . . . . . . . 48<br />

3.1 Frequency discretisation sequences for the near and far offset surveys. The<br />

<strong>la</strong>st frequency 8 Hz is chosen arbitrarily so that both experiments use i<strong>de</strong>ntical<br />

frequency ranges within the spectrum of the source signature. . . . . . . . . . . 70<br />

4.1 Number of nonzero singu<strong>la</strong>r values with frequency and offset range. . . . . . . 95<br />

4.3 Strategy of preconditioning for the waveform inversion for the 2-D exten<strong>de</strong>d<br />

Marmousi experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129<br />

5.1 Synthetic data characteristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . 139<br />

xix


xx<br />

LIST OF TABLES


Chapter 1<br />

Introduction<br />

Seismic exploration techniques are used to <strong>de</strong>termine the geological structure of the subsurface<br />

as a routine component in the search of hydrocarbon reservoirs. The seismic acquisition experiment<br />

is triggered by a control<strong>le</strong>d source which in turns, initiates a propagating wave within<br />

the earth that is ultimately recor<strong>de</strong>d at the surface by receivers. The recor<strong>de</strong>d signal is then<br />

processed in or<strong>de</strong>r to <strong>de</strong>termine a mo<strong>de</strong>l representing the geological structure of the subsurface.<br />

This mo<strong>de</strong>l is most often characterised by the propagation velocities and is in practice, most<br />

often <strong>de</strong>composed by spatial sca<strong>le</strong>s: the low wavenumbers <strong>de</strong>fine the <strong>la</strong>rge sca<strong>le</strong>s of the mo<strong>de</strong>l<br />

whi<strong>le</strong> the high wavenumbers <strong>de</strong>scribe the fine <strong>de</strong>tails of the structure.<br />

This <strong>de</strong>composition by sca<strong>le</strong> is an approach that is justified by the information content of<br />

the experiment. Jannane et al. (1989) showed that typical near offset, seismic ref<strong>le</strong>ction data are<br />

sensitive only to the very low and the very high wavenumbers; the intermediate wavenumbers<br />

are invisib<strong>le</strong> to the data. The standard approach in exploration imaging has, as a result, evolved<br />

into a recovery of the low wavenumbers of the velocity field (macro-mo<strong>de</strong>l) followed by the<br />

reconstruction of the very high wavenumbers components. During the <strong>la</strong>st <strong>de</strong>ca<strong>de</strong>s, this <strong>de</strong>composition<br />

has been successfully applied to many areas of seismic exploration presenting simp<strong>le</strong><br />

geological structure. More recently, seismic exploration targets have inclu<strong>de</strong>d geographical areas<br />

in which the subsurface contains more comp<strong>le</strong>x features. Examp<strong>le</strong>s inclu<strong>de</strong> areas with significant<br />

salt flows (Ward et al., 1994), or significant amounts of volcanics (P<strong>la</strong>nke et al., 1999),<br />

both of which cause severe imaging prob<strong>le</strong>ms. The strong heterogeneity as well as the 2D/3D<br />

nature of the media significantly diminish the efficacy of standard imaging techniques. The<br />

most wi<strong>de</strong>ly used methods for imaging comp<strong>le</strong>x media are prestack <strong>de</strong>pth migration. These approaches<br />

however, are currently still applied within the standard approach, in which no attempt<br />

1


2 CHAPTER 1. INTRODUCTION<br />

is ma<strong>de</strong> to recover the intermediate wavenumbers. The accuracy of prestack imaging has been<br />

a popu<strong>la</strong>r topic of investigation and synthetic seismic data generated in very comp<strong>le</strong>x velocity<br />

mo<strong>de</strong>ls have been avai<strong>la</strong>b<strong>le</strong> to the research community. Some examp<strong>le</strong>s are the SEG/EAGE salt<br />

and overthrust mo<strong>de</strong>ls, and the Marmousi mo<strong>de</strong>l (Versteeg, 1994). These mo<strong>de</strong>ls are representative<br />

of the current chal<strong>le</strong>nges of imaging and are wi<strong>de</strong>ly used as benchmarks for testing the<br />

accuracy of current methods.<br />

One of the main difficulties in imaging is that the accuracy of the reconstruction of the high<br />

wavenumbers by migration strongly <strong>de</strong>pends on the accuracy of the macro-mo<strong>de</strong>l. Versteeg<br />

(1993) used the Marmousi mo<strong>de</strong>l to show that prestack <strong>de</strong>pth migration may be significantly<br />

improved when the macro-mo<strong>de</strong>l contains some of the intermediate wavenumbers. On the other<br />

hand, none of the techniques routinely employed for the <strong>de</strong>termination of the macro-mo<strong>de</strong>l are<br />

capab<strong>le</strong> of recovering these wavenumbers.<br />

The aims of this thesis is to apply waveform inversion of wi<strong>de</strong>-ang<strong>le</strong> seismic data, to reconstruct<br />

a continuous range of wavenumbers of a comp<strong>le</strong>x structure. The goal is to produce a<br />

velocity mo<strong>de</strong>l which contains both the low and intermediate wavenumbers of the velocity field<br />

so that the reconstruction of the high wavenumbers may be significantly improved. I will use<br />

two synthetic case studies: the first is an extension of the Marmousi mo<strong>de</strong>l, in which additional,<br />

wi<strong>de</strong>-ang<strong>le</strong> data are used to supp<strong>le</strong>ment the original survey geometry. The second synthetic<br />

examp<strong>le</strong> is used to illustrate the issues involved in imaging through high velocity subsurface<br />

basaltic <strong>de</strong>posits.<br />

1.1 Determination of the macro-mo<strong>de</strong>l<br />

The <strong>de</strong>termination of the very low wavenumber components, referred to as the macro-mo<strong>de</strong>l or<br />

velocity mo<strong>de</strong>l, aims to solve the kinematic aspect of the prob<strong>le</strong>m. The methods employed for<br />

the <strong>de</strong>termination of the macro-mo<strong>de</strong>l thus utilise the traveltime information of the data. The<br />

most frequently used technique is based on the <strong>de</strong>rivation of the velocity mo<strong>de</strong>l by stacking<br />

velocity analysis in which the velocities are <strong>de</strong>fined from the normal moveout (NMO) of the<br />

ref<strong>le</strong>ction hyperbo<strong>la</strong> (Yilmaz, 1987). The stacking velocities may then be converted to interval<br />

velocities using the Dix’s formu<strong>la</strong> (Dix, 1955). However, such a velocity mo<strong>de</strong>l is <strong>de</strong>fined<br />

in time and must be converted to <strong>de</strong>pth; moreover, the NMO correction also implies that the<br />

medium is tabu<strong>la</strong>r and therefore 1-D. Because the macro-mo<strong>de</strong>l is often associated with a comp<strong>le</strong>x<br />

structure, the 2/3-D nature of the low wavenumbers must be properly taken into account.


1.1. DETERMINATION OF THE MACRO-MODEL 3<br />

More advanced techniques such as traveltime tomography have therefore been <strong>de</strong>veloped<br />

and are ab<strong>le</strong> to hand<strong>le</strong> <strong>la</strong>terally varying media (Bishop et al., 1985; Farra and Madariaga, 1988).<br />

The traveltime tomography method consists of the resolution of an inverse prob<strong>le</strong>m that seeks<br />

to minimise the mismatch between observed and calcu<strong>la</strong>ted traveltime. The observed traveltime<br />

are picked in the data volume and are i<strong>de</strong>ntified by coherent events such as ref<strong>le</strong>ction hyperbo<strong>la</strong>s<br />

or diving/refracted arrivals. The calcu<strong>la</strong>ted data are computed using ray tracing techniques<br />

based on the high frequency asymptotic approximation of the wave equation ( ˘Cervený et al.,<br />

1977; Chapman, 1985). Such mo<strong>de</strong>lling methods are accurate if the mo<strong>de</strong>l varies slowly with<br />

respect to the wave<strong>le</strong>ngth of the propagating wave and are therefore not adapted to simu<strong>la</strong>tion of<br />

propagation within highly heterogeneous media (Chapman, 1985). For this reason, the velocity<br />

mo<strong>de</strong>l estimated from ray based tomography is required to be smooth; the blocky nature of<br />

the mo<strong>de</strong>l may however be accounted for by including velocity discontinuities (interfaces).<br />

A wi<strong>de</strong>ly used 2-D traveltime inversion technique is the program rayinvr <strong>de</strong>veloped by Zelt<br />

and Smith (1992) that allows the simultaneous inversion of various types of ray propagation:<br />

ref<strong>le</strong>ctions, refractions and diving waves.The imp<strong>le</strong>mentation of this application requires the<br />

traveltime picking of events in time that must be i<strong>de</strong>ntified with features in the mo<strong>de</strong>l: for<br />

examp<strong>le</strong> ref<strong>le</strong>ctions as well as refractions must be associated with ref<strong>le</strong>ctors (interface). Such a<br />

method therefore <strong>de</strong>mands a strong a priori know<strong>le</strong>dge of the velocity structure as the number<br />

of <strong>la</strong>yers and <strong>la</strong>teral discontinuities must be known which may be difficult to <strong>de</strong>termine in the<br />

presence of a comp<strong>le</strong>x structure.<br />

Traveltime tomography relying on the use of the first arrival traveltimes alone may also be<br />

used (Zelt and Barton, 1998). The calcu<strong>la</strong>ted traveltimes are computed using a finite difference<br />

solver of the eikonal equation (Vida<strong>le</strong>, 1990; Podvin and Lecomte, 1991); diving rays are traced<br />

by following the normal of the wavefront connecting the receiver to the source. This method<br />

presents a great advantage: the first arrival picked traveltimes are not required to be associated<br />

with particu<strong>la</strong>r horizons of the mo<strong>de</strong>l. However, <strong>la</strong>rge offset data are necessary to assure the<br />

proper illumination of the <strong>de</strong>ep part of the mo<strong>de</strong>l. Also, the exploitation of the first arrival<br />

information is not very efficient in the <strong>de</strong>termination of low velocity zones which tend to be<br />

avoi<strong>de</strong>d by first arrival diving rays.<br />

Alternative methods have emerged that aims to avoid the “hand” picking and interpretation<br />

of coherent events in the data. Bil<strong>le</strong>tte and Lambaré (1998) proposed the stereotomography approach<br />

that relies on the picking of locally coherent events that may be per<strong>forme</strong>d automatically.<br />

This picking does not require the association of traveltime picks with continuous ref<strong>le</strong>ctions in


4 CHAPTER 1. INTRODUCTION<br />

the data volume or the introduction of interfaces in the mo<strong>de</strong>l.<br />

In or<strong>de</strong>r to avoid the task of picking the traveltime prerequisite to tomographic inversion, migration<br />

velocity analysis methods have been proposed (Al-Yahya, 1989; Symes and Carazzone,<br />

1991; Docherty et al., 1997; Chauris et al., 2002). These approaches rely on the optimisation of<br />

trace coherence in the migrated domain. An inverse prob<strong>le</strong>m is posed that seeks to update the<br />

velocity mo<strong>de</strong>l in a way that optimises the f<strong>la</strong>tness/semb<strong>la</strong>nce of common image gathers (CIG).<br />

The <strong>de</strong>termination of the macro-mo<strong>de</strong>l by migration velocity analysis is often consi<strong>de</strong>red the<br />

most <strong>de</strong>sirab<strong>le</strong> method by the seismic industry since it optimises the process of migration which<br />

is the most wi<strong>de</strong>ly used techniques for the recovery of the high wavenumbers.<br />

1.2 Determination of the high wavenumbers by migration<br />

The recovery of the very high wavenumbers is the comp<strong>le</strong>mentary step to the <strong>de</strong>termination of<br />

the macro-mo<strong>de</strong>l. The ref<strong>le</strong>ctivity field provi<strong>de</strong>s information on the fine structure of the mo<strong>de</strong>l<br />

thereby allowing the localisation of ref<strong>le</strong>ctors in <strong>de</strong>pth. Migration is the most commonly used<br />

technique for the <strong>de</strong>termination of the high wavenumber components. It was applied to zero<br />

offset data after NMO correction (C<strong>la</strong>erbout and Doherty, 1972) and relies on the concept of<br />

exploding ref<strong>le</strong>ctors (C<strong>la</strong>erbout, 1985). The princip<strong>le</strong> of summation is the basis of Kirchoff<br />

migration: the magnitu<strong>de</strong> of a diffraction point is <strong>de</strong>termined by Kirchhoff summation (Schnei<strong>de</strong>r,<br />

1978), which sums the amplitu<strong>de</strong> of the data along the diffraction hyperbo<strong>la</strong>. Alternative<br />

poststack migrations however exist such, as f-k migration (Stolt, 1978; Gazdag, 1978), reverse<br />

time migration (Baysal et al., 1983) and finite difference migration (C<strong>la</strong>erbout, 1985). Because<br />

poststack migration is applied after NMO correction of the data, this approach is not accurate<br />

for imaging dipping ref<strong>le</strong>ctors. Dip-moveout correction (DMO), or prestack partial migration<br />

(PSPM) have emerged in an attempt to correct the effect of dips on the data thus improving<br />

poststack imaging methods (Yilmaz and C<strong>la</strong>erbout, 1980; Ha<strong>le</strong>, 1984).<br />

Prestack migration is now commonly used as it is the most efficient method for the imaging<br />

of comp<strong>le</strong>x media (Schultz and Sherwood, 1980; Wiggins, 1984). The migration of prestack<br />

data often uses the asymptotic approximation of the wave equation for Kirchoff migration<br />

(Carter and Frazer, 1984; Beylkin, 1985; Mil<strong>le</strong>r et al., 1987; Keho and Beydoun, 1988). Some<br />

limitations of the prestack migration have neverthe<strong>le</strong>ss been pointed out as ray methods exploit<br />

the first arrivals which may not be the most energetic arrival (Geoltrain and Brac, 1993; Moser,<br />

1991; Gray and May, 1994). Alternative approaches based on the one-way approximation of


1.3. QUANTITATIVE MIGRATION AND LINEARISED INVERSION 5<br />

the wave equation have shown to be more robust since they account for <strong>la</strong>ter arrivals (Ehinger<br />

et al., 1996). However, even when the macro-mo<strong>de</strong>l exp<strong>la</strong>ins correctly the kinematic aspect of<br />

the imaging, an accurate migration was not originally expected to provi<strong>de</strong> a quantitative estimation<br />

of the ref<strong>le</strong>ctivity field. Therefore, the result of migration cannot be linked to physical<br />

parameters that may more rigorously <strong>de</strong>scribe the mo<strong>de</strong>l.<br />

1.3 Quantitative migration and linearised inversion<br />

Many techniques have been proposed to provi<strong>de</strong> quantitative estimates of the mo<strong>de</strong>l parameters.<br />

Most of these techniques rely on: 1- the ray theory and 2- the Born approximation (Born, 1923)<br />

or the Kirchhoff approximation (Schnei<strong>de</strong>r, 1978). They can all be <strong>de</strong>fined within the context<br />

of linearised inverse prob<strong>le</strong>m theory (Lailly, 1983; Taranto<strong>la</strong>, 1984b), where the inverse of the<br />

linear forward operator must be estimated. In an early work, Cohen and B<strong>le</strong>istein (1979) <strong>de</strong>fined<br />

a direct inverse operator for zero offset data. Thereafter, inversion procedures for prestack<br />

data were proposed (C<strong>la</strong>yton and Stolt, 1981; Beylkin, 1985; Ikel<strong>le</strong> et al., 1986; Mil<strong>le</strong>r et al.,<br />

1987). These <strong>de</strong>velopments initiated the concept of migration/inversion using kinematic Kirchhoff<br />

migration where amplitu<strong>de</strong> are estimated using the Born or the Kirchoff approximations<br />

(B<strong>le</strong>istein, 1987; B<strong>le</strong>istein et al., 1987; Beydoun and Men<strong>de</strong>s, 1989). These types of migration<br />

methods are commonly used because of their computational efficiency for manipu<strong>la</strong>ting very<br />

<strong>la</strong>rge data sets (Thierry et al., 1999). As an alternative to the direct inverse method, iterative<br />

methods are often consi<strong>de</strong>red more robust (Lambaré et al., 1992; Jin et al., 1992; Nemeth et al.,<br />

1999). Iterative methods rely on the explicit minimisation of <strong>le</strong>ast-square misfit function and are<br />

more efficient in handling incomp<strong>le</strong>te data sets. As discussed in the previous section, one of the<br />

difficulty of ray-based migration is its difficulty in accounting for multipathing. More advanced<br />

ray tracing methods have therefore been proposed with the concept of waveform construction<br />

(Vinje et al., 1993; Lambaré et al., 1996) and have shown to improve the quantitative estimation<br />

of migration/inversion (Operto et al., 2000).<br />

However, the accuracy of the estimation of the mo<strong>de</strong>l parameter using linearised inversion<br />

is limited by the validity of the Born or the Kirchoff approximation. For examp<strong>le</strong>, the<br />

Born approximation only accounts for first or<strong>de</strong>r scattering and is only valid in the presence of<br />

small velocity perturbation (Kel<strong>le</strong>r, 1969; Beylkin and Ostroglio, 1985; Beydoun and Taranto<strong>la</strong>,<br />

1988). On the other hand, the Kirchoff approximation accounts for the ref<strong>le</strong>ctivity which cannot<br />

directly be re<strong>la</strong>ted to velocity. Amplitu<strong>de</strong> versus offset (AVO) analysis are then required to


6 CHAPTER 1. INTRODUCTION<br />

quantify the mo<strong>de</strong>l in terms of velocity (see for examp<strong>le</strong> Yilmaz (1987)).<br />

1.4 Migration/<strong>Inversion</strong> and non-linearity<br />

It is well known that the <strong>de</strong>termination of mo<strong>de</strong>l parameters from seismic data is a non-linear<br />

inverse prob<strong>le</strong>m. The issue of non-linearity is partially addressed in imaging by <strong>de</strong>composing<br />

the process into a <strong>de</strong>termination of the macro-mo<strong>de</strong>l followed by (quantitative) migration. It<br />

is commonly accepted that migration will fail, at <strong>le</strong>ast kinematically, if the low wavenumber<br />

components of the mo<strong>de</strong>l are inaccurate. Therefore, the non-linearity of the inverse prob<strong>le</strong>m is<br />

expected to be addressed by the <strong>de</strong>termination of the macro-mo<strong>de</strong>l. In addition, the linearised<br />

inversion approach assumes that the quantification of the mo<strong>de</strong>l parameters is a linear process<br />

once the kinematic aspects of the data is resolved. The non-linear inverse prob<strong>le</strong>m is therefore<br />

somewhat solved by <strong>de</strong>composing the prob<strong>le</strong>m into a recovery of the low wavenumbers, followed<br />

by the reconstruction of the high wavenumbers. However, as illustrated by Stork (1992a)<br />

in the case of f<strong>la</strong>ttening of CIGs, the prob<strong>le</strong>m of <strong>de</strong>termination of the macro-mo<strong>de</strong>l is nonunique<br />

and can easily provi<strong>de</strong> a wrong answer. As a result, migration may locate the ref<strong>le</strong>ctors<br />

at a wrong <strong>de</strong>pth and linearised inversion may provi<strong>de</strong> an inaccurate evaluation of the mo<strong>de</strong>l<br />

parameters.<br />

1.5 Non-linear waveform inversion<br />

As an alternative to imaging techniques, the non-linear waveform inversion approach has been<br />

proposed. The objective of waveform inversion is to estimate a quantitative mo<strong>de</strong>l of the subsurface<br />

in a way that minimises the differences (residuals) between observed and calcu<strong>la</strong>ted data,<br />

by mo<strong>de</strong>lling the physics of the measurement. This minimisation is achieved by localising the<br />

global minimum of the misfit function. The calcu<strong>la</strong>ted data must therefore be generated using<br />

a fully non-linear mo<strong>de</strong>lling algorithm, usually a finite difference solver of the wave equation<br />

(Kelly et al., 1976). Some of the comp<strong>le</strong>xity of the wave propagation in comp<strong>le</strong>x media is thus<br />

taken into account and both phase and amplitu<strong>de</strong> data misfit must be minimised.<br />

Among non-linear waveform inversion methods, the global minimum may most accurately<br />

be found by performing a global search of the misfit function (Cary and Chapman, 1988; Sen<br />

and Stoffa, 1991). These methods carry out a systematic exploration of the multidimensional


1.5. NON-LINEAR WAVEFORM INVERSION 7<br />

misfit function using for examp<strong>le</strong>, Monte-carlo, genetic or simu<strong>la</strong>ted annealing algorithms. Although<br />

these techniques are capab<strong>le</strong> of handling non-linear behaviour by inverting for both low<br />

and high wavenumbers, they require the computation of the same or<strong>de</strong>r of forward mo<strong>de</strong>llings<br />

as there are mo<strong>de</strong>l parameters involved. Another approach is to <strong>de</strong>coup<strong>le</strong> the low and high<br />

wavenumber information, inverting for each parameter set alternately (Snie<strong>de</strong>r et al., 1989; Cao<br />

et al., 1990; Hicks and Pratt, 2001). This <strong>la</strong>tter approach achieves the required <strong>de</strong>coupling by<br />

a re-parametrisation in time of the velocity mo<strong>de</strong>l, thus ensuring the zero offset traveltimes<br />

remain fixed during the ref<strong>le</strong>ctivity reconstruction. This method assumes a near-vertical propagation<br />

of the near offset events, and is thus limited to situations in which the 1-D approximation<br />

can be applied locally.<br />

Because of the high computational cost of calcu<strong>la</strong>ting synthetic seismic data, non-linear<br />

waveform inversion is usually formu<strong>la</strong>ted as an iterative “<strong>de</strong>scent” method, in which the minimisation<br />

of residuals is achieved through the repeated calcu<strong>la</strong>tion of a local gradient. The<br />

gradient at each iteration provi<strong>de</strong>s the direction of minimisation of the “objective” functional,<br />

usually the L 2 norm of the data residuals possibly combined with some form of regu<strong>la</strong>rising<br />

constrains. Since 1983, it has been recognised that the calcu<strong>la</strong>tion of the gradient is of the same<br />

computational or<strong>de</strong>r as the forward mo<strong>de</strong>lling task, and that the gradient is closely re<strong>la</strong>ted to<br />

seismic migration (Lailly, 1983; Taranto<strong>la</strong>, 1984a). Success <strong>de</strong>pends upon the topography of<br />

the misfit function at the location of an initial guess of the velocity mo<strong>de</strong>l (starting mo<strong>de</strong>l). In<br />

the presence of strong non-linearities, the inversion may fail and convergence into local minimum<br />

may occur (Gauthier et al., 1986). The starting mo<strong>de</strong>l for such scheme must therefore be<br />

located in the neighbourhood of the global minimum, i.e., a <strong>de</strong>scent path <strong>le</strong>ading to the global<br />

minimum must exist if the method is to succeed.<br />

Iterative non-linear waveform inversion was first applied to near offset ref<strong>le</strong>ction data to<br />

recover the high wavenumber components of the mo<strong>de</strong>l (Taranto<strong>la</strong>, 1986; Mora, 1987; Pica et<br />

al., 1990; Crase et al., 1990). The advantage of the non-linear approach over the linearised<br />

approach is that it will locate more efficiently the global minimum since the misfit function is<br />

not required to be locally quadratic. In other words, the recovery of the high wavenumbers is<br />

not limited by the validity of the Born approximation, since the forward prob<strong>le</strong>m used does<br />

take into account multip<strong>le</strong> scattering. However, the <strong>de</strong>termination of the macro-mo<strong>de</strong>l remains<br />

a critical aspect of the non-linear inversion of near offset data as it will cause convergence into<br />

a local minimum if the starting macro-mo<strong>de</strong>l velocities are not sufficiently accurate. Non-linear<br />

inversion of near offset data therefore suffers from the same limitation than migration/inversion


8 CHAPTER 1. INTRODUCTION<br />

techniques and <strong>de</strong>pends strongly on the accuracy of the macro-mo<strong>de</strong>l.<br />

1.6 Non-linear inversion of wi<strong>de</strong>-ang<strong>le</strong> data<br />

Mora (1988) recognised that the inversion of transmission data may allow the recovery of some<br />

of the low wavenumbers of the velocity mo<strong>de</strong>l. Non-linear waveform inversion may therefore<br />

be utilised for the recovery of both the low and the high wavenumbers as in Mora’s (1989)<br />

expression<br />

inversion = migration + tomography.<br />

The standard concept of imaging that <strong>de</strong>composes the mo<strong>de</strong>l of subsurface into the very low<br />

and the very high wavenumbers may be re<strong>de</strong>fined: waveform inversion may be used to <strong>de</strong>scribe<br />

the velocity mo<strong>de</strong>l over a continuous range of wavenumbers so that the “ho<strong>le</strong>” of wavenumber<br />

information present in c<strong>la</strong>ssical methods no longer exists. The <strong>de</strong>termination of the intermediate<br />

wavenumbers is particu<strong>la</strong>rly important for the recovery of the high wavenumbers. This was<br />

shown by Versteeg (1993) who <strong>de</strong>monstrated that prestack <strong>de</strong>pth migration in the Marmousi<br />

mo<strong>de</strong>l may be significantly improved when spatial wave<strong>le</strong>ngths up to 200 m are present in the<br />

macro-mo<strong>de</strong>l.<br />

In or<strong>de</strong>r to exploit the information contained in transmission data such as diving and refracted<br />

waves, waveform inversion has been applied on wi<strong>de</strong>-ang<strong>le</strong> (<strong>la</strong>rge offset) seismic data<br />

(Mora, 1988, 1989; Sun and McMechan, 1992; Pratt et al., 1996; Shipp and Singh, 2002). The<br />

introduction of <strong>la</strong>rge offset data in the misfit function however contributes to an increase of<br />

the non-linearity of the inverse prob<strong>le</strong>m. Since the traveltime error is the result of integration<br />

of slowness errors over the ray path, errors contained in the starting velocity mo<strong>de</strong>l are<br />

expected to have a greater effect on the <strong>la</strong>rge offset data which correspond to the longest ray<br />

paths. Convergence into a local minimum will yield a wrong update of the low and intermediate<br />

wavenumbers of the velocity mo<strong>de</strong>l and thus will have a dramatic effect on the reconstruction<br />

of the high wavenumbers. Therefore, local minima must be very carefully avoi<strong>de</strong>d during the<br />

recovery of the low and intermediate wavenumbers to insure an accurate reconstruction of the<br />

high wavenumbers.<br />

The linearity of the waveform inverse prob<strong>le</strong>m also varies with temporal frequency of the<br />

data as low frequencies are more linear than high frequencies. The inversion of the lowest<br />

frequencies in the data thus appears to be <strong>de</strong>sirab<strong>le</strong> in or<strong>de</strong>r to increase the chance of conver-


1.7. AIM OF THE THESIS 9<br />

gence into the global minimum (Bunks et al., 1995). The typical seismic data bandwidth is<br />

neverthe<strong>le</strong>ss limited by acquisition practice.<br />

Another important aspect of the waveform inversion is the <strong>de</strong>termination of the starting<br />

mo<strong>de</strong>l where the initial <strong>de</strong>scent direction of the misfit function is computed. The starting mo<strong>de</strong>l<br />

must be obtained using the standard methods discussed previously for the <strong>de</strong>termination of<br />

the macro-mo<strong>de</strong>l (stacking velocity analysis, traveltime inversion...). The requirements on the<br />

starting mo<strong>de</strong>l are often very vague as it is consi<strong>de</strong>red that it must be in the neighbourhood of the<br />

global minimum. This begs the important question of the compatibility between the waveform<br />

inversion of the lowest frequencies avai<strong>la</strong>b<strong>le</strong> in the data and the limitations of c<strong>la</strong>ssical methods<br />

in terms of accuracy of the macro-mo<strong>de</strong>l. It is primarily this question that I seek to investigate<br />

in this thesis.<br />

1.7 Aim of the thesis<br />

The objective of this thesis is to apply local, non-linear waveform inversion methods on wi<strong>de</strong>ang<strong>le</strong><br />

seismic data for the recovery of a continuous range of wavenumbers of a comp<strong>le</strong>x velocity<br />

mo<strong>de</strong>l. Two velocity mo<strong>de</strong>ls will be used to investigate this question: the 2D Marmousi mo<strong>de</strong>l<br />

and a 1-D velocity mo<strong>de</strong>l representing the sub-basalt imaging prob<strong>le</strong>m.<br />

The 2D Marmousi numerical experiment explores the application of waveform inversion<br />

using a realistic starting velocity mo<strong>de</strong>l that was obtained from a standard method such as<br />

traveltime tomography. The 1D sub-basalt experiment analyses the requirements of the starting<br />

mo<strong>de</strong>l that ensure the success of waveform inversion.<br />

The method of waveform inversion in the frequency domain will be used (Pratt and Worthington,<br />

1990), as it presents significant advantages in term of computational efficiency. This<br />

work will examine the potential for applying waveform inversion in the frequency domain. The<br />

difficulties caused by the non-linearity of the waveform inverse prob<strong>le</strong>m will also be investigated,<br />

particu<strong>la</strong>rly with respect to the minimum frequency avai<strong>la</strong>b<strong>le</strong> in the data. This thesis also<br />

seeks to propose a methodology to incorporate a range of preconditioning tools in the inversion<br />

process (Taranto<strong>la</strong>, 1987; Mora, 1987), that may improve the linearity of the inverse prob<strong>le</strong>m.


10 CHAPTER 1. INTRODUCTION<br />

1.8 Summary of the thesis<br />

This thesis begins by reviewing the basics of inverse theory in Chapter 2. The resolution of<br />

linear inverse prob<strong>le</strong>m using the singu<strong>la</strong>r value <strong>de</strong>composition (SVD) method is first <strong>de</strong>scribed.<br />

This is followed by the <strong>de</strong>finition of local methods using Gauss-Newton and gradient techniques.<br />

A useful link is then established between the SVD and local methods. The resolution<br />

of non-linear inverse prob<strong>le</strong>m using the local methods is then <strong>de</strong>tai<strong>le</strong>d. Numerical examp<strong>le</strong>s of<br />

resolution of a linear and non-linear prob<strong>le</strong>m using local methods are shown. This review of<br />

inverse theory provi<strong>de</strong>s the e<strong>le</strong>ments for the introduction of the frequency domain waveform<br />

inversion.<br />

In Chapter 3, I aim to further un<strong>de</strong>rstand the re<strong>la</strong>tion between temporal frequencies, offset<br />

and wavenumber reconstruction of the mo<strong>de</strong>l. A strategy for se<strong>le</strong>cting frequencies is <strong>de</strong>rived<br />

that assures a continuous reconstruction of the wavenumber spectrum of the mo<strong>de</strong>l. This strategy<br />

makes use of the redundancy of information present when a range of offset is avai<strong>la</strong>b<strong>le</strong> and<br />

that allows to discretise temporal frequencies with a much <strong>la</strong>rger sampling interval than that dictated<br />

by the sampling theorem. It is shown that the <strong>la</strong>rger the maximum effective offset is, the<br />

fewer frequencies are required. The accuracy of the strategy, exact in a 1-D earth, is validated<br />

on a 1-D numerical experiment. The strategy is also useful in more general comp<strong>le</strong>x media<br />

as it is successfully applied on the 2-D Marmousi mo<strong>de</strong>l using a starting mo<strong>de</strong>l that was obtained<br />

from traveltime tomography. In this chapter, all waveform inversion however start from<br />

unrealistically low frequencies. A minimum frequency of 5 Hz is used in the 2-D Marmousi<br />

experiment. Such a frequency is consi<strong>de</strong>red too low to be present in real seismic data.<br />

In Chapter 4, I therefore investigate waveform inversion starting from a more realistic frequency:<br />

7Hz. It is shown that at this frequency, waveform inversion can easily converge into<br />

a local minimum. The aim of this chapter is therefore to further un<strong>de</strong>rstand the mechanism<br />

governing the inversion process in or<strong>de</strong>r to <strong>de</strong>sign a strategy that improves the linearity of the<br />

inverse prob<strong>le</strong>m. A SVD is applied on simp<strong>le</strong> 1-D mo<strong>de</strong>ls and indicates that the gradient vector<br />

is dominated by the high wavenumbers. A linearity study is also carried out showing that the<br />

low wavenumbers are the more non-linear components of the mo<strong>de</strong>l. As a result, the gradient<br />

vector must be smoothed in or<strong>de</strong>r to assure the proper reconstruction of the low wavenumber<br />

components. In or<strong>de</strong>r to mitigate the risk of convergence into a local minimum during the recovery<br />

of the low wavenumbers, a<strong>de</strong>quate preconditioning of the data residuals may be applied<br />

that focus on the inversion of the early arrivals.


1.8. SUMMARY OF THE THESIS 11<br />

In Chapter 5, I explore the feasibility of an application of waveform inversion to the subbasalt<br />

imaging prob<strong>le</strong>m. A non-linearity study is carried out that aims to <strong>de</strong>fine the requirements<br />

of the starting mo<strong>de</strong>l that ensure the success of waveform inversion. This study is per<strong>forme</strong>d on<br />

a 1-D velocity mo<strong>de</strong>l and emphasises the importance of the <strong>de</strong>termination of the starting mo<strong>de</strong>l<br />

in the waveform inversion process. It is shown that the requirements on the starting mo<strong>de</strong>l<br />

<strong>de</strong>pends on the minimum frequency avai<strong>la</strong>b<strong>le</strong> in the data: the higher this frequency is, the more<br />

accurate the starting mo<strong>de</strong>l needs to be.


12 CHAPTER 1. INTRODUCTION


Chapter 2<br />

Inverse theory<br />

2.1 Introduction<br />

The <strong>la</strong>ws of physics allow us to simu<strong>la</strong>te the action and interaction of physical parameters within<br />

a given system cal<strong>le</strong>d the mo<strong>de</strong>l. This mo<strong>de</strong>l is a representation of a natural system (the earth,<br />

the subsurface,... the atom...) and is <strong>de</strong>scribed by a set of mo<strong>de</strong>l parameters (velocity, <strong>de</strong>nsity,<br />

conductivity....). This mo<strong>de</strong>l, once excited, yields a set of measurab<strong>le</strong> quantities: the data. The<br />

system of equations re<strong>la</strong>ting the data d to the mo<strong>de</strong>l m is cal<strong>le</strong>d the forward prob<strong>le</strong>m and is<br />

noted<br />

g (m) = d (2.1)<br />

where g(m) is a set of mathematical equations <strong>de</strong>pen<strong>de</strong>nt on m.<br />

The reverse action consists in finding the mo<strong>de</strong>l m that exp<strong>la</strong>ins a set of observed data d obs<br />

and is cal<strong>le</strong>d the inverse prob<strong>le</strong>m. The comp<strong>le</strong>xity of the inverse prob<strong>le</strong>m is closely re<strong>la</strong>ted to<br />

the comp<strong>le</strong>xity of the forward prob<strong>le</strong>m. The <strong>la</strong>tter must be known in or<strong>de</strong>r to <strong>de</strong>termine the<br />

inverse solution. Often, this answer is an estimate and differs from the true mo<strong>de</strong>l because<br />

the forward mo<strong>de</strong>ling provi<strong>de</strong>s with incomp<strong>le</strong>te information on the mo<strong>de</strong>l. Furthermore, the<br />

forward prob<strong>le</strong>m is an approximation and the mo<strong>de</strong>l is a simplified representation of the true<br />

system. In addition, the observed data are recor<strong>de</strong>d by instruments and thus contain noise.<br />

In waveform inversion, we aim to recover a quantitative representation of a mo<strong>de</strong>l of the<br />

subsurface. The mo<strong>de</strong>l is the most often characterized by the P-wave velocity varying with <strong>de</strong>pth<br />

although other quantities may be accounted for (S-wave, quality factor....). The information<br />

used as data are the seismic traces recor<strong>de</strong>d at the earth’s surface by receivers. These receivers<br />

13


14 CHAPTER 2. INVERSE THEORY<br />

Data<br />

Mo<strong>de</strong>l<br />

Inverse Prob<strong>le</strong>m<br />

Theory<br />

Depth<br />

Time<br />

Forward Prob<strong>le</strong>m<br />

Figure 2.1: The Inverse and the Forward Prob<strong>le</strong>m in waveform inversion. The inverse prob<strong>le</strong>m<br />

seeks to estimate a <strong>de</strong>pth velocity mo<strong>de</strong>l of the subsurface from seismic data recor<strong>de</strong>d on the<br />

field.<br />

record in time the signal propagated through the sub-surface from a source. The concept of<br />

forward and inverse prob<strong>le</strong>m in the context of waveform inversion is illustrated Figure 2.1.<br />

A straight forward solution to the inverse prob<strong>le</strong>m is to <strong>de</strong>fine the inverse operator g −1 such<br />

as the estimated solution is<br />

m † = g −1 (d) , (2.2)<br />

these methods will be refered to as direct methods.<br />

In or<strong>de</strong>r to avoid the estimation of the inverse operator, local methods may be used which<br />

aim to estimate the mo<strong>de</strong>l update ∆m of an priori mo<strong>de</strong>l m o such as<br />

m † = m o + ∆m. (2.3)<br />

The mo<strong>de</strong>l update is found by minimizing the misfit function, measuring the mismatch between<br />

the observed and forward mo<strong>de</strong>l<strong>le</strong>d data. These methods are local in the sense that the path<br />

<strong>le</strong>ading to the inverse solution <strong>de</strong>pends on the initial guess of the mo<strong>de</strong>l m o .<br />

Direct and local methods are wi<strong>de</strong>ly used in seismic exploration and I will focus in this<br />

chapter on the resolution of the inverse prob<strong>le</strong>m in the case where:<br />

• The forward prob<strong>le</strong>m is linear<br />

• The forward prob<strong>le</strong>m is non-linear


2.2. INVERSION OF LINEAR PROBLEMS 15<br />

The resolution of the waveform inversion prob<strong>le</strong>m, imp<strong>le</strong>mented in the frequency domain will<br />

be introduced in the final section.<br />

2.2 <strong>Inversion</strong> of linear prob<strong>le</strong>ms<br />

For linear system, the forward prob<strong>le</strong>m is a linear combination of mo<strong>de</strong>l parameters and can be<br />

expressed in discretised form as<br />

Gm = d (2.4)<br />

where m is the mo<strong>de</strong>l vector of dimension (n m × 1), and d is the data vector of dimension<br />

(n d × 1). G is the (n d × n m ) forward operator matrix of coefficients in<strong>de</strong>pen<strong>de</strong>nt of m, that<br />

projects a mo<strong>de</strong>l vector into the data space. Finding the solution to equation (2.4) consists in<br />

finding the vector m † that exp<strong>la</strong>ins the observed data d obs .<br />

The direct method consists in finding the inverse matrix G −1 of the matrix G such that the<br />

inverse solution is<br />

m † = G −1 d. (2.5)<br />

The resolution of the inverse prob<strong>le</strong>m using direct methods thus implies that the matrix inverse<br />

can be found or at <strong>le</strong>ast estimated. The inverse matrix G −1 is <strong>de</strong>fined as<br />

and/or<br />

G −1 G = I nm for n m ≤ n d<br />

GG −1 = I nd for n m ≥ n d<br />

(2.6)<br />

where I n is the i<strong>de</strong>ntity matrix of dimension (n × n).<br />

The inverse matrix G −1 will not exist if the matrix G is not invertib<strong>le</strong> in which case a<br />

solution of the type given by equation (2.5) cannot be obtained. Furthermore, the existence of<br />

the inverse matrix does not assure the solution to be unique as there may be none, one or an<br />

infinite number of solutions as shown Tab<strong>le</strong> 2.1 (p. 48 of Sca<strong>le</strong>s et al. (2001)). If G is invertib<strong>le</strong><br />

and of dimension (n m × n d ), direct solver such as LU <strong>de</strong>composition may be used (Press et al.,<br />

1992) in the case of a well posed inverse prob<strong>le</strong>m.<br />

In the next section, I will introduce the technique of Singu<strong>la</strong>r Value Decomposition (SVD)<br />

which allows one to obtain an estimate of the inverse matrix cal<strong>le</strong>d the generalized inverse<br />

G † . The SVD is a powerful method since it also diagnoses the prob<strong>le</strong>m by providing useful<br />

information on how well the prob<strong>le</strong>m is posed. It also yields an estimate of the solution m † ,<br />

even in the case where G is not invertib<strong>le</strong>.


16 CHAPTER 2. INVERSE THEORY<br />

Unknowns/Equations G −1 exists if G has Solution<br />

n m < n d linearly in<strong>de</strong>pen<strong>de</strong>nt columns At most one<br />

n m = n d linearly in<strong>de</strong>pen<strong>de</strong>nt columns Only one<br />

n m > n d n d linearly in<strong>de</strong>pen<strong>de</strong>nt columns More than one<br />

Tab<strong>le</strong> 2.1: Condition of existence of G −1 in equation 2.5 (p. 48 of Sca<strong>le</strong>s et al. (2001)).<br />

2.2.1 Singu<strong>la</strong>r value <strong>de</strong>composition<br />

The Singu<strong>la</strong>r Value Decomposition (SVD) method was introduced by Lanczos (1961). I briefly<br />

review in this section the princip<strong>le</strong> results of the SVD. Further <strong>de</strong>tails on the method may be<br />

found in Aki and Richards (1980); Menke (1989); Sca<strong>le</strong>s et al. (2001).<br />

The singu<strong>la</strong>r value <strong>de</strong>composition of any matrix G of dimension (n d × n m ) is expressed as<br />

G = UΛV t (2.7)<br />

where the columns of U (dimension (n d × n d )) are the data space eigenvectors of GG t , and<br />

the column of V (dimension (n m × n m )) are the mo<strong>de</strong>l space eigenvectors of G t G. The matrix<br />

Λ of dimension (n d × n m ) is diagonal where the diagonal e<strong>le</strong>ments are cal<strong>le</strong>d singu<strong>la</strong>r values.<br />

There are min(n d , n m ) number of singu<strong>la</strong>r values which are <strong>de</strong>fined as the square roots of the<br />

first min(n d , n m ) eigenvalues of<br />

G t Gv i = λ 2 i v i i = 1, n m<br />

GG t u i = λ 2 i u i i = 1, n d<br />

(2.8)<br />

Each data eigenvector u i and mo<strong>de</strong>l eigenvector v i are columns of U and V respectively such<br />

as U and V form a basis of orthonormal vectors. Any vector of the data or mo<strong>de</strong>l space can<br />

thus be expressed as a linear combination of eigenvectors and we have<br />

U t U = UU t = I nd (2.9)<br />

V t V = VV t = I nm (2.10)<br />

Note that the eigenvalues λ 2 i <strong>de</strong>fined in equation (2.8) may be positive or zero real numbers so<br />

that the diagonal terms of Λ may contain zeros. If n m ≠ n d , the number of eigenvectors of<br />

the data and mo<strong>de</strong>l space are different. If p ≤ min(n d , n m ) is the number of non-zero singu<strong>la</strong>r<br />

values, the data and mo<strong>de</strong>l space have p eigenvectors with associated non-zero eigenvalues.


£<br />

2.2. INVERSION OF LINEAR PROBLEMS 17<br />

¤¦¥ §©¨<br />

¡ ¢<br />

<br />

<br />

<br />

<br />

Figure 2.2: Illustration of the Singu<strong>la</strong>r Value Decomposition of the forward operator G. The<br />

SVD i<strong>de</strong>ntifies the mo<strong>de</strong>l and data null spaces V o and U o . G can be reduced to a projection<br />

operator from V p to U p .<br />

Therefore, there will be (n m − p) space eigenvectors and (n d − p) data eigenvectors with the<br />

eigenvalue zero. The projection of eigenvectors from the mo<strong>de</strong>l to the data space is<br />

⎧<br />

⎨<br />

and from the data to the mo<strong>de</strong>l space is<br />

⎩<br />

⎧<br />

⎨<br />

⎩<br />

Gv i = λ i u i for i = 1, p<br />

Gv i = 0 for i = p, n, m<br />

(2.11)<br />

G t u i = λ i v i for i = 1, p<br />

G t u i = 0 for i = p, n d<br />

. (2.12)<br />

Equation (2.11) and (2.12) implies that, for the p non-zero eigenvalues, an eigenvector of the<br />

mo<strong>de</strong>l space is coup<strong>le</strong>d with an eigenvector of the data space which have the same eigenvalue.<br />

The projection of an eigenvector with a zero eigenvalue has no component in the corresponding<br />

space.<br />

Un<strong>de</strong>rstanding the forward prob<strong>le</strong>m<br />

The matrix U and V can be <strong>de</strong>composed into two sub-spaces represented by the eigenvectors<br />

with non-zero and zero eigenvalues (as shown Figure 2.2)<br />

U = (U p , U 0 )<br />

V = (V p , U 0 )<br />

(2.13)


18 CHAPTER 2. INVERSE THEORY<br />

where the columns U p (n d × p) and V p (n m × r) are the eigenvectors associated with non zero<br />

eigenvalues and U 0 (n d × (n d − p)) and V 0 (n m × (n m − p)) are the eigenvectors associated<br />

with zero eigenvalues. U 0 and V 0 are respectively representing the data and mo<strong>de</strong>l null spaces.<br />

Although the equations (2.9) and (2.10) are always verified, it may not be true for the subspaces<br />

U p and V p if either a mo<strong>de</strong>l or data null space exist. Since the eigenvectors are normalized,<br />

we will always have<br />

U t pU p = VpV t p = I p (2.14)<br />

but as the eigenvectors with non-zero eigenvalues may not span the entire space we have<br />

U p U t p ≠ I nd if U o exists<br />

V p V t p ≠ I nm if V o exists<br />

(2.15)<br />

The <strong>de</strong>termination of the eigenvectors and eigenvalues belonging to each sub-space provi<strong>de</strong>s<br />

us with very useful information. It allows us to i<strong>de</strong>ntify the portions of the mo<strong>de</strong>l and data spaces<br />

effectively involved in the forward prob<strong>le</strong>m since:<br />

• U p is sensitive to V p<br />

• U p is insensitive to V o<br />

• U o is insensitive to V p , V o<br />

Within U p , the most sensitive portion of the data space are the eigenvectors with the highest<br />

eigenvalues.<br />

By <strong>de</strong>fining the diagonal square matrix Λ p of dimension (p × p), a sub-matrix of Λ which<br />

as non-zero value along its diagonal, we have<br />

Λ =<br />

⎛<br />

⎝ Λ p 0<br />

0 0<br />

We can express the forward operator of equation (2.7) as<br />

G = (U p , U 0 )<br />

⎛<br />

⎝ Λ p 0<br />

0 0<br />

The forward prob<strong>le</strong>m can then be reduced to the expression<br />

⎞<br />

⎠ . (2.16)<br />

⎞<br />

⎠ (V p , V o ) t . (2.17)<br />

G = U p Λ p V t p. (2.18)


2.2. INVERSION OF LINEAR PROBLEMS 19<br />

Equation (2.18) shows that the SVD allows the reduction of the forward operator G to its<br />

efficient state of information. The forward prob<strong>le</strong>m may use the portion of the mo<strong>de</strong>l space<br />

constrained to V p to mo<strong>de</strong>l the data contained in U p only. The other portion of the data U o<br />

cannot be mo<strong>de</strong><strong>le</strong>d.<br />

Solution estimate of the inverse prob<strong>le</strong>m<br />

We have seen that the SVD allows the diagnosis of the forward prob<strong>le</strong>m by i<strong>de</strong>ntifying the<br />

portion of the data and mo<strong>de</strong>l spaces U p , V p . The SVD technique may also be used to estimate<br />

a solution of the inverse prob<strong>le</strong>m. By reformu<strong>la</strong>ting the state of information with the perspective<br />

of inversion, the sub-space<br />

• V p is constrained by U p only<br />

• V 0 is un<strong>de</strong>termined<br />

• U 0 is use<strong>le</strong>ss<br />

Since V p can only be <strong>de</strong>termined using the information contained in U p only, we <strong>de</strong>fine the<br />

generalised inverse G † as<br />

G † = V p Λ −1<br />

p Ut p . (2.19)<br />

The generalised inverse G † is an estimate of the inverse matrix G −1 (which may not exist, see<br />

tab<strong>le</strong> 2.1). The estimate of the inverse prob<strong>le</strong>m solution m † is then <strong>de</strong>fined as<br />

m † = G † d. (2.20)<br />

In or<strong>de</strong>r to evaluate how well the mo<strong>de</strong>l is resolved, equation (2.4) may be inclu<strong>de</strong>d into equation<br />

(2.20) yielding<br />

m † = G † Gm<br />

= R m m (2.21)<br />

where R m = G † G is the mo<strong>de</strong>l resolution matrix. We can also measure how well the estimated<br />

solution exp<strong>la</strong>ined the data by <strong>de</strong>fining d † , the data predicted by the generalised inverse solution<br />

d † = Gm †<br />

= GG † d<br />

= R d d (2.22)


20 CHAPTER 2. INVERSE THEORY<br />

Where R d = GG † , is the data resolution matrix.<br />

If the mo<strong>de</strong>l and data resolution are equal to the i<strong>de</strong>ntity matrix, the mo<strong>de</strong>l is perfectly<br />

resolved and the data perfectly exp<strong>la</strong>ined. This is possib<strong>le</strong> only if p = n m = n d . A review of<br />

possib<strong>le</strong>s cases are shown Tab<strong>le</strong> 2.2. The interesting point is that even though G −1 may not<br />

exist for p < (n d , n m ), the generalised inverse will still provi<strong>de</strong> an estimate of the solution but<br />

neither the data nor the mo<strong>de</strong>l will perfectly resolved (R m , R d ≠ I).<br />

Case V o U o R m = I R d = I G † = G −1<br />

p = n m < n d No Yes Yes No Yes<br />

p = n m = n d No No Yes Yes Yes<br />

p = n d < n m Yes No No Yes Yes<br />

p < (n d , n m ) Yes Yes No No No<br />

Tab<strong>le</strong> 2.2: Review of possib<strong>le</strong> cases.<br />

2.2.2 Local methods<br />

The local methods rely on the optimisation of the fit between observed and calcu<strong>la</strong>ted data. The<br />

data are calcu<strong>la</strong>ted in a reference mo<strong>de</strong>l m o using the forward prob<strong>le</strong>m and the optimisation<br />

process provi<strong>de</strong>s the mo<strong>de</strong>l update direction ∆m that minimises the data residuals vector ∆d<br />

<strong>de</strong>fined as<br />

∆d = d calc − d obs , (2.23)<br />

where d calc and d obs are respectively, the calcu<strong>la</strong>ted and observed data vectors. Therefore,<br />

an initial guess of the mo<strong>de</strong>l m o (or a priori mo<strong>de</strong>l, starting mo<strong>de</strong>l) is required to initiate the<br />

inversion process. In some cases, we will see that iterations are necessary as the first mo<strong>de</strong>l<br />

update will not comp<strong>le</strong>te the goal of optimisation. The criteria used for the optimisation process<br />

is the minimisation of the misfit function (or objective function, cost function or <strong>le</strong>ast squares<br />

function). This misfit function E is the most often <strong>de</strong>fined as the L 2 norm of the data residuals<br />

(Taranto<strong>la</strong>, 1987)<br />

E (m) = 1 2 (d calc (m) − d obs ) t (d calc (m) − d obs )<br />

= 1 2 ∆dt ∆d (2.24)


2.2. INVERSION OF LINEAR PROBLEMS 21<br />

The minimisation of the function E (m) is achieved by finding the mo<strong>de</strong>l parameter m † such<br />

that E ( m †) is minimum. The minimisation is local: the quantities involved in the optimisation<br />

of the misfit function <strong>de</strong>pend on the starting mo<strong>de</strong>l m o . All local methods require the calcu<strong>la</strong>tion<br />

of the local gradient of the misfit function ∇ m E, indicating the directions of increase of the<br />

misfit function : the steepest ascent. The gradient vector contains the first <strong>de</strong>rivatives of the<br />

misfit function E (m) with respect to the mo<strong>de</strong>l parameters m and is <strong>de</strong>fined as<br />

∇ m E = ∂E<br />

∂m . (2.25)<br />

By <strong>de</strong>riving the misfit function expressed in equation (2.24), the gradient is given by<br />

∇ m E (m) = F t ∆d (2.26)<br />

where F is the Fréchet <strong>de</strong>rivative matrix: the <strong>de</strong>rivative of the data with respect to the mo<strong>de</strong>l<br />

F = ∂d<br />

∂m . (2.27)<br />

For linear prob<strong>le</strong>m, the Fréchet <strong>de</strong>rivative matrix is constant with respect to m and expressed as<br />

F = ∂d<br />

∂m = ∂ (Gm) = G, (2.28)<br />

∂m<br />

is simply the forward operator G and is therefore in<strong>de</strong>pen<strong>de</strong>nt of m.<br />

2.2.2.1 The Gauss-Newton method<br />

Since the forward prob<strong>le</strong>m is linear, the misfit function is exactly quadratic and can be expressed<br />

as a Taylor series of the second or<strong>de</strong>r<br />

E ( m †) = E (m o + ∆m) = E (m o ) + ∇ m E t ∆m + 1 2 ∆mt H∆m (2.29)<br />

where H is the Hessian, the second <strong>de</strong>rivative matrix <strong>de</strong>fined as<br />

H = ∂ (∇E)<br />

∂m<br />

= ∂Ft (∆d · · · ∆d)<br />

∂m } {{ }<br />

+F t F. (2.30)<br />

n m<br />

For linear prob<strong>le</strong>m, as the Fréchet <strong>de</strong>rivative matrix is in<strong>de</strong>pen<strong>de</strong>nt of m, the Hessian is given<br />

by<br />

H = F t F. (2.31)


22 CHAPTER 2. INVERSE THEORY<br />

The minimum of the misfit function will be reached at m † if the gradient at this location is zero<br />

such as ∇ m E ( m †) = 0. Using equation (2.29) to express the <strong>de</strong>rivative of E in terms of m †<br />

we have<br />

∇ m E ( m †) = ∇ m E (m o ) + H∆m † . (2.32)<br />

and the mo<strong>de</strong>l update that will locate the minimum of the misfit function is thus<br />

∆m † = −H −1 ∇ m E (m o ) . (2.33)<br />

The starting mo<strong>de</strong>l is then updated, yielding the Gauss-Newton solution<br />

m † = m o − H −1 ∇ m E. (2.34)<br />

Each step involved in the calcu<strong>la</strong>tion of the Gauss-Newton solution is illustrated Figure 2.3.<br />

The minimum of the misfit function is expected to be found at the first mo<strong>de</strong>l update. The gradient<br />

vector is easy to calcu<strong>la</strong>te as it is simply the transpose of the forward mo<strong>de</strong>lling operator<br />

multiplied by the data residuals. However, the <strong>de</strong>termination of the inverse of the Hessian may<br />

require significant additional steps.<br />

Starting Mo<strong>de</strong>l<br />

<br />

Forward Mo<strong>de</strong>ling<br />

#"$&%('*),+.-0/ !<br />

1243527698;:2@!ACB<br />

Residuals<br />

Gradient<br />

Hessian<br />

JLKNM<br />

D<br />

EGFIH<br />

Inverse Hessian<br />

Mo<strong>de</strong>l Update<br />

OQPSRTOQUSVXW*Y[Z]\^_<br />

Figure 2.3: Scheme of the Gauss-Newton method applied to linear prob<strong>le</strong>m.


2.2. INVERSION OF LINEAR PROBLEMS 23<br />

2.2.2.2 The gradient method<br />

The gradient method (also cal<strong>le</strong>d steepest <strong>de</strong>scent method) avoids the <strong>de</strong>termination of the inverse<br />

of the Hessian by updating the mo<strong>de</strong>l in the steepest <strong>de</strong>scent direction. Since the gradient<br />

vector points towards the steepest ascent direction, the mo<strong>de</strong>l may be updated according to<br />

m (1) = m o − γ∇ m E (2.35)<br />

where γ is the step <strong>le</strong>ngth, a sca<strong>la</strong>r real number. The update of the mo<strong>de</strong>l in the steepest <strong>de</strong>scent<br />

direction assures the <strong>de</strong>crease of the misfit function so that E ( m (1)) < E (m o ). The expression<br />

of the step <strong>le</strong>ngth in the linear case is straightforward as it is easy to solve the inverse prob<strong>le</strong>m<br />

where γ is the only parameter to resolve since the update direction is set by the gradient. Finding<br />

the step <strong>le</strong>ngth γ, consist of estimating γ such that the misfit function E is minimum along the<br />

gradient direction. The solution of the step <strong>le</strong>ngth is strictly equiva<strong>le</strong>nt to a Gauss-Newton<br />

method where the misfit function is expressed as<br />

E (m o − γ∇ m E) = E (m o ) − γ (F ∇ m E) t ∆d + 1 2 γ2 (F ∇ m E) t (F ∇ m E) (2.36)<br />

where the step<strong>le</strong>ngth γ is the unknown. The minimum of E is obtained if<br />

which yields a value of γ given by<br />

γ =<br />

∇ γ E = ∂E<br />

∂γ = 0 (2.37)<br />

(F∇ m E) t ∆d<br />

(F∇ m E) t (F∇ m E) . (2.38)<br />

The minimum of the misfit function is unlikely to be reached at the first mo<strong>de</strong>l update and<br />

m (1) ≠ m † . Several mo<strong>de</strong>l updates may be necessary in or<strong>de</strong>r to comp<strong>le</strong>te the optimisation<br />

process. The required steps to <strong>de</strong>termine the mo<strong>de</strong>l update are part of an iteration as shown<br />

Figure 2.4. Several iterations are often nee<strong>de</strong>d and the minimum of the misfit function will be<br />

reached in a finite number of iterations until some convergence criteria are met.<br />

2.2.2.3 Interpretation of the Gradient and the Hessian using the SVD<br />

In this section, I will use the Singu<strong>la</strong>r Value Decomposition as a tool to analyse the ro<strong>le</strong> of the<br />

gradient ∇ m E and the Hessian H.<br />

We can evaluate the gradient in term of the Singu<strong>la</strong>r Value Decomposition of G <strong>de</strong>scribed<br />

in section 2.2.1 as we have F = G and ∇ m E = F t ∆d. The eigenvectors of the Hessian are<br />

simply the mo<strong>de</strong>l eigenvectors since H = F t F.


‰<br />

24 CHAPTER 2. INVERSE THEORY<br />

Starting Mo<strong>de</strong>l<br />

d calc = g (m)<br />

Forward Mo<strong>de</strong>ling<br />

`ba<br />

Residuals<br />

Iteration k+1<br />

~X4€579‚uƒ„k…>7†!‡,ˆ<br />

c<br />

dfehg<br />

Misfit Function<br />

Gradient<br />

Step<strong>le</strong>ngth<br />

Mo<strong>de</strong>l Update<br />

{}| ikjl


2.2. INVERSION OF LINEAR PROBLEMS 25<br />

where Λ 2 is a (n m × n m ) diagonal matrix with the eigenvalues values of H on the diagonal<br />

(which may contain some zeros if a mo<strong>de</strong>l null space exists), and zero elsewhere. For a better<br />

un<strong>de</strong>rstanding of equation (2.42), it can be written as a sum of the mo<strong>de</strong>l eigenvectors such as<br />

∇ m E =<br />

n m ∑<br />

i=1<br />

(<br />

λ<br />

2<br />

i v i v t i ∆m true<br />

)<br />

. (2.43)<br />

The product v i vi t is an operator in the mo<strong>de</strong>l space of dimension (n m × n m ), that projects any<br />

mo<strong>de</strong>l vector on the eigenvector v i (Sca<strong>le</strong>s et al. (2001), section 4.9). As the eigenvectors form<br />

an orthonormal basis, the sum of the projection of the vector ∆m true on each eigenvector v i is<br />

simply the vector itself as we have (see equation (2.10))<br />

∆m true =<br />

n m ∑<br />

i=1<br />

(<br />

vi v t i∆m true<br />

)<br />

(2.44)<br />

where each of v i vi∆m t true is the orthogonal projection of the vector ∆m true along the direction<br />

given by the eigenvector v i .<br />

From equation (2.43), it is c<strong>le</strong>ar that the gradient vector is the sum of the projected component<br />

of ∆m true along the eigenvectors v i , multiplied by the eigenvalue of v i . Therefore, the<br />

components of the gradient vector on the eigenvector basis are the component of the true perturbation,<br />

stretched by the eigenvalues values of H as shown Figure 2.5. The gradient direction<br />

is the direction of the true mo<strong>de</strong>l perturbation in a stretched mo<strong>de</strong>l space.<br />

Remarks on ∇ m E and the eigenvalues of H<br />

• If ∆m true can be expressed as a unique component along one eigenvector (with a nonzero<br />

eigenvalue), the gradient method will converge in one iteration.<br />

• If the eigenvectors have all the same eigenvalues, the gradient method will converge in<br />

one iteration.<br />

• If ∆m true has components in the null space, the gradient will only contain information<br />

about the component of ∆m true along eigenvectors with non-zero eigenvalues.<br />

• The solution ∆m † is the projection of ∆m true on the mo<strong>de</strong>l subspace V p .


œS œ *ž œ7Ÿ• ›&œ<br />

žu¡[¢¦£<br />

¤*¥S¤*¦¥7§©¨ ¦uª«@¬<br />

26 CHAPTER 2. INVERSE THEORY<br />

­©®©¯<br />

Žu[‘¦’<br />

°*±<br />

“•” –u—˜@š<br />

ÇÈ<br />

½Q¾½*¿¾wÀ©Á ¿uÂÃ@Ä<br />

Figure 2.5: Illustration of the construction of the gradient vector from eigenvectors and eigenvalues<br />

for a 2 dimensional mo<strong>de</strong>l space. The gradient vector is the component of the true mo<strong>de</strong>l<br />

perturbation stretched by the eigenvalues of H.<br />

ÅQÆ<br />

²&³´µ ´ µ*·ẃ¸•¹ ·uº[»¦¼<br />

The Hessian<br />

We have seen that for the Gauss-Newton method, the gradient direction is multiplied by the<br />

inverse of the Hessian so that inverse solution can be written as<br />

∆m † = H −1 H∆m true .<br />

The inverse Hessian undo the stretch of the component of ∆m true present in the gradient vector.<br />

If a mo<strong>de</strong>l null space exists, H is not invertib<strong>le</strong> and a Gauss-Newton solution of the type<br />

given by equation (2.34) can not be obtained. Therefore, H must be constrained within V p to<br />

be invertib<strong>le</strong> and the inverse Hessian becomes<br />

H −1<br />

p<br />

= V p Λ −2<br />

p Vt p . (2.45)<br />

The Fréchet <strong>de</strong>rivative matrix F may also be restricted to U p , V p and the Gauss-Newton solution<br />

is then <strong>de</strong>fined as<br />

m † = m o + H −1<br />

p F t p∆d<br />

The Gauss-Newton solution therefore yields the same solution as the gradient method: the<br />

portion of ∆m true constrained within V p . If the starting mo<strong>de</strong>l m o = 0, the solution given by


2.2. INVERSION OF LINEAR PROBLEMS 27<br />

local methods will be i<strong>de</strong>ntical to the generalised inverse solution. This solution neverthe<strong>le</strong>ss<br />

requires the dual imp<strong>le</strong>mentation of the SVD and Gauss-Newton in or<strong>de</strong>r to discriminate the<br />

mo<strong>de</strong>l null-space. An alternative is to damp the Hessian by adding a constant along its diagonal<br />

such as<br />

H d = H + κI (2.46)<br />

where κ is a real, positive number. Because H d does not have eigenvectors with zero eigenvalues,<br />

it is always invertib<strong>le</strong>. Applying a damping term to the Hessian is equiva<strong>le</strong>nt to finding the<br />

Gauss-Newton solution that minimises a new weighted misfit function <strong>de</strong>fined as<br />

E = 1 (<br />

∆d t ∆d + κ∆m t ∆m ) (2.47)<br />

2<br />

which minimises both the data residuals and the mo<strong>de</strong>l <strong>le</strong>ngth update in proportion to κ.<br />

Gauss-Newton solution and generalised inverse<br />

It can easily be <strong>de</strong>monstrated that the matrix H −1<br />

p Ft p is in fact the generalised inverse obtained<br />

from the SVD since we have<br />

H −1<br />

p F t p = V p Λ −1<br />

p<br />

= G †<br />

The Gauss-Newton solution is in fact equiva<strong>le</strong>nt to the solution given by the generalised inverse<br />

U t p<br />

when there is no a priori know<strong>le</strong>dge of the mo<strong>de</strong>l (m o = 0).<br />

2.2.2.4 Preconditioned gradient methods<br />

We have seen that the component of the gradient along the space eigenvectors are the component<br />

of the true mo<strong>de</strong>l perturbation stretched by the eigenvalues of H. In or<strong>de</strong>r to mitigate this effect<br />

without recourse of the inverse Hessian, it is possib<strong>le</strong> to precondition the gradient direction by :<br />

1. Adding a vector to the gradient<br />

2. Weight the data residuals<br />

3. Multiply the gradient by a matrix<br />

The preconditioned gradient direction can always be associated with an equiva<strong>le</strong>nt misfit function<br />

E, in which the preconditioned gradient is the steepest ascent. Therefore, an equiva<strong>le</strong>nce<br />

exist between modifying the misfit function or the gradient direction.


28 CHAPTER 2. INVERSE THEORY<br />

Adding a vector to the gradient<br />

If the misfit function is <strong>de</strong>fined as<br />

E (m) = 1 2 (Gm − d obs) t (Gm − d obs ) + m t p. (2.48)<br />

where p is a (n m × 1) vector, the steepest ascent becomes<br />

∇ m E = ∇ m E + p. (2.49)<br />

Adding a vector p to the gradient vector is equiva<strong>le</strong>nt to minimising the normal misfit function<br />

with an additional term m t p.<br />

Data residuals weighting versus gradient multiplication<br />

The data residuals may be weighted by a matrix W of dimension(n d × n d ) such as the trans<strong>forme</strong>d<br />

data residuals ∆¯d are<br />

∆¯d = W∆d (2.50)<br />

The misfit function minimising the L 2 norm of the data misfit ∆¯d is<br />

E = 1 2 ∆¯d t ∆¯d<br />

= 1 2 ∆dt W t W∆d<br />

= 1 2 (Gm − d obs) t W t W (Gm − d obs ) . (2.51)<br />

The gradient of E is<br />

∇ m E = G t W t W (Gm − d obs ) (2.52)<br />

The gradient of E is the preconditioned gradient of ∇ m E with the matrix P of dimension<br />

(n m × n m ) such as<br />

∇E = P∇ m E<br />

where P is<br />

P = G t W t W ( G t) −1<br />

Therefore, weighting the data residuals is equiva<strong>le</strong>nt to preconditioning the gradient vector by<br />

multiplication of a matrix. In other words, any action of multiplying the gradient by a matrix<br />

will modify the data residuals involved in the inversion process.<br />

It is often consi<strong>de</strong>red that the i<strong>de</strong>al preconditioning operator P is simply the inverse Hessian<br />

H −1 so that the steepest <strong>de</strong>scent of E points in the same direction than ∆m † .


2.2. INVERSION OF LINEAR PROBLEMS 29<br />

2.2.2.5 The conjugate gradient method<br />

Whi<strong>le</strong> the steepest <strong>de</strong>scent method assures that the convergence will be reached in a finite number<br />

of iterations, conjugate gradient method improve the convergence rate at litt<strong>le</strong> additional<br />

cost. The conjugate gradient optimises the mo<strong>de</strong>l space search by avoiding to update the mo<strong>de</strong>l<br />

in a direction that has already been searched. The current update direction ∇ m E is then the<br />

conjugate of the previous update direction. The inversion process is thus assured to converge in<br />

n m iterations, where n m is the number of mo<strong>de</strong>l parameter. The conjugate gradient is <strong>de</strong>fined<br />

as (Po<strong>la</strong>k and Ribière, 1969)<br />

(<br />

∇m E (k) − ∇<br />

∇ m E (k) = ∇ m E (k) m E (k−1)) ∇ m E (k)<br />

+<br />

∇<br />

∇ m E (k−1)t ∇ m E (k−1) m E (k−1) . (2.53)<br />

From equation (2.53), we can see that conjugate gradient belongs to the family of gradient<br />

preconditioning that add a vector to the steepest ascent direction as <strong>de</strong>scribed in the previous<br />

section.<br />

2.2.2.6 Numerical examp<strong>le</strong>s<br />

In or<strong>de</strong>r to illustrate the resolution of linear system using local techniques, I chose as an examp<strong>le</strong>,<br />

the linear system of 2 equations and 2 unknowns <strong>de</strong>fined as<br />

⎧<br />

⎨<br />

⎩<br />

m 1 + m 2 = 2<br />

−2m 1 + 3m 2 = 1 . (2.54)<br />

The solution of (2.54) is equiva<strong>le</strong>nt to finding the intersection of the 2 straight lines as shown<br />

Figure 2.6a). The solution is evi<strong>de</strong>nt as the straight lines intersect in m true<br />

forward prob<strong>le</strong>m can be expressed in algebraic form using equation (2.4) with<br />

⎛<br />

G = ⎝ 1 1<br />

⎞<br />

⎛<br />

⎠ , m = ⎝ m ⎞<br />

⎛<br />

1<br />

⎠ , d obs = ⎝ 2 ⎞<br />

⎠ .<br />

−2 3<br />

m 2 1<br />

The corresponding misfit function expressed with respect to the component of m is<br />

E = 5m 2 1 + 10m2 2 − 10m 1m 2 − 10m 2 + 5<br />

= (1, 1). The<br />

A SVD of G was carried out using the subroutine svdcmp (Press et al., 1992), yielding the<br />

eigenvectors and singu<strong>la</strong>r values<br />

⎛<br />

⎞ ⎛<br />

0.996 0.09<br />

U ≃ ⎝ ⎠ , Λ ≃ ⎝ 1.382 0<br />

⎞<br />

⎠ ,<br />

−0.09 0.996<br />

0 3.618<br />

⎛<br />

⎞<br />

0.851 −0.526<br />

V ≃ ⎝ ⎠ .<br />

0.526 0.851


30 CHAPTER 2. INVERSE THEORY<br />

a) b)<br />

3<br />

7<br />

6<br />

2<br />

5<br />

4<br />

3<br />

1<br />

2<br />

ÉËÊ$Ì]Í#Î<br />

ÒÓ<br />

ÜÝ<br />

1<br />

0<br />

0 1 2 3<br />

0 1 2 3 4 5 6 7<br />

ÔÖÕ<br />

×ÙØ<br />

Figure 2.6: Examp<strong>le</strong> of a 2-D, linear inverse prob<strong>le</strong>m where a) the solution is the intersection<br />

point m = (1, 1), between two straight lines, b) the corresponding misfit function with the two<br />

eigenvectors v 1 , v 2 are shown centred on the minimum.<br />

ÏÑÐ<br />

ÚÑÛ<br />

Since there are no zero singu<strong>la</strong>r values, we have p = n m = n d . There is neither a mo<strong>de</strong>l nor a<br />

data null space, the solution of the inverse prob<strong>le</strong>m yields the true solution: m † = m true (see<br />

Tab<strong>le</strong> 2.2). The misfit function is shown Figure 2.6b), with the mo<strong>de</strong>l eigenvectors centred on<br />

the minimum of the misfit function.<br />

Figure 2.7 shows the minimisation of the misfit function using (a) Gauss-Newton , (b) gradient<br />

, (c) preconditioned gradient and (d) conjugated gradient methods. As expected, the Gauss-<br />

Newton method locates the minimum in one iteration whereas the gradient methods converges<br />

to the minimum in several iterations. The convergence rate is improved by preconditioning<br />

the gradient direction with the diagonal term of the inverse Hessian and the conjugated gradient<br />

convergences in 2 iterations. The evolution of the misfit function with iterations is shown Figure<br />

2.8 for each method .


2.2. INVERSION OF LINEAR PROBLEMS 31<br />

a) b)<br />

7<br />

7<br />

6<br />

6<br />

5<br />

5<br />

4<br />

3<br />

−H −1 ∇E<br />

4<br />

3<br />

<br />

&('<br />

)+*-,/.¡0£1¥23,/.¡0<br />

"#<br />

2<br />

2<br />

<br />

1<br />

1<br />

<br />

$%<br />

7<br />

0 1 2 3 4 5 6 7<br />

c) d)<br />

<br />

7<br />

0 1 2 3 4 5 6 7<br />

!<br />

6<br />

6<br />

5<br />

5<br />

öç÷<br />

4<br />

4<br />

åçæ<br />

úüû@ýÿþ¡ £¢¥¤ ýþ¡<br />

3<br />

3<br />

áâ<br />

2<br />

2<br />

èêéìëîí]ï9ðñ@ëîí]ï<br />

ôõ<br />

1<br />

1<br />

¦¨§©¡©<br />

0 1 2 3 4 5 6 7<br />

0 1 2 3 4 5 6 7<br />

ã&ä<br />

ø&ù<br />

Figure 2.7: Local methods applied to a 2-D, linear inverse prob<strong>le</strong>m for a) Gauss-Newton , b)<br />

Gradient, c) Preconditioned gradient and d) Conjugated gradient methods. The gradient in c)<br />

was preconditioned with the diagonal term of the inverse Hessian.<br />

Þàß<br />

òàó


32 CHAPTER 2. INVERSE THEORY<br />

Misfit Function<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Iterations<br />

Figure 2.8: Evolution of the misfit function with iterations for the Gauss-Newton (dotted<br />

b<strong>la</strong>ck), the gradient (solid b<strong>la</strong>ck), the preconditioned gradient (solid grey) and the conjugated<br />

gradient (dotted grey).<br />

2.3 <strong>Inversion</strong> of non-linear prob<strong>le</strong>ms<br />

In the non-linear case, the forward prob<strong>le</strong>m is represented by equation (2.1) where the function<br />

g(m) is a non-linear function of the mo<strong>de</strong>l parameter m. The misfit function is thus<br />

E = 1 2 (g (m) − d obs) t (g (m) − d obs ) (2.55)<br />

and the Fréchet <strong>de</strong>rivative matrix is<br />

F (m o ) =<br />

∂g (m)<br />

∂m<br />

(2.56)<br />

∣<br />

mo<br />

and can not be directly re<strong>la</strong>ted to the forward mo<strong>de</strong>lling operator and will thus require extra<br />

computational steps.<br />

As for linear prob<strong>le</strong>ms, non-linear local methods make use of the local gradient of the misfit<br />

function. The non-linear gradient remains given by equation (2.26) but the Fréchet <strong>de</strong>rivative<br />

matrix F is mo<strong>de</strong>l <strong>de</strong>pen<strong>de</strong>nt. A dramatic effect of the non-linearity is that, in or<strong>de</strong>r to obtain<br />

the gradient vector, F must be re-<strong>de</strong>fined for any new reference mo<strong>de</strong>l since<br />

∇ m E (m o ) = F t m o<br />

∆d (m o ) . (2.57)<br />

We will see that the importance of the reference mo<strong>de</strong>l becomes fundamental for strong nonlinear<br />

prob<strong>le</strong>ms.


2.3. INVERSION OF NON-LINEAR PROBLEMS 33<br />

2.3.1 Newton method<br />

The Newton method relies on the assumption that the misfit function is approximately quadratic<br />

in the neighbourhood of m o and may thus be expressed as a Taylor series of the second or<strong>de</strong>r<br />

E (m o + ∆m) = E (m o ) + ∆m t ∇ m E (m o ) + 1 2 ∆mt H∆m + O (∣ ∣ ∣<br />

∣ ∣∣m 3 ∣ ∣ ∣<br />

∣ ∣∣<br />

)<br />

≃ E (m o ) + ∆m t ∇ m E (m o ) + 1 2 ∆mt H∆m. (2.58)<br />

We can express the Hessian H as the sum of two matrices<br />

H = H s + H a (2.59)<br />

where the non-linear Hessian H s and the approximate Hessian H a are <strong>de</strong>fined as<br />

H s = ∂Ft (∆d · · · ∆d)<br />

∂m } {{ }<br />

, H a = F t F. (2.60)<br />

n d<br />

For the non-linear prob<strong>le</strong>m, the matrix H s is non-zero since the Fréchet matrix F is <strong>de</strong>pen<strong>de</strong>nt<br />

on m. The <strong>de</strong>monstration is i<strong>de</strong>ntical to the Gauss-Newton solution for linear prob<strong>le</strong>ms. However,<br />

if the misfit function is not exactly quadratic, the Newton method will not converge in one<br />

iteration. An iterative process may be imp<strong>le</strong>mented to account for remaining non-linearities and<br />

the mo<strong>de</strong>l update is given by<br />

m (k+1) = m (k) − H −1 (k) ∇ m E (k)<br />

Remark : The misfit function is locally quadratic if and only if the forward prob<strong>le</strong>m is locally<br />

linear. Neverthe<strong>le</strong>ss, the Newton method relies on a discrepancy as it is assumed that the misfit<br />

function is quadratic but its curvature is <strong>de</strong>termined accounting for the non linearity of the<br />

prob<strong>le</strong>m.<br />

2.3.2 Linearised inversion<br />

The linearised methods relies on the assumption that the forward prob<strong>le</strong>m is linear in the neighbourhood<br />

of m o so that the data residuals ∆d have a linear re<strong>la</strong>tion with the mo<strong>de</strong>l perturbation<br />

∆m given by<br />

∆d = F∆m. (2.61)<br />

The data in the perturbed mo<strong>de</strong>l can thus be written as<br />

d (m o + ∆m) = d (m o ) + F∆m (2.62)


34 CHAPTER 2. INVERSE THEORY<br />

Gauss-Newton method<br />

- Demonstration 1<br />

Whi<strong>le</strong> Newton method takes into account the full Hessian, the Gauss-Newton method approximates<br />

the Hessian by<br />

H ≃ H a<br />

and thus neg<strong>le</strong>ct the non-linear Hessian term H s . The mo<strong>de</strong>l update then becomes<br />

- Demonstration 2<br />

∆m = H −1<br />

a ∇ mE.<br />

Equation (2.61) may be consi<strong>de</strong>red as a linear forward prob<strong>le</strong>m where the data is ∆d and the<br />

mo<strong>de</strong>l is ∆m (not a perturbation !). A solution can hence be found based on the resolution of<br />

linear system shown in section 2.2.2.1. A new optimisation process is set seeking to minimise<br />

the difference between observed data ∆d obs and calcu<strong>la</strong>ted data ∆d calc . The new misfit function<br />

is <strong>de</strong>fined as<br />

E (∆m) = (∆d calc − ∆d obs ) t (∆d calc − ∆d obs )<br />

where the observed data and calcu<strong>la</strong>ted data are<br />

= (F∆m − ∆d obs ) t (F∆m − ∆d obs ) (2.63)<br />

∆d obs = d obs − d (m o ) , ∆d calc = F∆m. (2.64)<br />

A reference mo<strong>de</strong>l m o must be chosen so that the linear assumption holds reasonably and the<br />

data in the reference mo<strong>de</strong>l d (m o ) are expressed using the non-linear forward prob<strong>le</strong>m. The<br />

misfit function E is minimum in ∆m if the gradient of E at this point is null. The gradient with<br />

respect to the mo<strong>de</strong>l perturbation is<br />

∂E<br />

∂∆m = Ft (F∆m − ∆d obs ) .<br />

The starting mo<strong>de</strong>l is chosen to be the reference mo<strong>de</strong>l so that the initial perturbation is zero:<br />

∆m o = 0. The Gauss-Newton solution yields the mo<strong>de</strong>l update<br />

∆m † = H −1<br />

a F t ∆d obs (2.65)


2.3. INVERSION OF NON-LINEAR PROBLEMS 35<br />

For strong non-linear prob<strong>le</strong>ms, the quadratic approximation is inaccurate and it may then<br />

be preferab<strong>le</strong> to use a damped Hessian such as<br />

H a = H a + κI. (2.66)<br />

The damped Hessian is not here only motivated by the existence of a mo<strong>de</strong>l null space as in<br />

the linear case (see section 2.2.2.3). In or<strong>de</strong>r to counter the effect of non-linearity and the<br />

irre<strong>le</strong>vance of the the quadratic approximation, the damped Gauss-Newton mo<strong>de</strong>l update is<br />

influenced by the steepest <strong>de</strong>scent direction which assures the <strong>de</strong>crease of the misfit function.<br />

For <strong>la</strong>rge κ, the damped Gauss-Newton is closer to simp<strong>le</strong> gradient methods and a step <strong>le</strong>ngth<br />

should be <strong>de</strong>fined.<br />

Linearised gradient method<br />

An iterative linearised gradient method may be used as an alternative to Gauss-Newton method<br />

in or<strong>de</strong>r to avoid the <strong>de</strong>termination of the inverse Hessian. The mo<strong>de</strong>l perturbation ∆m † is<br />

<strong>de</strong>termined after convergence of a linear, iterative gradient scheme such as<br />

∆m (k+1) = ∆m (k) − γ (k) ∇ ∆m E<br />

= ∆m (k) − γ (k) F t (F∆m − ∆d obs ) .<br />

It is important to keep in mind that for linearised gradient method, the Fréchet <strong>de</strong>rivatives remain<br />

constant as they are not re-<strong>de</strong>fined at each linear iteration. Therefore, after the first iteration, the<br />

gradient does not exactly correspond to the steepest ascent direction.<br />

Compensate for non-linearities<br />

If the misfit function is not exactly quadratic, a non-linear iterative scheme for Gauss-Newton,<br />

as for linearised gradient methods may be imp<strong>le</strong>mented as shown Figure 2.9. These non-linear<br />

iteration are compensating for the linearization. A new non-linear forward mo<strong>de</strong>lling as well as<br />

the Fréchet matrix are thus re-computed in the updated reference mo<strong>de</strong>l.<br />

2.3.3 Non-linear gradient method<br />

In non-linear gradient methods, the mo<strong>de</strong>l is updated at each iteration in the direction of the<br />

steepest <strong>de</strong>scent given by<br />

m (k+1) = m (k) − γ k ∇ m E (k)


O<br />

P<br />

36 CHAPTER 2. INVERSE THEORY<br />

Starting Mo<strong>de</strong>l<br />

d calc = g<br />

(m (k))<br />

Forward Mo<strong>de</strong>ling<br />

Non−linear Iteration k+1<br />

Gauss−<br />

Newton<br />

Linear<br />

Gradient<br />

Linear Iteration<br />

Linear <strong>Inversion</strong><br />

687:9@A687:9CBEDF687:9<br />

Linear Mo<strong>de</strong>l Update<br />

GIHKJML-N<br />

4 5<br />

Convergence<br />

Figure 2.9: Scheme of linearised inversion. For each reference mo<strong>de</strong>l, a linear update is found<br />

using either Gauss-Newton or iterative gradient method. To compensate for non-linearities, the<br />

reference mo<strong>de</strong>l can be updated and a new linear inversion carried out until convergence is met.


2.3. INVERSION OF NON-LINEAR PROBLEMS 37<br />

where γ is the step <strong>le</strong>ngth. Since the Fréchet <strong>de</strong>rivative is re-<strong>de</strong>fined at each iteration, no assumption<br />

of linearity is ma<strong>de</strong> and the gradient is the true steepest ascent direction. Finding the<br />

step <strong>le</strong>ngth γ such that the misfit function is a minimum for E ( m (k+1)) may be achieved by<br />

performing a line search along the gradient direction using parabolic fit (Press et al., 1992). This<br />

method requires the test of several values of E along the gradient direction and an additional<br />

forward mo<strong>de</strong>lling is thus nee<strong>de</strong>d for each test. In or<strong>de</strong>r to save the extra cost of several forward<br />

mo<strong>de</strong>llings, the linear estimate of the step <strong>le</strong>ngth γ can be <strong>de</strong>termined using equation (2.38)<br />

and often provi<strong>de</strong>s an acceptab<strong>le</strong> result. It is important to notice that at he first iteration, the<br />

linearised and non-linear gradient are i<strong>de</strong>ntical.<br />

As for the linear case, a precondition operator may be applied on the gradient. The (damped)<br />

approximate Hessian is often consi<strong>de</strong>r to be the optimal preconditioning operator. In or<strong>de</strong>r to<br />

speed the convergence rate, conjugate gradient may be used.<br />

2.3.4 Importance of the starting mo<strong>de</strong>l<br />

Depending on the <strong>de</strong>gree of non-linearity of the forward prob<strong>le</strong>m, the misfit function may<br />

present some local minima. A local minimum is a location in the misfit function where the<br />

gradient vanishes but which does not correspond to the lowest possib<strong>le</strong> minimum (the global<br />

minimum). The success of local methods will <strong>de</strong>pend upon the accuracy of the starting mo<strong>de</strong>l.<br />

If this starting mo<strong>de</strong>l is located close to a local minimum, local methods will converge into this<br />

minimum and will thus yield a wrong solution. Therefore, the starting mo<strong>de</strong>l must be located<br />

in the neighbourhood of the global minimum to assure success of convergence.<br />

For Newton and linearised methods, the misfit function must be reasonably quadratic and<br />

the mo<strong>de</strong>l error must have a quasi linear re<strong>la</strong>tion with the data residuals. The remaining nonlinearities<br />

can be compensated for by carrying out non-linear iterations.<br />

The non-linear gradient methods do not rely on the assumption of linearity and tend to<br />

be more robust than methods relying on the quadratic approximation. If the starting mo<strong>de</strong>l is<br />

located within the global minimum, gradient methods will converge accurately.


38 CHAPTER 2. INVERSE THEORY<br />

a)<br />

YZ<br />

10<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

3<br />

b)<br />

QR<br />

10<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

3<br />

Local Minima<br />

2<br />

2<br />

1<br />

VXW<br />

0<br />

0 1 2 3 4 5 6 7 8 9 10<br />

1<br />

Global Minimum<br />

0 1 2 3 4 5 6 7 8 9 10<br />

SUT<br />

Figure 2.10: A 2-D non-linear inverse prob<strong>le</strong>m. a) the solution is the intersection point between<br />

a cosine function and a straight line at m = (5, 4) . b) the misfit function shows the global<br />

minimum surroun<strong>de</strong>d by two local minimum.<br />

2.3.5 Numerical examp<strong>le</strong>s<br />

A two dimensional non-linear forward operator g(m) and a set of observed data is <strong>de</strong>fined<br />

g (m) =<br />

⎛<br />

⎝ −2 cos ( π<br />

m ⎞<br />

2 1)<br />

+ m2<br />

⎠ , d obs =<br />

−m 1 + m 2<br />

⎛<br />

⎝<br />

4 −1<br />

⎞<br />

⎠ .<br />

The solution of the non-linear inverse prob<strong>le</strong>m is the intersection point between the cosine and<br />

the straight line shown Figure 2.10a. This inverse prob<strong>le</strong>m has a unique solution in m true =<br />

(5, 4) but the direct inverse operator g −1 can not easily be obtained. The misfit function in<br />

Figure 2.10b shows, as expected, the global minimum in m = (5, 4) but surroun<strong>de</strong>d by two<br />

local minima.<br />

Since the analytical solution of the forward prob<strong>le</strong>m is known, the Fréchet <strong>de</strong>rivative matrix<br />

can easily be found and is given by<br />

⎛<br />

F = ⎝<br />

π<br />

π<br />

)<br />

2 2 1 1<br />

−1 1<br />

⎞<br />


2.3. INVERSION OF NON-LINEAR PROBLEMS 39<br />

and the non-linear and approximate Hessian matrices are<br />

H s =<br />

H a =<br />

⎛<br />

⎝<br />

⎛<br />

⎝<br />

) 2 (<br />

cos<br />

π m ) ⎞<br />

2 1 ∆d1 0<br />

⎠ ,<br />

0 0<br />

) 2 ( sin<br />

2 π<br />

m (<br />

2 1)<br />

+ 1 −π sin<br />

π m ) ⎞<br />

2 1 + 1<br />

−π sin ( π<br />

m ) ⎠ .<br />

2 1 + 1 2<br />

( π<br />

2<br />

( π<br />

2<br />

In or<strong>de</strong>r to show the importance of the starting mo<strong>de</strong>l as well as the performance of the different<br />

local methods, I applied a Newton, Gauss-Newton, non-linear gradient and conjugated gradient<br />

to the resolution of our non-linear inverse prob<strong>le</strong>m. Figure 2.11 shows the results of the<br />

inversion for a starting mo<strong>de</strong>l located in m o = (4.5, 7). For this starting mo<strong>de</strong>l, the quadratic<br />

approximation is accurate and Newton and Gauss-Newton methods converge into the global<br />

minimum. The non-linear gradients converge as well towards the global minimum. Figure 2.12<br />

shows the same experiment but with a starting mo<strong>de</strong>l located at m o<br />

= (4, 6). The Newton<br />

method fails to find the global minimum as the quadratic approximation is not a<strong>de</strong>quate (Figure<br />

2.12a). The Gauss-Newton method would therefore also fail and Figure 2.12b shows that<br />

instead, it is preferab<strong>le</strong> to perform a Gauss-Newton methods with a damped Hessian according<br />

to equation (2.66) with κ = 0.5. Although the quadratic approximation is not accurate, both<br />

non-linear gradient and conjugate gradient methods converge into the global minimum.<br />

Figure 2.13 shows the results for a starting mo<strong>de</strong>l in m o = (7, 9), located in the neighbourhood<br />

of a local minimum. All methods converge into the local minimum and the inversions<br />

fails to provi<strong>de</strong> with the correct answer.<br />

These results <strong>de</strong>monstrate the importance of the starting mo<strong>de</strong>l in applying local methods to<br />

non-linear inverse prob<strong>le</strong>ms. Non-linear gradient methods are more robust as they do not rely<br />

on the quadratic approximation, but require the recomputation of the Fréchet <strong>de</strong>rivatives at each<br />

iteration.


40 CHAPTER 2. INVERSE THEORY<br />

a) b)<br />

10<br />

10<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

hi<br />

[\<br />

3<br />

2<br />

1<br />

3<br />

2<br />

1<br />

c)<br />

10<br />

9<br />

8<br />

7<br />

6<br />

5<br />

0 1 2 3 4 5 6 7 8 9 10<br />

]_^<br />

10<br />

9<br />

8<br />

7<br />

6<br />

5<br />

j_k<br />

0 1 2 3 4 5 6 7 8 9 10<br />

d)<br />

<strong>de</strong><br />

`a<br />

4<br />

3<br />

2<br />

1<br />

4<br />

3<br />

2<br />

1<br />

f_g<br />

0 1 2 3 4 5 6 7 8 9 10<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Figure 2.11: <strong>Inversion</strong> with a starting mo<strong>de</strong>l at m o = (4.5, 7) showing a) the Newton , b)<br />

the Gauss-Newton, c) the gradient and d) the conjugate gradient methods. For this starting<br />

mo<strong>de</strong>l, the quadratic approximation is accurate and all methods converge rapidly to the global<br />

minimum.<br />

b_c


2.3. INVERSION OF NON-LINEAR PROBLEMS 41<br />

10<br />

a) b)<br />

10<br />

9<br />

8<br />

7<br />

6<br />

5<br />

9<br />

8<br />

7<br />

6<br />

5<br />

tu<br />

4<br />

3<br />

4<br />

3<br />

xy<br />

2<br />

1<br />

2<br />

1<br />

10<br />

0 1 2 3 4 5 6 7 8 9 10<br />

c) d)<br />

v_w<br />

10<br />

0 1 2 3 4 5 6 7 8 9 10<br />

z_{<br />

9<br />

9<br />

8<br />

7<br />

6<br />

5<br />

8<br />

7<br />

6<br />

5<br />

pq<br />

lm<br />

4<br />

3<br />

2<br />

1<br />

4<br />

3<br />

2<br />

1<br />

r_s<br />

0 1 2 3 4 5 6 7 8 9 10<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Figure 2.12: <strong>Inversion</strong> with a starting mo<strong>de</strong>l at m o = (4, 6) showing a) the Newton , b) the<br />

Damped Gauss-Newton, c) the gradient and d) the conjugate gradient methods. For this starting<br />

mo<strong>de</strong>l, the quadratic approximation is not accurate and Newton fails to locate the global<br />

minimum. The damped Gauss-Newton and gradient methods are successful.<br />

n_o


€<br />

42 CHAPTER 2. INVERSE THEORY<br />

10<br />

a) b)<br />

10<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

ˆ‰<br />

|}<br />

3<br />

2<br />

1<br />

3<br />

2<br />

1<br />

10<br />

0 1 2 3 4 5 6 7 8 9 10<br />

c) d)<br />

~_<br />

10<br />

Š_‹<br />

0 1 2 3 4 5 6 7 8 9 10<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

„…<br />

3<br />

2<br />

1<br />

3<br />

2<br />

1<br />

†_‡<br />

0 1 2 3 4 5 6 7 8 9 10<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Figure 2.13: <strong>Inversion</strong> with a starting mo<strong>de</strong>l at m o = (7, 9) showing a) the Newton , b) the<br />

Damped Gauss-Newton, c) the gradient and d) the conjugate gradient methods. This starting<br />

mo<strong>de</strong>l is not a<strong>de</strong>quate as all local methods converge into a local minimum.<br />

‚_ƒ


2.4. FREQUENCY DOMAIN WAVEFORM INVERSION 43<br />

2.4 Frequency domain waveform inversion<br />

The frequency domain waveform inversion method requires, as in any inverse prob<strong>le</strong>m, the<br />

<strong>de</strong>finition of the forward prob<strong>le</strong>m.<br />

2.4.1 The forward prob<strong>le</strong>m<br />

In or<strong>de</strong>r to simu<strong>la</strong>te the propagation of seismic waves, a forward prob<strong>le</strong>m must be chosen.<br />

The forward prob<strong>le</strong>m is always a simplified representation and takes only into account a limited<br />

aspect of the phenomena involved. The most comp<strong>le</strong>te representation (so far) of the propagation<br />

is the 3D, visco-e<strong>la</strong>stic and anisotropic case. Accounting accurately for all these effects is<br />

however highly computationally expensive and the 2D, acoustic, isotropic approximation is<br />

often used.<br />

We thus <strong>de</strong>fine the 2D, constant <strong>de</strong>nsity, acoustic wave equation in the frequency domain as<br />

(<br />

∇ 2 +<br />

)<br />

ω2<br />

c (x) 2 Ψ (x, s, ω) = −S (s, ω) (2.67)<br />

where x is the position vector, c(x) is the velocity, Ψ(x, s, ω) is the comp<strong>le</strong>x wavefield and ω is<br />

the circu<strong>la</strong>r frequency. A point source S(s, ω) at the location s is given by<br />

S(s, ω) = δ(x − s)A(s, ω)e iϕ(s,ω) (2.68)<br />

where A is the amplitu<strong>de</strong> term and ϕ is the phase of the source wave<strong>le</strong>t.<br />

The main difficulty of the wave equation (2.67) is that it is not in the form of a forward prob<strong>le</strong>m<br />

(equation (2.1)). The data are in<strong>de</strong>ed, not expressed as a combination of mo<strong>de</strong>l parameters.<br />

In or<strong>de</strong>r to establish the forward prob<strong>le</strong>m, we need to solve an in<strong>de</strong>pen<strong>de</strong>nt inverse prob<strong>le</strong>m.<br />

In this inverse prob<strong>le</strong>m, the wavefield Ψ (x, s, ω) p<strong>la</strong>ys the ro<strong>le</strong> of the mo<strong>de</strong>l parameter and the<br />

source term S (s, ω), the ro<strong>le</strong> of the data parameter. Defining the operator<br />

equation (2.67) can be expressed as<br />

ω2<br />

g (c (x) , ω) = ∇ 2 +<br />

c (x) 2 , (2.69)<br />

g (c(x), ω) Ψ (x, s, ω) = S (s, ω) . (2.70)


44 CHAPTER 2. INVERSE THEORY<br />

The solution of the forward prob<strong>le</strong>m requires that the inverse of the operator g must be found<br />

such that<br />

Ψ (x, s, ω) = g −1 (c (x) , ω) S (s, ω)<br />

= G (c(x), x, s, ω) S(s, ω) (2.71)<br />

where G(x, s) is the point source Green’s function for a source located at s. The Green’s<br />

function is the wavefield at the location x for a point source of zero-phase and unit amplitu<strong>de</strong>.<br />

For a given angu<strong>la</strong>r frequency ω, it is non-linearly <strong>de</strong>pen<strong>de</strong>nt on the velocities of the mo<strong>de</strong>l. For<br />

homogeneous media (constant velocity c o ), G(x, s) can be expressed analytically and the 2-D<br />

and 3-D free space Green’s function are (Morse and Feshbach, 1953)<br />

where H (1)<br />

o<br />

G 2D (x, s) = i 4 H(1) o (ω ||x − s|| /c o ) , G 3D (x, s) = exp (iω ||x − s|| /c o)<br />

4π ||x − s||<br />

is the zero or<strong>de</strong>r Hankel function of the first kind.<br />

In heterogeneous media, the Green’s function cannot be solved analytically and the forward<br />

prob<strong>le</strong>m must be solved numerically. The use of ray theory is often very useful as it is a computationally<br />

efficient method to evaluate the Green’s functions ( ˘Cervený et al., 1977; Chapman,<br />

1985). Ray methods neverthe<strong>le</strong>ss rely on the high-frequency asymptotic approximation of the<br />

wave-equation and are only accurate in smoothed velocity mo<strong>de</strong>ls, i.e. when the velocity mo<strong>de</strong>l<br />

varies slowly compared to the wave<strong>le</strong>ngth. In or<strong>de</strong>r to simu<strong>la</strong>te the propagation of waves in<br />

realistic media (possibly discontinuous), the wave-equation can be discretized (Marfurt, 1984)<br />

and the use of finite difference solvers may be used (Jo et al., 1996; Stekl and Pratt, 1998). The<br />

forward prob<strong>le</strong>m is then a solution of a linear inverse prob<strong>le</strong>m in which the wavefield Ψ (x, s)<br />

is the unknown.<br />

2.4.2 The inverse prob<strong>le</strong>m<br />

The misfit function of waveform inversion is <strong>de</strong>fined as the summation over source-receiver<br />

pairs, of the square of the data residuals:<br />

E = 1 ∑ ∑<br />

(Ψ calc (r, s) − Ψ obs (r, s)) ∗ (Ψ calc (r, s) − Ψ obs (r, s))<br />

2 s r<br />

= 1 ∑ ∑<br />

∆Ψ ∗ (r, s) ∆Ψ (r, s) (2.72)<br />

2<br />

s<br />

r


2.4. FREQUENCY DOMAIN WAVEFORM INVERSION 45<br />

where the superscript ∗ <strong>de</strong>notes comp<strong>le</strong>x conjugation. If the mo<strong>de</strong>l is represented by m (x), that<br />

is, as a function of spatial position, x, the steepest <strong>de</strong>scent direction is given by<br />

∇ m E (x) = ∂E<br />

∂m (x)<br />

{ ∑ ∑<br />

= Re<br />

s r<br />

∂Ψ ∗ (r, s)<br />

∂m<br />

(Ψ calc (r, s) − Ψ obs (r, s))<br />

}<br />

. (2.73)<br />

The gradient in equation (2.73) is expressed as the sum of the individual source-receiver pair<br />

gradients. For a given source-receiver pair, the Fréchet <strong>de</strong>rivative can be <strong>de</strong>rived from equation<br />

(2.71) and can be written<br />

∂Ψ (r, s) ∂G (r, s)<br />

= S (ω) . (2.74)<br />

∂m ∂m<br />

The main difficulty of computing the Fréchet <strong>de</strong>rivative is that the Green’s function can not be<br />

expressed analytically so that equation (2.74) cannot be used analytically.<br />

Explicit numerical computation of the Fréchet <strong>de</strong>rivative<br />

The Fréchet <strong>de</strong>rivative can be computed explicitly by measuring the effect on the data, of the<br />

perturbation of a sing<strong>le</strong> mo<strong>de</strong>l parameter. The mo<strong>de</strong>l parameter m (x) is thus perturbed of a<br />

factor β, chosen so that the mo<strong>de</strong>l perturbation remains small. The explicit Fréchet <strong>de</strong>rivative<br />

is then<br />

( )<br />

∂Ψ (r, s) ∆Ψ<br />

∂m<br />

= lim<br />

Ψ (m + βm) − Ψ (m)<br />

≃ (2.75)<br />

∆m→0 ∆m<br />

βm<br />

Such a computation of the Fréchet <strong>de</strong>rivatives for many mo<strong>de</strong>l parameters is however very expensive.<br />

This method requires the resolution of a forward prob<strong>le</strong>m for each mo<strong>de</strong>l parameter<br />

to compute the term Ψ (m + βm). For computational efficiency, this approach cannot be envisaged<br />

for the resolution of <strong>la</strong>rge prob<strong>le</strong>m in which a significant number of mo<strong>de</strong>l parameters are<br />

involved.<br />

Efficient calcu<strong>la</strong>tion of the Fréchet <strong>de</strong>rivative using the Born approximation<br />

A more efficient approach is to use the Born approximation to compute the partial <strong>de</strong>rivatives.<br />

The Born approximation linearises the perturbed wavefield according to the re<strong>la</strong>tion<br />

Ψ ′ (m + ∆m, r, s) ≃ Ψ (m, r, s) + ∂Ψ ∆m. (2.76)<br />

∂m<br />

The perturbed wavefield Ψ (m + ∆m) is the solution of the wave equation<br />

g(m + ∆m)Ψ ′ (r, s) = S (ω) . (2.77)


46 CHAPTER 2. INVERSE THEORY<br />

For a small mo<strong>de</strong>l perturbation, the operator g may be linearised so that we have<br />

By introducing equation (2.78) into (2.77) we have<br />

g (m + ∆m) ≃ g (m) + ∂g ∆m. (2.78)<br />

∂m<br />

g(m)Ψ ′ (r, s) + ∂g<br />

∂m ∆mΨ′ (r, s) = S(ω). (2.79)<br />

Using equation (2.76), the perturbed wavefield in (2.79) can be expressed in its linearised form<br />

so that the perturbed wave equation may be written<br />

g(m)Ψ(r, s) + g(m) ∂Ψ ∂g<br />

∂g ∂Ψ<br />

∆m + ∆mΨ(r, s) +<br />

∂m ∂m ∂m ∂m ∆m2 = S(ω). (2.80)<br />

In equation (2.80), the unperturbed wavefield Ψ(r, s) is solution of the wave equation in (2.70).<br />

As the second or<strong>de</strong>r term (∆m 2 ) is neg<strong>le</strong>cted since we lie within th Born approximation, the<br />

partial <strong>de</strong>rivative can be written<br />

∂Ψ (r, s)<br />

g(m)<br />

∂m<br />

= − ∂g Ψ(r, s). (2.81)<br />

∂m<br />

Equation (2.81) is simply the wave equation in (2.67) where the Fréchet <strong>de</strong>rivative is the partial<br />

<strong>de</strong>rivative wavefield, and the right hand si<strong>de</strong> term is the virtual source (Pratt et al., 1998). The<br />

resolution of equation (2.81) thus requires the <strong>de</strong>termination of a new forward prob<strong>le</strong>m: the<br />

Green’s for an excitation at the receiver location and we have<br />

∂Ψ (r, s)<br />

∂m<br />

∂g<br />

(x) = −G (x, r) Ψ (r, s)<br />

∂m<br />

= −G (x, r) ∂g G (x, s) S (ω) . (2.82)<br />

∂m<br />

Equation (2.82) expresses the Fréchet <strong>de</strong>rivative in a way that can be computationally more<br />

efficient than its explicit formu<strong>la</strong> (equation 2.75) as it involves the resolution of two forward<br />

prob<strong>le</strong>ms: the Green’s functions with an excitation at the source location and at the receiver location.<br />

The calcu<strong>la</strong>tion of the comp<strong>le</strong>te Fréchet matrix for all source-receiver pairs thus requires<br />

n s × n r Green’s function where n s and n r are the number of source and receivers respectively.<br />

2.4.2.1 The linearised inversion and the Born approximation<br />

Linearised inversion methods rely on the assumption that the forward prob<strong>le</strong>m is linear in the<br />

neighbourhood of the reference mo<strong>de</strong>l. Therefore, the Born approximation (equation (2.76)) is


2.4. FREQUENCY DOMAIN WAVEFORM INVERSION 47<br />

used to establish a linear re<strong>la</strong>tion between the mo<strong>de</strong>l perturbation and the data residuals. The<br />

data in the reference mo<strong>de</strong>l Ψ (m o ) are most often computed using an approximate Green’s<br />

function in smooth or homogeneous media using ray theory. The estimate of the direct inverse<br />

operator may be obtained for stacked data (Cohen and B<strong>le</strong>istein, 1979) and for unstacked data<br />

(C<strong>la</strong>yton and Stolt, 1981; Mil<strong>le</strong>r et al., 1987). The result of diffraction tomography yields a<br />

practical analytical solution in the (ω, k) domain (Devaney, 1984; Wu and Toksöz, 1987; Pratt<br />

and Worthington, 1988) assuming the reference mo<strong>de</strong>l is homogeneous. Often, the formu<strong>la</strong>tion<br />

of the inverse prob<strong>le</strong>m based on the explicit minimisation of the data misfit as <strong>de</strong>scribed in<br />

section 2.3.2 are consi<strong>de</strong>red more robust (Taranto<strong>la</strong>, 1984b; Ikel<strong>le</strong> et al., 1988; Lambaré et al.,<br />

1992).<br />

Linearised inversion based on the ray theory does however suffer from a severe limitation:<br />

the asymptotic ray approximation will break down if the updated reference mo<strong>de</strong>l contains<br />

strong heterogeneities. This would therefore prevent non-linear iteration from being carried<br />

out. Therefore, if the starting mo<strong>de</strong>l is not a<strong>de</strong>quate and contains <strong>la</strong>rge errors, the Born approximation<br />

will not be valid and the solution of the linearised prob<strong>le</strong>m inaccurate.<br />

2.4.2.2 The non-linear gradient methods<br />

As <strong>de</strong>scribed in section 2.3.3, non-linear gradient methods require the computation of the<br />

Fréchet <strong>de</strong>rivative at each iteration involving the computation of the Green’s function from the<br />

sources and receivers location. Since the updated mo<strong>de</strong>l may contain strong heterogeneities,<br />

the non-linear gradient methods must be used with an accurate wave propagation solver such as<br />

finite difference solver of the wave equation.<br />

The gradient computation<br />

The gradient direction can be obtained by including equation (2.82) into equation (2.73) so that<br />

we have<br />

{ ( ) ∑ ∑<br />

∗ }<br />

∂g<br />

∇ m E (x) = −Re<br />

s r ∂m (x) Ψ (x, r) G (x, r) ∆Ψ (r, s)<br />

{ }<br />

∂g ∑ ∑<br />

= −Re<br />

G ∗ (x, s) S ∗ (ω)G (x, r) ∗ ∆Ψ (r, s) . (2.83)<br />

∂m<br />

s<br />

r


48 CHAPTER 2. INVERSE THEORY<br />

The gradient in equation (2.83) can be <strong>de</strong>composed into the multiplication of two wavefields:<br />

the forward propagated wavefield P f (x, s) of the source term, <strong>de</strong>fined as<br />

P f (x, s) = G (x, s) S (ω) (2.84)<br />

and the back-propagated wavefield P b (x, r, s) of the data residuals from the receiver location is<br />

given by<br />

P b (x, r, s) = G (x, r) ∗ ∆Ψ (r, s) . (2.85)<br />

The gradient then becomes<br />

∇ m E (x) = −Re<br />

{ ∂g<br />

∂m<br />

}<br />

∑ ∑<br />

Pf ∗ (x, s) P b (x, r, s)<br />

s r<br />

(2.86)<br />

(see Pratt et al. (1996), equation 12)). In equation (2.86), the gradient is the result of the multiplication<br />

of comp<strong>le</strong>x quantities as both forward and back-propagated wavefield are comp<strong>le</strong>x.<br />

Attenuation may be accounted for through the introduction of imaginary velocity in the wave<br />

equation, in which case the imaginary velocities are updated according to the imaginary part<br />

of the gradient (Song et al., 1995; Hicks and Pratt, 2001). Only (real) velocity mo<strong>de</strong>l is here<br />

consi<strong>de</strong>red and is updated at iteration (k + 1) according<br />

m (k+1) (x) = m (k) (x) − γ (k) ∇ m E (x) (2.87)<br />

Mo<strong>de</strong>l parameterisation<br />

The estimate of the velocity mo<strong>de</strong>l may be achieved by parameterising the inverse prob<strong>le</strong>m with<br />

respect to velocity, slowness or square of slowness. The choice of parameterisation influences<br />

the gradient direction through the term ∂g (Tab<strong>le</strong> 2.3). Since the parameterisation does not<br />

∂m<br />

modify the shape of the misfit function, it cannot be c<strong>la</strong>ssified as a precondition gradient as it is<br />

explicitly part of <strong>de</strong>finition of the gradient.<br />

Velocity Slowness Slowness Squared<br />

m c 1/c 1/c 2<br />

∂g<br />

∂m −2ω2 /c 3 2ω 2 /c ω 2<br />

Tab<strong>le</strong> 2.3: Possib<strong>le</strong> mo<strong>de</strong>l parameterisations.


2.4. FREQUENCY DOMAIN WAVEFORM INVERSION 49<br />

Efficient computation of the gradient<br />

Equation (2.86) showed that the gradient involves for each source-receiver pair, the multiplication<br />

of two wavefields, from the source and the receiver locations. As the forward propagated<br />

wavefield from the source, P f (x, s) does not <strong>de</strong>pend on the receiver location, equation (2.86)<br />

can be written<br />

∇ m E (x) = −Re<br />

{ ∂g<br />

∂m<br />

∑<br />

Pf ∗ (x, s) ∑<br />

s<br />

r<br />

P b (x, r, s)<br />

}<br />

. (2.88)<br />

This trivial manipu<strong>la</strong>tion shows that for a given source gather, the forward propagated field can<br />

be multiplied by the sum of the back-propagated wavefield from the corresponding receivers.<br />

Equation (2.88) offers a critical improvement as the back-propagation from all receivers can<br />

in fact be carried out simultaneously at no additional computational cost. Therefore, the cost<br />

of back-propagation of the data residuals for all receivers together is the same than for a sing<strong>le</strong><br />

receiver. The gradient calcu<strong>la</strong>tion thus requires n s ×2 forward mo<strong>de</strong>lling instead of n s ×n r if all<br />

the data residuals were back-propagated in<strong>de</strong>pen<strong>de</strong>ntly. This computational gain creates a major<br />

inconvenience: the Fréchet <strong>de</strong>rivative matrix is not known as it would require the computation<br />

of the individual source-receiver pair gradients according to equation (2.86) . The efficient<br />

computation of the gradient thus implies that the Fréchet matrix is not computed explicitly.<br />

Effective computation of the step<strong>le</strong>ngth<br />

The linear estimate of the step <strong>le</strong>ngth γ for the waveform inverse prob<strong>le</strong>m, is often consi<strong>de</strong>red<br />

sufficient although line search techniques are possib<strong>le</strong>. However, since the Fréchet <strong>de</strong>rivative<br />

matrix is not known, the vector F∇ m E of equation (2.38) must be evaluated by perturbing<br />

the mo<strong>de</strong>l in the gradient direction. An additional forward mo<strong>de</strong>lling must be per<strong>forme</strong>d to<br />

compute the data in the perturbed mo<strong>de</strong>l d (m o − β∇ m E). The Fréchet <strong>de</strong>rivative matrix is<br />

thus given by<br />

F∇ m E = d (m o − β∇ m E) − d (m o )<br />

. (2.89)<br />

−β<br />

The term F∇E is then used to compute the linear estimate of the step <strong>le</strong>ngth γ .<br />

Why the approximate Hessian cannot be efficiently computed <br />

The approximate Hessian given by equation (2.60) requires the know<strong>le</strong>dge of the Fréchet <strong>de</strong>rivative<br />

matrix. Since the efficient computation of gradient is done without building the Fréchet<br />

<strong>de</strong>rivative matrix, the Hessian cannot be easily obtained. However, some waveform inversions


50 CHAPTER 2. INVERSE THEORY<br />

using Gauss-Newton methods have been carried out (Pratt et al., 1998; Hicks, 1999; Hicks and<br />

Pratt, 2001). This requires the explicit computation of the Fréchet <strong>de</strong>rivative matrix. Such an<br />

approach thus involves a forward mo<strong>de</strong>lling per mo<strong>de</strong>l parameter and can only be applied in<br />

cases where the number of mo<strong>de</strong>l parameters is limited.<br />

Non-linear gradient methods and the Born approximation<br />

It is very important to keep in mind that the non-linear, iterative gradient approach do not rely<br />

on the linear assumption. The confusion is often ma<strong>de</strong> that the success of non-linear inversion<br />

methods <strong>de</strong>pends upon the validity of the Born approximation, i.e. that the data residuals have a<br />

linear re<strong>la</strong>tion with the true mo<strong>de</strong>l perturbation. This is a wrong assumption as the non-linearity<br />

is attacked by recomputing the Fréchet <strong>de</strong>rivative at each iteration. The Born approximation<br />

is only used for the computation of the Fréchet <strong>de</strong>rivatives. This should allow compensation<br />

of some of the errors contained in the starting mo<strong>de</strong>l. The success of non-linear waveform<br />

inversion only <strong>de</strong>pends on the fact that a <strong>de</strong>scent path <strong>le</strong>ading to the global minimum exists. Of<br />

course, since the waveform inverse prob<strong>le</strong>m is highly non-linear, the misfit function will present<br />

many local minima and the <strong>de</strong>termination of the starting mo<strong>de</strong>l will therefore be a critical step<br />

in the waveform inversion process.<br />

2.5 Conclusion<br />

The inverse prob<strong>le</strong>m may be solved using two main different approaches. The direct methods<br />

aims to <strong>de</strong>termine the direct inverse operator whi<strong>le</strong> the local methods are based on the minimisation<br />

of the misfit between observed and calcu<strong>la</strong>ted data.<br />

In the linear case, the accuracy of the inverse solution will <strong>de</strong>pend on the existence of a<br />

mo<strong>de</strong>l null space. If no mo<strong>de</strong>l null space exists, the direct and local methods yield the exact<br />

solution and the generalised inverse and the Gauss-Newton solver are equiva<strong>le</strong>nt.<br />

In the case where a mo<strong>de</strong>l null space exists, the solution of the generalised inverse has<br />

no components in V o . The gradient method will update the starting mo<strong>de</strong>l with components<br />

contained in V p only so that the components in V o remain unchanged. Since the Hessian is<br />

not invertib<strong>le</strong>, the Gauss-Newton solution can be obtained if the Hessian is restrained to V p in<br />

which case it remains i<strong>de</strong>ntical to the generalised inverse method. A more wi<strong>de</strong>ly used approach<br />

for handling the null-space is to minimise a misfit function with an additional damping term.


2.5. CONCLUSION 51<br />

The resolution of non-linear inverse prob<strong>le</strong>ms introduces additional difficulties as the Fréchet<br />

<strong>de</strong>rivative matrix is mo<strong>de</strong>l <strong>de</strong>pendant and the misfit function is not quadratic and may present local<br />

minima. Newton and linearised methods assumes that the misfit function is locally quadratic.<br />

If this assumption holds reasonably, these approaches will be successful but non-linear iterations<br />

may be required to compensate for non-linearities. For strong non-linear prob<strong>le</strong>ms, non-linear<br />

gradient methods appear to be more robust as they do not rely on the quadratic approximation<br />

and <strong>de</strong>fine the Fréchet matrix at each iteration. In or<strong>de</strong>r to improve the convergence rate,<br />

the preconditioning of the gradient could be optimally per<strong>forme</strong>d by the inverse of a damped<br />

approximate Hessian.<br />

The waveform inverse prob<strong>le</strong>m is non-linear and the efficiency of the local methods will<br />

<strong>de</strong>pend on the accuracy of the starting mo<strong>de</strong>l. If linearised inversion are to be used, the starting<br />

mo<strong>de</strong>l must be good enough so that the minimum of the quadratic function corresponds to<br />

the global minimum. Because linearised inversion is the most often imp<strong>le</strong>mented using ray<br />

theory, non-linear iteration can not be carried out. The non-linear gradient methods appears to<br />

be more robust as there are not boun<strong>de</strong>d by the Born approximation. The convergence rate can<br />

be improved by preconditioning of the gradient. The damped approximate Hessian can not be<br />

easily used as a precondition operator as the Fréchet matrix is not explicitly computed. The<br />

success of the non-linear gradient will <strong>de</strong>pend on the location of the starting mo<strong>de</strong>l with respect<br />

to the global minimum. If the starting mo<strong>de</strong>l is not a<strong>de</strong>quate, the inversion will converge into a<br />

local minimum and will thus yields a wrong answer.


52 CHAPTER 2. INVERSE THEORY


Chapter 3<br />

A strategy for se<strong>le</strong>cting temporal<br />

frequencies<br />

3.1 Introduction<br />

1 Waveform inversion may be imp<strong>le</strong>mented either in the time domain (Taranto<strong>la</strong>, 1986; Mora,<br />

1987; Bunks et al., 1995; Shipp and Singh, 2002), or in the frequency domain (Pratt and Worthington,<br />

1990; Liao and McMechan, 1996); imp<strong>le</strong>mentations for both acoustic and e<strong>la</strong>stic wave<br />

equations exist. The frequency domain approach is equiva<strong>le</strong>nt to the time domain approach<br />

when all frequencies are inverted simultaneously (Pratt et al., 1998). Reciprocally, the inversion<br />

of a sing<strong>le</strong> sinusoidal component of time domain data is equiva<strong>le</strong>nt to the inversion of a sing<strong>le</strong><br />

component of frequency domain data.<br />

The choice of a domain for inversion allows one to apply specific methodologies to precondition<br />

either the data residuals or the gradient in ways that may improve the convergence and/or<br />

the linearity of the inverse prob<strong>le</strong>m. For examp<strong>le</strong>, time windowing the residuals is often useful<br />

(Shipp and Singh, 2002); this requires a time domain representation of the data residuals. It<br />

is neverthe<strong>le</strong>ss possib<strong>le</strong> to <strong>de</strong>vise strategies for time domain conditioning of the data by making<br />

use of comp<strong>le</strong>x-valued frequencies, as we will see in Chapter 4. Conversely, when using a<br />

frequency domain approach it is straightforward to carry out the inversion adopting a strategy<br />

that proceeds sequentially from the low to the high frequencies (at no additional computational<br />

cost). Since low frequency data are more linear with respect to the mo<strong>de</strong>l perturbations than<br />

1 This chapter is an exten<strong>de</strong>d version of an artic<strong>le</strong> accepted for publication in Geophysics (co-author G. Pratt).<br />

53


54 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

high frequency data, such a strategy dramatically improves the chance for the inversion to locate<br />

the global minimum of the full, wi<strong>de</strong> band inverse prob<strong>le</strong>m. Such a methodology may of course<br />

also be applied in the time domain by low pass filtering of the data (Bunks et al., 1995), however<br />

the computational efficiency of the frequency domain approach is far superior when this strategy<br />

is to be employed (Marfurt, 1984). Furthermore, the frequency domain is also well suited<br />

to compensate for variations in the source signature, which will cause certain frequencies to be<br />

un<strong>de</strong>r-represented in the image. Although this effect may be compensated for by preconditioning<br />

of the time domain gradient (Causse et al., 1999), it is easier to apply the preconditioning in<br />

the frequency domain as we process each mono-frequency gradient individually (Hicks, 1999).<br />

One of the most important advantages of frequency domain inversion is the ability to provi<strong>de</strong><br />

a unaliased image using a very limited number of frequencies, as observed by Pratt and<br />

Worthington (1988); Liao and McMechan (1996) and Forgues et al. (1998). This observation<br />

is consistent with the results of Wu and Toksöz (1987), who <strong>de</strong>monstrated that even a sing<strong>le</strong><br />

frequency of a given seismic survey geometry yields information on a finite portion of the<br />

mo<strong>de</strong>l in the wavenumber domain. Since the computational cost of frequency domain inversion<br />

is directly proportional to the number of frequencies used, this is a significant advantage for<br />

waveform inversion of real, multi-shot seismic data.<br />

The c<strong>la</strong>im that a limited number of frequencies would suffice when inverting ref<strong>le</strong>ction seismic<br />

data was investigated by Freu<strong>de</strong>nreich and Singh (2000), who pointed out that no strategy<br />

for se<strong>le</strong>cting the appropriate frequencies had yet been <strong>de</strong>veloped. They conclu<strong>de</strong>d that the frequency<br />

domain inversion fails in the presence of a limited offset range.<br />

In this chapter, it is shown that frequency domain inversion of ref<strong>le</strong>ction data using only a<br />

few frequencies (properly se<strong>le</strong>cted) can yield a result that is comparab<strong>le</strong> to full time domain<br />

inversion, for any avai<strong>la</strong>b<strong>le</strong> range of offsets. I <strong>de</strong>velop a strategy that adapts the se<strong>le</strong>ction of<br />

imaging frequencies to the avai<strong>la</strong>b<strong>le</strong> offset range. The main i<strong>de</strong>a is that the <strong>la</strong>rger the offset,<br />

the fewer frequencies are required. This strategy takes advantage of the effect of an “image<br />

stretch”, which increases with increasing offset. This effect is commonly observed in prestack<br />

<strong>de</strong>pth migration (Gardner et al., 1974)- in a 1-D earth, it is equiva<strong>le</strong>nt to NMO stretch. Stretch is<br />

often consi<strong>de</strong>red to have a negative impact on seismic imaging, although Haldorsen and Farmer<br />

(1989) showed that the stretch effect can be utilised to compensate for the <strong>la</strong>ck of low frequencies<br />

of the source and hence to improve the spectral content of stacked data. In the strategy<br />

<strong>de</strong>fined, each successive frequency is se<strong>le</strong>cted by evaluating the image stretch at the maximum<br />

offset for the previous frequency. Although the strategy is <strong>de</strong>veloped using a 1-D assumption, I


3.2. GRADIENT VECTOR AND WAVENUMBER ILLUMINATION 55<br />

will show that the same approach can also be applied effectively in 2-D, heterogeneous media.<br />

Because of the kinematic equiva<strong>le</strong>nce between the first iteration gradient vector and prestack<br />

<strong>de</strong>pth imaging for ref<strong>le</strong>ction data (Taranto<strong>la</strong>, 1986), the strategy also has important implications<br />

for frequency domain imp<strong>le</strong>mentations of prestack <strong>de</strong>pth migration (such as that advocated by<br />

Sch<strong>le</strong>icher et al. (1993)).<br />

This chapter begins by <strong>de</strong>riving a specific formu<strong>la</strong> for the gradient using a p<strong>la</strong>ne wave approximation<br />

in homogeneous background media that predicts analytically the wavenumber illumination<br />

of a given target. This allows us to <strong>de</strong>fine a strategy for choosing frequencies for<br />

continuous wavenumber illumination of a 1-D target. The key i<strong>de</strong>a is to take advantage of the<br />

maximum offset data in the acquisition. This strategy is then validated on the 1-D imaging<br />

experiment of Freu<strong>de</strong>nreich and Singh (2000). I go on in the chapter to show that the strategy<br />

can be successfully applied to data generated in the 2-D heterogeneous Marmousi mo<strong>de</strong>l.<br />

3.2 Gradient vector and wavenumber illumination<br />

In chapter 2, the formu<strong>la</strong>tion of the gradient is applicab<strong>le</strong> to arbitrary reference media: equation<br />

(2.83) may be used in cases where the background media is inhomogeneous, provi<strong>de</strong>d the<br />

Green’s functions are calcu<strong>la</strong>ted appropriately. In or<strong>de</strong>r to proceed, the following assumptions<br />

are now introduced:<br />

1. We assume we may ignore amplitu<strong>de</strong> effects.<br />

2. We assume the reference medium is homogeneous, with a velocity c o .<br />

3. We assume we are in the far field, so that we may rep<strong>la</strong>ce the Green’s functions with p<strong>la</strong>ne<br />

wave approximations.<br />

Note that these are introduced only in or<strong>de</strong>r to proceed with the analysis that will <strong>le</strong>ad to a<br />

frequency se<strong>le</strong>ction strategy; none of these are required for the actual inversions. Un<strong>de</strong>r these<br />

three assumptions both Green’s functions, G o (x, s) and G o (x, r) may be approximated by<br />

inci<strong>de</strong>nt and scattered p<strong>la</strong>ne waves:<br />

G o (x, s) ≈ exp (ik o ŝ · x)<br />

G o (x, r) ≈ exp (ik oˆr · x)<br />

(3.1)


56 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

where k o = ω/c o is the wavenumber of the inci<strong>de</strong>nt and scattered waves in the homogeneous<br />

reference medium, and ŝ and ˆr are respectively unit vectors in the inci<strong>de</strong>nt propagation direction<br />

(source to scatterer) and the inverse scattering direction (receiver to scatterer) direction.<br />

Inserting equation (3.1) into equation (2.83) yields<br />

∇E (x) = −ω 2 ∑ ∑<br />

Re {exp (−ik o ŝ · x) exp (−ik oˆr · x) ∆Ψ (r, s)}<br />

s r<br />

= −ω 2 ∑ ∑<br />

Re {exp (−ik o (ŝ + ˆr) · x) ∆Ψ (r, s)} . (3.2)<br />

s r<br />

The only space-<strong>de</strong>pen<strong>de</strong>nt term in equation (3.2) is the comp<strong>le</strong>x exponential, which, for a given<br />

source-receiver pair, oscil<strong>la</strong>tes at a sing<strong>le</strong> wavenumber, given by the vector k o (ŝ + ˆr). Equation<br />

(3.2) is therefore an inverse Fourier summation in which the weight function for each<br />

source/receiver pair p<strong>la</strong>ne wave combination is <strong>de</strong>termined by the data residuals ∆Ψ(r, s).<br />

A sing<strong>le</strong> frequency, sing<strong>le</strong> source-receiver pair in the imaging algorithm <strong>de</strong>scribed by equation<br />

(3.2) would therefore yield only a sing<strong>le</strong> wavenumber in the image, ∇E(x) (see Figure<br />

3.1). In or<strong>de</strong>r to reconstruct useful images, we need to provi<strong>de</strong> additional wavenumbers. There<br />

are two ways to recover a range of wavenumbers: (1) we can use a wi<strong>de</strong>r band of frequencies (as<br />

we do if we use a time-domain algorithm), or (2) we can use a range of different source-receiver<br />

pairs that samp<strong>le</strong> the same image point at different directions, ˆr and ŝ as shown Figure 3.2. A<br />

combination of both approaches seems sensib<strong>le</strong>. Almost all seismic imaging strategies take advantage<br />

of the first approach: the wavenumber bandwidth of the image is provi<strong>de</strong>d by temporal<br />

bandwidth in the experiment. My suggestion is that some of the required image bandwidth can<br />

in fact be provi<strong>de</strong>d by making optimal use of different offsets in the ref<strong>le</strong>ction experiment to<br />

create a variety of p<strong>la</strong>ne wave imaging directions. Temporal frequencies may then be se<strong>le</strong>cted<br />

to avoid redundancy of the wavenumber information.<br />

3.2.1 Analysis of the gradient within the Born approximation<br />

The gradient image of equation (3.2) is now examined by estimating the data residuals ∆Ψ(r, s)<br />

in the context of the Born approximation, through which we may write<br />

∆Ψ(r, s) ≈ −ω 2 ∫<br />

dx G o (r, x)G o (x, s) δm(x) (3.3)<br />

where δm(x) is the perturbation in the parameter m(x) = 1/c 2 (x) (see for examp<strong>le</strong> Mil<strong>le</strong>r et<br />

al. (1987), equation (8)). Equation (3.3) approximates the data residuals by establishing a linear<br />

re<strong>la</strong>tion with the mo<strong>de</strong>l perturbation δm. Although in the inversion scheme, the data residuals


³<br />

¥<br />

3.2. GRADIENT VECTOR AND WAVENUMBER ILLUMINATION 57<br />

ŒŽ<br />

°Ž±²<br />

’+“<br />

œŸž ¡ ¢¤£ ´µ ›<br />

«­¬ ª ®Ž¯ ¦¨§¤©


ë<br />

¸<br />

·<br />

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0<br />

îïÀð ìí<br />

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0<br />

¢<br />

ø<br />

58 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

0<br />

a)<br />

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0<br />

0<br />

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0<br />

0<br />

0.1<br />

0.1<br />

0.1<br />

0.2<br />

0.2<br />

0.2<br />

0.3<br />

0.4<br />

0.5<br />

0.6<br />

0.3<br />

0.4<br />

0.5<br />

0.6<br />

=<br />

0.3<br />

0.4<br />

0.5<br />

0.6<br />

0.7<br />

0.7<br />

0.7<br />

õ ö


3.2. GRADIENT VECTOR AND WAVENUMBER ILLUMINATION 59<br />

Substituting the Born estimate of the data residuals, equation (3.5) into the p<strong>la</strong>ne wave gradient<br />

image equation (3.2) we obtain<br />

g(x) = ω 4 ∑ s<br />

∑<br />

Re { exp (−ik o (ŝ + ˆr) · x) ˜M (k o (ŝ + ˆr)) } , (3.6)<br />

r<br />

which, as stated above in reference to equation (3.2) is an inverse Fourier summation. We have<br />

now written this summation in a form that makes it c<strong>le</strong>ar that the weights in the summation<br />

are given by the Fourier components of the mo<strong>de</strong>l. Thus the summation will yield an image of<br />

the true mo<strong>de</strong>l, distorted however due to the limited avai<strong>la</strong>bility of scattering directions. I may<br />

emphasise this fundamental observation:<br />

One source-receiver pair with an inci<strong>de</strong>nce direction ŝ and an inverse scattering direction ˆr<br />

yields information on only a sing<strong>le</strong> wavenumber in the spectrum of the mo<strong>de</strong>l, given by k o (ŝ+ˆr).<br />

Multip<strong>le</strong> source-receiver pairs yield information on a finite region in wavenumber space, even<br />

if only a sing<strong>le</strong> frequency component is used. This basic re<strong>la</strong>tionship is illustrated in Figure<br />

3.1.This result is simi<strong>la</strong>r to <strong>de</strong>monstrations ma<strong>de</strong> in the context of diffraction tomography (Devaney,<br />

1981; Wu and Toksöz, 1987) and linearised inversion in the (ω, k) domain (for examp<strong>le</strong><br />

(C<strong>la</strong>yton and Stolt, 1981; Ikel<strong>le</strong> et al., 1986)).<br />

A comment on the gradient in the linear case<br />

Equation (3.6) is in the form of an inverse Fourier summation: if we have a comp<strong>le</strong>te and uniform<br />

sampling in the source and receiver scattering directions (an impossibility for most geophysical<br />

scenarios), then this inverse Fourier summation will undo the forward Fourier transform<br />

in ˜M(k), i.e.<br />

g(x) → ω 4 δm(x), (3.7)<br />

in which case we see that the gradient will recover a sca<strong>le</strong>d image of the original mo<strong>de</strong>l. Equation<br />

(3.7) illustrates the limitation of the gradient as an imaging operator: the factor ω 2 in the<br />

forward prob<strong>le</strong>m is not cancel<strong>le</strong>d by the gradient operator, instead it is re-used. The gradient<br />

thus provi<strong>de</strong>s a filtered image due to the distortion of the true perturbation as <strong>de</strong>scribed in section<br />

2.2.2.3, using the singu<strong>la</strong>r value <strong>de</strong>composition. The mo<strong>de</strong>l space eigenvectors and singu<strong>la</strong>r<br />

values of the multi-frequency inverse prob<strong>le</strong>m (time-domain) are therefore frequency <strong>de</strong>pendant<br />

and the high frequencies are associated with the highest singu<strong>la</strong>r values. A correct inverse mo<strong>de</strong>lling<br />

operator would cancel this term. In diffraction tomography and linearised inversion, the<br />

inverse operator contains a filter term inversely proportional to ω 2 , thus ensuring the fi<strong>de</strong>lity of


60 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

the reconstruction (Wu and Toksöz, 1987). In <strong>de</strong>scent schemes, an iterative approach should<br />

ensure that the distortions in the gradient are eventually removed.<br />

In multi-frequency, or time-domain inversions the factor ω 4 implies a doub<strong>le</strong> application of<br />

the second time <strong>de</strong>rivative operator. For this reason, time domain schemes are not very good<br />

at recovering the very low wavenumbers, which are suppressed by the <strong>de</strong>rivative operators.<br />

Iterations may eventually recover some of the low wavenumbers, but without preconditioning<br />

in some form the convergence rate will be slow. Furthermore, since the low frequencies have a<br />

more linear re<strong>la</strong>tion with the low wavenumber components of the mo<strong>de</strong>l, the chance of locating<br />

the global minimum will be diminished if this information is not used at an early stage of the<br />

inversion. This effect can be compensated for by low-pass filtering of the data in time (Bunks<br />

et al., 1995), and progressively including higher frequencies at a <strong>la</strong>ter stage. However, if the<br />

inversion only involves a few frequencies, it is always more efficient to carry this out in the<br />

frequency domain (Marfurt, 1984; Jo et al., 1996). In sing<strong>le</strong> frequency inversions the factor ω 4<br />

is of no consequence, as the step <strong>le</strong>ngth used in equation (2.38) will compensate for this factor<br />

correctly by rescaling the gradient.<br />

3.3 The 1-D case<br />

Having reviewed the basic re<strong>la</strong>tionship between the wavenumber spectrum of the image and<br />

the inci<strong>de</strong>nt and scattered wave directions, we now wish to use this re<strong>la</strong>tionship to <strong>de</strong>velop<br />

a strategy for the optimal se<strong>le</strong>ction of frequencies in frequency domain imaging of seismic<br />

ref<strong>le</strong>ction data. The fundamental assumption here will be that the strategy may be <strong>de</strong>veloped<br />

by examining the illumination of a 1-D mo<strong>de</strong>l (i.e., one in which the velocity varies only as a<br />

function of <strong>de</strong>pth), for which we need only recover the vertical wavenumbers from a range of<br />

source-receiver offsets. At a <strong>la</strong>ter stage, the strategy is tested on a mo<strong>de</strong>l with significant 2-D<br />

structure in or<strong>de</strong>r to evaluate the generality of the approach.<br />

3.3.1 Wavenumber illumination for the 1-D case<br />

As shown previously using a p<strong>la</strong>ne wave approximation, the contribution to the gradient image<br />

of a sing<strong>le</strong> source-receiver pair has only a sing<strong>le</strong> wavenumber component, given by k o (ŝ + ˆr).<br />

If the mo<strong>de</strong>l is 1-D, then all ref<strong>le</strong>ction points in the subsurface are midpoint ref<strong>le</strong>ction points.<br />

In other words, we may consi<strong>de</strong>r a common shot gather to be equiva<strong>le</strong>nt to a common midpoint


E<br />

7<br />

3.3. THE 1-D CASE 61<br />

;=<<br />

S<br />

CMP<br />

R<br />

65<br />

43<br />

BDC<br />

A@<br />

><br />

:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:<br />

898989898989898989898989898989898989898989898989898<br />

:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:9:<br />

898989898989898989898989898989898989898989898989898<br />

Figure 3.3: The 1-D basic scattering experiment. The inci<strong>de</strong>nt p<strong>la</strong>ne wave is scattered from a<br />

1-D thin <strong>la</strong>yer at the midpoint between source and receiver. For the 1-D case the inci<strong>de</strong>nt and<br />

scattered ang<strong>le</strong>s, θ and −θ are equal.<br />

(CMP) gather. The basic configuration is illustrated in Figure 3.3: each source-receiver pair<br />

records data from a given midpoint with a particu<strong>la</strong>r combination of symmetric p<strong>la</strong>ne waves<br />

(the equiva<strong>le</strong>nt wavenumber diagram for this configuration appears in Figure 3.1). For a 1-D<br />

earth, the inci<strong>de</strong>nt and scattering ang<strong>le</strong>s are symmetric, so that<br />

k o ŝ = (k o sin θ, k o cos θ) and k oˆr = (−k o sin θ, k o cos θ). (3.8)<br />

In equation (3.8) the ang<strong>le</strong>s θ and −θ are the propagation directions of the source and receiver<br />

wave vectors, <strong>de</strong>fined with respect to the vertical axis. From Figure 3.3 we observe that<br />

cos θ =<br />

z √<br />

h 2 +z 2<br />

sin θ =<br />

h √<br />

h 2 +z 2 ,<br />

(3.9)<br />

where h is the half offset and z is the <strong>de</strong>pth of the scattering <strong>la</strong>yer.<br />

(3.9) into equation (3.8) we find that the components of the vector k o (ŝ + ˆr) are<br />

Substituting expression<br />

k x = 0<br />

k z = 2k o α,<br />

(3.10)<br />

with<br />

α = cos θ


62 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

=<br />

=<br />

z<br />

√<br />

h2 + z 2<br />

1<br />

√<br />

1 + R<br />

2 , (3.11)<br />

where R = h/z is the half offset-to-<strong>de</strong>pth ratio.<br />

Equation (3.10) <strong>de</strong>fines the wavenumber illumination of a given 1-D <strong>la</strong>yer for a given<br />

source-receiver pair. The illumination varies with the wavenumber k o = ω/c o and with the<br />

factor α, which in turn <strong>de</strong>pends on the half offset-to-<strong>de</strong>pth ratio, R of the source-receiver pair<br />

re<strong>la</strong>tive to the target. The horizontal wavenumber of the gradient is always zero, as we expect<br />

for a 1-D mo<strong>de</strong>l. Each source-receiver pair contributes a different vertical wavenumber to the<br />

gradient image; the <strong>la</strong>rger the offset-to-<strong>de</strong>pth ratio, the smal<strong>le</strong>r is the vertical wavenumber of<br />

the contribution. Thus, for a given frequency and offset, the vertical wavenumber recovered<br />

will <strong>de</strong>crease with <strong>de</strong>pth.<br />

3.4 Strategy for choosing frequencies for 1-D imaging<br />

In the preceding sections, the wavenumber spectral coverage, for a given source-receiver pair<br />

at a sing<strong>le</strong> frequency over a 1-D earth, was c<strong>la</strong>rified. Using equations (3.10) and (3.11), but<br />

now for a range of offsets, we find that for a given surface seismic acquisition characterised by<br />

an offset range [0, x max ], the vertical wavenumber coverage k z of a 1-D thin <strong>la</strong>yer for a given<br />

frequency is limited to the range [k zmin , k zmax ], where<br />

k zmin = 2k o α min<br />

k zmax = 2k o<br />

(3.12)<br />

with<br />

α min =<br />

1<br />

√<br />

1 + R<br />

2<br />

max<br />

(3.13)<br />

where R max<br />

= h max /z is the half offset-to-<strong>de</strong>pth ratio obtained at the maximum half offset<br />

h max , and z is the <strong>de</strong>pth of the target <strong>la</strong>yer. The minimum and maximum wavenumbers, k zmin<br />

and k zmax are produced by the furthest and nearest offsets, respectively. Expressing equation<br />

(3.12) in terms of frequency we have<br />

k zmin = 4πfα min /c o<br />

k zmax = 4πf/c o<br />

(3.14)


3.4. STRATEGY FOR CHOOSING FREQUENCIES FOR 1-D IMAGING 63<br />

where f is the frequency expressed in Hz and c o is the velocity in the background medium.<br />

From re<strong>la</strong>tion (3.14), we may <strong>de</strong>fine the wavenumber coverage of a multi-offset acquisition as<br />

∆k z ≡ |k zmax − k zmin |<br />

= 4π (1 − α min ) f/c o , (3.15)<br />

whi<strong>le</strong> the bandwidth<br />

k zmax<br />

= 1 √<br />

= 1 + Rmax k zmin α 2 . (3.16)<br />

min<br />

The wavenumber coverage increases linearly with frequency and is therefore greater for high<br />

than for low frequencies (Wu and Toksöz, 1987), whi<strong>le</strong> the wavenumber bandwidth is a function<br />

only of the offset-to-<strong>de</strong>pth ratio.<br />

The strategy for se<strong>le</strong>cting frequencies is <strong>de</strong>termined as follows: each frequency has a limited,<br />

finite band contribution to the image spectrum. In or<strong>de</strong>r to recover the target accurately<br />

over a broad range of wavenumbers, the continuity of the coverage of the object in the wavenumber<br />

domain must be preserved as the imaging frequencies are se<strong>le</strong>cted. I choose<br />

k zmin (f n+1 ) = k zmax (f n ) (3.17)<br />

where f n+1 is the next frequency to be chosen following the frequency f n . The princip<strong>le</strong> (illustrated<br />

in Figure 3.4) is that the maximum wavenumber of the smal<strong>le</strong>r frequency must equal the<br />

minimum wavenumber of the <strong>la</strong>rger frequency. This strategy thus relies on using the full range<br />

of offsets avai<strong>la</strong>b<strong>le</strong>.<br />

Using the condition <strong>de</strong>fined in equation (3.17) and substituting equation (3.14) we arrive at<br />

the re<strong>la</strong>tion<br />

f n+1 =<br />

f n<br />

. (3.18)<br />

α min<br />

This <strong>le</strong>ad us to a discretisation <strong>la</strong>w in which the frequency increment ∆f n+1 is given by<br />

∆f n+1 = f n+1 − f n<br />

( ) 1 − αmin<br />

=<br />

α min<br />

f n<br />

= (1 − α min ) f n+1 (3.19)<br />

Equation (3.19) shows that the optimum frequency increment is not constant and increases<br />

linearly with frequency. This is an interesting result as the commonly used frequency domain<br />

sampling theorem sets the frequency increment to be constant and equal to<br />

∆f st = 1<br />

T max<br />

, (3.20)


„<br />

64 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

ƒ‚<br />

THUVXWY[ZQ\S]<br />

…†Q‡Sˆ¢‰iŠŒ‹sŽƒ¢<br />

^`_¢acb¦dQegfihj<br />

FHGIKJMLONQPSR<br />

‘,’Q“•”¦–˜—2–šƒ›©œ ¦§g¨S©gª¬«Œ­®i¯<br />

¢¡˜£2¤¡i¥ žQŸS<br />

0<br />

0<br />

kml¢nco¦pQqsr<br />

tiu<br />

{}|i~yO€<br />

viwyxOz<br />

Figure 3.4: Illustration of the frequency discretisation strategy. A sing<strong>le</strong> discrete frequency<br />

yields a range of vertical wavenumbers in the image (governed by the maximum and minimum<br />

offsets in the survey). Each successive frequency is se<strong>le</strong>cted in such a way as to obtain a<br />

continuous coverage in vertical wavenumbers.<br />

where T max is the maximum recor<strong>de</strong>d time. Equation (3.20) prevents “time-aliasing” (i.e.,<br />

wrap-around in the time domain due to insufficient frequency sampling) . Since ∆f n in equation<br />

(3.19) is typically bigger than ∆f st in equation (3.20), the strategy we are proposing is a<br />

discretisation that would produce time-aliasing of a sing<strong>le</strong> seismic trace in time, but will yield<br />

an unaliased image of the target in <strong>de</strong>pth from a set of seismic traces with a range of offsets.<br />

3.5 Numerical test I: 1-D mo<strong>de</strong>ls<br />

The 1-D experiment used in this section (see Figure 3.5) was originally used by Freu<strong>de</strong>nreich<br />

and Singh (2000), who applied a time domain waveform inversion method (Shipp et al., 1997)<br />

to evaluate and compare the time and frequency domain inversion approaches. (Freu<strong>de</strong>nreich<br />

and Singh simu<strong>la</strong>ted a frequency domain approach by inverting a sing<strong>le</strong> sinusoidal component<br />

of the time domain data.) They observed that when far offsets were inclu<strong>de</strong>d (out to 10 km),<br />

their frequency domain approach using 3 discrete frequencies did in<strong>de</strong>ed provi<strong>de</strong> an accurate<br />

imaging result equiva<strong>le</strong>nt to their time domain inversion. However, when attempting waveform


3.5. NUMERICAL TEST I: 1-D MODELS 65<br />

inversion for the same 3 frequencies, but using only a near offset range (limited to 3 km),<br />

their inversion result suffered from oscil<strong>la</strong>tory artifacts due to the limited set of input frequency<br />

components. They conclu<strong>de</strong>d that when only near offsets were avai<strong>la</strong>b<strong>le</strong>, the frequency domain<br />

approach was <strong>le</strong>ss robust.<br />

Freu<strong>de</strong>nreich and Singh raised the important question of exactly how the inversion frequencies<br />

should be chosen. In this section, I use the strategy on the same experiment and thus<br />

provi<strong>de</strong> an answer to this question.<br />

3.5.1 The synthetic data<br />

The data for the 1-D experiment (Figure 3.5) consist of a sing<strong>le</strong> shot gather comprising 201<br />

receivers, with a receiver interval of 50 meters. The maximum offset range avai<strong>la</strong>b<strong>le</strong> is [0, 10]<br />

km. The target of the imaging experiment is the thin <strong>la</strong>yer located at 2 km <strong>de</strong>pth. The velocity<br />

mo<strong>de</strong>l is exactly 1-D, and has thus only wavenumber components along the vertical k z axis. For<br />

all inversion results I present below, the starting mo<strong>de</strong>l was i<strong>de</strong>ntical to this mo<strong>de</strong>l, but without<br />

the thin <strong>la</strong>yer. All the mo<strong>de</strong>lling and inversion tests were per<strong>forme</strong>d with absorbing boundary<br />

at the top of the mo<strong>de</strong>l, and using a point source with a Ricker wave<strong>le</strong>t centred on 4 Hz. The<br />

time domain seismograms shown in Figure 3.5 are synthesised from the frequency domain<br />

finite difference results, using 50 frequencies within a range [0.2 − 10] Hz (thus representing a<br />

maximum recording time of 5 s).<br />

3.5.2 Validation of the p<strong>la</strong>ne wave predictions<br />

In the previous section, I quantified the wavenumber coverage k z of a sing<strong>le</strong> source/receiver<br />

pair within the p<strong>la</strong>ne wave assumption, for a 1-D mo<strong>de</strong>l (equation (3.10)). In this section,<br />

the gradient (<strong>de</strong>scent direction) is generated using equation (2.86), with point source Green’s<br />

functions and the results are tested against equation (3.10).<br />

Figure 3.6 <strong>de</strong>picts the first stage in the numerical computation of the gradient at 5 Hz using<br />

equation (2.86) for 2 individual source-receiver pairs (at 3 km and 10 km offsets). The 5 Hz<br />

Green’s functions were calcu<strong>la</strong>ted by the method of frequency-domain finite differences (Pratt<br />

and Worthington, 1990; Jo et al., 1996). The background mo<strong>de</strong>l is as <strong>de</strong>scribed above, i.e.,<br />

as in Figure 3.5a), without the thin <strong>la</strong>yer. The figure shows the real parts of the forward and<br />

back propagated wavefields and the gradient, the product Re { Pf ∗(x, s)P b(x, r, s) } (see equation<br />

(2.86)). Note that to compute the back-propagated wavefield P b (x, r, s), the data residuals


66 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

a)<br />

Velocity (km/s)<br />

2.3 2.4 2.5 2.6 2.7 2.8<br />

0<br />

0.5<br />

1.0<br />

b) c)<br />

S<br />

10 km<br />

R R<br />

1 2<br />

R R<br />

... ...<br />

R<br />

...<br />

R<br />

201<br />

0<br />

1<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Depth (km)<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

°=°=°=°=°=°=°=°=°=°=°=°=°=°=°=°=°=°<br />

±=±=±=±=±=±=±=±=±=±=±=±=±=±=±=±=±<br />

Time (s)<br />

2<br />

3<br />

4<br />

5<br />

Figure 3.5: Velocity mo<strong>de</strong>l after Freu<strong>de</strong>nreich and Singh, 2000. a) Velocity profi<strong>le</strong> of the 1-D<br />

mo<strong>de</strong>l, b) acquisition geometry of the 1-D basic imaging experiment: a sing<strong>le</strong> source and 201<br />

receivers were used, c) synthetic, time-domain data from the mo<strong>de</strong>l, using a Ricker wave<strong>le</strong>t<br />

centred on 4 Hz and with 400 ms of time <strong>de</strong><strong>la</strong>y.<br />

, ∆Ψ(r, s) were first computed for the experiment by subtraction of the data generated with and<br />

without the thin <strong>la</strong>yer.


ÄÅ<br />

ÆÇ<br />

ÈÊÉ ËÍÌ¿Î<br />

Ï<br />

ÑÐ<br />

Ò<br />

Ô<br />

º<br />

¸¹<br />

s·<br />

»½¼¿¾<br />

À ¾<br />

Á<br />

ÃÂ<br />

Ó<br />

Õ<br />

µ<br />

³H´ ²<br />

Depth (km)<br />

Depth (km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

0<br />

1<br />

2<br />

3<br />

4<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Depth (km)<br />

Depth (km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

0<br />

1<br />

2<br />

3<br />

4<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Depth (km)<br />

Depth (km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

0<br />

Figure 3.6: Illustration of the computation of the 5 Hz gradient ( equation (2.86)) for the mo<strong>de</strong>l in Figure 3.5, using two sourcereceiver<br />

pairs, at offsets of 3 km (top row) and 10 km (bottom row). The midpoint for each offset pair is marked.<br />

1<br />

2<br />

3<br />

4<br />

CMP<br />

CMP<br />

3.5. NUMERICAL TEST I: 1-D MODELS 67


68 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

The fundamental properties of the image in 1-D media (noted previously) may also be noted<br />

on Figure 3.6 by examining the wavepaths at the midpoint locations:<br />

1. At the midpoint location the gradient image contains only vertical wavenumbers.<br />

2. Large offsets generate low vertical wavenumbers and near offsets generate high wavenumbers.<br />

3. The wavenumber of the reconstruction increases with <strong>de</strong>creasing offset-to-<strong>de</strong>pth ratio<br />

(i.e., the <strong>de</strong>eper in the mo<strong>de</strong>l we go, the higher the wavenumber for a given offset).<br />

A full reconstruction is provi<strong>de</strong>d by combining gradient images such as those in Figure 3.6<br />

for all source-receiver pairs (i.e., all offsets). The reconstructed images for 2 Hz and for 5 Hz<br />

are illustrated in Figure 3.7a). Because of the equiva<strong>le</strong>nce between shot gathers and midpoint<br />

gathers for 1-D mo<strong>de</strong>ls, the images may be consi<strong>de</strong>red to be equiva<strong>le</strong>nt to a set of individual<br />

1-D images (or common image point gathers), each obtained from a midpoint gather with a<br />

different half-offset, h. Each result is a 2-D image which manifests the expected <strong>de</strong>crease in<br />

vertical wavenumber as the offset increases.<br />

Figure 3.7b) shows the Fourier transform of the gradient images in <strong>de</strong>pth, with an over<strong>la</strong>y<br />

representing the predicted wavenumber coverage from equation (3.10). The comparison confirms<br />

that the p<strong>la</strong>ne wave prediction is an a<strong>de</strong>quate representation of the wavenumber coverage<br />

in the far field and with a homogeneous background velocity mo<strong>de</strong>l. This shows that the analytic<br />

equations <strong>de</strong>veloped within the p<strong>la</strong>ne wave assumption may be safely used in the <strong>de</strong>velopment<br />

of the frequency se<strong>le</strong>ction strategy.<br />

3.5.3 A remark on “image stretch”<br />

Equation (3.10) predicts a <strong>de</strong>crease in vertical wavenumber content of the gradient image with<br />

offset; this effect is evi<strong>de</strong>nt on Figure 3.7. This shift towards low wavenumbers with offset is<br />

referred to as “image stretch”. The notion that the seismic image from <strong>la</strong>rge offsets is lower<br />

in resolution has long been observed (Buchholtz, 1972; Dunkin and Levin, 1973): this effect<br />

is usually termed “Normal Moveout (NMO) stretch” (see for examp<strong>le</strong> Yilmaz (1987)). As it is<br />

shown in the appendix A, the stretch factor 1/α in equation (3.10) is i<strong>de</strong>ntical to that caused by<br />

NMO stretch. If the processes of NMO correction, stacking and <strong>de</strong>pth conversion are applied,<br />

NMO stretch will, inevitably, add low wavenumber content to the image in a manner i<strong>de</strong>ntical


3.5. NUMERICAL TEST I: 1-D MODELS 69<br />

a) b)<br />

Depth (km)<br />

Half Offset (km)<br />

0 1 2 3 4 5<br />

0<br />

1<br />

2<br />

3<br />

4<br />

ãØäÚåŒæÝçåßèSéê½ë¬ç<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

Half Offset (km)<br />

0 1 2 3 4 5<br />

Depth (km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

Half Offset (km)<br />

0 1 2 3 4 5<br />

ÖØ×ÚÙŒÛÝÜÙßÞSàá½â¬Ü<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

Half Offset (km)<br />

0 1 2 3 4 5<br />

Figure 3.7: a) A set of 1-D amplitu<strong>de</strong> normalised gradient images at 2 Hz (top) and 5 Hz<br />

(bottom) as a function of half-offset, h for a mid-point gather. The 1-D mo<strong>de</strong>l used in this test is<br />

shown in Figure 3.5a); the location of the expected anomaly is at 2 km <strong>de</strong>pth. The 1-D gradient<br />

image at a sing<strong>le</strong> frequency, for a sing<strong>le</strong> source-receiver pair contains only a sing<strong>le</strong> vertical<br />

wavenumber, and hence is oscil<strong>la</strong>tory. b) the vertical Fourier transform of the images in a) .<br />

The white line is the vertical wavenumber predicted from p<strong>la</strong>ne wave theory in equation (3.10).<br />

The vertical wavenumber <strong>de</strong>creases with increasing half offset in a manner exactly equiva<strong>le</strong>nt<br />

to NMO stretch.


70 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

to that observed in Figure 3.7. If these processes are rep<strong>la</strong>ced with prestack <strong>de</strong>pth migration,<br />

the effect remains (Gardner et al., 1974; Levin, 1998). Usually this is consi<strong>de</strong>red <strong>de</strong>trimental,<br />

and data with a <strong>la</strong>rge offset-to-<strong>de</strong>pth ratio are muted in standard data processing. An exception<br />

to this is the work of Haldorsen and Farmer (1989), who showed that NMO stretch may be used<br />

to compensate for a <strong>la</strong>ck of low frequency content in the data spectrum in or<strong>de</strong>r to improve the<br />

spectral contain of a stack section.<br />

3.5.4 1-D waveform inversion<br />

In this section, I <strong>de</strong>scribe the application of the frequency discretisation strategy <strong>de</strong>veloped<br />

above to the 1-D basic imaging experiment in Figure 3.5, using two survey geometries: (1) A<br />

near offset survey with a maximum offset of 3 km, and (2) a far offset survey with a maximum<br />

offset of 10 km. Given the <strong>de</strong>pth of the bottom of the scatterer z = 2.2 km, equation (3.18)<br />

gives the following sequence of frequencies for the two offset ranges:<br />

Near offset survey [0, 3] km<br />

Far offset survey [0, 10] km<br />

f = 2, 2.4, 2.9, 3.5, 4.3, 5.2, 6.3, 7.6, 8 Hz<br />

f = 2, 5, 8 Hz<br />

Tab<strong>le</strong> 3.1: Frequency discretisation sequences for the near and far offset surveys. The <strong>la</strong>st<br />

frequency 8 Hz is chosen arbitrarily so that both experiments use i<strong>de</strong>ntical frequency ranges<br />

within the spectrum of the source signature.<br />

The two discretisation sequences are illustrated in Figure 3.8, in which the continuity of the<br />

wavenumber coverage is evi<strong>de</strong>nt. The near offset survey requires 9 frequencies, whereas the far<br />

offset survey only 3 frequencies. It is implicit in the strategy that the <strong>la</strong>rger the offset range is,<br />

the fewer frequencies are required.<br />

The inversion results for each of the surveys are shown Figure 3.9. In or<strong>de</strong>r to comp<strong>le</strong>te<br />

the imaging of the midpoints, the full set of common image point gradient images (shown<br />

in Figure 3.7) were stacked to form a sing<strong>le</strong>, 1-D image at an equiva<strong>le</strong>nt midpoint location<br />

for each frequency. In se<strong>le</strong>cting frequencies, I followed the sequence given in Tab<strong>le</strong> 3.1, one<br />

frequency at a time (i.e., “sequentially”), iterating 20 times at each discrete frequency. Figure<br />

3.9 shows both the final results (top row) and the individual contributions for each frequency<br />

to the image (bottom row). This se<strong>le</strong>ction of frequencies, although discretized far below the<br />

frequency domain sampling theorem, yields a continuous wavenumber coverage of the target<br />

without aliasing (see the wavenumber domain representations in Figure 3.9). Each imaging


3.5. NUMERICAL TEST I: 1-D MODELS 71<br />

Wavenumber (1/km)<br />

a)<br />

10<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

3<br />

2<br />

1<br />

ö÷Qømù¢ú<br />

ûüQý`þgÿ<br />

0<br />

0<br />

0 1 2 3 4 5 6 7 8 9 10<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Frequency (Hz)<br />

Frequency (Hz)<br />

Wavenumber (1/km)<br />

b)<br />

10<br />

9<br />

8<br />

7<br />

6<br />

5<br />

4<br />

3<br />

2<br />

1<br />

ìíQî`ïgð<br />

ñòQómô¢õ<br />

Figure 3.8: Frequency sequence generated by equation (3.18), for a) the near offset survey (0–<br />

3 km) and b) the far offset survey (0–10 km). The near offset survey requires 9 frequencies,<br />

whereas the far offset survey only 3 frequencies.<br />

frequency for the far offset survey c<strong>le</strong>arly yields a wi<strong>de</strong>r band of vertical wavenumbers than<br />

does the equiva<strong>le</strong>nt frequency for the near offset survey.


a)<br />

Depth (Km)<br />

Depth (Km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

Velocity perturbation (m/s)<br />

-200 -100 0 100 200<br />

Near Offset Range<br />

Final Result<br />

b)<br />

FT<br />

FT<br />

Contribution of each frequency<br />

Wavenumber (1/km)<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Amplitu<strong>de</strong><br />

0 100 200 300 400 500 600 700 800 900<br />

e)<br />

Depth (Km)<br />

Depth (Km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

Velocity perturbation (m/s)<br />

-200 -100 0 100 200<br />

Far Offset Range<br />

Final Result<br />

f)<br />

c) d)<br />

g)<br />

h)<br />

FT<br />

FT<br />

Contribution of each frequency<br />

Wavenumber (1/km)<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Amplitu<strong>de</strong><br />

0 100 200 300 400 500 600 700 800 900<br />

Figure 3.9: Sequential frequency domain inversion results for the near offset survey (a-d), and the far offset survey (e-h). The<br />

true perturbation is shown as a dotted line. The top row shows the final results in both the <strong>de</strong>pth domain (a,e) and the wavenumber<br />

domain (b,h) whi<strong>le</strong> the bottom row shows the contributions at each discrete frequency, both in the <strong>de</strong>pth domain (c, g) and in the<br />

wavenumber domain (d, h).<br />

72 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.


3.5. NUMERICAL TEST I: 1-D MODELS 73<br />

The far offset survey is also more successful in recovering the magnitu<strong>de</strong> of the velocity<br />

perturbation. This is due to the fact that the inversion of the first frequency (2 Hz) from the far<br />

offset survey yields information on much lower wavenumbers than does the 2 Hz result from<br />

the near offset survey. This is fundamental, and it is due to the image stretching remarked on<br />

above. The low wavenumbers are equally important in recovering the true velocity perturbation.<br />

Time domain waveform inversion can be simu<strong>la</strong>ted using the frequency domain approach by<br />

inverting all frequency components within the data simultaneously, rather than sequentially. By<br />

“All” frequencies, I mean with a comp<strong>le</strong>te sampling according to the sampling theorem (3.20).<br />

The gradient for this simultaneous, “time-like” inversion procedure is the sum of the sing<strong>le</strong>frequency<br />

gradients at each iteration. Figure 3.10 shows the results of the inversion using the<br />

same offset ranges using time-like inversion (i.e., inverting for 50 frequencies simultaneously).<br />

The far offset survey, once more, does a better job of recovering the magnitu<strong>de</strong> of the velocity<br />

perturbation (as in frequency domain inversion, and for the same reasons). Comparing the<br />

sequential, frequency domain inversions and the time-like inversions, we note that performing<br />

the waveform inversion with a limited number of frequencies according to the strategy yields<br />

an acceptab<strong>le</strong> result at a much smal<strong>le</strong>r computational cost, i.e., a factor of 5 fewer frequencies<br />

for the near offset range, and a factor of 16 fewer frequencies for the far offset range.<br />

3.5.5 Sensitivity to noise<br />

The strategy <strong>de</strong>veloped suppresses the redundancy in the wavenumber coverage of the 1-D target<br />

by reducing the number of frequencies given by the sampling theorem. However, some<br />

of this redundancy is most <strong>de</strong>sired when noise is present in the seismic data to assure the <strong>de</strong>structive<br />

summation of incoherent signal. In or<strong>de</strong>r to investigate the sensitivity to noise of our<br />

frequency se<strong>le</strong>ction strategy, the 1-D inversion is carried out after having ad<strong>de</strong>d white noise to<br />

the data using the program suaddnoise of the Seismic Un*x package (Stockwell, 1997). The<br />

ref<strong>le</strong>ction hyperbo<strong>la</strong> of the 1-D target with noise corresponding to a very high <strong>le</strong>vel of noise<br />

(signal to noise ratio of 1) is shown Figure 3.11b and can be compared with the original ref<strong>le</strong>ction<br />

on Figure 3.11a. The 1-D sequential inversion of the near offset data (0-3 km) using<br />

the 9 frequencies dictated by the strategy is shown Figure 3.11c. Since this experiment was<br />

carried out by inverting a sing<strong>le</strong> frequency at a time, the reconstruction is highly sensitive to<br />

noise and yields a poor result. Although the <strong>le</strong>vel of noise is here very high, this begs the important<br />

question of the efficiency of sing<strong>le</strong> frequency waveform inversion in the presence of


74 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

Near Offset Range<br />

a) b)<br />

0<br />

Velocity perturbation (m/s)<br />

-200 -100 0 100 200<br />

0<br />

Amplitu<strong>de</strong><br />

0 100 200 300 400 500 600 700 800 900<br />

0.5<br />

1<br />

1.0<br />

2<br />

3<br />

Depth (Km)<br />

1.5<br />

2.0<br />

2.5<br />

FT<br />

Wavenumber (1/km)<br />

4<br />

5<br />

6<br />

3.0<br />

7<br />

8<br />

3.5<br />

9<br />

c)<br />

4.0<br />

0<br />

0.5<br />

1.0<br />

Velocity perturbation (m/s)<br />

-200 -100 0 100 200<br />

Far Offset Range<br />

d)<br />

0<br />

1<br />

2<br />

3<br />

10<br />

Amplitu<strong>de</strong><br />

0 100 200 300 400 500 600 700 800 900<br />

Depth (Km)<br />

1.5<br />

2.0<br />

2.5<br />

FT<br />

Wavenumber (1/km)<br />

4<br />

5<br />

6<br />

3.0<br />

7<br />

8<br />

3.5<br />

9<br />

4.0<br />

10<br />

Figure 3.10: Time-like waveform inversion of the near offset range (a,b) and the far offset range<br />

(c,d). The true perturbation shown as a dotted line. 50 frequencies were inverted simultaneously<br />

in or<strong>de</strong>r to simu<strong>la</strong>te time domain waveform inversion. Results are shown in both <strong>de</strong>pth domain<br />

(a,c) and wavenumber domain (b,d).


3.5. NUMERICAL TEST I: 1-D MODELS 75<br />

a)<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

b)<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

2<br />

2<br />

Time (s)<br />

3<br />

4<br />

Time (s)<br />

3<br />

4<br />

0<br />

5<br />

c) d) e) f)<br />

Velocity perturbation (m/s)<br />

-200 -100 0 100 200<br />

0<br />

Velocity perturbation (m/s)<br />

-200 -100 0 100 200<br />

0<br />

5<br />

Velocity perturbation (m/s)<br />

-200 -100 0 100 200<br />

0<br />

Velocity perturbation (m/s)<br />

-200 -100 0 100 200<br />

0.5<br />

0.5<br />

0.5<br />

0.5<br />

1.0<br />

1.0<br />

1.0<br />

1.0<br />

1.5<br />

1.5<br />

1.5<br />

1.5<br />

Depth (Km)<br />

2.0<br />

Depth (Km)<br />

2.0<br />

Depth (Km)<br />

2.0<br />

Depth (Km)<br />

2.0<br />

2.5<br />

2.5<br />

2.5<br />

2.5<br />

3.0<br />

3.0<br />

3.0<br />

3.0<br />

3.5<br />

3.5<br />

3.5<br />

3.5<br />

4.0<br />

4.0<br />

4.0<br />

4.0<br />

Figure 3.11: Impact of noise on the reconstruction of the 1-D target for an inversion of the near<br />

offset data (0-3 km). a) Original ref<strong>le</strong>ction hyperbo<strong>la</strong> on the 1-D target. b) Ref<strong>le</strong>ction hyperbo<strong>la</strong><br />

with white noise (signal to noise ratio of 1). c) Sequential inversion using 9 frequencies<br />

according to the se<strong>le</strong>ction strategy as in Figure 3.9. d) Simultaneous inversion using the same 9<br />

frequencies. e) Simultaneous using 16 frequencies obtained with an effective maximum offset<br />

of 2 km. f) Time-like (simultaneous) inversion using 50 frequencies.<br />

noise. Figure 3.11c shows that inverting the same 9 frequencies simultaneously, the reconstruction<br />

is much improved. The reconstructed target contains more noise than the equiva<strong>le</strong>nt time<br />

domain inversion using 50 frequencies (Figure 3.11e). The noise can neverthe<strong>le</strong>ss be attenuated<br />

by increasing some of the redundancy using 16 frequencies in the inversion as shown Figure<br />

3.11d. The sequence of 16 frequencies was generated using a maximum offset of 2 km in the<br />

calcu<strong>la</strong>tion of α min (equation 3.13).<br />

Despite the high <strong>le</strong>vel of noise present in the data, the simultaneous inversion using 9 frequencies<br />

does yield a good image. The frequency <strong>de</strong>cimation suppresses some of the redundancy<br />

but the stack of each offset contribution to the image does neverthe<strong>le</strong>ss allow <strong>de</strong>structive<br />

summation of some of the random noise.


76 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

3.6 Numerical test II: 2-D mo<strong>de</strong>ls<br />

This strategy explicitly ensures the continuity of the wavenumber coverage only along the k z<br />

axis, and only for a 1-D target in a homogeneous medium. The applicability of the strategy to<br />

2-D, heterogeneous velocity mo<strong>de</strong>ls is therefore unknown. In the 2-D case, other directions in<br />

the wavenumber spectrum of the mo<strong>de</strong>l must be recovered. In this section, the strategy is tested<br />

using a more <strong>de</strong>manding, 2-D application.<br />

3.6.1 Marmousi velocity mo<strong>de</strong>l<br />

In or<strong>de</strong>r to evaluate the applicability of the strategy to 2-D mo<strong>de</strong>ls, the approach is tested with<br />

synthetic seismic data from the 2-D Marmousi velocity mo<strong>de</strong>l (Versteeg, 1994). Waveform<br />

inversion has already been applied to data from the Marmousi mo<strong>de</strong>l using both a time domain<br />

approach (Bunks et al., 1995) and a frequency domain approach (Forgues et al., 1998). Forgues<br />

et al. (1998) per<strong>forme</strong>d frequency domain waveform inversion with a starting velocity mo<strong>de</strong>l<br />

which was a smoothed version of the true mo<strong>de</strong>l. They also used an arbitrary se<strong>le</strong>ction of<br />

imaging frequencies. These two over-simplifications are avoi<strong>de</strong>d in the test <strong>de</strong>scribed below.<br />

3.6.1.1 Synthetic data<br />

A key conclusion of this analysis is that <strong>la</strong>rge offsets are important if we want to recover the<br />

lowest wavenumbers in the mo<strong>de</strong>l (which, in turn allow us to recover the true magnitu<strong>de</strong> of<br />

the velocity anomalies). The original Marmousi experiment was limited in offset to only 3<br />

km. The seismic survey in the Marmousi mo<strong>de</strong>l was therefore modified and remo<strong>de</strong>l<strong>le</strong>d to<br />

create new, wi<strong>de</strong>-ang<strong>le</strong> data. The velocity mo<strong>de</strong>l was restricted to 2 km in <strong>de</strong>pth to assure the<br />

efficient illumination of the mo<strong>de</strong>l at <strong>la</strong>rge offset-to-<strong>de</strong>pth ratios by the diving and refracted<br />

waves (Figure 3.12). These arrivals will be nee<strong>de</strong>d for the <strong>de</strong>termination of the starting mo<strong>de</strong>l<br />

of the waveform inversion (see below). The new synthetic survey comprises 96 sources (every<br />

96 meters) and 384 receivers (every 24 meters), with a maximum offset of 9.2 km (simu<strong>la</strong>ting<br />

the geometry of an Ocean Bottom Cab<strong>le</strong> (OBC)). The synthetic data shown in Figure 3.12<br />

were created using frequency domain finite differences; time domain data were extracted from<br />

122 frequencies in the range [0.12 − 15] Hz by Fourier synthesis. A Ricker source wave<strong>le</strong>t<br />

with a spectrum centred on 5.5 Hz was used in the simu<strong>la</strong>tion. The forward mo<strong>de</strong>lling and<br />

the inversions were all per<strong>forme</strong>d with a free surface boundary at the top of the mo<strong>de</strong>l; the


3.6. NUMERICAL TEST II: 2-D MODELS 77<br />

a)<br />

V (m/s)<br />

b)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

Depth (km)<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

0<br />

Offset (km)<br />

-4 -3 -2 -1 0 1 2 3 4<br />

0<br />

Offset (km)<br />

-9 -8 -7 -6 -5 -4 -3 -2 -1 0<br />

1<br />

1<br />

1<br />

2<br />

2<br />

2<br />

Time (s)<br />

3<br />

4<br />

Time (s)<br />

3<br />

4<br />

Time (s)<br />

3<br />

4<br />

5<br />

5<br />

5<br />

6<br />

6<br />

6<br />

7<br />

7<br />

Shot 1 at 48 m Shot 48 at 4560 m Shot 96 at 9168 m<br />

7<br />

Figure 3.12: Marmousi wi<strong>de</strong> ang<strong>le</strong> synthetic data. a) The velocity mo<strong>de</strong>l, unchanged from the<br />

original Marmousi mo<strong>de</strong>l but truncated at 2 km <strong>de</strong>pth, b) three representative wi<strong>de</strong> ang<strong>le</strong> shot<br />

gathers. The acquisition was expan<strong>de</strong>d to wi<strong>de</strong> offsets by recording data for all shots at all 384<br />

receiver points along the surface (simu<strong>la</strong>ting an OBC recording geometry).<br />

simu<strong>la</strong>tions therefore inclu<strong>de</strong> the effect of free surface multip<strong>le</strong>s.<br />

3.6.2 Determination of the starting mo<strong>de</strong>l by traveltime inversion<br />

In or<strong>de</strong>r to apply waveform inversion to the wi<strong>de</strong> ang<strong>le</strong> Marmousi data set, a starting velocity<br />

mo<strong>de</strong>l is required that must be located in the neighbourhood of the global minimum of the<br />

objective function for the starting frequency. A suitab<strong>le</strong> starting mo<strong>de</strong>l may be obtained by<br />

combining waveform inversion with ray-based traveltime inversion (Pratt and Goulty, 1991). I<br />

used traveltime inversion of the hand-picked first arrivals, using the program “FAST” (Zelt and<br />

Barton, 1998).


78 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

0.5<br />

Eikonal<br />

1 1.5 2 2.5 3 3.5<br />

4<br />

Shot 1<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

Ray Tracing<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

0.5<br />

2<br />

Shot 48<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

2.0<br />

2<br />

1.5 1 1<br />

1.5<br />

2.0<br />

Depth (km)<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

4<br />

0.5<br />

Shot 96<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

2.0<br />

3.5<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

2.0<br />

Figure 3.13: Examp<strong>le</strong> of the finite difference solver of the eikonal in a smoothed Marmousi<br />

mo<strong>de</strong>l for the 3 shot points (1,48 and 96). The ray tracing uses the negative time gradient to<br />

<strong>de</strong>fine the rays connecting the receivers to the source location.<br />

The forward prob<strong>le</strong>m<br />

The FAST software uses a finite difference solver of the eikonal equation to forward mo<strong>de</strong>l<br />

the first arrival times (Vida<strong>le</strong>, 1990; Podvin and Lecomte, 1991). Figure 3.13 shows the first<br />

arrivals forward mo<strong>de</strong>lling in a smoothed Marmousi mo<strong>de</strong>l. The finite difference method yields<br />

the first arrival at each no<strong>de</strong> of the mo<strong>de</strong>l thus allowing the <strong>de</strong>termination of isochrones. Each<br />

source-receiver ray is <strong>de</strong>fined from the receiver location and is propagated to the source location<br />

using the path given by the negative time gradient (normal to the isochrones).<br />

What information should we expect from traveltime inversion <br />

The eikonal equation is based on the high-frequency asymptotic approximation of the wave<br />

equation. This approximation is valid if the medium varies slowly with respect to propagated<br />

wave <strong>le</strong>ngth. In or<strong>de</strong>r to investigate the type of velocity mo<strong>de</strong>l that should be expected from<br />

the traveltime inversion, I carried out a traveltime finite difference mo<strong>de</strong>lling of the eikonal in<br />

a increasingly smoothed version of the true Marmousi mo<strong>de</strong>l. The smoothing was carried out


3.6. NUMERICAL TEST II: 2-D MODELS 79<br />

in the vertical and horizontal directions using the program smooth2 of the Seismic for Unix<br />

(SU) package. For each smoothed mo<strong>de</strong>l, the traveltime are compared with the hand picked<br />

first arrivals of the waveform data. The evolution of the traveltime misfit function with respect<br />

to the <strong>de</strong>gree of smoothness of the true Marmousi mo<strong>de</strong>l is shown Figure 3.14. The true mo<strong>de</strong>l<br />

(not smoothed) corresponds to a smoothness factor of 0. The misfit function <strong>de</strong>creases with<br />

smoothness until it reaches a minimum for a smoothing factor of 36 and then re-increases.<br />

This is a critical observation as the true mo<strong>de</strong>l is not the mo<strong>de</strong>l for which the misfit function is<br />

minimum. This <strong>le</strong>ads us to a conclusion that is consistent with the asymptotic approximation:<br />

the global minimum of the traveltime misfit function does not correspond to the true mo<strong>de</strong>l but<br />

rather a smoothed version of it.<br />

Traveltime inversion<br />

The traveltime inversion was first carried out to estimate the 1-D medium that best fit the data<br />

followed by solving for a strongly smoothed 2-D mo<strong>de</strong>l, and then progressively <strong>le</strong>ss smoothed<br />

mo<strong>de</strong>ls. The final traveltime result is a re<strong>la</strong>tively smooth, 2-D velocity structure that fits the<br />

picked time with an RMS residual of 27 ms (Figure 3.15a-b)) — a reasonab<strong>le</strong> <strong>le</strong>vel consi<strong>de</strong>ring<br />

the 200 ms period of the central frequency of the source. Finite difference mo<strong>de</strong>lling in this result<br />

produces seismic traces that closely match the first arrival waveforms, but contain virtually<br />

no ref<strong>le</strong>cted waves (Figure 3.15d)).<br />

3.6.3 Waveform inversion<br />

The 2-D, smooth traveltime inversion velocity mo<strong>de</strong>l in Figure 3.15a) was used as a starting<br />

mo<strong>de</strong>l for the subsequent waveform inversion. The frequency sequence in the inversion was<br />

calcu<strong>la</strong>ted using equation (3.18), with a target <strong>de</strong>pth of z = 2 km (i.e, the maximum <strong>de</strong>pth in<br />

the mo<strong>de</strong>l), and a maximum offset x max = 4 km. Although the maximum offset is in reality 9.2<br />

km, the sequence was calcu<strong>la</strong>ted with this “effective” offset range because the far offsets are<br />

un<strong>de</strong>r represented in the OBC shooting geometry, and hence these will carry a disproportionally<br />

small weight in the inversion. Furthermore, the edges of the mo<strong>de</strong>l are not illuminated by the<br />

wi<strong>de</strong>st offsets, especially at <strong>de</strong>pth. Using these parameters, the frequency se<strong>le</strong>ction strategy<br />

yields three frequencies within the bandwidth of the data: 5,7 and 10 Hz. 20 iterations were<br />

carried out at each imaging frequency sequentially.<br />

Figure 3.16 shows the contribution of each frequency and the inversion results are shown


80 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

a)<br />

200<br />

150<br />

RMS residuals<br />

100<br />

50<br />

V (m/s)<br />

b)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0 10 20 30 40 50 60 70 80<br />

Smoothing Factor<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8<br />

Figure 3.14: a) Evolution of the traveltime misfit function with respect to increasing <strong>de</strong>gree of<br />

smoothness of the true Marmousi mo<strong>de</strong>l. b) The smoothed mo<strong>de</strong>l for which the misfit function<br />

is minimum (smoothing factor of 36).


0<br />

0<br />

0<br />

3.6. NUMERICAL TEST II: 2-D MODELS 81<br />

a)<br />

Initial residuals<br />

Final residuals<br />

b)<br />

V (m/s)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

Velocity Mo<strong>de</strong>l<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

c)<br />

Ray Tracing<br />

2.0<br />

2.0<br />

2.0<br />

Depth (km)<br />

1.5 1.0 0.5<br />

Depth (km)<br />

1.5 1.0 0.5<br />

Depth (km)<br />

1.5 1.0 0.5<br />

0 1 2 3 4 5 6 7 8 9<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Distance (km)<br />

d)<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

0<br />

0<br />

Waveform Mo<strong>de</strong>ling<br />

Offset (km)<br />

-4 -3 -2 -1 0 1 2 3 4<br />

0<br />

Offset (km)<br />

-9 -8 -7 -6 -5 -4 -3 -2 -1 0<br />

1<br />

1<br />

1<br />

2<br />

2<br />

2<br />

Time (s)<br />

3<br />

4<br />

Time (s)<br />

3<br />

4<br />

Time (s)<br />

3<br />

4<br />

5<br />

5<br />

5<br />

6<br />

6<br />

6<br />

7<br />

7<br />

7<br />

Figure 3.15: First arrival traveltime inversion showing a) the traveltime residuals versus offset,<br />

b) the final traveltime inversion velocity mo<strong>de</strong>l, and c) examp<strong>le</strong>s of ray tracing in the final mo<strong>de</strong>l<br />

for shots 1,48 and 96. This velocity mo<strong>de</strong>l fits the first arrival picks to an RMS <strong>le</strong>vel of 27 ms.<br />

d) Finite difference seismic traces (normalised) in this velocity mo<strong>de</strong>l are also shown, for the 3<br />

shot positions.


82 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

V (m/s)<br />

V (m/s)<br />

V (m/s)<br />

250<br />

0<br />

-250<br />

250<br />

0<br />

-250<br />

250<br />

0<br />

-250<br />

Depth (km)<br />

Depth (km)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

At 5 Hz<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

At 7 Hz<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

At 10 Hz<br />

Figure 3.16: Waveform inversion in Marmousi: velocity perturbation introduced following the<br />

comp<strong>le</strong>tion of each frequency.<br />

Figure 3.17. As expected, the resolution of the image improves as higher frequencies are used<br />

in the inversion. The final velocity mo<strong>de</strong>l (following the use of 10 Hz data) is very close to the<br />

true mo<strong>de</strong>l. Except at the very edges of the mo<strong>de</strong>l where the coverage is incomp<strong>le</strong>te, the full<br />

comp<strong>le</strong>xity of the 2-D structure is recovered. Figure 3.18 shows three representative velocity<br />

profi<strong>le</strong>s within the mo<strong>de</strong>l, and compares these to the true mo<strong>de</strong>l and the starting mo<strong>de</strong>l from<br />

traveltime inversion, from which it may be conclu<strong>de</strong>d that not only are the structures correctly<br />

imaged, but also that the magnitu<strong>de</strong>s of the velocity perturbations are accurately estimated. In<br />

spite of the oscil<strong>la</strong>tion of the individual contributions to the image at intermediate frequencies<br />

(Figure 3.17), there is no evi<strong>de</strong>nce of any oscil<strong>la</strong>tory artifacts (“ringing”) in these results.


3.6. NUMERICAL TEST II: 2-D MODELS 83<br />

V (m/s)<br />

V (m/s)<br />

V (m/s)<br />

V (m/s)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

Depth (km)<br />

Depth (km)<br />

Depth (km)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

After 5 Hz<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

After 5 and 7 Hz<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

After 5,7 and 10 Hz<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

True Mo<strong>de</strong>l<br />

Figure 3.17: Waveform inversion results in the Marmousi mo<strong>de</strong>l at progressively higher frequencies.


84 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

Depth (m)<br />

Depth (m)<br />

Depth (m)<br />

500<br />

1000<br />

1500<br />

2000<br />

500<br />

1000<br />

1500<br />

2000<br />

500<br />

1000<br />

1500<br />

2000<br />

Velocity (m/s)<br />

1000 1500 2000 2500 3000 3500 4000<br />

0<br />

At 2.3 km<br />

Velocity (m/s)<br />

1000 1500 2000 2500 3000 3500 4000<br />

0<br />

At 4.5 km<br />

Velocity (m/s)<br />

1000 1500 2000 2500 3000 3500 4000<br />

0<br />

At 6.9 km<br />

Figure 3.18: Velocity profi<strong>le</strong>s at three locations in the Marmousi mo<strong>de</strong>l (see Figure 3.17), showing<br />

the true mo<strong>de</strong>l (grey), the starting mo<strong>de</strong>l (dotted) and the result of the waveform inversion<br />

(solid).


3.7. DISCUSSION 85<br />

An essential aspect of any inversion is an illustration of the <strong>de</strong>gree to which the result actually<br />

predicts the observations. Figure 3.19 shows the data residuals for the same shot gathers<br />

showed in Figure 3.12 and 3.15 computed in the starting mo<strong>de</strong>l and in the final velocity mo<strong>de</strong>l<br />

(after 10 Hz). Although only 3 frequencies were used for the waveform inversion, the waveforms<br />

in the time domain (created using all 122 frequencies) show an excel<strong>le</strong>nt fit with the true<br />

waveforms. Some misfit remains at the edge of the mo<strong>de</strong> (near offset of shot 1 and 96) due to<br />

the <strong>la</strong>ck of coverage of the edge of the mo<strong>de</strong>l.<br />

3.7 Discussion<br />

3.7.1 Efficiency of the strategy for 2-D heterogeneous media<br />

I have shown in this chapter that the frequency se<strong>le</strong>ction strategy <strong>de</strong>fined yields very accurate<br />

results in both 1-D and 2-D heterogeneous media, <strong>de</strong>spite the fact that the strategy was <strong>de</strong>veloped<br />

using the simp<strong>le</strong>st case of a 1-D target embed<strong>de</strong>d in an homogeneous mo<strong>de</strong>l. The strategy<br />

assumes 1-D illumination by straight rays only, in which case the inci<strong>de</strong>nt and scattering ang<strong>le</strong>s<br />

yield information only on the vertical component of the target. In more realistic 2-D, heterogeneous<br />

medium, other wavenumber directions must be recovered. The success of the strategy<br />

when applied to more comp<strong>le</strong>x mo<strong>de</strong>ls (such as the Marmousi mo<strong>de</strong>l in the previous section)<br />

may be exp<strong>la</strong>ined by the nature of the background velocity mo<strong>de</strong>l. A more comp<strong>le</strong>x mo<strong>de</strong>l will<br />

produce additional coverage due to two distinct effects:<br />

1. Non symmetric illumination: In comp<strong>le</strong>x, 2-D mo<strong>de</strong>ls the inci<strong>de</strong>nt and scattering ang<strong>le</strong>s<br />

are different. Energy reaches the receivers from non-horizontal structures (containing<br />

non-vertical wavenumber components). Hence the inversion will recover additional<br />

wavenumbers away from the vertical axis.<br />

2. Ray bending: In mo<strong>de</strong>ls with an increase in velocity with <strong>de</strong>pth, the inci<strong>de</strong>nt and scattering<br />

ang<strong>le</strong> are wi<strong>de</strong>r than those predicted in a homogeneous mo<strong>de</strong>l due to the progressive<br />

curvature of ray paths toward the horizontal axis with <strong>de</strong>pth. Wi<strong>de</strong>r scattering ang<strong>le</strong>s<br />

provi<strong>de</strong> lower wavenumber information than that predicted in a homogeneous medium;<br />

at very wi<strong>de</strong> ang<strong>le</strong> diving waves exist, which are scattered from the very low wavenumber<br />

components of the velocity mo<strong>de</strong>l (i.e., they tell us about the background velocities).


86 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

Initial Residuals<br />

Final Residuals<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

0<br />

Shot 1<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

0<br />

1<br />

1<br />

2<br />

2<br />

Time (s)<br />

3<br />

4<br />

Time (s)<br />

3<br />

4<br />

5<br />

5<br />

6<br />

6<br />

7<br />

0<br />

1<br />

2<br />

Offset (km)<br />

-4 -3 -2 -1 0 1 2 3 4<br />

7<br />

Shot 48<br />

0<br />

1<br />

2<br />

Offset (km)<br />

-4 -3 -2 -1 0 1 2 3 4<br />

Time (s)<br />

3<br />

4<br />

Time (s)<br />

3<br />

4<br />

5<br />

5<br />

6<br />

6<br />

7<br />

Offset (km)<br />

-9 -8 -7 -6 -5 -4 -3 -2 -1 0<br />

0<br />

7<br />

Shot 96<br />

Offset (km)<br />

-9 -8 -7 -6 -5 -4 -3 -2 -1 0<br />

0<br />

1<br />

1<br />

2<br />

2<br />

Time (s)<br />

3<br />

4<br />

Time (s)<br />

3<br />

4<br />

5<br />

5<br />

6<br />

6<br />

7<br />

7<br />

Figure 3.19: Data residuals in the Marmousi mo<strong>de</strong>l , in the starting mo<strong>de</strong>l (<strong>le</strong>ft column) and<br />

in the final mo<strong>de</strong>l of the waveform inversion (right column) for shots 1,48 and 96 . Despite<br />

using only 3 frequencies in the waveform inversion, the mo<strong>de</strong>lling using 122 frequencies shows<br />

a good fit between the real and estimated data. This is a true amplitu<strong>de</strong> disp<strong>la</strong>y, the colour sca<strong>le</strong><br />

of initial and final residuals are i<strong>de</strong>ntical.


3.7. DISCUSSION 87<br />

The application of the frequency se<strong>le</strong>ction strategy will thus in fact produce some redundancy<br />

in wavenumber coverage, since the wavenumber coverage of a sing<strong>le</strong> frequency is<br />

actually wi<strong>de</strong>r than the one predicted.<br />

These effects together seem to permit an effective 2-D coverage of the mo<strong>de</strong>l observed with<br />

the Marmousi experiment, even with a limited number of frequencies se<strong>le</strong>cted according to a<br />

strategy based on 1-D, homogeneous mo<strong>de</strong>ls.<br />

3.7.2 Practical imp<strong>le</strong>mentation of the frequency se<strong>le</strong>ction strategy<br />

My strategy seeks to suppress redundancy of information by taking advantage of the wavenumber<br />

coverage inherent in multi-offset surveys. Some redundancy remains when the strategy is<br />

applied in heterogeneous media; in cases one may also wish to further increase this redundancy.<br />

For examp<strong>le</strong>, in the presence of non-coherent noise, redundancy will allow the <strong>de</strong>structive summation<br />

of unwanted signal. Furthermore, since the far offsets are more sensitive to errors in the<br />

background velocity mo<strong>de</strong>l and therefore more non-linear (due to longer propagation distances),<br />

it may be preferab<strong>le</strong> to <strong>de</strong>crease the <strong>de</strong>pen<strong>de</strong>nce of the imaging on the far offset information.<br />

This can be easily done by <strong>de</strong>creasing the effective offset to <strong>de</strong>pth ratio R max used in equation<br />

(3.13) by using a half offset h max that is smal<strong>le</strong>r than the true maximum value (as was done<br />

in the Marmousi mo<strong>de</strong>l). This will likely improve the quality of the imaging, particu<strong>la</strong>rly in<br />

the case of significant dip on the ref<strong>le</strong>ctors (in which case the scattering ang<strong>le</strong>s are somewhat<br />

smal<strong>le</strong>r than for horizontal ref<strong>le</strong>ctors).<br />

3.7.3 The equiva<strong>le</strong>nce between gradient images and migration<br />

Many authors have i<strong>de</strong>ntified the kinematic equiva<strong>le</strong>nce between the first iteration gradient image<br />

and prestack migration (e.g., Taranto<strong>la</strong> (1986); Mora (1989)). Migration maps the data, after<br />

muting of the first arrivals, to “isochrones” in the mo<strong>de</strong>l space (Mil<strong>le</strong>r et al., 1987), whereas the<br />

gradient maps the data residuals (which may contain transmitted as well as ref<strong>le</strong>cted arrivals)<br />

to the wavepath (see Figure 3.6). Transmitted events map within the first Fresnel zones of the<br />

wavepath, whi<strong>le</strong> ref<strong>le</strong>cted events map to the higher or<strong>de</strong>r Fresnel zones, which are elliptical<br />

zones corresponding to the isochrones used in migration (Woodward, 1992).<br />

In the first iteration of a waveform inversion scheme the starting mo<strong>de</strong>l is normally a highly<br />

smoothed mo<strong>de</strong>l, as in the numerical test of the Marmousi mo<strong>de</strong>l above. Such a mo<strong>de</strong>l will gen-


88 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.<br />

erate accurate transmitted arrivals but will generate no ref<strong>le</strong>cted energy (as in Figure 3.15). As<br />

a result, the first iteration data residuals will be dominated by ref<strong>le</strong>ctions. In the numerical test<br />

using the 1-D mo<strong>de</strong>l above, the initial data residuals consisted only of ref<strong>le</strong>cted energy from the<br />

target at 2 km <strong>de</strong>pth. Un<strong>de</strong>r these conditions the first iteration image is kinematically equiva<strong>le</strong>nt<br />

to a migration of the data. The gradient is not a particu<strong>la</strong>rly well-<strong>de</strong>signed migration operator,<br />

since there are no focusing terms or amplitu<strong>de</strong> terms to act as preconditioners, neverthe<strong>le</strong>ss<br />

the correspon<strong>de</strong>nce is significant and frequency domain migration algorithms could potentially<br />

make use of the strategy <strong>de</strong>veloped in this chapter.<br />

This correspon<strong>de</strong>nce between gradient images and migration operators implies that the spectral<br />

properties of the gradient image noted in this chapter also apply to migration. Multi-offset<br />

migration also contains an “image stretch” effect (Gardner et al., 1974; Tygel et al., 1994) that<br />

is normally consi<strong>de</strong>red <strong>de</strong>trimental (Brown, 1994; Levin, 1998). The meaning commonly given<br />

to migration is the recovery of discontinuities in the subsurface, so that the suppression of low<br />

wavenumbers may even be consi<strong>de</strong>red <strong>de</strong>sirab<strong>le</strong>. It was shown in this chapter that the image<br />

stretch, when properly hand<strong>le</strong>d, will in fact contribute the required low wavenumbers to the<br />

imaged velocity structure.<br />

However, the successful application of the strategy to prestack migration would <strong>de</strong>pend on<br />

the accurate combination of multi-offset images. This implies that the contribution of each<br />

offset to the image must carry true amplitu<strong>de</strong> information. The result of a simp<strong>le</strong> stack of the<br />

multi-offset images will fail to recover the true velocity perturbations if the images contain<br />

inaccurate re<strong>la</strong>tive amplitu<strong>de</strong>s. This is also true for the simp<strong>le</strong> gradient in waveform inversion,<br />

but the iterative imp<strong>le</strong>mentation of the <strong>de</strong>scent methods corrects for the inaccuracies in the stack.<br />

Migration is usually formu<strong>la</strong>ted as a one-step process, without iteration, in which case effects<br />

such as spherical divergence, and the number of input traces contributing to each offset image<br />

must be explicitly compensated for. In or<strong>de</strong>r to use this approach to frequency se<strong>le</strong>ction for<br />

prestack <strong>de</strong>pth migration, the required migration operators need to at <strong>le</strong>ast preserve amplitu<strong>de</strong>s<br />

during the combination of multip<strong>le</strong> offset images (e.g. Sch<strong>le</strong>icher et al. (1993)).<br />

As an alternative, since the imp<strong>le</strong>mentation of the strategy implies a significant reduction<br />

of the computational cost of the imaging, iterative migration/inversion techniques may become<br />

more feasib<strong>le</strong>.


3.8. CONCLUSION 89<br />

3.8 Conclusion<br />

I have <strong>de</strong>fined a strategy for se<strong>le</strong>cting temporal frequencies for efficient waveform inversion.<br />

This se<strong>le</strong>ction scheme <strong>de</strong>fines a much sparser set of frequencies than does the frequency domain<br />

sampling theorem. This is possib<strong>le</strong> because this approach makes use of the multi-offset<br />

coverage of each ref<strong>le</strong>ction point and optimises the use of information contributed by each<br />

source-receiver pair. By taking advantage of the image stretch occurring with increasing offset,<br />

the strategy adapts the se<strong>le</strong>ction of frequencies to the maximum offset present in the data. The<br />

i<strong>de</strong>a is simp<strong>le</strong>: the further this maximum offset is, the fewer frequencies are nee<strong>de</strong>d. In <strong>de</strong>riving<br />

the strategy, I ma<strong>de</strong> the approximations of a 1-D target in an homogeneous background media,<br />

illuminated by p<strong>la</strong>ne waves. I neverthe<strong>le</strong>ss show that this strategy is also very efficient when<br />

applied to data from 2-D heterogeneous media.<br />

Waveform inversion was used in or<strong>de</strong>r to illustrate numerically the accuracy of the method.<br />

I showed that, for a 1-D target in a homogeneous medium, using only the frequencies <strong>de</strong>fined<br />

by our strategy produces a result comparab<strong>le</strong> with equiva<strong>le</strong>nt, wi<strong>de</strong> band, time domain inversion<br />

(using the full set of frequencies as <strong>de</strong>fined by the sampling theorem).<br />

I also used the 2-D Marmousi velocity mo<strong>de</strong>l to test the strategy in a more realistic, heterogeneous<br />

medium. This experiment yiel<strong>de</strong>d very good results, in which the full range of<br />

wavenumbers of the velocity mo<strong>de</strong>l was recovered, <strong>de</strong>monstrating the imaging potential of parameterising<br />

the full wavefield using only a limited set of frequencies. A traveltime tomographic<br />

inversion of the first arrivals was used in or<strong>de</strong>r to create a starting mo<strong>de</strong>l for our waveform inversion<br />

that fell within the neighbourhood of the global minimum of the misfit function at the<br />

minimum frequency used. This minimum frequency (5 Hz) was chosen so that the resolution<br />

limit imposed by traveltime tomography was compatib<strong>le</strong> with the linearity requirement of the<br />

waveform inversion. This frequency is unrealistically low for real data and I will investigate in<br />

the following Chapter, waveform inversion starting from a higher frequency.<br />

Because of the kinematic equiva<strong>le</strong>nce between gradient images in waveform inversion and<br />

prestack <strong>de</strong>pth migration, the strategy can be easily applied in practice. Large offsets in the<br />

acquisition are not required in or<strong>de</strong>r to take advantage of the effect of stretch. Imp<strong>le</strong>mentation<br />

of the strategy to standard ref<strong>le</strong>ction profi<strong>le</strong>s with offsets limited to <strong>le</strong>ss than 3 km still allow<br />

a significant <strong>de</strong>cimation of the number of required frequencies. However, in or<strong>de</strong>r to make<br />

use of the information arising from all offsets, amplitu<strong>de</strong> preserved migration operators will be<br />

required to ensure that the re<strong>la</strong>tive amplitu<strong>de</strong>s of the common offset images are accurate.


90 CHAPTER 3. STRATEGY FOR SELECTING TEMPORAL FREQUENCIES.


Chapter 4<br />

Waveform inversion starting from realistic<br />

frequencies<br />

4.1 Introduction<br />

The previous chapter <strong>de</strong>monstrated that waveform inversion may be carried out in the frequency<br />

domain using only a very limited number of temporal frequencies. The un<strong>de</strong>rlying empirical<br />

assumption is that if the inversion of the first frequency (f min ) yields successful convergence<br />

to the global minimum, the inversion of higher frequencies should then be also successful.<br />

The 2-D Marmousi numerical experiment of Chapter 3 explored the possibility that a suitab<strong>le</strong>,<br />

smooth starting mo<strong>de</strong>l may be <strong>de</strong>termined from first arrival traveltime tomography. The fundamental<br />

chal<strong>le</strong>nge of waveform inversion is therefore to assure that the starting velocity mo<strong>de</strong>l<br />

is located in the neighbourhood of the global minimum of the misfit function when using the<br />

smal<strong>le</strong>st frequency avai<strong>la</strong>b<strong>le</strong> in the data. The Marmousi experiment showed that a smooth starting<br />

mo<strong>de</strong>l from traveltime tomography is c<strong>le</strong>arly a<strong>de</strong>quate for a waveform inversion starting<br />

at f min = 5 Hz. Unfortunately, such a frequency is consi<strong>de</strong>red unrealistically low since real<br />

exploration seismic data contain higher minimum frequencies (> 7 Hz). Because non-linearity<br />

rapidly increases with increasing frequency, waveform inversion at higher frequencies present<br />

more dramatic non-linearity effects which are more likely to cause convergence into a local<br />

minimum.<br />

Assuming that the starting mo<strong>de</strong>l is smooth, the goals of waveform inversion are to: 1-<br />

correct velocity errors, and 2- “fill in” the missing wavenumber information. It is often consid-<br />

91


92 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

ered that the effect of non-linearity is caused by velocity errors contained in the macro mo<strong>de</strong>l.<br />

This argument is however a simplification as the <strong>la</strong>ck of intermediate and high wavenumber<br />

information may also be a source of non-linearity <strong>de</strong>spite the fact that the low wavenumber<br />

components are accurate. The low and intermediate wavenumbers need to be recovered using<br />

the minimum frequency avai<strong>la</strong>b<strong>le</strong>. At the same time, local minimum must be avoi<strong>de</strong>d. Jannane<br />

et al. (1989) showed that for near offset surface seismic acquisition, seismic data with a typical<br />

bandwidth were not sensitive to the intermediate wavenumbers. The imaging potential of<br />

near offset data is thus limited to a <strong>de</strong>termination of the low wavenumbers (referred to as the<br />

macro mo<strong>de</strong>l), using traveltime methods (tomography, stacking velocities analysis, etc....) and<br />

the high wavenumbers (ref<strong>le</strong>ctivity), using prestack <strong>de</strong>pth migration. The use of wi<strong>de</strong>-ang<strong>le</strong><br />

(<strong>la</strong>rge offset) seismic data in waveform inversion has hence been consi<strong>de</strong>red (Mora, 1989; Sun<br />

and McMechan, 1992; Pratt et al., 1996; Shipp and Singh, 2002), as wi<strong>de</strong>-ang<strong>le</strong> data contain<br />

information from the lower wavenumbers of the mo<strong>de</strong>l (see chapter 3).<br />

In this chapter, I will focus on the application of waveform inversion at 7 Hz for the recovery<br />

of the continuous wavenumber information of a <strong>la</strong>rge offset seismic survey. The frequency 7<br />

Hz presents a great increase of the non-linearity of the waveform inverse prob<strong>le</strong>m compared<br />

to the frequency 5 Hz. The issue of local minima is therefore expected to be more critical,<br />

and standard gradient methods are likely to be ineffective. Therefore, a strategy is required<br />

that improves the chance of convergence to the global minimum. The strategy proposed in this<br />

chapter relies on the preconditioning of both the gradient vector and the data residuals.<br />

As the <strong>de</strong>rivation of such a methodology requires better un<strong>de</strong>rstanding of the behaviour of<br />

waveform inversion, I will first carry out a singu<strong>la</strong>r value <strong>de</strong>composition (SVD) of the sing<strong>le</strong><br />

frequency, Fréchet <strong>de</strong>rivative matrix for simp<strong>le</strong> 1-D mo<strong>de</strong>ls. This study will allow us to i<strong>de</strong>ntify<br />

the portion of the mo<strong>de</strong>l space carrying the highest singu<strong>la</strong>r values and which therefore have the<br />

strongest influence on the gradient direction in the first iterations of the inversion process. The<br />

strong non-linearity of the forward prob<strong>le</strong>m will then be c<strong>la</strong>rified by i<strong>de</strong>ntifying the components<br />

of the mo<strong>de</strong>l that have the most non-linear re<strong>la</strong>tion with the data. The results of both the SVD<br />

and non-linearity studies will <strong>le</strong>ad to the <strong>de</strong>finition of a strategy for improving the convergence<br />

of the inversion. The strategy relies on the preconditioning of both the gradient vector and the<br />

data residuals. This approach, applied to an inversion at 7 Hz of a 1-D heterogeneous velocity<br />

mo<strong>de</strong>l, will show that the global minimum may be found more effectively. Finally, the strategy<br />

will be applied to an exten<strong>de</strong>d version of the 2-D Marmousi velocity mo<strong>de</strong>l.


4.2. SVD OF THE FRÉCHET DERIVATIVE MATRIX 93<br />

4.2 SVD of the Fréchet <strong>de</strong>rivative matrix<br />

The SVD technique has been used for geophysical prob<strong>le</strong>ms in the context of tomography<br />

(White, 1989; Pratt and Chapman, 1992; Stork, 1992b; Miche<strong>le</strong>na, 1993) and waveform inversion<br />

(Woodward, 1989; Lebrun et al., 2001). I propose in this section to apply SVD to the<br />

Fréchet <strong>de</strong>rivative matrix at a sing<strong>le</strong> frequency. The i<strong>de</strong>ntification of the mo<strong>de</strong>l eigenvectors<br />

and their associated singu<strong>la</strong>r values will allow us to <strong>de</strong>termine the nature of the mo<strong>de</strong>l space<br />

that will first be updated during the inversion. As shown in chapter 2, the gradient is the result<br />

of the projection of the true mo<strong>de</strong>l perturbation on the nonzero eigenvectors, multiplied by the<br />

square of the singu<strong>la</strong>r values. Therefore, at the early stage of the inversion, the gradient will be<br />

dominated by the reconstruction of the component of the mo<strong>de</strong>l corresponding to the highest<br />

singu<strong>la</strong>r values. The eigenvectors with lower singu<strong>la</strong>r values will carry a smal<strong>le</strong>r weight and<br />

their convergence rate will be slower, the eigenvectors with near zero singu<strong>la</strong>r values will not<br />

be recovered.<br />

The SVD will first be applied to the 1-D (quasi-homogeneous) velocity mo<strong>de</strong>l of Chapter<br />

3. A vertical, constant velocity gradient will then be introduced in or<strong>de</strong>r to apply the <strong>de</strong>composition<br />

to more realistic velocity mo<strong>de</strong>l. The SVD will be carried out for a sing<strong>le</strong> frequency at a<br />

time, for the near offset (0-3 km) and the far offset range (0-10 km).<br />

4.2.1 1-D medium in homogeneous background<br />

The SVD applied to the thin <strong>la</strong>yer mo<strong>de</strong>l used in section 3.5 (Figure 3.5b)) implies the numerical<br />

<strong>de</strong>termination of the Fréchet <strong>de</strong>rivative matrix F according to equation (2.75). A sing<strong>le</strong><br />

frequency, finite difference forward mo<strong>de</strong>lling is therefore required for the computation of the<br />

perturbed wavefield for each of the mo<strong>de</strong>l parameter. The parameterisation is <strong>de</strong>fined with respect<br />

to velocity (m = c in equation 2.75). The acquisition geometry is as <strong>de</strong>scribed in chapter<br />

3, section 3.5, but with a free surface on the top of the mo<strong>de</strong>l in or<strong>de</strong>r to take into account the<br />

influence of the ghost effect on the amplitu<strong>de</strong> of the data. The n m = 75 vertical velocity mo<strong>de</strong>l<br />

parameters were perturbed by 1 % of the velocity at each no<strong>de</strong> of the finite difference grid.<br />

Because the background mo<strong>de</strong>l is quasi-homogeneous, the perturbations of the velocities only<br />

produce ref<strong>le</strong>ction hyperbo<strong>la</strong>s and no diving waves are present in the data perturbation. The<br />

resulting Fréchet matrix F is comp<strong>le</strong>x valued, and the SVD was per<strong>forme</strong>d using the subroutine<br />

csvd of the LINPACK package (Dongarra et al., 1979). This subroutine yields the mo<strong>de</strong>l space


94 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

eigenvectors and the singu<strong>la</strong>r values of the Fréchet matrix. The SVD was carried out for the<br />

near offset range (0-3 km) and the far offset range (0-10 km). The approximate Hessian was<br />

also <strong>de</strong>termined from the Fréchet matrix according to<br />

H a = Re { F t∗ F } . (4.1)<br />

Figure 4.1,4.2 and 4.3 show the singu<strong>la</strong>r value <strong>de</strong>composition of the Fréchet matrix at 3,<br />

5 and 10 Hz for the near (a-c) and far offset (e-g) ranges. For each offset range, the mo<strong>de</strong>l<br />

eigenvectors with the p nonzero singu<strong>la</strong>r values are shown in the <strong>de</strong>pth domain (a,e) and in<br />

the wavenumber domain (amplitu<strong>de</strong> spectrum) (b,f). In the wavenumber domain, the spectral<br />

coverage [k min , k max ], of the 1-D target (at 2 km <strong>de</strong>pth) according to equation (3.14) is shown<br />

as two horizontal straight lines. The corresponding nonzero singu<strong>la</strong>r values are disp<strong>la</strong>yed in<br />

(c,g) in <strong>de</strong>cibels according to the re<strong>la</strong>tion<br />

λ i (dB) = 10 log<br />

( )<br />

λi<br />

× 100 . (4.2)<br />

λ 1<br />

The nonzero singu<strong>la</strong>r values are <strong>de</strong>fined as the p first singu<strong>la</strong>r values satisfying<br />

λ i (dB) > 0 , i = 1, p. (4.3)<br />

This corresponds to a threshold of 1 % of the highest singu<strong>la</strong>r value. The Hessian computed<br />

using equation (4.1) is shown in (d,h).<br />

The singu<strong>la</strong>r values<br />

When comparing the number of singu<strong>la</strong>r values for each frequency with respect to the offset<br />

range (c and g), we observe that the number of nonzero singu<strong>la</strong>r values is <strong>la</strong>rger for the far<br />

offset range than for the near offset range. Also, for the same offset range, the number of<br />

nonzero singu<strong>la</strong>r values increases with frequency. The number p of nonzero singu<strong>la</strong>r values<br />

with respect to the offset ranges is summarised Tab<strong>le</strong> 4.1.<br />

The eigenvectors<br />

The eigenvectors with the highest singu<strong>la</strong>r values correspond to the highest wavenumbers (see<br />

for examp<strong>le</strong> Figure 4.2a,b, eigenvectors 1 to 5 ). We observe a shift of the wavenumber spectrum<br />

towards the low wavenumbers as the singu<strong>la</strong>r values <strong>de</strong>crease.


4.2. SVD OF THE FRÉCHET DERIVATIVE MATRIX 95<br />

Frequency / Offset range Near Offset Far Offset<br />

3 Hz 8 11<br />

5 Hz 10 15<br />

10 Hz 15 26<br />

Tab<strong>le</strong> 4.1: Number of nonzero singu<strong>la</strong>r values with frequency and offset range.<br />

For the near offset range, the eigenvectors corresponding to the low wavenumber (for examp<strong>le</strong>,<br />

eigenvector number 5 of Figure 4.2a) have energy in the shallow part of the mo<strong>de</strong>l. This<br />

effect is mitigated by the introduction of the far offsets.<br />

The Hessian<br />

As discussed in Chapter 2 (section 2.2.2.3) for the linear case, the gradient vector is the true<br />

mo<strong>de</strong>l perturbation filtered by the Hessian. The gradient vector for the mo<strong>de</strong>l parameter m i is<br />

the result of the multiplication of the i th line of the Hessian with the true perturbation and is<br />

therefore a linear combination of all mo<strong>de</strong>l parameters. The off diagonal terms of the Hessian<br />

contribute to the strong filtering effect whereas the diagonal e<strong>le</strong>ments are weighting terms. The<br />

distribution of amplitu<strong>de</strong>s of the e<strong>le</strong>ments of each row away from the main diagonal tells us the<br />

extend to which a given e<strong>le</strong>ment of the gradient is corrupted. The disp<strong>la</strong>y of the Hessian will<br />

tell us about the imaging potential of a sing<strong>le</strong> frequency inversion which is directly linked to the<br />

wavenumber illumination <strong>de</strong>tai<strong>le</strong>d in Chapter 3, for homogeneous media.<br />

Although the Hessian matrix shown in (d) and (h) is dominated by high amplitu<strong>de</strong>s along<br />

its main diagonal (see for examp<strong>le</strong> Figure 4.1d) there is a strong oscil<strong>la</strong>tory nature of the sing<strong>le</strong><br />

frequency Hessian which. The oscil<strong>la</strong>tory effect in the Hessian is <strong>le</strong>ss important with increasing<br />

frequency and increasing offset. This can be exp<strong>la</strong>ined by the fact that the wavenumber coverage<br />

∆k z , as <strong>de</strong>fined in equation (3.14), is <strong>la</strong>rger with increasing frequency and/or increasing offset<br />

range. The ringing effect of the sing<strong>le</strong> frequency gradient will therefore <strong>de</strong>crease as <strong>la</strong>rger<br />

offsets or higher frequencies are used.<br />

Furthermore, the Hessian presents an amplitu<strong>de</strong> <strong>de</strong>cay along the main diagonal. This shows<br />

that the Hessian increasingly damps the gradient parameters the <strong>de</strong>eper they are in the mo<strong>de</strong>l.<br />

This damping is however <strong>le</strong>ss important for the far offset range than for the near offset range,<br />

as <strong>la</strong>rge offsets allow better illumination of the <strong>de</strong>epest part of the mo<strong>de</strong>l.


96 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

Implications for waveform inversion<br />

The 1D inversion experiment that was carried out in Chapter 3, section 3.5 aims to recover<br />

the velocity perturbation located at 2 km <strong>de</strong>pth in a quasi-homogeneous background mo<strong>de</strong>l.<br />

It was shown that the sing<strong>le</strong> frequency inversion contributes to the reconstruction of a finite<br />

portion of the wavenumber spectrum of this perturbation that may be expressed analytically<br />

by the wavenumber coverage [k min , k max ] (see equation 3.14). This wavenumber coverage is<br />

shown as horizontal straight lines in the wavenumber spectrum of the eigenvectors on Figures<br />

4.1 to 4.3 (b and f). The results of the SVD indicate that the gradient vector is dominated by the<br />

reconstruction of the high wavenumber part of the wavenumber coverage. As a result, the low<br />

wavenumber components carry a smal<strong>le</strong>r weight in the gradient direction, and their convergence<br />

rate is slower.


a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

Eigenvectors<br />

5<br />

c) d)<br />

Singu<strong>la</strong>r Value (dB)<br />

20<br />

15<br />

10<br />

5<br />

0<br />

5<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Near Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5<br />

e) f)<br />

Depth (km)<br />

Singu<strong>la</strong>r Value (dB)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Eigenvectors<br />

5 10<br />

g) h)<br />

5 10<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Far Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10<br />

Figure 4.1: SVD at 3 Hz, of the thin <strong>la</strong>yer velocity mo<strong>de</strong>l as shown 3.5: a) the mo<strong>de</strong>l eigenvectors in <strong>de</strong>pth domain and b) in<br />

wavenumber domain. The wavenumber coverage of the thin <strong>la</strong>yer target after equation (3.14) is shown as horizontal lines. c) The<br />

nonzero singu<strong>la</strong>r values. d) The approximate Hessian.<br />

4.2. SVD OF THE FRÉCHET DERIVATIVE MATRIX 97


a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

Eigenvectors<br />

5 10<br />

c) d)<br />

Singu<strong>la</strong>r Value (dB)<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Near Offset Range<br />

5 10<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10<br />

e) f)<br />

Depth (km)<br />

Singu<strong>la</strong>r Value (dB)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Far Offset Range<br />

Eigenvectors<br />

5 10 15<br />

g) h)<br />

5 10 15<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10 15<br />

Figure 4.2: SVD at 5 Hz: a) The mo<strong>de</strong>l eigenvectors in <strong>de</strong>pth domain and b) in wavenumber domain. he wavenumber coverage<br />

of the thin <strong>la</strong>yer target after equation (3.14) is shown as horizontal lines. c) The nonzero singu<strong>la</strong>r values. d) The approximate<br />

Hessian.<br />

98 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES


a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

Eigenvectors<br />

5 10 15<br />

c) d)<br />

Singu<strong>la</strong>r Value (dB)<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Near Offset Range<br />

5 10 15<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10 15<br />

e) f)<br />

Depth (km)<br />

Singu<strong>la</strong>r Value (dB)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Eigenvectors<br />

5 10 15 20 25<br />

g) h)<br />

5 10 15 20 25<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Far Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10 15 20 25<br />

Figure 4.3: SVD at 10 Hz: a) the mo<strong>de</strong>l eigenvectors in <strong>de</strong>pth domain and b) in wavenumber domain. The wavenumber coverage<br />

of the thin <strong>la</strong>yer target after equation (3.14) is shown as horizontal lines. c) The nonzero singu<strong>la</strong>r values. d) The approximate<br />

Hessian.<br />

4.2. SVD OF THE FRÉCHET DERIVATIVE MATRIX 99


100 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

a) b)<br />

0<br />

Velocity (km/s)<br />

2.5 3.0 3.5 4.0<br />

0<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Depth (km)<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

Time (s)<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

4.5<br />

4.0<br />

Figure 4.4: a) 1-D velocity mo<strong>de</strong>l with a constant vertical velocity gradient. b) Waveform<br />

mo<strong>de</strong>lling using a Ricker wave<strong>le</strong>t with a peak frequency at 4 Hz.<br />

4.2.2 1-D medium with a constant vertical velocity gradient<br />

I now wish to carry out the SVD of a more realistic mo<strong>de</strong>l containing a vertical velocity gradient<br />

as shown Figure 4.4a. The time representation of the waveform mo<strong>de</strong>lling in this mo<strong>de</strong>l is<br />

shown Figure 4.4b.<br />

The SVD results for this new velocity mo<strong>de</strong>l at 3,5 and 10 Hz, for the near and far offset<br />

ranges, are shown Figure 4.5,4.6 and 4.7. The introduction of a vertical velocity gradient<br />

produces a strong variation of the amplitu<strong>de</strong> of the oscil<strong>la</strong>tions of the eigenvectors with <strong>de</strong>pth<br />

(a,e) as the wavenumber spectra shift towards the low wavenumbers (b,f). This results in a very<br />

strong amplitu<strong>de</strong> <strong>de</strong>cay occurring along the diagonal of the Hessian. This amplitu<strong>de</strong> <strong>de</strong>cay is<br />

more important for the far offset range.<br />

When comparing the wavenumber spectrum of the eigenvectors with the ones for the homogeneous<br />

medium, we observe that the eigenvectors in the mo<strong>de</strong>l with a vertical velocity gradient<br />

contains lower wavenumbers (compare for examp<strong>le</strong> Figure 4.2b and Figure 4.6b). This observation<br />

is consistent with the discussion in Chapter 3, section 3.7 that argues that a mo<strong>de</strong>l that<br />

contains an increase of velocity with <strong>de</strong>pth produces ray bending and therefore provi<strong>de</strong> access<br />

to a wi<strong>de</strong>r range of wavenumber.


a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

Eigenvectors<br />

5<br />

c) d)<br />

Singu<strong>la</strong>r Value (dB)<br />

20<br />

15<br />

10<br />

5<br />

0<br />

5<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Near Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5<br />

e) f)<br />

Depth (km)<br />

Singu<strong>la</strong>r Value (dB)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Eigenvectors<br />

5 10<br />

g) h)<br />

5 10<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Far Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10<br />

Figure 4.5: SVD at 3 Hz of the 1-D velocity mo<strong>de</strong>l with a constant vertical velocity gradient as shown 4.4: a) the mo<strong>de</strong>l<br />

eigenvectors in <strong>de</strong>pth domain and b) in wavenumber domain. c) The nonzero singu<strong>la</strong>r values. d) The approximate Hessian.<br />

4.2. SVD OF THE FRÉCHET DERIVATIVE MATRIX 101


a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

Eigenvectors<br />

5 10<br />

c) d)<br />

Singu<strong>la</strong>r Value (dB)<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Near Offset Range<br />

5 10<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10<br />

e) f)<br />

Depth (km)<br />

Singu<strong>la</strong>r Value (dB)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Eigenvectors<br />

5 10 15<br />

g) h)<br />

5 10 15<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Far Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10 15<br />

Figure 4.6: SVD at 5 Hz: a) the mo<strong>de</strong>l eigenvectors in <strong>de</strong>pth domain and b) in wavenumber domain. c) The nonzero singu<strong>la</strong>r<br />

values. d) The approximate Hessian.<br />

102 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES


a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

Eigenvectors<br />

5 10 15<br />

c) d)<br />

Singu<strong>la</strong>r Value (dB)<br />

20<br />

15<br />

10<br />

5<br />

0<br />

5 10 15<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Near Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10 15<br />

e) f)<br />

Depth (km)<br />

Singu<strong>la</strong>r Value (dB)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Eigenvectors<br />

5 10 15 20 25 30<br />

g) h)<br />

5 10 15 20 25 30<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Far Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10 15 20 25 30<br />

Figure 4.7: SVD at 10 Hz: a) the mo<strong>de</strong>l eigenvectors in <strong>de</strong>pth domain and b) in wavenumber domain. c) The nonzero singu<strong>la</strong>r<br />

values. d) The approximate Hessian.<br />

4.2. SVD OF THE FRÉCHET DERIVATIVE MATRIX 103


104 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

Slowness parameterisation<br />

Until now, the SVD was carried out by <strong>de</strong>fining the Fréchet matrix with a velocity parameterisation<br />

(m = c). The SVD may also be per<strong>forme</strong>d with respect to a slowness parameterisation<br />

(m = 1/c) <strong>de</strong>fining the Fréchet matrix by perturbing the mo<strong>de</strong>l with respect to 1 % of the<br />

slowness at each no<strong>de</strong> of the finite difference grid. The SVD of this new Fréchet matrix in the<br />

mo<strong>de</strong>l with the vertical velocity gradient, for the frequency 3,5 and 10 Hz are shown Figure<br />

4.8,4.9 and 4.10. The main consequence of the slowness parameterisation is that the eigenvectors<br />

are not as <strong>de</strong>pth <strong>de</strong>pen<strong>de</strong>nt as for the velocity parameterisation. The amplitu<strong>de</strong> <strong>de</strong>cay along<br />

the main diagonal of the Hessian is therefore much <strong>le</strong>ss important in the case of the slowness<br />

parameterisation.<br />

A waveform inversion should therefore be carried out with a parameterisation in slowness<br />

to insure proper reconstruction of the components of the mo<strong>de</strong>l located in <strong>de</strong>pth where the<br />

background velocities are typically the highest. This observation is consistent with the ro<strong>le</strong><br />

of the <strong>de</strong>rivative term ∂g/∂m used for the computation of the Fréchet <strong>de</strong>rivative <strong>de</strong>fined in<br />

equation (2.82) (see Tab<strong>le</strong> 2.3). In the case of a velocity parameterisation, the term ω 2 /c 3<br />

will damp the components of the gradient corresponding to the high velocities of the reference<br />

mo<strong>de</strong>l. The slowness parameterisation mitigates this effect as the <strong>de</strong>rivative term ω 2 /c is used.<br />

Following the same argument, the parameterisation with respect to the square of slowness (m =<br />

1/c 2 ) should suppress any weight of the Fréchet <strong>de</strong>rivative with respect to increasing velocity<br />

as only the term ω 2 is used.


a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

Eigenvectors<br />

5<br />

c) d)<br />

Singu<strong>la</strong>r Value (dB)<br />

20<br />

15<br />

10<br />

5<br />

0<br />

5<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Near Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5<br />

e) f)<br />

Depth (km)<br />

Singu<strong>la</strong>r Value (dB)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Eigenvectors<br />

5 10<br />

g) h)<br />

5 10<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Far Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10<br />

Figure 4.8: Slowness parameterisation. SVD at 3 Hz a) the mo<strong>de</strong>l eigenvectors in <strong>de</strong>pth domain and b) in wavenumber domain.<br />

c) The nonzero singu<strong>la</strong>r values. d) The approximate Hessian.<br />

4.2. SVD OF THE FRÉCHET DERIVATIVE MATRIX 105


a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

Eigenvectors<br />

5 10<br />

c) d)<br />

Singu<strong>la</strong>r Value (dB)<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Near Offset Range<br />

5 10<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10<br />

e) f)<br />

Depth (km)<br />

Singu<strong>la</strong>r Value (dB)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Eigenvectors<br />

5 10 15<br />

g) h)<br />

5 10 15<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Far Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10 15<br />

Figure 4.9: Slowness parameterisation. SVD at 5 Hz: a) the mo<strong>de</strong>l eigenvectors in <strong>de</strong>pth domain and b) in wavenumber domain.<br />

c) The nonzero singu<strong>la</strong>r values. d) The approximate Hessian.<br />

106 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES


a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

Eigenvectors<br />

5 10 15<br />

c) d)<br />

Singu<strong>la</strong>r Value (dB)<br />

20<br />

15<br />

10<br />

5<br />

0<br />

5 10 15<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Near Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10 15<br />

e) f)<br />

Depth (km)<br />

Singu<strong>la</strong>r Value (dB)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

20<br />

15<br />

10<br />

5<br />

0<br />

Eigenvectors<br />

5 10 15 20 25 30<br />

g) h)<br />

5 10 15 20 25 30<br />

Singu<strong>la</strong>r Value In<strong>de</strong>x<br />

Far Offset Range<br />

Wavenumber (1/km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Eigenvectors<br />

5 10 15 20 25 30<br />

Figure 4.10: Slowness parameterisation. SVD at 10 Hz: a) the mo<strong>de</strong>l eigenvectors in <strong>de</strong>pth domain and b) in wavenumber<br />

domain. c) The nonzero singu<strong>la</strong>r values. d) The approximate Hessian.<br />

4.2. SVD OF THE FRÉCHET DERIVATIVE MATRIX 107


108 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

4.3 Non-linearity of the waveform inverse prob<strong>le</strong>m<br />

It is well known that the waveform inversion is a non-linear prob<strong>le</strong>m and that as a result, the true<br />

mo<strong>de</strong>l perturbation cannot be expressed as a linear combination of the data residuals. Furthermore,<br />

the <strong>de</strong>gree of non-linearity varies and <strong>de</strong>pends on the different components of the mo<strong>de</strong>l<br />

and the data consi<strong>de</strong>red. This section c<strong>la</strong>rifies the re<strong>la</strong>tions of non-linearity between the mo<strong>de</strong>l<br />

and the data.<br />

4.3.1 The effect of cyc<strong>le</strong> skipping<br />

The issue of non-linearity and local minima may be observed in the data domain by the phenomenon<br />

of cyc<strong>le</strong> skipping (Figure 4.11). Cyc<strong>le</strong> skipping occurs when there exists a traveltime<br />

mismatch between observed and calcu<strong>la</strong>ted data greater than half a period of the frequency consi<strong>de</strong>red.<br />

In this case, the steepest <strong>de</strong>scent direction will yield an inaccurate mo<strong>de</strong>l update and<br />

will attempt to fit the calcu<strong>la</strong>ted data to the wrong cyc<strong>le</strong> of the observed data. The issue of<br />

convergence to local minima is hence closely re<strong>la</strong>ted to the effect of cyc<strong>le</strong> skipping. A good<br />

starting mo<strong>de</strong>l should therefore exp<strong>la</strong>in the observed data to within half a cyc<strong>le</strong> of the starting<br />

frequency used in the inversion to avoid convergence into a local minimum.<br />

4.3.2 Analytic linearity study of the 3D homogeneous Green’s function<br />

Since the non-linearity of the waveform inverse prob<strong>le</strong>m is <strong>de</strong>fined by the forward prob<strong>le</strong>m, I<br />

propose in this section to study the linearity of the 3D homogeneous, free space Green’s function<br />

G 3D<br />

G 3D (R, f, c o ) = exp (i2πfR/c o)<br />

, (4.4)<br />

4πR<br />

(Morse and Feshbach, 1953), where R is the propagation distance, f is the frequency and c o<br />

is the constant velocity . The <strong>de</strong>gree of non-linearity may be measured by <strong>de</strong>fining the linearization<br />

error as the difference between the true analytic Green’s function and its linearised<br />

formu<strong>la</strong>tion. The linearization of equation (4.4) is expressed as<br />

˜G 3D (c o + ∆c, m) = G 3D (c o ) + ∂G 3D<br />

∆m (4.5)<br />

∂m<br />

where m is the mo<strong>de</strong>l parameter used for the linearization. We then may <strong>de</strong>fine the linearization<br />

error as the L 2 norm of the difference between the linearised and the true wavefield given by<br />

E = ∣∣ ˜G 3D (c o + ∆c) − G 3D (c o + ∆c) ∣∣ 2 . (4.6)


4.3. NON-LINEARITY OF THE WAVEFORM INVERSE PROBLEM 109<br />

a)<br />

0<br />

0<br />

∆Τ<br />

b)<br />

0<br />

0<br />

∆Τ<br />

Time<br />

5<br />

Time<br />

5<br />

10<br />

10<br />

Figure 4.11: Illustration of the cyc<strong>le</strong> skipping prob<strong>le</strong>m showing a sine (sing<strong>le</strong> frequency) in<br />

time for the observed (solid) and calcu<strong>la</strong>ted (dotted) data: the traveltime mismatch is a) inferior<br />

to half a cyc<strong>le</strong> and b) superior to half a cyc<strong>le</strong>. The gradient methods will minimise the misfit<br />

between observed and calcu<strong>la</strong>ted data in a way which is consistent with the minimisation of<br />

traveltime error in a). If cyc<strong>le</strong> skipping occurs as in b), the inversion will fit the calcu<strong>la</strong>ted data<br />

to the wrong cyc<strong>le</strong> thus increasing the travel-time error.


110 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

Four parameters are involved in the <strong>de</strong>termination of the Green’s function: the constant velocity<br />

c o , the velocity perturbation ∆c, the frequency f and the propagation distance R. Figure 4.6<br />

shows 4 plots of the linearisation error <strong>de</strong>fined in equation (4.6) with respect to the variation<br />

of one of the four parameters, the other three remaining constant (see Tab<strong>le</strong> Figure 4.12). The<br />

linearization was per<strong>forme</strong>d with respect to a velocity (b<strong>la</strong>ck) and a slowness (grey) parameterisation.<br />

The evolution of the linearization error with respect to frequency (Figure 4.12a ) shows<br />

that the non-linearity of the Green’s function increases with frequency (a), propagation distance<br />

(b) and velocity perturbation (d). On the other hand, the Green’s function is more linear as<br />

velocity increases. Also, the linearization with respect to slowness (grey) yields <strong>le</strong>ss error than<br />

the linearization with respect to velocity (b<strong>la</strong>ck) showing that a parameterisation in slowness<br />

improves the linearity. This result is consistent with the conclusions of Beydoun and Taranto<strong>la</strong><br />

(1988) who studied the accuracy of the Born approximation for various parameterisations. The<br />

oscil<strong>la</strong>tions of the misfit function (a,b,d) are due to the effect of cyc<strong>le</strong> skipping: a local minima<br />

can be i<strong>de</strong>ntified at values of the misfit function for which ∂E/∂m = 0. Although the Green’s<br />

function is more linear with respect to slowness, the effect of cyc<strong>le</strong> skipping is not avoi<strong>de</strong>d.<br />

4.3.3 Non-linearity and wavenumber domain<br />

In the previous section, the mo<strong>de</strong>l was assumed to be homogeneous and thus had only a sing<strong>le</strong><br />

wavenumber component at k = 0 km −1 . Heterogeneous mo<strong>de</strong>ls must be represented with<br />

a range of wavenumbers which do not p<strong>la</strong>y the same ro<strong>le</strong> in the non-linearity of the inverse<br />

prob<strong>le</strong>m. This characteristic of the wavenumbers may be illustrated by carrying out a linearity<br />

study on the 1-D, heterogeneous velocity mo<strong>de</strong>l of Figure 4.12a. This mo<strong>de</strong>l was obtained<br />

using the central velocity trace of the 2D Marmousi mo<strong>de</strong>l and will be referred to as the 1-D<br />

Marmousi mo<strong>de</strong>l (Figure 4.13). The misfit function is computed by perturbation of the amplitu<strong>de</strong><br />

of a sing<strong>le</strong> wavenumber component of the mo<strong>de</strong>l in a simi<strong>la</strong>r fashion to that of Jannane<br />

et al. (1989). Offsets up to 10 km were inclu<strong>de</strong>d in the misfit function. This wavenumber<br />

perturbation is equiva<strong>le</strong>nt to adding a sinusoidal component to the <strong>de</strong>pth mo<strong>de</strong>l with the appropriate<br />

phase. A taper is however applied to the sine function so that the very shallow part of<br />

the mo<strong>de</strong>l remains unchanged, thus excluding the direct arrival from the misfit function. Figure<br />

4.14 shows the evolution of the misfit function as one of the following six wavenumbers<br />

k = 0, 0.5, 1, 1.5, 2, 2.5 km −1 are perturbed for 5 Hz (b<strong>la</strong>ck) and 10 Hz (grey). The global minimum<br />

of the misfit function is much narrower as the lowest wavenumbers are perturbed. The


4.3. NON-LINEARITY OF THE WAVEFORM INVERSE PROBLEM 111<br />

a)<br />

0.3<br />

b)<br />

0.10<br />

0.2<br />

Misfit Function<br />

0.1<br />

Misfit Function<br />

0.05<br />

c)<br />

0<br />

0 5 10 15 20<br />

Frequency (Hz)<br />

0.15<br />

d)<br />

0<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Distance (km)<br />

0.4<br />

0.3<br />

0.10<br />

Misfit Function<br />

0.05<br />

Misfit Function<br />

0.2<br />

0.1<br />

0<br />

1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0<br />

Velocity (km/s)<br />

0<br />

0 5 10 15 20 25<br />

Velocity perturbation (%)<br />

a d c d<br />

f ↗ 10 Hz 10 Hz 10 Hz<br />

R 2 km ↗ 2 km 2 km<br />

c o 2 km.s −1 2 km.s −1 ↗ 2 km.s −1<br />

∆c/c o 10 % 10 % 10% ↗<br />

Figure 4.12: Misfit function of the linearization of the 3D homogeneous Green’s function with<br />

variation of: a) The frequency f. b) The propagation distance R. c) The velocity c o . d) The<br />

velocity perturbation. The constants used for each plot are shown in the tab<strong>le</strong> above.


112 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

a)<br />

Velocity (m/s)<br />

1500 2000 2500 3000 3500<br />

0<br />

b)<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

0<br />

1<br />

Depth (km)<br />

0.5<br />

1.0<br />

1.5<br />

Time (s)<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

2.0<br />

Figure 4.13: The 1-D Marmousi velocity mo<strong>de</strong>l. a) 1-D Velocity mo<strong>de</strong>l of the midd<strong>le</strong> trace of<br />

the Marmousi mo<strong>de</strong>l. b) Waveform mo<strong>de</strong>lling in time.<br />

low wavenumbers are therefore more non linear than the high wavenumbers. The comparison<br />

of the misfit function of 5 Hz and 10 Hz confirms that the low frequencies are more linear (the<br />

width of the global minimum is greater for the frequency 5 Hz).<br />

4.3.4 Non-linearity and offset<br />

The analytic study of the linearity of the 3D Green’s function showed that the non-linearity<br />

increased with the propagation distance. For surface seismic data, the far offsets are the most<br />

non-linear part of the data as they correspond to the <strong>la</strong>rgest propagation distance. Figure 4.15<br />

shows the misfit function as the wavenumber k = 0.5 km -1 is perturbed for the near (0-3 km)<br />

and far (3-10 km) offsets. The global minimum of the misfit function of the near offset is wi<strong>de</strong>r<br />

than the one of the far offset showing that the far offset are more non-linear.


4.3. NON-LINEARITY OF THE WAVEFORM INVERSE PROBLEM 113<br />

1.0<br />

k = 0 ¢¡<br />

−1<br />

a) b)<br />

0<br />

Velocity (m/s)<br />

-10 0 10<br />

1.0<br />

k = 0.5 £¢¤<br />

−1<br />

0<br />

Velocity (m/s)<br />

-10 0 10<br />

Misfit Function<br />

0.5<br />

Depth (km)<br />

0.5<br />

1.0<br />

Misfit Function<br />

0.5<br />

Depth (km)<br />

0.5<br />

1.0<br />

1.5<br />

1.5<br />

-300 -200 -100 0 100 200 300<br />

Velocity Perturbation (m/s)<br />

2.0<br />

-300 -200 -100 0 100 200 300<br />

Velocity Perturbation (m/s)<br />

2.0<br />

1.0<br />

k = 1 ¥¢¦<br />

−1<br />

0<br />

Velocity (m/s)<br />

-10 0 10<br />

1.0<br />

k = 1.5 §¢¨<br />

−1<br />

0<br />

Velocity (m/s)<br />

-10 0 10<br />

0.5<br />

0.5<br />

Misfit Function<br />

0.5<br />

Depth (km)<br />

1.0<br />

Misfit Function<br />

0.5<br />

Depth (km)<br />

1.0<br />

1.5<br />

1.5<br />

-300 -200 -100 0 100 200 300<br />

Velocity Perturbation (m/s)<br />

2.0<br />

-300 -200 -100 0 100 200 300<br />

Velocity Perturbation (m/s)<br />

2.0<br />

k = 2 ¢<br />

−1<br />

k = 2.5 ©¢<br />

−1<br />

1.0<br />

0<br />

Velocity (m/s)<br />

-10 0 10<br />

1.0<br />

0<br />

Velocity (m/s)<br />

-10 0 10<br />

Misfit Function<br />

0.5<br />

Depth (km)<br />

0.5<br />

1.0<br />

Misfit Function<br />

0.5<br />

Depth (km)<br />

0.5<br />

1.0<br />

1.5<br />

1.5<br />

-300 -200 -100 0 100 200 300<br />

Velocity Perturbation (m/s)<br />

2.0<br />

-300 -200 -100 0 100 200 300<br />

Velocity Perturbation (m/s)<br />

2.0<br />

Figure 4.14: Non-linearity and wavenumbers. a) Evolution of the misfit function as a sing<strong>le</strong><br />

wavenumber component of the mo<strong>de</strong>l of Figure 4.13 is perturbed in amplitu<strong>de</strong>. The misfit<br />

functions of the frequencies 5 Hz (b<strong>la</strong>ck) and 10 Hz (grey) are shown. b) A taper is applied to<br />

prevent the perturbation of the very shallow part of the mo<strong>de</strong>l hence removing the direct arrival<br />

from the misfit function. The misfit functions are normalised with respect to their maximum<br />

value.


114 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

1.0<br />

Misfit Function<br />

0.5<br />

-300 -200 -100 0 100 200 300<br />

Velocity Perturbation (m/s)<br />

Figure 4.15: Non-linearity and offset. Misfit function for a wavenumber perturbation of k =<br />

0.5 km −1 for the near offset (0-3 km) (b<strong>la</strong>ck) and the far offset (3-10 km) (grey). The far offsets<br />

are more non-linear than the near offset. The misfit functions are normalised with respect to<br />

their maximum value.<br />

4.4 Tools for the mitigation of non-linearities<br />

I now wish to attempt to improve the convergence accuracy of the waveform inversion using the<br />

results of the SVD and the basic re<strong>la</strong>tions of non-linearity <strong>de</strong>scribed in the previous sections.<br />

The approach used in the previous chapter which consist in inverting from the low to the high<br />

frequencies of the source spectrum presents a great advantage, as the low frequencies have a<br />

more linear re<strong>la</strong>tion with the mo<strong>de</strong>l. This strategy may neverthe<strong>le</strong>ss not be sufficient as the lowest<br />

frequency avai<strong>la</strong>b<strong>le</strong> in the data spectrum may still present some local minima. The chal<strong>le</strong>nge<br />

of waveform inversion is to assure that the inversion of the minimum frequency avai<strong>la</strong>b<strong>le</strong> in the<br />

data converges successfully into the global minimum. In this section, I will introduce some<br />

of the techniques that help in achieving this goal by using appropriate preconditioning of the<br />

gradient vector and the data residuals.<br />

4.4.1 Gradient preconditioning by wavenumber filtering<br />

The singu<strong>la</strong>r value <strong>de</strong>composition showed that the low wavenumbers of the mo<strong>de</strong>l correspond<br />

to the small singu<strong>la</strong>r values of the inverse prob<strong>le</strong>m. The gradient of the misfit function will<br />

hence be dominated by the high wavenumbers. This behaviour can potentially be the cause of


4.4. TOOLS FOR THE MITIGATION OF NON-LINEARITIES 115<br />

convergence into local minima, as the <strong>de</strong>termination of the high wavenumbers <strong>de</strong>pends in turn<br />

on the accuracy of the low wavenumbers. This notion is commonly accepted in the migration<br />

community: it is well known that the quality of the migration highly <strong>de</strong>pends on the accuracy<br />

of the macro mo<strong>de</strong>l. Waveform inversion of <strong>la</strong>rge offset data allow both tomographic-like and<br />

migration-like reconstructions (Mora, 1989; Pratt et al., 1996). From the arguments above, it<br />

is <strong>de</strong>sirab<strong>le</strong> that the tomographic regime should occur before the migration. It hence appears<br />

necessary to enhance the update of the low wavenumbers to facilitate the full convergence of<br />

the low wavenumbers.<br />

In or<strong>de</strong>r to achieve this goal, a smoothing operator may be applied to the gradient direction<br />

removing the high wavenumber components of the gradient vector. This approach is simi<strong>la</strong>r<br />

to the multi-grid technique proposed by Bunks et al. (1995) in the context of time domain<br />

waveform inversion. The multi-grid method projects the gradient vector onto a coarse grid by<br />

smoothing and sub-sampling the finite difference mo<strong>de</strong>lling grid (see for examp<strong>le</strong> the appendix<br />

of Pratt et al. (1998)). Bunks et al. (1995) argued that such an approach improves the chance<br />

of convergence into the global minimum. Unfortunately they fai<strong>le</strong>d to validate their c<strong>la</strong>im on<br />

numerical experiments as they used instead, a strategy that inverts band-pass filtered data, from<br />

the very low (0-7 Hz !) to the high frequencies. The non-linearities were therefore avoi<strong>de</strong>d by<br />

inverting unrealistically low frequencies.<br />

The strategy that is proposed here relies on the preconditioning of the gradient vector by<br />

smoothing which is per<strong>forme</strong>d using a 2-D low pass wavenumber filter of the gradient vector.<br />

After 2-D spatial Fourier transform of the gradient using the subroutine rlft3 of Numerical<br />

Recipe package (Press et al., 1992), a low-pass 2-D elliptic filter is applied to the Fourier components<br />

of the gradient as shown Figure 4.16. A taper zone is applied to prevent artifacts of<br />

the filter. A different filter of the horizontal and vertical directions may be applied in which<br />

case, the filter is elliptic. I will however use a circu<strong>la</strong>r filter characterised by (k max , ∆k) where<br />

k max <strong>de</strong>fines the maximum preserved wavenumber in all directions, and ∆k <strong>de</strong>fines the width<br />

of the taper zone. After filtering of the gradient in the wavenumber domain, an inverse Fourier<br />

transform is per<strong>forme</strong>d to recover the filtered gradient in the space domain.<br />

An examp<strong>le</strong> of an inversion of a simp<strong>le</strong> 1-D mo<strong>de</strong>l with a standard and preconditioned gradient<br />

inversion is shown Figure 4.17 at the frequency 7 Hz. The wavenumber filtered gradient<br />

inversion (c,d) is compared to the standard inversion (a,b). A total of 30 iterations were per<strong>forme</strong>d<br />

for both inversions. The preconditioned inversion was carried out with 3 sets of 10<br />

iterations with different gradient wavenumber filtering: for the 10 first iterations the wavenum-


116 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

Figure 4.16: 2-D low-pass wavenumber filter of the gradient vector. The dashed zone represent<br />

the preserved wavenumber and the tapering zone is shown in grey.<br />

ber filter (k max , ∆k) = (1, 0.5 km −1 ) was applied to the gradient, the next 10 iterations were<br />

carried out with a filter (k max , ∆k) = (2, 1 km −1 ) and for the <strong>la</strong>st 10 iterations, no filter was<br />

applied. The 2-D standard and preconditioned gradient of the first iteration are shown Figure<br />

4.17a,c. These 2-D gradient are stacked horizontally in or<strong>de</strong>r to carry out the 1-D inversion<br />

so that only the vertical component of the mo<strong>de</strong>l is updated. These results show that the preconditioning<br />

by low-pass wavenumber filtering of the gradient can significantly improve the<br />

convergence accuracy of an inversion.<br />

4.4.2 Time damping of the data residuals<br />

The strategy of inverting from the low to the high wavenumbers components of the mo<strong>de</strong>l may<br />

be carried out by smoothing the gradient vector in the first steps of the waveform inversion. This<br />

strategy may however not be sufficient as the low wavenumbers correspond to the most nonlinear<br />

components of the mo<strong>de</strong>l. Therefore, some components of the data wavefield contributing<br />

to the low wavenumbers may be cyc<strong>le</strong> skipped. The risk of cyc<strong>le</strong> skipping is particu<strong>la</strong>rly critical<br />

for the far offset data because they correspond the longest propagation distance (see section<br />

4.3.4).<br />

So far, for a given offset range, the full wavefield was taken into account. All events, including<br />

ref<strong>le</strong>ctions and refracted/diving waves where inclu<strong>de</strong>d in the data residuals. As we will


4.4. TOOLS FOR THE MITIGATION OF NON-LINEARITIES 117<br />

Depth (km)<br />

a)<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

b)<br />

0<br />

Velocity (km/s)<br />

2500 3000 3500 4000 4500 5000<br />

Depth (km)<br />

c)<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

d)<br />

0<br />

Velocity (km/s)<br />

2500 3000 3500 4000 4500 5000<br />

0.5<br />

0.5<br />

1.0<br />

1.0<br />

Depth (km)<br />

1.5<br />

2.0<br />

2.5<br />

Depth (km)<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.0<br />

3.5<br />

3.5<br />

4.0<br />

4.0<br />

Figure 4.17: 1-D <strong>Inversion</strong> and gradient preconditioning by wavenumber filtering. <strong>Inversion</strong> at<br />

7 Hz with standard (a,b) and preconditioned gradient with wavenumber filtering (c,d). A total<br />

of 30 iterations were carried out for each experiment. a) the gradient of the first iteration . b)<br />

The result of the standard inversion. c) Filtered gradient at the first iteration. d) Result of the<br />

precondition gradient inversion. The true mo<strong>de</strong>l is shown in grey, the starting mo<strong>de</strong>l in dotted<br />

line and the result of inversion in solid line.


118 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

see, the risk of cyc<strong>le</strong> skipping may be reduced by se<strong>le</strong>cting the early arrivals in the data i.e., the<br />

wavefields arriving close in time to the first arrivals. The focusing of the inversion on the fit of<br />

the early arrivals is motivated by the fact that:<br />

1. early arrivals are more linear than <strong>la</strong>te arrivals because they optimise the tra<strong>de</strong>off between<br />

the propagation distance and propagation within high velocities according to the Fermat’s<br />

princip<strong>le</strong>. As shown in the linearity study of the homogeneous Green’s function in section<br />

4.3.2, the non-linearity <strong>de</strong>creases with increasing velocity and <strong>de</strong>creasing propagation<br />

distance.<br />

2. early arrivals are fully compatib<strong>le</strong> with the recovery of the low wavenumbers because<br />

they arise from the first Fresnel zones<br />

3. early arrivals contains the most trusted information if the starting mo<strong>de</strong>l is smooth and<br />

was obtained using first arrival traveltime tomography<br />

I will further <strong>de</strong>velop these items in the next sections. However, the se<strong>le</strong>ction of the early<br />

arrivals in the frequency domain requires the <strong>de</strong>finition of a preconditioning operator of the data<br />

residuals. This operator may be carried out in the time domain by windowing the data residuals<br />

(see for examp<strong>le</strong> Shipp and Singh (2002)), in which case a simultaneous inversion of several<br />

frequencies at a time is required. Therefore, higher frequencies must be used at the expense<br />

of an increase of non-linearity of the inverse prob<strong>le</strong>m. In or<strong>de</strong>r to preserve the advantage of<br />

inverting for the lowest frequency avai<strong>la</strong>b<strong>le</strong>, the technique of time damping will be introduced<br />

as a tool to damp <strong>la</strong>te data arrivals of a sing<strong>le</strong> frequency inversion.<br />

4.4.2.1 Time damping of the <strong>la</strong>te arrivals<br />

The use of comp<strong>le</strong>x frequencies for the time representation of a frequency domain mo<strong>de</strong>lling is<br />

commonly used to prevent time aliasing (wrap-around) (Mallick and Frazer, 1987). The time<br />

function of the wavefield is obtain using the inverse Fourier transform<br />

f(t) =<br />

∫ +∞<br />

−∞<br />

e −iωt Ψ (ω) dω, (4.7)<br />

where f(t) is the wavefield in time, Ψ(ω) is the wavefield in the frequency domain. The use of<br />

a comp<strong>le</strong>x angu<strong>la</strong>r frequency <strong>de</strong>fined as<br />

ω ′ = ω + i/τ, (4.8)


4.4. TOOLS FOR THE MITIGATION OF NON-LINEARITIES 119<br />

is equiva<strong>le</strong>nt to multiplying the time domain wavefield by an exponential time <strong>de</strong>cay function<br />

since we have<br />

f ′ (t) = e −t/τ f(t)<br />

∫<br />

= e −iωt Ψ (ω + i/τ) dω. (4.9)<br />

The use of a comp<strong>le</strong>x ω in the waveform mo<strong>de</strong>lling thus allows one to damp <strong>la</strong>ter arrivals using<br />

a frequency domain representation provi<strong>de</strong>d we calcu<strong>la</strong>ted Ψ with the correct comp<strong>le</strong>x ω ′ . The<br />

exponential <strong>de</strong>cay function at t = 0 where a weighting of 1 is applied. For each trace (sourcereceiver<br />

pair), it is possib<strong>le</strong> to shift the time damping function to a new origin t o . The equiva<strong>le</strong>nt<br />

time function is written as<br />

f ′′ (t) = e −(t−to)/τ f(t)<br />

∫<br />

= e −iωt Ψ (ω + i/τ) e to/τ dω (4.10)<br />

In equation (4.10), the comp<strong>le</strong>x wavefield is weighted by the factor e to/τ so that the equiva<strong>le</strong>nt<br />

exponential time <strong>de</strong>cay function is equal to 1 at the time t = t o . As for the time damping,<br />

the time shift of the exponential <strong>de</strong>cay function may be imp<strong>le</strong>mented in the frequency domain,<br />

i.e. without the recourse to a time representation, by multiplying the wavefield by the term<br />

e to/τ . Since the time representation is not required, it is possib<strong>le</strong> to apply the time damping<br />

to the sing<strong>le</strong> frequency wavefield Ψ(ω). Shin et al. (2002) showed that the use a very strong<br />

time damping in the wave equation was equiva<strong>le</strong>nt to solve simultaneously for the eikonal and<br />

transport equations, at a fixed frequency.<br />

An examp<strong>le</strong> of mo<strong>de</strong>lling with comp<strong>le</strong>x frequency is shown Figure 4.18 in the 1-D Marmousi<br />

mo<strong>de</strong>l ( Figure 4.13a). The first arrivals t o , were picked on the wavefield with no time<br />

damping (a). The wavefield in the time domain mo<strong>de</strong>l<strong>le</strong>d with comp<strong>le</strong>x frequencies and amplitu<strong>de</strong><br />

compensation according to equation (4.10) shows that different values of τ may be used to<br />

vary the damping of the <strong>la</strong>te arrivals.<br />

4.4.2.2 Linearity of early arrivals<br />

In or<strong>de</strong>r to <strong>de</strong>monstrate the linearity of the early arrivals, especially at far offsets, Figure 4.19<br />

shows the misfit function at 7 Hz of the far offset data (3-10 km) as a sing<strong>le</strong> low wavenumber<br />

component of the 1-D Marmousi mo<strong>de</strong>l is perturbed (k = 0.5 km −1 ). The misfit functions<br />

with no time damping (b<strong>la</strong>ck) and with a time damping of τ = 0.25 s (grey) show that the


120 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

a)<br />

Time (s)<br />

c)<br />

Time (s)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

b)<br />

Time (s)<br />

d)<br />

Time (s)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Figure 4.18: Finite difference mo<strong>de</strong>lling with different value of τ: a) No time damping, b)<br />

τ = 0.5 s, c) τ = 0.25 s and d) τ = 0.1 s. The traveltime picks t o are shown as a white line in<br />

a). Amplitu<strong>de</strong> are normalised.


4.4. TOOLS FOR THE MITIGATION OF NON-LINEARITIES 121<br />

1.0<br />

Misfit Function<br />

0.5<br />

-100 -50 0 50 100<br />

Velocity Perturbation (m/s)<br />

Figure 4.19: Misfit function of the far offset at 7 Hz with no time damping (b<strong>la</strong>ck) and τ =<br />

0.25 s (grey). The mo<strong>de</strong>l is perturbed at the sing<strong>le</strong> wavenumber k = 0.5 km -1 .<br />

damping of <strong>la</strong>te arrivals wi<strong>de</strong>n the global minimum. The application of time damping to the<br />

data residuals will therefore improve the linearity of the inverse prob<strong>le</strong>m for the recovery of the<br />

low wavenumbers.<br />

4.4.2.3 Wavenumber and Fresnel zones<br />

Chapter 3 <strong>de</strong>fined the contribution of the far offsets in the <strong>de</strong>termination of the low wavenumbers.<br />

For a same source-receiver pair (same offset), the concept of a wavepath (Woodward,<br />

1992; Pratt et al., 1996) ( see Figure 3.6) shows that arrivals contained within the first Fresnel<br />

zone correspond to the lowest wavenumber information. Later arrivals arise from outer Fresnel<br />

zones and are associate with higher wavenumbers. Therefore a strategy aiming to first recover<br />

the low wavenumbers is fully compatib<strong>le</strong> with the se<strong>le</strong>ction of the early arrivals. The smoothing<br />

of the gradient partially achieves this goal by removing the high wavenumber components of the<br />

gradient vector. Neverthe<strong>le</strong>ss, the wavenumber low-pass filtering of the gradient does not fully<br />

prevent <strong>la</strong>te arrivals from being taken into account. Furthermore, as the arrivals contained in the<br />

first Fresnel zone may not be the most energetic arrivals, <strong>la</strong>ter arrivals with higher amplitu<strong>de</strong>s<br />

may dominate the inversion (as for examp<strong>le</strong> in Figure 4.18a). Time damping diminishes the<br />

importance of <strong>la</strong>ter arrivals that may carry strong amplitu<strong>de</strong>s, thus assuring that early arrivals<br />

are taken into account early in the inversion.


122 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

4.4.3 f-k filtering of the data residuals<br />

The f-k filtering of seismic data is routinely applied in processing to se<strong>le</strong>ct/remove, data with<br />

linear moveout in the t-x domain ( see for examp<strong>le</strong> Yilmaz (1987)). The f-k domain can thus<br />

be used to se<strong>le</strong>ct events with various dips such as ref<strong>le</strong>ction hyperbo<strong>la</strong> at far offset, direct arrivals<br />

and refracted waves. The f-k projection corresponds to a p<strong>la</strong>ne wave <strong>de</strong>composition of<br />

the wavefield where each component is associated with an emergence ang<strong>le</strong> that <strong>de</strong>pends on the<br />

horizontal component of the slowness vector. Because typical velocity mo<strong>de</strong>ls present an increase<br />

of the reference velocity with <strong>de</strong>pth, it is possib<strong>le</strong> to discriminate events in the wavefield<br />

coming from the <strong>de</strong>epest part of the mo<strong>de</strong>l as they will show different dips (more horizontal)<br />

than shallower events.<br />

Figure 4.20a shows an examp<strong>le</strong> of a velocity mo<strong>de</strong>l containing a shallow and a <strong>de</strong>ep heterogeneity<br />

with a reference mo<strong>de</strong>l presenting a constant vertical velocity gradient. The time<br />

representation of the waveform mo<strong>de</strong>lling in this mo<strong>de</strong>l is shown 4.20b. The application of a<br />

f-k filter, using the program sudipfilt of the Seismic of Un*x package (Stockwell, 1997), on the<br />

time representation of the wavefield allows to remove events with specific dips as shown Figure<br />

4.20c. The f-k filter applied suppresses apparent velocities greater than dx/dt = 3 km.s −1 and<br />

thus removed the ref<strong>le</strong>ction hyperbo<strong>la</strong> of the <strong>de</strong>ep heterogeneity but neverthe<strong>le</strong>ss also altered<br />

the shallow ref<strong>le</strong>ction at near offset.<br />

The f-k filter may be used for the sing<strong>le</strong> frequency waveform inversion by applying a<br />

wavenumber filter to the data residuals in the spatial receiver Fourier domain (for a sing<strong>le</strong> source<br />

gather). An inverse spatial Fourier transform is then carried out to recover the data residuals in<br />

the source-receiver domain. No time representation is required for the application of such filter<br />

to the data residuals.<br />

In or<strong>de</strong>r to illustrate the application of the f-k filter to waveform inversion, a 1-D waveform<br />

inversion at 7 Hz is carried out (Figure 4.21). The true velocity mo<strong>de</strong>l is the mo<strong>de</strong>l shown Figure<br />

4.20a and the two heterogeneities are removed to create the starting mo<strong>de</strong>l. The waveform<br />

inversion using the standard data residuals (Figure 4.21a ) does reconstruct both perturbations.<br />

If a wavenumber filter is applied to the data residuals that removes events with apparent velocities<br />

greater than dx/dt = 3 km.s −1 , only the shallow heterogeneity is recovered. Since<br />

the wavenumber filter of the residuals removed the near offset portion of the shallow ref<strong>le</strong>ction<br />

hyperbo<strong>la</strong>, the reconstruction of the shallow heterogeneity is <strong>de</strong>ficient in the high wavenumbers.


4.4. TOOLS FOR THE MITIGATION OF NON-LINEARITIES 123<br />

Depth (km)<br />

a)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

Velocity (km/s)<br />

2.5 3.0 3.5 4.0 4.5<br />

Time (s)<br />

b)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

4.5<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Time (s)<br />

c)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

4.5<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

Figure 4.20: a) 1-D velocity mo<strong>de</strong>l containing a shallow and a <strong>de</strong>ep heterogeneity. b) Time<br />

waveform mo<strong>de</strong>lling. c) f-k filter of apparent velocities greater than dx/dt = 3 km.s −1 .<br />

a) b)<br />

0<br />

Velocity (km/s)<br />

2.5 3.0 3.5 4.0 4.5<br />

0<br />

Velocity (km/s)<br />

2.5 3.0 3.5 4.0 4.5<br />

Depth (km)<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

Depth (km)<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

Figure 4.21: Preconditioning of the data residuals using an f-k filter. Waveform inversion at<br />

7 Hz. a) standard data residuals. b) The data residuals are f-k filtered and apparent velocities<br />

greater than 3 km/s are removed.


124 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

4.4.4 Offset windowing of the data residuals<br />

Shipp and Singh (2002) proposed that in or<strong>de</strong>r to recover the low wavenumber components of<br />

the mo<strong>de</strong>l, the far offsets should be used in the early stages of an inversion. At a <strong>la</strong>ter stage,<br />

the near offset data are inverted providing the high wavenumber information. A strategy that<br />

windows the data residuals from the far to the near offsets may be efficient provi<strong>de</strong>d that no<br />

cyc<strong>le</strong> skipping occurs in the far offset data. The risk of cyc<strong>le</strong> skipping may be mitigated by<br />

applying a time damping or windowing operator but in such a case, the far offset information<br />

will provi<strong>de</strong> only information on the <strong>de</strong>epest part of the mo<strong>de</strong>l, the low wavenumbers of the<br />

shallow part of the mo<strong>de</strong>l will not be constrained by the inversion. The lowest wavenumber<br />

information of the shallow part of the mo<strong>de</strong>l is contained in the <strong>la</strong>te arrivals at far offset and are<br />

therefore subject to a greater risk of cyc<strong>le</strong> skipping (see section 4.4.2.2). The use of far offsets,<br />

<strong>la</strong>te arrivals should therefore be avoi<strong>de</strong>d at the early stage of the inversion.<br />

On the other hand, the early arrivals of the near offsets can bring information on the low<br />

wavenumber components of the shallow part of the mo<strong>de</strong>l. This information yields higher<br />

wavenumbers than that of the <strong>la</strong>te arrivals at far offsets, but are neverthe<strong>le</strong>ss more linear as<br />

they correspond to shorter propagation distance. There is therefore some justification for the<br />

imp<strong>le</strong>mentation of a strategy that would invert the early arrivals, from the near to the far offset<br />

information since:<br />

1. the early arrivals of the near offset bring low wavenumber information on the shallow part<br />

of the mo<strong>de</strong>l<br />

2. the near offset data correspond to the shortest propagation distance and are <strong>le</strong>ss likely to<br />

be cyc<strong>le</strong> skipped<br />

3. the inversion from the near to the far offset is consistent with the imp<strong>le</strong>mentation of a <strong>la</strong>yer<br />

stripping strategy which proceeds by the <strong>de</strong>termination of the shallowest to the <strong>de</strong>epest<br />

part of the mo<strong>de</strong>l and aims to reduce the risk of cyc<strong>le</strong> skipping of the far offset data<br />

4. the offset windowing se<strong>le</strong>cting a range of offset assures that the far offset information that<br />

is associated with the small singu<strong>la</strong>r values are properly taken into account.


4.5. DEFINITION OF A STRATEGY 125<br />

4.4.5 Mo<strong>de</strong>l parameterisation<br />

The result of the SVD has shown that the parameterisation in velocity of Fréchet <strong>de</strong>rivative<br />

matrix was yielding a damping of the portion of the mo<strong>de</strong>l associated with high velocities. In<br />

or<strong>de</strong>r to compensate for this unwanted effect, a slowness or square of slowness parameterisation<br />

should be advocated. Moreover, the linearity study of the homogeneous Green’s function<br />

showed that the linearity is improved by a parameterisation in slowness. The various possib<strong>le</strong><br />

parameterisations were tested (not shown here). The results <strong>de</strong>monstrated a great improvement<br />

of the slowness over the velocity parameterisation for the <strong>de</strong>termination of the magnitu<strong>de</strong> of the<br />

velocities. No improvement however was noticed when using squared slowness parameterisation.<br />

4.5 Definition of a strategy<br />

The previous section <strong>de</strong>scribed some of the techniques avai<strong>la</strong>b<strong>le</strong> for the preconditioning of<br />

the sing<strong>le</strong> frequency, waveform inverse prob<strong>le</strong>m. A strategy aiming to improve the chance of<br />

convergence into the global minimum should hence combine the preconditioning of both the<br />

gradient vector and the data residuals. The <strong>de</strong>finition of a universal strategy is neverthe<strong>le</strong>ss<br />

<strong>de</strong>licate as the non-linearity is highly <strong>de</strong>pen<strong>de</strong>nt on the type of velocity mo<strong>de</strong>l involved. The<br />

practical imp<strong>le</strong>mentation of the mitigation of the non-linearities should therefore be adapted to<br />

the type of imaging prob<strong>le</strong>m involved.<br />

In the numerical experiments shown in the next section, the strategy proposed will rely on<br />

the use of the wavenumber filtering of the gradient vector and the time damping of the data<br />

residuals. This strategy may be <strong>de</strong>composed into two main stages:<br />

1. the low wavenumbers are recovered using the early arrivals information. Time damping<br />

and wavenumber filtering are used. The data are inverted from the near to the far offsets<br />

thus adopting a <strong>la</strong>yer stripping strategy.<br />

2. the high wavenumbers are recovered <strong>la</strong>ter in the inversion by using the full, near offset<br />

wavefield (without time damping).


126 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

Velocity (m/s)<br />

1500 2000 2500 3000 3500<br />

0<br />

0.5<br />

Depth (km)<br />

1.0<br />

1.5<br />

2.0<br />

Figure 4.22: Standard waveform inversion at 7 Hz showing the true mo<strong>de</strong>l (grey), the starting<br />

mo<strong>de</strong>l (dotted line) and the result of the inversion (solid line). The inversion fails since the<br />

starting mo<strong>de</strong>l is not close enough to the global minimum.<br />

4.6 Numerical experiments<br />

4.6.1 1-D velocity mo<strong>de</strong>l<br />

The efficacy of the strategy of preconditioning of both the gradient vector and the data residuals<br />

is tested on the 1-D Marmousi velocity mo<strong>de</strong>l. The 1-D starting mo<strong>de</strong>l used is the central trace<br />

of the smooth, 2-D “FAST” mo<strong>de</strong>l obtained in chapter 3, section 3.6.2. The issue of convergence<br />

into a local minimum may be illustrated by carrying out 40 iterations of the standard<br />

waveform inversion at 7 Hz as shown Figure 4.22. Using such starting mo<strong>de</strong>l and inverting<br />

for the entire wavefield (all time, all offsets), the inversion fails to provi<strong>de</strong> an accurate velocity<br />

mo<strong>de</strong>l. The preconditioned inversion using the 2 stages approach dramatically improves the<br />

chance of convergence to the global minimum (Figure 4.23). The first stage consist of inverting<br />

from the near to the far offset data, with a sliding window se<strong>le</strong>cting the data residuals within a<br />

given offset range. Time damping of the data residuals and wavenumber low-pass filtering of<br />

the gradient vector are used (the values of the parameters are <strong>de</strong>tai<strong>le</strong>d in the Tab<strong>le</strong> of Figure<br />

4.23). The choice of the parameter were chosen empirically.


4.6. NUMERICAL EXPERIMENTS 127<br />

a) b) c)<br />

Velocity (m/s)<br />

1500 2000 2500 3000 3500<br />

0<br />

Stage 1<br />

Velocity (m/s)<br />

1500 2000 2500 3000 3500<br />

0<br />

Velocity (m/s)<br />

1500 2000 2500 3000 3500<br />

0<br />

0.5<br />

0.5<br />

0.5<br />

Depth (km)<br />

1.0<br />

Depth (km)<br />

1.0<br />

Depth (km)<br />

1.0<br />

1.5<br />

1.5<br />

1.5<br />

2.0<br />

2.0<br />

d)<br />

Velocity (m/s)<br />

1500 2000 2500 3000 3500<br />

0<br />

2.0<br />

0.5<br />

Stage 2<br />

Depth (km)<br />

1.0<br />

1.5<br />

2.0<br />

a b c d<br />

Offset (m) 200-3000 3000-6000 6000-10000 200-3000<br />

Time damping: τ (s) 0.25 0.25 0.25 No<br />

Gradient filtering: k max , ∆k (km −1) 1,2 1,2 1,2 No<br />

Figure 4.23: 1-D Waveform inversion at 7 Hz with preconditioning of the data residuals and<br />

gradient vector. a-c) Stage 1: Recovery of the low wavenumbers by inverting the early arrivals<br />

using time damping and gradient wavenumber filtering. The inversion is carried out from the<br />

far to the near offset. d) Stage 2: recovery of the high wavenumbers inverting the full wavefield<br />

of the near offset data. The true mo<strong>de</strong>l is shown as a grey line, the starting mo<strong>de</strong>l as a dotted<br />

line and the result of the inversion as a solid line.


128 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

a)<br />

V (m/s)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

Distance (km)<br />

-4 -2 0 2 4 6 8 10 12<br />

Figure 4.24: 2D exten<strong>de</strong>d Marmousi mo<strong>de</strong>l. The original mo<strong>de</strong>l is contained in the b<strong>la</strong>ck<br />

frame.<br />

4.6.2 The 2-D exten<strong>de</strong>d Marmousi mo<strong>de</strong>l<br />

In or<strong>de</strong>r to test the efficiency of the preconditioned inversion on a 2-D velocity mo<strong>de</strong>l, a <strong>de</strong>nse<br />

<strong>la</strong>rge offset, surface acquisition survey was carried out in an exten<strong>de</strong>d version of the original 2-<br />

D Marmousi mo<strong>de</strong>l (Figure 4.24). The original 2-D Marmousi experiment <strong>de</strong>fined in Chapter 3<br />

( section 3.6.1.1), was carried out with an OBC acquisition geometry where the very far offsets<br />

(~ 9 km) were very poorly represented. The original mo<strong>de</strong>l was therefore duplicated on each<br />

si<strong>de</strong> of the mo<strong>de</strong>l to create a 18 km long mo<strong>de</strong>l. In this new exten<strong>de</strong>d Marmousi mo<strong>de</strong>l, 187<br />

shot gathers were mo<strong>de</strong>l<strong>le</strong>d using a finite difference solver of the acoustic wave equation. The<br />

maximum offset present in the data is 10 km. The origin of the horizontal coordinate (x = 0 km)<br />

is set to be at the location of the beginning of the original mo<strong>de</strong>l.<br />

4.6.2.1 <strong>Inversion</strong> starting from the “FAST” mo<strong>de</strong>l<br />

The “FAST” starting mo<strong>de</strong>l which was obtained from first arrival travel-time tomography (see<br />

chapter 3, section 3.6.2) was also exten<strong>de</strong>d in or<strong>de</strong>r to be used in this new wi<strong>de</strong> ang<strong>le</strong> inversion<br />

experiment. Figure 4.25 shows the result of the standard waveform inversion at 7 Hz starting<br />

from the “FAST” mo<strong>de</strong>l. The inversion fails because the starting mo<strong>de</strong>l is not close enough to<br />

the global minimum of the misfit function.<br />

In or<strong>de</strong>r to improve the convergence efficiency of the waveform inversion starting at 7 Hz,<br />

the 2 stage approach of the previous section is applied. The time damping of <strong>la</strong>te arrivals is<br />

progressively <strong>de</strong>creased by gradually increasing the value of τ. At the same time, the smoothing<br />

of the gradient is re<strong>la</strong>xed by progressively allowing the update of higher wavenumbers. The<br />

different stages of the inversion are <strong>de</strong>tai<strong>le</strong>d in Tab<strong>le</strong> 4.3 and the results are shown Figure 4.26.<br />

The reconstruction of the mo<strong>de</strong>l is correct up to 1.5 km <strong>de</strong>pth, beyond which the velocities are<br />

not well recovered. The velocity profi<strong>le</strong>s (Figure 4.27) confirms that the velocity are reasonably


4.6. NUMERICAL EXPERIMENTS 129<br />

V (m/s)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Figure 4.25: Standard waveform inversion at 7 Hz. Only the part of the exten<strong>de</strong>d mo<strong>de</strong>l corresponding<br />

to the original mo<strong>de</strong>l is shown.<br />

Offset Windowing Time Damping Gradient Smoothing<br />

km τ (s) k max , ∆k (km −1 )<br />

Stage 1.1 0.2-3 ⇒ 3-6 ⇒ 6-10 0.1 1,2<br />

Stage 1.2 0.2-3 ⇒ 3-6 ⇒ 6-10 0.25 2,2<br />

Stage 1.3 0.2-3 ⇒ 3-6 ⇒ 6-10 0.5 3,2<br />

Stage 2 0.2-5 X X<br />

Tab<strong>le</strong> 4.3: Strategy of preconditioning for the waveform inversion for the 2-D exten<strong>de</strong>d Marmousi<br />

experiment. For each stage, the inversion proceeds from the near to the far offsets.<br />

well estimated up to 1.5 km <strong>de</strong>pth. The representation of the mo<strong>de</strong>lling in time in the true<br />

mo<strong>de</strong>l and the inversion result for 3 shot gathers is shown Figure 4.28. A close look at the<br />

shot gathers shows that many traveltime mismatch are presents, especially within the far offsets<br />

which implies that cyc<strong>le</strong> skipping occurs in the residuals.<br />

The FAST mo<strong>de</strong>l was created in the original 2-D Marmousi mo<strong>de</strong>l with an OBC acquisition<br />

geometry. Because of the limited <strong>le</strong>ngth of the mo<strong>de</strong>l, this experiment suffers from a <strong>la</strong>ck of<br />

<strong>de</strong>nse, <strong>la</strong>rge offset information (The maximum offset of 9.2 km was only represented twice in<br />

the acquisition at the first and <strong>la</strong>st shot of the acquisition). This causes the <strong>de</strong>eper part of the<br />

mo<strong>de</strong>l to be poorly constrained by the traveltime inversion. Hence the starting mo<strong>de</strong>l did not<br />

prevent cyc<strong>le</strong> skipping of the early arrivals of the very far offset data. Since cyc<strong>le</strong> skipping<br />

produces convergence into a local minimum, the recovery of the low wavenumbers in the <strong>de</strong>ep<br />

part of the mo<strong>de</strong>l is inaccurate. The strategy inverting from the near to far offset data did not<br />

allow to prevent cyc<strong>le</strong> skipping of the far offset. This may be exp<strong>la</strong>ined by the fact that diving<br />

waves travel horizontally and are thus very sensitive to velocity error at fixed <strong>de</strong>pth, which


130 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

V (m/s)<br />

V (m/s)<br />

V (m/s)<br />

V (m/s)<br />

a)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

b)<br />

c)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

d)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

e)<br />

V (m/s)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

Depth (km)<br />

Depth (km)<br />

Depth (km)<br />

Depth (km)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Figure 4.26: Preconditioned <strong>Inversion</strong> at 7 Hz starting from the exten<strong>de</strong>d “FAST” mo<strong>de</strong>l. a)<br />

Near to far offset with τ = 0.1 s, F ilter(k) = 1, 2 km −1 . b) Near to far offset, τ = 0.25 s,<br />

F ilter(k) = 2, 2 km −1 . c) Near to far offset, τ = 0.5 s, F ilter(k) = 3, 2 km −1 . d) Offset 2-5<br />

km, no preconditioning. e) True mo<strong>de</strong>l.


4.7. CONCLUSION 131<br />

a) b) c)<br />

Velocity (m/s)<br />

1500 2000 2500 3000 3500 4000<br />

0<br />

Velocity (m/s)<br />

1500 2000 2500 3000 3500 4000<br />

0<br />

Velocity (m/s)<br />

1500 2000 2500 3000 3500 4000<br />

0<br />

500<br />

500<br />

500<br />

Depth (m)<br />

1000<br />

Depth (m)<br />

1000<br />

Depth (m)<br />

1000<br />

1500<br />

1500<br />

1500<br />

2000<br />

2000<br />

2000<br />

Figure 4.27: Velocity profi<strong>le</strong>s at 3 locations in the mo<strong>de</strong>l showing the true mo<strong>de</strong>l (grey), the<br />

starting mo<strong>de</strong>l (dotted) and the result of the preconditioned waveform inversion (solid). a) At<br />

x=2.2 km. b) At x=4.6 km. b) At x=6.9 km.<br />

cannot be fully compensated by a <strong>la</strong>yer stripping approach.<br />

4.6.2.2 <strong>Inversion</strong> starting from an improved starting mo<strong>de</strong>l<br />

The previous inversion experiment yiel<strong>de</strong>d a poor reconstruction of the <strong>de</strong>ep velocities due to<br />

the insufficient accuracy of the <strong>de</strong>eper part of the mo<strong>de</strong>l. In or<strong>de</strong>r to test the inversion strategy<br />

starting from a mo<strong>de</strong>l that contains more accurate low wavenumbers, a waveform inversion<br />

is carried out starting from a mo<strong>de</strong>l that is a smoothed version of the true mo<strong>de</strong>l. The true<br />

exten<strong>de</strong>d Marmousi was thus smoothed using the program smooth2 of the SU package (Figure<br />

4.29a). The result of the standard gradient inversion is shown Figure 4.29b. Despite the fact<br />

that the starting mo<strong>de</strong>l now contains accurate low wavenumbers of the true mo<strong>de</strong>l, the standard<br />

inversion fails to locate the global minimum. Figure 4.29 shows that when the preconditioning<br />

tools are applied according to the strategy <strong>de</strong>tai<strong>le</strong>d in the previous section, the inversion result<br />

is dramatically improved.<br />

4.7 Conclusion<br />

For a smooth starting mo<strong>de</strong>l and a starting frequency of 7 Hz, the waveform inversion can easily<br />

fail and converge into a local minimum. Some preconditioning operators must therefore be<br />

applied to the gradient vector and data residuals in or<strong>de</strong>r to improve the linearity of the inverse<br />

prob<strong>le</strong>m. The SVD indicates that the high wavenumbers dominate the gradient vector. Thus<br />

when wi<strong>de</strong> ang<strong>le</strong> seismic data are inverted, the migration regime of the waveform inversion<br />

carries a stronger weight than the tomographic reconstruction of the mo<strong>de</strong>l. Since the low<br />

wavenumbers need to be accurate to allow proper reconstruction of higher wavenumbers, the


132 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

0<br />

1<br />

2<br />

True Shot<br />

Offset (km)<br />

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9<br />

Shot at x=0 km<br />

0<br />

1<br />

2<br />

<strong>Inversion</strong> Shot<br />

Offset (km)<br />

-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9<br />

Time (s)<br />

3<br />

4<br />

5<br />

6<br />

Time (s)<br />

3<br />

4<br />

5<br />

6<br />

7<br />

Offset (km)<br />

-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8<br />

0<br />

Shot at x=4.6 km<br />

7<br />

Offset (km)<br />

-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8<br />

0<br />

Time (s)<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

Time (s)<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

0<br />

1<br />

2<br />

Offset (km)<br />

-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4<br />

Shot at x=9.2 km<br />

7<br />

0<br />

1<br />

2<br />

Offset (km)<br />

-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4<br />

Time (s)<br />

3<br />

4<br />

5<br />

6<br />

7<br />

Time (s)<br />

3<br />

4<br />

5<br />

6<br />

7<br />

Figure 4.28: Shot gathers at 3 locations in the true mo<strong>de</strong>l (<strong>le</strong>ft) and in the result of the preconditioned<br />

inversion (right). Amplitu<strong>de</strong> are normalised.


4.7. CONCLUSION 133<br />

a)<br />

V (m/s)<br />

b)<br />

V (m/s)<br />

c)<br />

V (m/s)<br />

d)<br />

V (m/s)<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

Depth (km)<br />

Depth (km)<br />

Depth (km)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Distance (km)<br />

0 1 2 3 4 5 6 7 8 9<br />

Figure 4.29: <strong>Inversion</strong> at 7 Hz starting from an improved macro mo<strong>de</strong>l. a) Starting mo<strong>de</strong>l. b)<br />

Standard waveform inversion. c) Preconditioned inversion. d) True mo<strong>de</strong>l.


134 CHAPTER 4. STARTING FROM REALISTIC FREQUENCIES<br />

preconditioning operator should insure that the low wavenumber have fully converged before<br />

the high wavenumbers are recovered. This can be achieved by smoothing the gradient vector.<br />

The <strong>de</strong>termination of the low wavenumbers is neverthe<strong>le</strong>ss <strong>de</strong>licate, as they correspond the most<br />

non-linear components of the mo<strong>de</strong>l. Time damping can be applied to the data residuals in or<strong>de</strong>r<br />

to improve the linearity of the low wavenumber reconstruction.<br />

The use of preconditioning tools on the 2-D exten<strong>de</strong>d Marmousi experiment, using <strong>de</strong>nse<br />

<strong>la</strong>rge offset data, has been shown to significantly improve the convergence accuracy of the waveform<br />

inversion. However, this experiment will require to be per<strong>forme</strong>d with a starting mo<strong>de</strong>l<br />

created from a first arrival traveltime inversion to fully evaluate the efficacy of the preconditioning<br />

strategy in realistic conditions. The “FAST” mo<strong>de</strong>l used in this study was in fact produced<br />

in the original Marmousi mo<strong>de</strong>l with an OBC acquisition geometry in which the <strong>la</strong>rge offsets<br />

are poorly represented.


Chapter 5<br />

Waveform inversion and starting mo<strong>de</strong>ls:<br />

the sub-basalt imaging prob<strong>le</strong>m<br />

5.1 Introduction<br />

The previous chapter showed that the non-linearity of the waveform inverse prob<strong>le</strong>m may be<br />

mitigated by using appropriate preconditioning techniques. This preconditioning is required<br />

because at high frequency, the topography of the misfit function in the neighbourhood of a<br />

smooth starting mo<strong>de</strong>l is likely to gui<strong>de</strong> gradient methods into a local minimum. The linearity<br />

may of course always be enhanced by improving the accuracy of the starting mo<strong>de</strong>l. This begs<br />

the important question of the <strong>de</strong>finition of the requirements of the starting mo<strong>de</strong>l that allow<br />

convergence of the waveform inversion to the global minimum, rather than a local minimum.<br />

Since the starting mo<strong>de</strong>l is to be obtained from standard techniques such as stacking velocity<br />

analysis or travel time tomography, the accuracy of the starting mo<strong>de</strong>l is boun<strong>de</strong>d by limitations<br />

in resolution, i.e. by the maximum wavenumber that can be recovered using standard methods.<br />

Both the requirements of waveform inversion and the limitations of standard techniques will<br />

govern of the success of the waveform inversion.<br />

In or<strong>de</strong>r to investigate the requirements of the starting mo<strong>de</strong>l in the case of standard waveform<br />

inversion (without preconditioning), I propose in this chapter to carry out a linearity study<br />

by analysing the evolution of the misfit function with respect to the smoothness of the true<br />

mo<strong>de</strong>l. Since the inverse prob<strong>le</strong>m is highly <strong>de</strong>pen<strong>de</strong>nt on the characteristics of the mo<strong>de</strong>l to be<br />

recovered, I will perform the study on an geophysical case that has drawn an increasing interest<br />

135


136 CHAPTER 5. WAVEFORM INVERSION AND STARTING MODEL<br />

over the past few years: the sub-basalt imaging prob<strong>le</strong>m.<br />

The particu<strong>la</strong>rity of the sub-basalt imaging prob<strong>le</strong>m will first be introduced. The linearity<br />

study will then be imp<strong>le</strong>mented on a 1-D velocity mo<strong>de</strong>l. This study will provi<strong>de</strong> useful indications<br />

on the characteristic of a suitab<strong>le</strong> starting mo<strong>de</strong>l for waveform inversion. I will then<br />

discuss the limitations of waveform inversion when applied to such an imaging prob<strong>le</strong>m.<br />

5.2 The sub-basalt imaging prob<strong>le</strong>m<br />

In many areas of high hydrocarbon prospectivity, extensive <strong>la</strong>va flows covered sedimentary<br />

basins during continental breakup. The basalt flow may be thin sills intru<strong>de</strong>d in the sediments<br />

or a very thick <strong>la</strong>yer of extru<strong>de</strong>d <strong>la</strong>va (P<strong>la</strong>nke et al., 1999). The main geographic area of interest<br />

is the northeast At<strong>la</strong>ntic, particu<strong>la</strong>rly around the Faeroe is<strong>la</strong>nds region where the Norwegian<br />

margin has been strongly affected by volcanism, although simi<strong>la</strong>r breakups have occurred along<br />

the western Australian, the western African and the Brazilian margins (White and D., 1989).<br />

In these areas, sub-basalt sediments may contain significant hydrocarbon reservoir that have<br />

not yet been properly imaged because of the screening effect due to the presence of the basalt.<br />

Many phenomena are involved in the screening of sub-basalt ref<strong>le</strong>ctions and the full comp<strong>le</strong>xity<br />

of the imaging prob<strong>le</strong>m is yet to be un<strong>de</strong>rstood. The effects discussed below have however been<br />

i<strong>de</strong>ntified as causing standard near offset seismic data to fail to provi<strong>de</strong> an accurate image of<br />

the basalt and sub-basalt structures.<br />

5.2.1 Transmission loss at the top basalt interface and multip<strong>le</strong>s<br />

The main feature of sub-basalt imaging is the strong velocity contrast at the top basalt interface.<br />

Basalts are associated with much higher P-wave velocities (V p > 5 km/s) than the overbur<strong>de</strong>n<br />

sediments. This contrast is responsib<strong>le</strong> for very strong transmission loss of the inci<strong>de</strong>nt wave<br />

at the basalt interface. The issue of weak amplitu<strong>de</strong> of intra-basalt and sub-basalt ref<strong>le</strong>ctions is<br />

exacerbated by the presence of very strong free surface and internal multip<strong>le</strong>s in the data.<br />

In recent years, <strong>la</strong>rge offset (wi<strong>de</strong> ang<strong>le</strong>) seismic acquisition has been consi<strong>de</strong>red in an<br />

attempt to overcome some of the difficulties of near offset data (Jarchow et al., 1994; Samson<br />

et al., 1995; Fruehn et al., 1998). At <strong>la</strong>rge offset, wi<strong>de</strong> ang<strong>le</strong> ref<strong>le</strong>ctions, diving waves and<br />

refractions may be exploited as they carry stronger amplitu<strong>de</strong>s than the near offset data. These<br />

arrivals also present the advantage of being separated in time from the free surface multip<strong>le</strong>s.


5.2. THE SUB-BASALT IMAGING PROBLEM 137<br />

Sediment<br />

P<br />

P<br />

P<br />

P<br />

Basalt<br />

S<br />

P<br />

S<br />

S<br />

P<br />

Subbasalt<br />

S<br />

P<br />

S<br />

P<br />

P<br />

P<br />

Figure 5.1: Converted waves in Sub-basalt imaging. The PPPPPP ray path is expected to be<br />

weak in amplitu<strong>de</strong> because of the transmission loss at the top basalt interface. The symmetric<br />

conversions PSSSP (dotted) and PSPPSP (grey) are therefore consi<strong>de</strong>red (after Li et al. (1998))<br />

The ray based traveltime inversion method of Zelt and Smith (1992) is often used and provi<strong>de</strong>s<br />

information on the long wave<strong>le</strong>ngths of the velocity structure (Hughes et al., 1998). The velocity<br />

mo<strong>de</strong>l produced by such methods may be combined with near inci<strong>de</strong>nce data to recover the fine<br />

sca<strong>le</strong> of the basalt structure (Fruehn et al., 2001). However, the use of traveltime techniques<br />

require the picking of events that must be carefully associated with horizons and <strong>la</strong>yers in the<br />

mo<strong>de</strong>l. The top of the basalt interface is easy to i<strong>de</strong>ntify and diving waves traveltime information<br />

may be used for the <strong>de</strong>termination of the internal basalt velocity gradient. Much more <strong>de</strong>licate<br />

is the i<strong>de</strong>ntification of the intra-basalt, and sub-basalt ref<strong>le</strong>ctions. However, Fliedner and White<br />

(2001) proposed to use the amplitu<strong>de</strong> versus offset (AVO) information of the diving-waves to<br />

obtain information on the basalt thickness and the internal velocity gradient.<br />

As an alternative to the use of refracted and diving waves information, the exploitation of<br />

converted waves has also been investigated. The strong ref<strong>le</strong>ctivity of the top basalt interface<br />

allows an efficient mo<strong>de</strong> conversion beyond a certain critical ang<strong>le</strong>, from an inci<strong>de</strong>nt P-wave to<br />

an S-wave travelling within the basalt (Purnell, 1992). Two main types of symmetric mo<strong>de</strong> conversion<br />

were consi<strong>de</strong>red: the PSPPSP and the PSSSSP wave (Figure 5.1 after Li et al. (1998)).<br />

The interpretation of the converted waves is neverthe<strong>le</strong>ss very <strong>de</strong>licate as they are very difficult<br />

to discriminate from internal multip<strong>le</strong>s.


138 CHAPTER 5. WAVEFORM INVERSION AND STARTING MODEL<br />

5.2.2 Interface scattering at the top of the basalt<br />

In addition to a strong velocity contrast, the scattering caused by irregu<strong>la</strong>rities of the top basalt<br />

interface is presumed to p<strong>la</strong>y a significant ro<strong>le</strong> in the screening of the sub-basalt ref<strong>le</strong>ctions.<br />

The effect of the roughness of this interface on the wave propagation may be mitigated by using<br />

wave equation datuming techniques (Martini and Bean, 2002). This approach consist of a downward<br />

continuation of the data to below the top basalt interface. The downward continuation is<br />

per<strong>forme</strong>d using a mo<strong>de</strong>l containing a rough interface that was <strong>de</strong>termined from a stack section.<br />

The data are then upward back-propagated to the surface using a f<strong>la</strong>t top basalt interface. This<br />

techniques as shown to improve the coherence of sub-basalt ref<strong>le</strong>ction.<br />

5.2.3 Body wave scattering within the basalt<br />

The basalt <strong>la</strong>yer presents strong heterogeneities that cause the scattering of the wavefront propagating<br />

within the basalt (Lafond et al., 1999; Martini et al., 2000). The effect of scattering is<br />

often consi<strong>de</strong>red as noise and is referred to as “apparent” attenuation since it is at the origin<br />

of some of the transmission loss of the seismic wave amplitu<strong>de</strong> (Gibson and Levan<strong>de</strong>r, 1988).<br />

Since the scattering affects the high frequencies more dramatically (Ziolkowski and Fokkema,<br />

1986), seismic acquisition using with low frequencies sources (~5-7 Hz) is now being seriously<br />

envisaged to mitigate the screening due to scattering (Ziolkowski et al., 2001).<br />

5.2.4 Sub-basalt imaging and waveform inversion<br />

Many additional prob<strong>le</strong>ms are likely to be associated with the sub-basalt imaging prob<strong>le</strong>ms<br />

(anisotropy, 3D effects...). However, the combination of <strong>la</strong>rge offsets and lower frequencies is<br />

now typically present in mo<strong>de</strong>rn sub-basalt imaging seismic data. This offers some real potential<br />

for the use of full waveform inversion techniques, for the reasons <strong>de</strong>scribed in the previous<br />

chapters: whi<strong>le</strong> the linearity of the waveform inverse prob<strong>le</strong>m is very much improved at low<br />

frequencies, the far offsets allow wi<strong>de</strong> ang<strong>le</strong> illumination and provi<strong>de</strong> information about the low<br />

wavenumbers. Shipp and Singh (2002) applied time domain, e<strong>la</strong>stic waveform inversion to a<br />

sub-basalt data set. The starting mo<strong>de</strong>l was created using ray-based traveltime inversion (Zelt<br />

and Smith, 1992). The velocity mo<strong>de</strong>l provi<strong>de</strong>d by this experiment was limited to a <strong>de</strong>termination<br />

of the overbur<strong>de</strong>n sediments and the basalt <strong>la</strong>yer itself. The bottom basalt interface and the<br />

sub-basalt <strong>la</strong>yers were not imaged. The reconstruction however <strong>de</strong>termined some of the basalt


5.3. THE 1-D SUB-BASALT NUMERICAL EXPERIMENT 139<br />

structure and showed a good fit between the synthetic and the observed data suggesting that the<br />

inversion converged to a mo<strong>de</strong>l reasonably close to the global minimum.<br />

The work presented here aims to further un<strong>de</strong>rstand the difficulties as well as the imaging<br />

potential that should be expected when applying waveform inversion to the sub-basalt imaging<br />

prob<strong>le</strong>m. Many of the screening effects <strong>de</strong>scribed previously will be neg<strong>le</strong>cted as only the<br />

nature of the waves propagating in each <strong>la</strong>yer will be taken into account.<br />

5.3 The 1-D sub-basalt numerical experiment<br />

The velocity mo<strong>de</strong>l chosen to represent the sub-basalt imaging prob<strong>le</strong>m in this chapter is shown<br />

Figure 5.2a. The mo<strong>de</strong>l comprises a water <strong>la</strong>yer (1 km thick), a sediment <strong>la</strong>yer (1.5 km thick),<br />

a basalt <strong>la</strong>yer (2 km thick) and the sub-basalt sediments. Seismic data were generated in this<br />

mo<strong>de</strong>l using a finite difference solver of the acoustic wave equation. Only a sing<strong>le</strong> shot gather<br />

is mo<strong>de</strong>l<strong>le</strong>d (Figure 5.2b ); and the characteristics of the mo<strong>de</strong>lling are <strong>de</strong>scribed in Tab<strong>le</strong> 5.1.<br />

The free surface multip<strong>le</strong>s were not mo<strong>de</strong>l<strong>le</strong>d in this experiment.<br />

Number of Shot 1<br />

Number of Receivers 601<br />

Shot/Receivers <strong>de</strong>pth 9 m<br />

Receiver spacing<br />

25 m<br />

Offset range<br />

0-15 Km<br />

Free surface on top<br />

No<br />

Frequency range 0.1-10 Hz<br />

Number of frequencies mo<strong>de</strong>l<strong>le</strong>d 99<br />

Grid size (n x × n z ) 355x140<br />

Grid spacing<br />

45 m<br />

Tab<strong>le</strong> 5.1: Synthetic data characteristics.


140 CHAPTER 5. WAVEFORM INVERSION AND STARTING MODEL<br />

a) b)<br />

0<br />

Velocity (m/s)<br />

2000 3000 4000 5000<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15<br />

0<br />

1<br />

1<br />

2<br />

WBR<br />

2<br />

3<br />

Depth (km)<br />

3<br />

4<br />

Time (s)<br />

4<br />

5<br />

6<br />

7<br />

BBR<br />

TBR<br />

5<br />

8<br />

9<br />

6<br />

Figure 5.2: The Sub-basalt synthetic data. a) Velocity mo<strong>de</strong>l. b) Shot point gather in the<br />

true mo<strong>de</strong>l with a source Ricker with a pick frequencies at 4 Hz. The water bottom ref<strong>le</strong>ction<br />

(WBR), the top basalt ref<strong>le</strong>ction (TBR) and the bottom basalt ref<strong>le</strong>ction (BBR) are i<strong>de</strong>ntified on<br />

the right panel. The free surface multip<strong>le</strong>s are not present in the data.


5.4. INVERSION STARTING FROM A “CORRECT” MACRO MODEL 141<br />

5.4 <strong>Inversion</strong> starting from a “correct” macro mo<strong>de</strong>l<br />

In or<strong>de</strong>r to illustrate the difficulties of waveform inversion applied to sub-basalt imaging, the<br />

results of the sing<strong>le</strong> frequency waveform inversion, when the starting mo<strong>de</strong>l contains the correct<br />

macro mo<strong>de</strong>l (low wavenumbers) are shown Figure 5.3, for the frequencies: 3, 5 and 7 Hz. The<br />

starting mo<strong>de</strong>l is a smoothed version of the true mo<strong>de</strong>l (The water bottom discontinuity is not<br />

smoothed and its <strong>de</strong>pth is therefore assumed to be known). The final mo<strong>de</strong>l of the inversion at<br />

3 Hz (Figure 5.3a ) is close enough to the global minimum to allow a sequential inversion of<br />

higher frequencies (Figure 5.3d ). The inversion fails when starting at 5 or 7 Hz (Figure 5.3b<br />

and c) which preclu<strong>de</strong>s any inversion that start at higher frequencies. In or<strong>de</strong>r to investigate the<br />

origin of this behaviour, the next section will study the misfit function using a varying <strong>de</strong>gree<br />

of smoothness of the true mo<strong>de</strong>l.<br />

5.4.1 Misfit function and mo<strong>de</strong>l smoothness<br />

Figure 5.4b shows the misfit functions for 3,5 and 7 Hz as the entire mo<strong>de</strong>l (sediments, basalt<br />

and sub-basalt) is <strong>de</strong>creasingly smoothed (increasingly more accurate) as shown in Figure 5.4a.<br />

The smoothing was carried out using the program smooth2 of the Seismic for Un*x package<br />

(Stockwell, 1997). The <strong>de</strong>gree of smoothness is <strong>de</strong>fined with respect to a smoothing factor: the<br />

mo<strong>de</strong>l will get smoother as the smoothing factor increases. The smoothest mo<strong>de</strong>l (smoothing<br />

factor of 10) is the macro mo<strong>de</strong>l used as a starting mo<strong>de</strong>l in the previous inversion tests.<br />

The 3 Hz misfit function shows a continuous <strong>de</strong>crease with <strong>de</strong>creasing smoothness, whereas<br />

for 5 and 7 Hz, the function presents a local maximum. The <strong>la</strong>tter strongly suggests the presence<br />

of a local minimum in the misfit function. Any starting mo<strong>de</strong>l located on the <strong>le</strong>ft si<strong>de</strong> of this<br />

maximum is therefore likely to produce cyc<strong>le</strong> skipping in the data and to cause the waveform<br />

inversion to converge within a local minimum. This behaviours exp<strong>la</strong>ins the poor results of the<br />

previous inversions at 5 and 7 Hz since the starting mo<strong>de</strong>l is located on the wrong si<strong>de</strong> of the<br />

maximum for these frequencies.<br />

5.4.2 The top basalt and bottom basalt interfaces<br />

In or<strong>de</strong>r to i<strong>de</strong>ntify the main origin of the non-linearity in the previous section, the same experiment<br />

is carried out again, but in this case, only the sediment and top basalt interface are<br />

<strong>de</strong>creasingly smoothed (Figure 5.5a). The misfit functions in Figure 5.5b are i<strong>de</strong>ntical in shape


142 CHAPTER 5. WAVEFORM INVERSION AND STARTING MODEL<br />

a)<br />

0<br />

Velocity (m/s)<br />

2000 3000 4000 5000<br />

b) c)<br />

0<br />

Velocity (m/s)<br />

2000 3000 4000 5000<br />

0<br />

Velocity (m/s)<br />

2000 3000 4000 5000<br />

1<br />

1<br />

1<br />

2<br />

2<br />

2<br />

Depth (km)<br />

3<br />

Depth (km)<br />

3<br />

Depth (km)<br />

3<br />

4<br />

4<br />

4<br />

5<br />

5<br />

5<br />

6<br />

6<br />

6<br />

d)<br />

0<br />

Velocity (m/s)<br />

2000 3000 4000 5000<br />

1<br />

2<br />

Depth (km)<br />

3<br />

4<br />

5<br />

6<br />

Figure 5.3: <strong>Inversion</strong> starting from an accurate macro-mo<strong>de</strong>l. Velocity mo<strong>de</strong>ls showing the<br />

true mo<strong>de</strong>l (grey), the starting mo<strong>de</strong>l (dotted) and the result of inversion (solid) for a) 3 Hz, b) 5<br />

Hz and c) 7 Hz. (d) The result in (a) is close enough to the global minimum to allow sequential<br />

inversion of higher frequencies (5 and 7 Hz).


5.5. WAVEFIELD SEPARATION BY LAYER STRIPPING 143<br />

a) b)<br />

Depth (km)<br />

0<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

Velocity (m/s)<br />

2000 3000 4000 5000<br />

1.5<br />

1.0<br />

0.5<br />

Normalised Misfit Function2.0<br />

10<br />

8<br />

6 4<br />

Smoothing Factor<br />

2<br />

3 Hz<br />

5 Hz<br />

7 Hz<br />

0<br />

Figure 5.4: Evolution of the misfit function as the entire mo<strong>de</strong>l is <strong>de</strong>creasingly smoothed.<br />

a) True mo<strong>de</strong>l (b<strong>la</strong>ck) and the true mo<strong>de</strong>l <strong>de</strong>creasingly smoothed (grey). b) Misfit functions<br />

normalised with respect to the initial value for the frequency 3,5 and 7 Hz.<br />

to the misfit functions presented when the entire mo<strong>de</strong>l is smoothed (Figure5.4b). The intrabasalt<br />

and sub-basalt <strong>la</strong>yers appears re<strong>la</strong>tively unimportant in governing the topology of the<br />

misfit function. This <strong>le</strong>ads to the conclusion that the top basalt interface is the main origin of<br />

the failure of the waveform inversion and that the top basalt interface needs to be discontinuous<br />

(blocky) when using frequencies of 5 and 7 Hz (or higher). Since the bottom basalt interface<br />

corresponds to a velocity contrast of the same or<strong>de</strong>r as the top of the basalt, it seems reasonab<strong>le</strong><br />

to think that this interface should therefore also be discontinuous.<br />

5.5 Wavefield separation by <strong>la</strong>yer stripping<br />

The non-linearity of the sub-basalt waveform inverse prob<strong>le</strong>m is apparently caused by the top<br />

basalt interface. This interface creates the strongest ref<strong>le</strong>ction in the data and is therefore the<br />

easiest to <strong>de</strong>termine. Even if the top of the basalt is known in <strong>de</strong>pth and velocity, some nonlinearity<br />

issues remain for each part of the mo<strong>de</strong>l. In or<strong>de</strong>r to investigate the non-linearity<br />

associated with each <strong>la</strong>yer, a separation of the wavefield is carried out by imp<strong>le</strong>menting an<br />

i<strong>de</strong>alised <strong>la</strong>yer stripping strategy by <strong>de</strong>composing the mo<strong>de</strong>l into the sediment, the basalt and<br />

the sub-basalt <strong>la</strong>yers as shown Figure 5.6. The wavefield of the basalt <strong>la</strong>yer (Figure 5.6d)<br />

was obtained by subtraction of the waveform in the sediment (b) to the waveform generated


144 CHAPTER 5. WAVEFORM INVERSION AND STARTING MODEL<br />

a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

2.5<br />

Velocity (m/s)<br />

2000 3000 4000<br />

Normalised Misfit Function<br />

2.5<br />

2.0<br />

1.5<br />

1.0<br />

0.5<br />

3 Hz<br />

5 Hz<br />

7 Hz<br />

3.0<br />

10<br />

8<br />

6 4<br />

Smoothing Factor<br />

2<br />

0<br />

Figure 5.5: Evolution of the misfit function as the sediment and top basalt interface are <strong>de</strong>creasingly<br />

smoothed. a) True mo<strong>de</strong>l (b<strong>la</strong>ck) and the true mo<strong>de</strong>l <strong>de</strong>creasingly smoothed (grey).<br />

b) Misfit function at 3,5 and 7 Hz.<br />

in the mo<strong>de</strong>l in (c). The wavefield in the sub-basalt in (f) was obtained by subtraction of the<br />

total wavefield (Figure 5.1b) from the wavefield generated in (c). The sum of each of these<br />

wavefields is the total wavefield shown Figure 5.2b.<br />

5.5.1 The sediment <strong>la</strong>yer<br />

Figure 5.7 shows the misfit function as the sediment <strong>la</strong>yer is <strong>de</strong>creasingly smoothed. The water<br />

bottom interface is assumed to be known and remain unsmoothed. The starting point (smoothing<br />

factor of 50) of the experiment is a quasi homogeneous <strong>la</strong>yer. The 3 Hz misfit function<br />

<strong>de</strong>creases monotonically, whereas the 5 and 7 Hz frequencies encounter a local maximum. A<br />

satisfactory starting mo<strong>de</strong>l for the frequency 5 or 7 Hz should therefore be located beyond the<br />

maximum of the misfit function to assure convergence toward the global minimum.<br />

5.5.2 The basalt <strong>la</strong>yer<br />

Figure 5.8 shows the misfit function as the intra basalt velocities are <strong>de</strong>creasingly smoothed.<br />

Both misfit functions at 3 and 5 Hz <strong>de</strong>crease towards the true mo<strong>de</strong>l, whi<strong>le</strong> the 7 Hz frequency<br />

encounters a local maximum. This maximum at 7 Hz however occurs for a smoother mo<strong>de</strong>l<br />

than for the sediment <strong>la</strong>yer. This can be exp<strong>la</strong>ined by the fact that, as <strong>de</strong>scribed in Chapter 4,


5.5. WAVEFIELD SEPARATION BY LAYER STRIPPING 145<br />

a) c) e)<br />

0<br />

Velocity (m/s)<br />

2000 3000 4000 5000<br />

0<br />

Velocity (m/s)<br />

2000 3000 4000 5000<br />

0<br />

Velocity (m/s)<br />

2000 3000 4000 5000<br />

1<br />

1<br />

1<br />

2<br />

2<br />

2<br />

Depth (km)<br />

3<br />

Depth (km)<br />

3<br />

Depth (km)<br />

3<br />

4<br />

4<br />

4<br />

5<br />

5<br />

5<br />

6<br />

b) d) f)<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15<br />

0<br />

6<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15<br />

0<br />

0<br />

6<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15<br />

Time (s)<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

Time (s)<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

Time (s)<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

Figure 5.6: Illustration of the i<strong>de</strong>alised <strong>la</strong>yer stripping strategy showing the part of the mo<strong>de</strong>l<br />

involved in grey (a,c,e) and the corresponding wavefield (b,d,f). a,b) The sediment only and the<br />

corresponding wavefield. c,d) The basalt. e,f) The sub-basalt. The sum of the wavefields in<br />

b,d,f is equal to the total wavefield shown Fig. 5.2. Amplitu<strong>de</strong> are normalised.


146 CHAPTER 5. WAVEFORM INVERSION AND STARTING MODEL<br />

a) b)<br />

Depth (km)<br />

0<br />

0.5<br />

1.0<br />

1.5<br />

2.0<br />

Velocity (m/s)<br />

1500 1750 2000 2250<br />

Normalised Misfit Function<br />

2.0<br />

1.5<br />

1.0<br />

0.5<br />

50<br />

3 Hz<br />

5 Hz<br />

7 Hz<br />

40<br />

30 20<br />

Smoothing Factor<br />

10<br />

0<br />

Figure 5.7: Evolution of the misfit function as the sediment <strong>la</strong>yer is <strong>de</strong>creasingly smoothed, a)<br />

true mo<strong>de</strong>l (b<strong>la</strong>ck) and the true mo<strong>de</strong>l smoothed (grey).b) Misfit function.<br />

section 4.3.2, the non-linearity <strong>de</strong>creases with increasing velocity. As a result, the requirements<br />

for the starting mo<strong>de</strong>l are <strong>le</strong>ss <strong>de</strong>manding in the basalt than in the sediment <strong>la</strong>yer.<br />

5.5.3 The sub-basalt <strong>la</strong>yer<br />

Figure 5.9 shows the misfit functions as the sub-basalt velocities are <strong>de</strong>creasingly smoothed.<br />

Although all misfit functions are monotonically <strong>de</strong>creasing, the data at 5 and 7 Hz show very<br />

litt<strong>le</strong> sensitivity to the low and intermediate wavenumbers of the sub-basalt. This can be exp<strong>la</strong>ined<br />

by the presence of the basalt <strong>la</strong>yer which prevents the wi<strong>de</strong> ang<strong>le</strong> illumination of the<br />

sub-basalt sediments. Therefore, only narrow inci<strong>de</strong>nce ang<strong>le</strong>s propagate within the sub-basalt<br />

sediments and these provi<strong>de</strong> only information about the high wavenumbers.<br />

5.6 Waveform inversion and starting mo<strong>de</strong>l requirements<br />

5.6.1 Frequency <strong>de</strong>pen<strong>de</strong>nce of the starting mo<strong>de</strong>l requirements<br />

The analysis of the misfit function curves with respect to the mo<strong>de</strong>l smoothness allows us to<br />

specify the <strong>de</strong>gree of a priori information that assures success of the waveform inversion in<br />

the context of a <strong>la</strong>yer stripping strategy. Figure 5.10 shows the minimum accuracy (maximum


5.6. WAVEFORM INVERSION AND STARTING MODEL REQUIREMENTS 147<br />

a) b)<br />

Depth (km)<br />

Velocity (m/s)<br />

2000 3000 4000 5000<br />

2.5<br />

3.0<br />

3.5<br />

4.0<br />

1.0<br />

0.5<br />

Normalised Misfit Function1.5<br />

3 Hz<br />

5 Hz<br />

7 Hz<br />

4.5<br />

50<br />

40<br />

30 20<br />

Smoothing Factor<br />

10<br />

0<br />

Figure 5.8: Evolution of the misfit function as the basalt <strong>la</strong>yer is <strong>de</strong>creasingly smoothed. a)<br />

True mo<strong>de</strong>l (b<strong>la</strong>ck) and the true mo<strong>de</strong>l smoothed (grey). b) Misfit functions at 3,5 and 7 Hz.<br />

a) b)<br />

Depth (km)<br />

Velocity (m/s)<br />

2500 3000 3500 4000<br />

4.5<br />

5.0<br />

5.5<br />

1.5<br />

1.0<br />

0.5<br />

Normalised Misfit Function2.0<br />

3 Hz<br />

5 Hz<br />

7 Hz<br />

6.0<br />

50<br />

40<br />

30 20<br />

Smoothing Factor<br />

10<br />

0<br />

Figure 5.9: Evolution of the misfit function as the sub-basalt <strong>la</strong>yer is <strong>de</strong>creasingly smoothed,<br />

a) true mo<strong>de</strong>l (b<strong>la</strong>ck) and the true mo<strong>de</strong>l smoothed (grey). b) Misfit functions at 3,5 and 7 Hz.


148 CHAPTER 5. WAVEFORM INVERSION AND STARTING MODEL<br />

a) b) c)<br />

0<br />

0.5<br />

Velocity (m/s)<br />

1500 1750 2000 2250<br />

3 Hz<br />

5 Hz 2.5<br />

7 Hz<br />

Velocity (m/s)<br />

3000 4000 5000<br />

Velocity (m/s)<br />

2500 3000 3500 4000<br />

4.5<br />

Depth (km)<br />

1.0<br />

1.5<br />

2.0<br />

Depth (km)<br />

3.0<br />

3.5<br />

4.0<br />

Depth (km)<br />

5.0<br />

5.5<br />

2.5<br />

4.5<br />

6.0<br />

Figure 5.10: A<strong>de</strong>quate velocity mo<strong>de</strong>ls for the a) sediment, b) the basalt and c) the sub-basalt<br />

showing the true mo<strong>de</strong>l (grey) and the velocity mo<strong>de</strong>ls for 3,5 and 7 Hz.<br />

allowed smoothness) of the velocity mo<strong>de</strong>l for an inversion starting at 3,5 or 7 Hz. A different<br />

<strong>le</strong>vel of accuracy is required for each of the sediment, the basalt and the sub-basalt <strong>la</strong>yers. The<br />

sediment <strong>la</strong>yer should contain some know<strong>le</strong>dge of the medium wavenumbers, the basalt <strong>la</strong>yer<br />

needs <strong>le</strong>ss accuracy as some bulk information on the vertical gradient appears to be sufficient.<br />

The sub-basalt <strong>la</strong>yer is the most <strong>de</strong>manding, as the main features of the heterogeneities seem to<br />

be required.<br />

5.6.2 Waveform inversion at 7 Hz<br />

The Figure 5.11 shows a set of waveform inversions for the sediment (a), the basalt (b) and<br />

the sub-basalt (c) <strong>la</strong>yers. For each <strong>la</strong>yer, two inversions were carried out: the first one from a<br />

poor starting mo<strong>de</strong>l, the second with a good starting mo<strong>de</strong>l as <strong>de</strong>fined from the misfit function<br />

analysis (Figure 5.10). The sediment and the basalt poor and good starting mo<strong>de</strong>ls are on either<br />

si<strong>de</strong> of the local maxima of their respective misfit function. For the sub-basalt, the limit is<br />

more difficult to <strong>de</strong>fine as no maximum occurs in the misfit functions. The waveform inversion<br />

results appear to validate the conclusions drawn from the linearity study. When inverting with a<br />

good starting mo<strong>de</strong>l at 7 Hz, the sediments are recovered very accurately as the offset range is<br />

ab<strong>le</strong> to reconstruct a wi<strong>de</strong> range of wavenumbers. The imaging in the basalt, although accurate,<br />

is limited by a maximum resolution, in turn control<strong>le</strong>d by the background velocities and the


5.6. WAVEFORM INVERSION AND STARTING MODEL REQUIREMENTS 149<br />

frequency (7 Hz). For the sub-basalt <strong>la</strong>yer, the starting mo<strong>de</strong>l needs to contain much of the low<br />

and intermediate wavenumbers as it is only illuminated by narrow inci<strong>de</strong>nce ang<strong>le</strong>s which, as<br />

<strong>de</strong>tai<strong>le</strong>d in Chapter 3, provi<strong>de</strong> information on the high wavenumbers. The sub-basalt wavefield<br />

is therefore unab<strong>le</strong> to contribute to the low wavenumbers. The consequence is that the inversion<br />

will provi<strong>de</strong> only a migration-like (high wavenumber) image of the sub-basalt structure.


150 CHAPTER 5. WAVEFORM INVERSION AND STARTING MODEL<br />

0<br />

0.5<br />

Poor Starting Mo<strong>de</strong>l<br />

Velocity (m/s)<br />

1500 2000<br />

(a)<br />

0<br />

0.5<br />

Good Starting Mo<strong>de</strong>l<br />

Velocity (m/s)<br />

1500 2000<br />

Depth (km)<br />

1.0<br />

1.5<br />

Depth (km)<br />

1.0<br />

1.5<br />

2.0<br />

2.0<br />

Smoothing Factor of 20<br />

Smoothing Factor of 6<br />

Velocity (m/s)<br />

3000 4000 5000<br />

2.5<br />

(b)<br />

Velocity (m/s)<br />

3000 4000 5000<br />

2.5<br />

3.0<br />

3.0<br />

Depth (km)<br />

3.5<br />

Depth (km)<br />

3.5<br />

4.0<br />

4.0<br />

4.5<br />

4.5<br />

Smoothing Factor of 28<br />

Smoothing Factor of 24<br />

Velocity (m/s)<br />

2500 3000 3500 4000<br />

4.5<br />

Velocity (m/s)<br />

2500 3000 3500 4000<br />

4.5<br />

5.0<br />

(c)<br />

5.0<br />

Depth (km)<br />

5.5<br />

Depth (km)<br />

5.5<br />

6.0<br />

Smoothing Factor of 50 Smoothing Factor of 2<br />

6.0<br />

Figure 5.11: Waveform inversion at 7 Hz when the starting mo<strong>de</strong>l (dotted) <strong>le</strong>ads the convergences<br />

towards a local minimum (<strong>le</strong>ft) and close to the global minimum (right). The true mo<strong>de</strong>l<br />

is shown in grey, and the result of the inversion in solid line.


5.7. DISCUSSION: STARTING MODEL AND STANDARD METHODS 151<br />

5.7 Discussion: starting mo<strong>de</strong>l and standard methods<br />

The analysis of the misfit functions with respect to the mo<strong>de</strong>l smoothness <strong>le</strong>ad us to conclu<strong>de</strong><br />

that for a realistic starting frequency of 7 Hz, the starting mo<strong>de</strong>l should contain discontinuous<br />

top and bottom basalt interfaces. Ray-based traveltime methods, such as the one proposed by<br />

Zelt and Smith (1992) can easily <strong>de</strong>termine the top-basalt interface. Neverthe<strong>le</strong>ss, the <strong>de</strong>termination<br />

of the bottom basalt interface, which must be inclu<strong>de</strong>d in the starting mo<strong>de</strong>l, remains a<br />

real issue as this ref<strong>le</strong>ction is often difficult to i<strong>de</strong>ntify in the data due to its weak amplitu<strong>de</strong> and<br />

the presence of strong multip<strong>le</strong>s.<br />

The <strong>de</strong>termination of the sediment <strong>la</strong>yer velocity structure may be <strong>de</strong>licate, as the starting<br />

mo<strong>de</strong>l should contain some information about the medium wavenumbers. This can be exp<strong>la</strong>ined<br />

by the re<strong>la</strong>tively low velocities in the sediments since non-linearity increases with low velocities.<br />

Standard methods such as stacking velocity analysis may provi<strong>de</strong> useful information on the<br />

vertical velocity gradient of the sediments, but the recovery of the medium wavenumbers will<br />

be difficult. The application of preconditioning operators for the <strong>de</strong>termination of the sediment<br />

<strong>la</strong>yer is likely to be inefficient as the sediment wavefield is dominated by ref<strong>le</strong>cted and scattered<br />

waves and no diving/refracted waves may be exploited. A se<strong>le</strong>ction of early arrivals is therefore<br />

poorly adapted and the smoothing of the gradient will involved the far offset data which is the<br />

most non-linear part of the ref<strong>le</strong>ction hyperbo<strong>la</strong>. Therefore, the recovery of the sediments using<br />

waveform inversion is restricted to a migration-like reconstruction and the result will <strong>de</strong>pend<br />

on the accuracy of the starting mo<strong>de</strong>l. An important point is that inaccurate velocities in the<br />

sediments may not be easily noticeab<strong>le</strong> when comparing predicted and observed data. Figure<br />

5.12 shows the wavefield of the sediment in the true mo<strong>de</strong>l and in the wrong inversion result at<br />

7 Hz, Figure 5.11, <strong>le</strong>ft hand si<strong>de</strong>). The wavefield in the wrong mo<strong>de</strong>l is difficult to differentiate<br />

from the wavefield in the true mo<strong>de</strong>l, especially at near offset.<br />

The requirements for the starting mo<strong>de</strong>l are <strong>le</strong>ss <strong>de</strong>manding for the basalt <strong>la</strong>yer, as the wavefield<br />

is more linear due to higher velocities. The vertical gradient of the basalt <strong>la</strong>yer may be<br />

<strong>de</strong>termined in the starting mo<strong>de</strong>l by exploiting the traveltime of the diving/refracted waves at<br />

far offsets. During the waveform inversion process, time windowing/damping or f-k filtering of<br />

the data residuals will be necessary in or<strong>de</strong>r to invert for the diving/refracted waveform of the<br />

basalt and thus exclu<strong>de</strong> shallower ref<strong>le</strong>ction and multip<strong>le</strong>s (as advocated by Shipp and Singh<br />

(2002).).<br />

Some imaging limitations exist for the sub-basalt sediments. Because the presence of the


152 CHAPTER 5. WAVEFORM INVERSION AND STARTING MODEL<br />

a) b)<br />

0<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

0<br />

Offset (km)<br />

0 1 2 3 4 5 6 7 8 9 10<br />

1<br />

1<br />

2<br />

2<br />

Time (s)<br />

3<br />

4<br />

Time (s)<br />

3<br />

4<br />

5<br />

5<br />

6<br />

6<br />

Figure 5.12: Sediment wavefield in time for a) the true sediments and b) the wrong mo<strong>de</strong>l<br />

recovered by the inversion at 7 Hz starting from a smooth mo<strong>de</strong>l (see Figure 5.11a, <strong>le</strong>ft). The<br />

source is a Ricker wave<strong>le</strong>t with a pick frequency at 4 Hz.<br />

basalt prevents any wi<strong>de</strong>-ang<strong>le</strong> illumination of the sub-basalt target such as diving or refracted<br />

waves. Therefore, only a migration-like reconstruction (high wavenumber) may be achieved.<br />

5.8 Conclusion<br />

Sub-basalt imaging is an important exploration prob<strong>le</strong>m as it is associated with geographical areas<br />

that may contain important hydrocarbon reservoirs. The application of waveform inversion<br />

is consi<strong>de</strong>red since sub-basalt seismic data contains lower frequencies and <strong>la</strong>rger offset data<br />

than standard seismic acquisition.<br />

A linearity study of the waveform inverse prob<strong>le</strong>m was carried out on a 1-D experiment<br />

representing the sub-basalt imaging prob<strong>le</strong>m. The characteristics of an a<strong>de</strong>quate starting mo<strong>de</strong>l<br />

could then be <strong>de</strong>fined. The overbur<strong>de</strong>n sediments must contain some of the intermediate wavenumbers<br />

to allow an accurate reconstruction. The basalt <strong>la</strong>yer is <strong>le</strong>ss <strong>de</strong>manding as only some information<br />

on the vertical gradient appears to be sufficient. Only the high wavenumber components<br />

of the sub-basalt <strong>la</strong>yer may be recovered due to the <strong>la</strong>ck of wi<strong>de</strong>-ang<strong>le</strong> information illuminating<br />

this target.<br />

This study was carried out in the context of an i<strong>de</strong>alised <strong>la</strong>yer stripping strategy that relies<br />

on a perfect separation of the wavefield. Moreover, the free surface multip<strong>le</strong>s were not taken


5.8. CONCLUSION 153<br />

into account. The characteristics of the required starting mo<strong>de</strong>l are likely to be more <strong>de</strong>manding<br />

in reality. More realistic <strong>la</strong>yer stripping have yet to be studied. The results of this study however<br />

provi<strong>de</strong> us with some useful indications as to how accurate the starting mo<strong>de</strong>l needs to be. These<br />

requirements change with the lowest frequency avai<strong>la</strong>b<strong>le</strong> in the data.<br />

The <strong>de</strong>termination of the starting mo<strong>de</strong>l is a critical step in the waveform inversion process.<br />

The linearity of the waveform inverse prob<strong>le</strong>m may be significantly enhanced by improving<br />

the accuracy of the starting mo<strong>de</strong>l. A good starting mo<strong>de</strong>l should thus insure that a <strong>de</strong>scent<br />

path exists that allow gradient methods to locate the global minimum of the misfit function.<br />

This study <strong>de</strong>monstrates the <strong>de</strong>pen<strong>de</strong>nce of the starting mo<strong>de</strong>l requirements on the starting<br />

frequency: the higher the starting frequency is, the more accurate the mo<strong>de</strong>l needs to be.


154 CHAPTER 5. WAVEFORM INVERSION AND STARTING MODEL


Chapter 6<br />

Conclusions<br />

In this thesis, I examined the application of waveform inversion on wi<strong>de</strong>-ang<strong>le</strong> surface seismic<br />

data. Since the data frequency bandwidth is limited in term of its low frequency content,<br />

wi<strong>de</strong>-ang<strong>le</strong> data are necessary to <strong>de</strong>termine the medium wavenumber components of the velocity<br />

mo<strong>de</strong>l. The medium wavenumbers cannot be recovered by c<strong>la</strong>ssical near inci<strong>de</strong>nt ang<strong>le</strong><br />

acquisition since the macro mo<strong>de</strong>l, as <strong>de</strong>fined by the standard imaging princip<strong>le</strong>, is typically<br />

<strong>de</strong>ficient in the medium wavenumbers. This information is of significant interest as in comp<strong>le</strong>x<br />

media, it is required for an accurate recovery of the high wavenumbers. Thus, the objective is<br />

to use waveform inversion for the reconstruction of a continuous wavenumber spectrum of the<br />

velocity mo<strong>de</strong>l.<br />

Due to the high computational cost of generating the synthetic data, the waveform inverse<br />

prob<strong>le</strong>m was solved using local methods based on the calcu<strong>la</strong>tion of the steepest <strong>de</strong>scent of the<br />

misfit function. This approach however, is very sensitive to the issue of local minima associated<br />

with the strong non-linearity of the inversion. This non-linearity is aggravated by the introduction<br />

of wi<strong>de</strong>-ang<strong>le</strong>/<strong>la</strong>rge offset data in the inverse prob<strong>le</strong>m. These data which correspond to the<br />

longest propagation distances and are thus more likely to present cyc<strong>le</strong> skipping.<br />

To summarise, in or<strong>de</strong>r to <strong>de</strong>termine a continuous wavenumber spectrum of the velocity<br />

mo<strong>de</strong>l, the following difficulties must be addressed:<br />

1. The recovery of the high wavenumbers is a reasonably linear prob<strong>le</strong>m if the low and<br />

medium wavenumbers are already present and accurate in the velocity mo<strong>de</strong>l<br />

2. The recovery of the medium wavenumbers requires the introduction of <strong>la</strong>rge offset data<br />

which are the most non-linear part of the data<br />

155


156 CHAPTER 6. CONCLUSIONS<br />

The waveform inversion of <strong>la</strong>rge offset data must hence be consi<strong>de</strong>red very carefully. Success<br />

<strong>de</strong>pends on the accuracy of the low wavenumbers and the minimum frequency accessib<strong>le</strong> in the<br />

data. In or<strong>de</strong>r to carry out this study, the frequency domain waveform inversion approach was<br />

adopted because of its efficiency in the imp<strong>le</strong>mentation of a strategy that inverts from the low<br />

to the high frequencies. This strategy is fundamental as the inverse prob<strong>le</strong>m is more linear for<br />

the low frequencies.<br />

6.1 Wavenumber, frequency and offset<br />

In Chapter 3, I investigated the re<strong>la</strong>tion between frequency, offset and the wavenumber coverage<br />

of the mo<strong>de</strong>l. It was shown that, within the p<strong>la</strong>ne wave and homogeneous background<br />

velocity approximations, the contribution of a sing<strong>le</strong> frequency to the reconstruction of the velocity<br />

perturbation is limited to a range of wavenumbers that may be expressed analytically.<br />

The wavenumber coverage increases linearly with frequency which <strong>le</strong>d to a strategy for se<strong>le</strong>cting<br />

frequency that takes advantage of a redundancy of information along the offset axis. This<br />

strategy takes advantage of the well known effect of wave<strong>le</strong>t stretch in NMO correction and<br />

migration and allows the discretisation of frequencies without <strong>de</strong>gradation of the image quality.<br />

As the maximum offset increases, the wavenumber coverage gets wi<strong>de</strong>r since the <strong>la</strong>rgest<br />

offset controls the minimum wavenumber that may be recovered. As a result, the <strong>la</strong>rger the<br />

maximum offset is, the fewer frequencies are nee<strong>de</strong>d to assure a continuous reconstruction of<br />

the velocity wavenumber spectrum. Because of the simi<strong>la</strong>rity between the first iteration steepest<br />

<strong>de</strong>scent schemes and migration, this strategy also has implications for the more commonly<br />

used migration techniques. In frequency domain, the computational cost of inversion is directly<br />

proportional to the amount of frequencies processed. The topic is therefore of fundamental<br />

interest.<br />

This study helped to further illustrate the re<strong>la</strong>tion between frequency, offset and wavenumber<br />

coverage. The lowest wavenumber that may be recovered from the waveform data <strong>de</strong>pends<br />

on both the minimum frequency inverted and on the maximum offset present in the seismic<br />

acquisition. The importance of wi<strong>de</strong>-ang<strong>le</strong> data for the recovery of the low wavenumbers was<br />

illustrated on a 1-D experiment, in which it was shown that the reconstruction of velocity is<br />

much improved by the introduction of <strong>la</strong>rge offset data.<br />

The same princip<strong>le</strong> was illustrated using the 2-D Marmousi numerical experiment. In the<br />

Marmousi study, frequency domain waveform inversion was very effective in recovering the


6.2. WAVEFORM INVERSION AND STARTING FREQUENCY 157<br />

2-D structure of the true mo<strong>de</strong>l, using only 3 frequencies. In this experiment, the waveform<br />

inversion was initiated at 5 Hz with a starting velocity mo<strong>de</strong>l obtained from traveltime tomography.<br />

In the final velocity mo<strong>de</strong>l of the waveform inversion, a continuous range of wavenumber<br />

was recovered, i.e. both velocities and discontinuities were recovered. However frequencies as<br />

low as 5 Hz are unlikely to be present in real exploration data. The use of realistic frequencies<br />

in waveform inversion was therefore investigated in the Chapter 4 of the thesis.<br />

6.2 Waveform inversion and starting frequency<br />

The inversion of wi<strong>de</strong>-ang<strong>le</strong> data is particu<strong>la</strong>rly <strong>de</strong>licate as the introduction of <strong>la</strong>rge offset increases<br />

the non-linearity of the inverse prob<strong>le</strong>m, due to the increased risk of cyc<strong>le</strong> skipping.<br />

Whi<strong>le</strong> the inversion of the 2-D Marmousi data set yiel<strong>de</strong>d accurate velocities when initiated<br />

at 5 Hz, the same experiment fai<strong>le</strong>d when starting the inversion at 7 Hz. This may be easily<br />

exp<strong>la</strong>ined by the fact that non-linearity increases dramatically with frequency and that standard<br />

gradient method are <strong>le</strong>ss likely to converge properly at 7 Hz. In Chapter 4, I therefore<br />

investigated techniques for improving the linearity of the inverse prob<strong>le</strong>m at 7 Hz.<br />

In or<strong>de</strong>r to further un<strong>de</strong>rstand the behaviour of gradient-based waveform inversion, a singu<strong>la</strong>r<br />

value <strong>de</strong>composition was applied to the mono-frequency Fréchet <strong>de</strong>rivative matrix. The<br />

results show that the high wavenumbers are associated with the highest singu<strong>la</strong>r values and<br />

that the gradient will thus be dominated by the update of the high wavenumbers. On the other<br />

hand, as mentioned previously, the low and medium wavenumbers must be present to ensure<br />

the accurate recovery of the high wavenumbers. Therefore the update of the high wavenumber<br />

should be penalised during the early stages of the inversion. This may be achieved by applying a<br />

smoothing operator to the gradient vector. Unfortunately, the recovery of the low wavenumbers<br />

involves the far offset information - the most non-linear part of the data. The linearity of the far<br />

offset data may however be improved by applying time damping to the data residuals, which<br />

has the effect of focusing on the inversion of early arrivals.<br />

The application of the preconditioning of both the gradient image and the data residuals<br />

were shown to improve the convergence accuracy of the waveform inversion using an exten<strong>de</strong>d<br />

version of the 2-D Marmousi mo<strong>de</strong>l in which a <strong>de</strong>nse, wi<strong>de</strong>-ang<strong>le</strong> acquisition survey was mo<strong>de</strong>l<strong>le</strong>d.


158 CHAPTER 6. CONCLUSIONS<br />

6.3 Waveform inversion and starting mo<strong>de</strong>l<br />

In Chapter 5, I explored requirements on the starting mo<strong>de</strong>l for the waveform inversion process.<br />

Suitab<strong>le</strong> starting mo<strong>de</strong>ls must be obtained using c<strong>la</strong>ssical methods which in general provi<strong>de</strong><br />

only a low wavenumber velocity mo<strong>de</strong>l. The question of the characteristics of a<strong>de</strong>quate starting<br />

mo<strong>de</strong>ls must therefore be addressed. These characteristics <strong>de</strong>pend on the minimum frequency<br />

avai<strong>la</strong>b<strong>le</strong> in the data: the higher this frequency is, the more <strong>de</strong>manding are the requirements of<br />

the starting mo<strong>de</strong>l. The characteristic also <strong>de</strong>pend on the survey geometry, and on the velocity<br />

structure itself.<br />

In or<strong>de</strong>r to investigate this question, a non-linearity study was carried out on a 1-D velocity<br />

mo<strong>de</strong>l representing the sub-basalt imaging prob<strong>le</strong>m. The requirements of the starting mo<strong>de</strong>l are<br />

<strong>de</strong>termined by examining the evolution of the misfit function with respect to various <strong>de</strong>grees<br />

of smoothing of the true mo<strong>de</strong>l. The results allowed me to <strong>de</strong>termine the <strong>de</strong>gree of resolution<br />

required for each part of the velocity mo<strong>de</strong>l. The potential of the application of waveform<br />

inversion to the sub-basalt imaging prob<strong>le</strong>m could then be evaluated. My primary conclusion<br />

are that waveform inversion may be used for the <strong>de</strong>termination of intra-basalt velocities, but<br />

the reconstruction of the overbur<strong>de</strong>n sediments will be more difficult. Only a migration likeimage<br />

of the sub-basalt velocity structure may be obtained due to the <strong>la</strong>ck of illumination at<br />

wi<strong>de</strong>-ang<strong>le</strong>s below the basalt <strong>la</strong>yer.<br />

6.4 Towards the waveform inversion of real data<br />

An a<strong>de</strong>quate real seismic data set should contain <strong>la</strong>rge offsets and frequencies as low as 7<br />

Hz although i<strong>de</strong>ally, lower frequencies would significantly improve the linearity of the inverse<br />

prob<strong>le</strong>m. The importance of the low frequencies is however <strong>la</strong>rgely un<strong>de</strong>restimated by the<br />

seismic industry, which often relies only on the high frequencies to improve the resolution of<br />

the reconstructed image. Some technical difficulties exist generating seismic data with very low<br />

frequencies (Ziolkowski et al., 2001), but the seismic community should certainly be aware that<br />

low frequencies offer a great potential to improve the <strong>de</strong>termination of the velocity mo<strong>de</strong>l.<br />

The starting mo<strong>de</strong>l is a fundamental aspect of the waveform inversion I showed in Chapter<br />

5. Any application of waveform inversion to real data must address the question of the <strong>de</strong>termination<br />

of the starting mo<strong>de</strong>l. Furthermore, any application to real data should be prece<strong>de</strong>d<br />

by inversion tests on synthetic data in or<strong>de</strong>r to evaluate the difficulties that may be encoun-


6.4. TOWARDS THE WAVEFORM INVERSION OF REAL DATA 159<br />

tered by waveform inversion. Linearity studies such as that as carried out in Chapter 5, should<br />

contribute to a better un<strong>de</strong>rstanding of the true potential of applying waveform inversion on a<br />

specific imaging prob<strong>le</strong>m. More advanced analysis may be carried out by including the preconditioning<br />

of the data residuals (especially through time damping), in the non-linearity study.<br />

The starting mo<strong>de</strong>l for my 2-D Marmousi study was obtained using first arrival traveltime<br />

tomography, which is limited by the asymptotic ray approximation. New methods are currently<br />

un<strong>de</strong>r investigation in which the first arrival travel time and amplitu<strong>de</strong> are computed using a full<br />

wave propagator with very strong time damping. Shin et al. (2002) show that the use of strong<br />

time damping in the wave equation is equiva<strong>le</strong>nt to solving simultaneously for the eikonal and<br />

the transport equations, at a sing<strong>le</strong> frequency. This approach could potentially offer a good<br />

alternative to ray based tomography and would be consistent with the use of time damping <strong>la</strong>ter<br />

in the waveform inversion as proposed in Chapter 4.<br />

In this thesis, I <strong>de</strong>monstrated that the frequency domain is very efficient, since only a few<br />

frequencies are required for the reconstruction of the velocity mo<strong>de</strong>l. It is also particu<strong>la</strong>rly<br />

appropriate for the imp<strong>le</strong>mentation of a strategy of inverting from the lowest to the highest<br />

frequencies. However, as pointed out in Chapter 3, the inversion of a sing<strong>le</strong> frequency is likely<br />

to cause prob<strong>le</strong>m in the presence of random noise in which case, the more robust simultaneous<br />

inversion of several frequencies must be consi<strong>de</strong>red. The question of the number of frequencies<br />

that should be inverted, as a function of the signal to noise ratio should therefore be investigated.<br />

All numerical experiments in this thesis were per<strong>forme</strong>d within the acoustic approximation<br />

of the wave equation. Waveform inversion using a frequency domain e<strong>la</strong>stic forward mo<strong>de</strong>lling<br />

would benefit from the more accurate simu<strong>la</strong>tion of wave propagation. Such simu<strong>la</strong>tions are<br />

however computationally very expensive. E<strong>la</strong>stic, time domain waveform inversions have already<br />

been carried out (Shipp and Singh, 2002). The advantage and tra<strong>de</strong>offs of the e<strong>la</strong>stic<br />

versus the acoustic should be further investigated.<br />

Frequency domain and time domain waveform inversion approaches are intrinsically equiva<strong>le</strong>nt<br />

as they both make use of the wave equation. The differences are primarily the types of<br />

preconditioning that may be applied during the inversion process. The main advantage of the<br />

time over the frequency domain is that a precise time windowing of the data residuals is possib<strong>le</strong>.<br />

It was shown in Chapter 4, that time windowing may be a<strong>de</strong>quately per<strong>forme</strong>d in the<br />

frequency domain using time damping, in which case the inversion of a sing<strong>le</strong> frequency of the<br />

data at a time still remains possib<strong>le</strong>.


160 CHAPTER 6. CONCLUSIONS


Appendix A<br />

Image stretch and NMO stretch<br />

In a homogeneous medium the zero offset two-way traveltime for a sing<strong>le</strong> horizontal ref<strong>le</strong>ction<br />

is given by<br />

τ = 2z c ,<br />

(A.1)<br />

where z is the <strong>de</strong>pth of the ref<strong>le</strong>ctor and c is the velocity. At finite offsets the two-way traveltime<br />

is<br />

t = 2 √<br />

z2 + h<br />

c 2<br />

⎛ ( ) ⎞ 2<br />

2h<br />

= ⎝τ 2 + ⎠<br />

c<br />

1<br />

2<br />

, (A.2)<br />

where h is the half-offset.<br />

The NMO correction seeks to move all ref<strong>le</strong>ctions to their zero offset times. NMO stretch<br />

occurs because ref<strong>le</strong>ctions are associated with a wave<strong>le</strong>t of finite (non-zero) duration. Therefore<br />

the start of the ref<strong>le</strong>ction is moved to the correct zero offset time, whereas the end of the wave<strong>le</strong>t<br />

is moved to a different zero offset time. If we characterise the width of the wave<strong>le</strong>t as ∆t and<br />

the width of the wave<strong>le</strong>t after NMO correction as ∆τ, the NMO stretch is<br />

∆τ<br />

∆t ,<br />

(A.3)<br />

which varies as a function of <strong>de</strong>pth and offset. Both these <strong>de</strong>pen<strong>de</strong>ncies are captured by consi<strong>de</strong>ring<br />

the NMO stretch to be a function of two-way time, t. The instantaneous value of the<br />

NMO stretch is therefore<br />

( ) ∆τ<br />

lim = dτ<br />

∆t→0 ∆t dt .<br />

161<br />

(A.4)


162 APPENDIX A. IMAGE STRETCH AND NMO STRETCH<br />

We may evaluate this <strong>de</strong>rivative with the use of equation (A.2), from which we obtain<br />

dτ<br />

dt = √ 1 + R 2 = 1 α ,<br />

(A.5)<br />

where R = h/z is the offset-to-<strong>de</strong>pth ratio and α = cos θ as in equation (3.10).<br />

It can therefore be conclu<strong>de</strong>d that the shift towards the low wavenumbers with increasing<br />

offset predicted by equation (3.10) is exactly the same as that caused by NMO stretch after<br />

<strong>de</strong>pth correction.


Bibliography<br />

Aki, K., and Richards, P. G., 1980, Quantitative seismology, theory and methods: W. H. Freeman<br />

and Co.<br />

Al-Yahya, K. M., 1989, Velocity analysis by iterative profi<strong>le</strong> migration: Geophysics, 54, no. 06,<br />

718–729.<br />

Baysal, E., Kosloff, D. D., and Sherwood, J. W. C., 1983, Reverse time migration: Geophysics,<br />

48, no. 11, 1514–1524.<br />

Beydoun, W. B., and Men<strong>de</strong>s, M., 1989, E<strong>la</strong>stic ray-Born l 2 migration / inversion: Geophysical<br />

Journal, 97, 151–160.<br />

Beydoun, W. B., and Taranto<strong>la</strong>, A., 1988, First Born and Rytov approximations: Mo<strong>de</strong>ling and<br />

inversion in a canonical exemp<strong>le</strong>: J. Acoust. Soc. Am., 83, no. 2, 1587–1595.<br />

Beylkin, G., and Ostroglio, M. L., 1985, Distorted-wave Born and distorted-wave Rytov approximations:<br />

Opt. Commun., 53, 213–216.<br />

Beylkin, G., 1985, Imaging of discontinuuities in the inverse scattering prob<strong>le</strong>m by inversion of<br />

a causal generalized Radon transform: J. Math. Phys, 26, 99–108.<br />

Bil<strong>le</strong>tte, F., and Lambaré, G., 1998, Velocity macro-mo<strong>de</strong>l estimation from seismic ref<strong>le</strong>ction<br />

data by stereotomography: Geophys. J. Int., 135, 671–690.<br />

Bishop, T. N., Bube, K. P., Cut<strong>le</strong>r, R. T., Langan, R. T., Love, P. L., Resnick, J. R., Shuey, R. T.,<br />

Spind<strong>le</strong>r, D. A., and Wyld, H. W., 1985, Tomographic <strong>de</strong>termination of velocity and <strong>de</strong>pth in<br />

<strong>la</strong>terally varying media: Geophysics, 50, no. 06, 903–923.<br />

B<strong>le</strong>istein, N., Cohen, J. K., and Hagin, F. G., 1987, Two and one-half dimensional Born inversion<br />

with an arbitrary reference: Geophysics, 52, no. 01, 26–36.<br />

163


164 BIBLIOGRAPHY<br />

B<strong>le</strong>istein, N., 1987, On the imaging of ref<strong>le</strong>ctors in the Earth: Geophysics, 52, no. 07, 931–942.<br />

Born, M., 1923, Quantum mechanics of impact processes: Z. Phys., 38, 803–827.<br />

Brown, R., 1994, Image quality <strong>de</strong>pends on your point of view: The Leading Edge, 13, no. 06,<br />

669–673.<br />

Buchholtz, H., 1972, A note on signal distortion due to dynamic (NMO) correction: Geophys.<br />

Prosp., 20, 395–402.<br />

Bunks, C., Sa<strong>le</strong>ck, F. M., Za<strong>le</strong>ski, S., and Chavent, G., 1995, Multisca<strong>le</strong> seismic waveform<br />

inversion: Geophysics, 60, no. 05, 1457–1473.<br />

Cao, D., Beydoun, W. B., Singh, S. C., and Taranto<strong>la</strong>, A., 1990, A simultaneous inversion for<br />

background velocity and impedance maps: Geophysics, 55, no. 04, 458–469.<br />

Carter, J. A., and Frazer, L. N., 1984, Accommodating <strong>la</strong>teral velocity changes in Kirchhoff<br />

migration by means of Fermat’ princip<strong>le</strong>: Geophysics, 49, no. 01, 46–53.<br />

Cary, P., and Chapman, P., 1988, Automatic 1-d waveform inversion of marine seimic refraction<br />

data: Geophysical Journal, 93, 527–546.<br />

Causse, E., Mittet, R., and Ursin, B., 1999, Preconditioning of full-waveform inversion in viscoacoustic<br />

media: Geophysics, 64, no. 1, 130–145.<br />

Chapman, C. H., 1985, Ray theory and its extensions: WBKJ and Maslov seismograms: J.<br />

Geophys, 58, 27–43.<br />

Chauris, H., Nob<strong>le</strong>, M. S., Lambaré, G., and Podvin, P., 2002, Migration velocity analysis from<br />

locally coherent events in 2-d <strong>la</strong>terally heterogeneous media, part i: Theoretical aspects:<br />

Geophysics, 67, no. 04, 1202–1212.<br />

C<strong>la</strong>erbout, J. F., and Doherty, S. M., 1972, Downward continuation of moveout-corrected seismograms:<br />

Geophysics, 37, no. 05, 741–768.<br />

C<strong>la</strong>erbout, J. F., 1985, Imaging the earth’s interior: B<strong>la</strong>ckwell Scientific Publications.<br />

C<strong>la</strong>yton, R. W., and Stolt, R. H., 1981, A Born-WBKJ inversion method for acoustic ref<strong>le</strong>ction<br />

data: Geophysics, 46, no. 11, 1559–1567.


BIBLIOGRAPHY 165<br />

Cohen, J. K., and B<strong>le</strong>istein, N., 1979, Velocity inversion procedure for acoustic waves: Geophysics,<br />

44, no. 06, 1077–1087.<br />

Crase, E., Pica, A., Nob<strong>le</strong>, M., McDonald, J., and Taranto<strong>la</strong>, A., 1990, Robust e<strong>la</strong>stic nonlinear<br />

waveform inversion: Application to real data: Geophysics, 55, no. 05, 527–538.<br />

Cullity, B. D., 1978, E<strong>le</strong>ments of X-ray diffraction: Addison-Wes<strong>le</strong>y, Mass.<br />

Devaney, A. J., 1981, Inverse-scattering theory within the Rytov approximation: Optics Lett.,<br />

6, 374–376.<br />

Devaney, A. J., 1984, Geophysical diffraction tomography: IEEE Trans. Geosci. Remote Sensing,<br />

GE-22, 3–13.<br />

Dix, C. H., 1955, Seismic velocities from surface measurements: Geophysics, 20, no. 01, 68–<br />

86.<br />

Docherty, P., Silva, R., Singh, S., Song, Z. M., and Wood, M., 1997, Migration velocity analysis<br />

using a genetic algorithm: Geophys. Prosp., 45, no. 05, 865–878.<br />

Dongarra, J. J., Bunch, J. R., Mo<strong>le</strong>r, C. B., and Stewart, G. W., 1979, LINPACK users’s gui<strong>de</strong>:<br />

Society for Industrial and Applied Mathematics, 11.1–11.23.<br />

Dunkin, J. W., and Levin, F. K., 1973, Effect of normal moveout on a seismic pulse: Geophysics,<br />

38, no. 04, 635–642.<br />

Ehinger, A., Lailly, P., and Marfurt, K. J., 1996, Green’s function imp<strong>le</strong>mentation of commonoffset<br />

wave-equation migration: Geophysics, 61, no. 06, 1813–1821.<br />

Ewald, P. P., 1921, Das "reziproke gitter" in <strong>de</strong>r struktur theorie: Zeit. f. Kris., 56, 129–156.<br />

Farra, V., and Madariaga, R., 1988, Non-linear ref<strong>le</strong>ction tomography: Geophysical Journal, 95,<br />

135–147.<br />

Fliedner, M. M., and White, R. S., 2001, Seismic structure of basalt flows from surface seismic<br />

data, boreho<strong>le</strong> measurements, and synthetic seismogram mo<strong>de</strong>ling: Geophysics, 66, no. 6,<br />

1925–1936.


166 BIBLIOGRAPHY<br />

Forgues, E., Sca<strong>la</strong>, E., and Pratt, R. G., 1998, High resolution velocity mo<strong>de</strong>l estimation from<br />

refraction and ref<strong>le</strong>ction data:, in 68th Ann. Internat. Mtg, Soc. Expl. Geophys., Expan<strong>de</strong>d<br />

Abstracts Soc. of Expl. Geophys., 1211–1214.<br />

Freu<strong>de</strong>nreich, Y., and Singh, S., 2000, Full waveform inversion for seismic data - frequency<br />

versus time domain:, in 62nd Mtg. Eur. Assn. Geosci. Eng., Session:C0054.<br />

Fruehn, J., White, R. S., Fliedner, M., Richardson, K. R., Cul<strong>le</strong>n, E., and Latkiewicz, C., 1998,<br />

Two-ship <strong>la</strong>rge aperture seismic profi<strong>le</strong>s - application to imaging through basalt:, in 60th Mtg.<br />

Eur. Assn. Geosci. Eng., Session:01–47.<br />

Fruehn, J., Fliedner, M. M., and White, R. S., 2001, Integrated wi<strong>de</strong>-ang<strong>le</strong> and near-vertical subbasalt<br />

study using <strong>la</strong>rge-aperture seismic data from the faeroe-shet<strong>la</strong>nd region: Geophysics,<br />

66, no. 5, 1340–1348.<br />

Gardner, G. H. F., French, W. S., and Matzuk, T., 1974, E<strong>le</strong>ments of migration and velocity<br />

analysis: Geophysics, 39, no. 06, 811–825.<br />

Gauthier, O., Virieux, J., and Taranto<strong>la</strong>, A., 1986, Two-dimensional nonlinear inversion of seismic<br />

waveforms - Numerical results: Geophysics, 51, no. 07, 1387–1403.<br />

Gazdag, J., 1978, Wave equation migration with the phase-shift method: Geophysics, 43, no.<br />

07, 1342–1351.<br />

Geoltrain, S., and Brac, J., 1993, Can we image comp<strong>le</strong>x structures with first-arrival traveltime:<br />

Geophysics, 58, no. 04, 564–575.<br />

Gibson, B. S., and Levan<strong>de</strong>r, A., 1988, Mo<strong>de</strong>ling and processing of scattered waves in seismic<br />

ref<strong>le</strong>ction surveys: Geophysics, 53, 466–478.<br />

Gray, S. H., and May, W. P., 1994, Kirchoff migration using eikonal equation traveltimes: Geophysics,<br />

59, no. 05, 810–817.<br />

Haldorsen, J. B. U., and Farmer, P. A., 1989, Resolution and NMO-stretch: imaging by stacking:<br />

Geophysical Prospecting, 37, 479–492.<br />

Ha<strong>le</strong>, D., 1984, Dip-moveout by Fourier transform: Geophysics, 49, no. 06, 741–757.


BIBLIOGRAPHY 167<br />

Hicks, G. J., and Pratt, R. G., 2001, Ref<strong>le</strong>ction waveform inversion using local <strong>de</strong>scent methods:<br />

Estimating attenuation and velocity over a gas-sand <strong>de</strong>posit: Geophysics, 66, no. 2, 598–612.<br />

Hicks, G. J., 1999, Seismic velocities from ref<strong>le</strong>ction waveform: The application of Newton<br />

inversion method: Ph.D. thesis, Univ. London.<br />

Hughes, S., Barton, P. J., and Harrison, D., 1998, Exploration in the shet<strong>la</strong>nd-faeroe basin using<br />

<strong>de</strong>nsely spaced arrays of ocean-bottom seismometers: Geophysics, 63, no. 02, 490–501.<br />

Ikel<strong>le</strong>, L. T., Diet, J. P., and Taranto<strong>la</strong>, A., 1986, Linearized inversion of multioffset seismic<br />

ref<strong>le</strong>ction data in the frequency-wavenumber domain: Geophysics, 51, no. 06, 1266–1276.<br />

Ikel<strong>le</strong>, L. T., Diet, J. P., and Taranto<strong>la</strong>, A., 1988, Linearized inversion of multioffset seismic<br />

ref<strong>le</strong>ction data in the frequency-wavenumber domain - Depth-<strong>de</strong>pen<strong>de</strong>nt reference medium:<br />

Geophysics, 53, no. 01, 50–64.<br />

Jannane, M., Beydoun, W., Crase, E., Cao, D., Koren, Z., Landa, E., Men<strong>de</strong>s, M., Pica, A.,<br />

Nob<strong>le</strong>, M., Roeth, G., Singh, S., Snie<strong>de</strong>r, R., Taranto<strong>la</strong>, A., Trezeguet, D., and Zie, M., 1989,<br />

Wave<strong>le</strong>ngths of earth structures that can be resolved from seismic ref<strong>le</strong>ction data (short note):<br />

Geophysics, 54, no. 07, 906–910.<br />

Jarchow, C. M., Catchings, R. D., and Lutter, W. J., 1994, Large-explosive source, wi<strong>de</strong>recording<br />

aperture, seismic profiling on the columbia p<strong>la</strong>teau, washington: Geophysics, 59,<br />

no. 02, 259–271.<br />

Jin, S., Madariaga, R., Virieux, J., and Lambaré, G., 1992, Two-dimensional asymptotic iterative<br />

e<strong>la</strong>stic inversion: Geophys. J. Int., 108, 575–588.<br />

Jo, C.-H., Shin, C., and Suh, J. H., 1996, An optimal 9-point, finite-difference, frequency-space,<br />

2-d sca<strong>la</strong>r wave extrapo<strong>la</strong>tor: Geophysics, 61, no. 02, 529–537.<br />

Keho, T. H., and Beydoun, W. B., 1988, Paraxial ray Kirchhoff migration: Geophysics, 53, no.<br />

12, 1540–1546.<br />

Kel<strong>le</strong>r, J. B., 1969, Accuracy and validity of the Born and Rytov approximations: J. Opt. Soc.<br />

Am., 59, 1003–1004.<br />

Kelly, K. R., Ward, R. W., Treitel, S., and Alford, R. M., 1976, Synthetic seismograms - A<br />

finite-difference approach: Geophysics, 41, no. 01, 2–27.


168 BIBLIOGRAPHY<br />

Lafond, C., Kaculini, S., and Martini, F., 1999, The effects of basalt heterogeneities on seismic<br />

imaging of <strong>de</strong>eper ref<strong>le</strong>ctors:, in 69th Ann. Internat. Mtg Soc. of Expl. Geophys., 1433–1436.<br />

Lailly, P., 1983, The seismic inverse prob<strong>le</strong>ms as a sequence of before stack migration, in Bednar,<br />

J. B., Redner, R., Robinson, E., and Weg<strong>le</strong>in, A., Eds., Conference on Inverse Scatterring:<br />

Theory and Application: Soc. Industr. Appl. Math., Phi<strong>la</strong><strong>de</strong>lphia.<br />

Lambaré, G., Virieux, J., Mandariaga, R., and Jin, S., 1992, Iterative asymptotic inversion in<br />

the acoustic approximation: Geophysics, 57, no. 09, 1138–1154.<br />

Lambaré, G., Lucio, P. S., and Hanyga, M., 1996, Two-dimensional multivalued traveltime and<br />

amplitu<strong>de</strong> maps by uniform sampling of ray field: Geophys. J. Int., 125, 584–598.<br />

Lanczos, C., 1961, Linear differential operators,: D. van Nostrand.<br />

Lebrun, D., Richard, V., Mace, D., and Cuer, M., 2001, Svd for multioffset linearized inversion:<br />

Resolution analysis in multicomponent acquisition: Geophysics, 66, no. 3, 871–88.<br />

Levin, S. A., 1998, Resolution in seismic imaging: Is it all matter of perspective: Geophysics,<br />

63, 743–749.<br />

Li, X. Y., MacBeth, C., Hitchen, K., and Hanssen, P., 1998, Using converted shear-wave for<br />

imaging beneath basalt in <strong>de</strong>ep water p<strong>la</strong>ys:, in 68th Ann. Internat. Mtg Soc. of Expl. Geophys.,<br />

1369–1372.<br />

Liao, Q., and McMechan, G. A., 1996, Multifrequency viscoacoustic mo<strong>de</strong>ling and inversion:<br />

Geophysics, 61, no. 05, 1371–1378.<br />

Mallick, S., and Frazer, N. L., 1987, Practical aspects of ref<strong>le</strong>ctivity mo<strong>de</strong>ling: Geophysics, 52,<br />

1355–1364.<br />

Marfurt, K. J., 1984, Accuracy of finite-difference and finite-e<strong>le</strong>ment mo<strong>de</strong>ling of the sca<strong>la</strong>r and<br />

e<strong>la</strong>stic wave-equations: Geophysics, 49, no. 05, 533–549.<br />

Martini, F., and Bean, C. J., 2002, Interface scattering versus body scattering in subbasalt imaging<br />

and application of prestack wave equation datuming: Geophysics, 67, no. 05, 1593–1601.<br />

Martini, F., C., B., Lafond, C., and Kaculini, S., 2000, Imaging below highly heterogeneous<br />

<strong>la</strong>yers:, in 70st Ann. Internat. Mtg Soc. of Expl. Geophys., 2448–2451.


BIBLIOGRAPHY 169<br />

Menke, W., 1989, Geophysical Data Analysis: Discrete Inverse Theory: Aca<strong>de</strong>mic Press, Inc.<br />

Miche<strong>le</strong>na, R. J., 1993, Singu<strong>la</strong>r-value <strong>de</strong>composition for cross-well tomography: Geophysics,<br />

58, no. 11, 1655–1661.<br />

Mil<strong>le</strong>r, D. E., Oristaglio, M., and Beylkin, G., 1987, A new s<strong>la</strong>nt on seismic imaging: Migration<br />

and integral geometry: Geophysics, 52, 943–964.<br />

Mora, P. R., 1987, Nonlinear two-dimensional e<strong>la</strong>stic inversion of multioffset seismic data:<br />

Geophysics, 52, no. 09, 1211–1228.<br />

Mora, P., 1988, E<strong>la</strong>stic wave-field inversion of ref<strong>le</strong>ction and transmission data: Geophysics,<br />

53, no. 06, 750–759.<br />

Mora, P., 1989, <strong>Inversion</strong> = migration + tomography: Geophysics, 54, no. 12, 1575–1586.<br />

Morse, P. M., and Feshbach, H., 1953, Methods of Theoretical Physics: McGraw-Hill.<br />

Moser, T. J., 1991, Shortest path calcu<strong>la</strong>tion of seismic rays: Geophysics, 56, no. 01, 59–67.<br />

Nemeth, T., Wu, C., and Schuster, G. T., 1999, Least-squares migration of incomp<strong>le</strong>te ref<strong>le</strong>ction<br />

data: Geophysics, 64, no. 1, 208–221.<br />

Operto, S., Xu, S., and Lambare, G., 2000, Can we quantitatively image comp<strong>le</strong>x structures<br />

with rays: Geophysics, 65, no. 4, 1223–1238.<br />

Pica, A., Diet, J., and Taranto<strong>la</strong>, A., 1990, Nonlinear inversion of seismic ref<strong>le</strong>ction data in a<br />

<strong>la</strong>terally invariant medium: Geophysics, 55, no. 03, 284–292.<br />

P<strong>la</strong>nke, S., Alvestad, E., and Eldholm, O., 1999, Seismic characteristics of basaltic extrusive<br />

and intrusive rocks: The Leading Edge, 18, no. 3, 342–348.<br />

Podvin, P., and Lecomte, I., 1991, Finite difference computation of traveltimes in very contrasted<br />

velocity mo<strong>de</strong>ls: a massively paral<strong>le</strong>l approach and its associated tools.: Geophys. J.<br />

Int., 105, 271–284.<br />

Po<strong>la</strong>k, E., and Ribière, G., 1969, Notes sur <strong>la</strong> convergence <strong>de</strong> métho<strong>de</strong>s <strong>de</strong> directions conjugatées:<br />

Rev. Fr. Inf. Rech. Oper, 16-R1, 35–43.


170 BIBLIOGRAPHY<br />

Pratt, R. G., and Chapman, C. H., 1992, Traveltime tomography in anisotropic media–II. Application:<br />

Geophys. J. Int., 109, 20–37.<br />

Pratt, R. G., and Goulty, N. R., 1991, Combining wave-equation imaging with traveltime tomography<br />

to form high-resolution images from crossho<strong>le</strong> data: Geophysics, 56, no. 02, 208–224.<br />

Pratt, R. G., and Worthington, M. H., 1988, The application of diffraction tomography to crossho<strong>le</strong><br />

seismic data: Geophysics, 53, no. 10, 1284–1294.<br />

Pratt, R. G., and Worthington, M. H., 1990, Inverse theory applied to multi-source cross-ho<strong>le</strong><br />

tomography. part 1: Acoustic wave-equation method: Geophys. Prosp., 38, no. 03, 287–310.<br />

Pratt, R. G., Song, Z.-M., Williamson, P., and Warner, M., 1996, Two-dimensional velocity<br />

mo<strong>de</strong>ls from wi<strong>de</strong>-ang<strong>le</strong> seismic data by wavefield inverson: Geophys. J. Int., 124, 323–340.<br />

Pratt, R. G., Shin, C., and Hicks, G. J., 1998, Gauss-Newton and full Newton methods in<br />

frequency-space seismic waveform inversion: Geophys. J. Int., 133, 341–362.<br />

Press, W. H., Teukolsky, S. A., Vetterling, W. T., and F<strong>la</strong>nnery, B. P., 1992, Numerical Recipies<br />

in Fortran 77: The art of scientific computing: Cambridge Univ. Press.<br />

Purnell, G. W., 1992, Imaging beneath a high-velocity <strong>la</strong>yer using converted waves: Geophysics,<br />

57, no. 11, 1444–1452.<br />

Samson, C., Barton, P. J., and Karwatowski, J., 1995, Imaging beneath an opaque basaltic <strong>la</strong>yer<br />

using <strong>de</strong>nsely samp<strong>le</strong>d wi<strong>de</strong>-ang<strong>le</strong> obs data: Geophys. Prosp., 43, no. 04, 509–527.<br />

Sca<strong>le</strong>s, J. A., Smith, M. L., and Treitel, S., 2001, Inverse prob<strong>le</strong>m theory: Samizdat Press.<br />

Sch<strong>le</strong>icher, J., Tygel, M., and Hubral, P., 1993, 3-D true-amplitu<strong>de</strong> finite-offset migration: Geophysics,<br />

58, no. 08, 1112–1126.<br />

Schnei<strong>de</strong>r, W. A., 1978, Integral formu<strong>la</strong>tion for migration in two-dimensions and threedimensions:<br />

Geophysics, 43, no. 01, 49–76.<br />

Schultz, P. S., and Sherwood, J. W. C., 1980, Depth migration before stack: Geophysics, 45,<br />

no. 03, 376–393.<br />

Sen, M. K., and Stoffa, P. L., 1991, Nonlinear one-dimensional seismic waveform inversion<br />

using simu<strong>la</strong>ted annealing: Geophysics, 56, no. 10, 1624–1638.


BIBLIOGRAPHY 171<br />

Shin, C., Min, D.-J., Marfurt, K. J., Lim, H. Y., Yang, D., Cha, Y., Ko, S., Yoon, K., Ha, T.,<br />

and Hong, S., 2002, Traveltime and amplitu<strong>de</strong> calcu<strong>la</strong>tions using the damped wave solution:<br />

Geophysics, 67, no. 05, 1637–1647.<br />

Shipp, R. M., and Singh, S. C., 2002, Two-dimensional full wavefield inversion of wi<strong>de</strong>-aperture<br />

marine seismic streamer data: Geophys. J. Int., 151, 325–344.<br />

Shipp, R., Singh, S., and Barton, P., 1997, Sub-basalt imaging using full wavefield inversion:<br />

67th Ann. Internat. Mtg, Soc. Expl. Geophys., Expan<strong>de</strong>d Abstracts, 1563–1566.<br />

Snie<strong>de</strong>r, R., Xie, M. Y., Pica, A., and Taranto<strong>la</strong>, A., 1989, Retrieving both the impedance contrast<br />

and background velocity: A global strategy for the seismic ref<strong>le</strong>ction prob<strong>le</strong>m: Geophysics,<br />

54, no. 08, 991–1000.<br />

Song, Z. M., Williamson, P. R., and Pratt, R. G., 1995, Frequency-domain acoustic-wave mo<strong>de</strong>ling<br />

and inversion of crossho<strong>le</strong> data: Part ii: <strong>Inversion</strong> method, synthetic experiments and<br />

real-data results: Geophysics, 60, no. 03, 796–809.<br />

Stekl, I., and Pratt, R. G., 1998, Accurate viscoe<strong>la</strong>stic mo<strong>de</strong>ling by frequency-domain finite<br />

differences using rotated operators: Geophysics, 63, no. 05, 1779–1794.<br />

Stockwell, J. W., 1997, Free software in education: A case study of CWP/SU: Seismic Un*x:<br />

The Leading Edge, 16, no. 07, 1045–1049.<br />

Stolt, R. H., 1978, Migration by Fourier transform: Geophysics, 43, no. 01, 23–48.<br />

Stork, C., 1992a, Ref<strong>le</strong>ction tomography in the postmigrated domain: Geophysics, 57, no. 05,<br />

680–692.<br />

Stork, C., 1992b, Singu<strong>la</strong>r value <strong>de</strong>composition of the velocity-ref<strong>le</strong>ctor <strong>de</strong>pth tra<strong>de</strong>off, part 2:<br />

High-resolution analysis of a generic mo<strong>de</strong>l: Geophysics, 57, no. 07, 933–943.<br />

Sun, R., and McMechan, G. A., 1992, 2-D full-wavefield inversion for wi<strong>de</strong>-aperture, e<strong>la</strong>stic,<br />

seismic data: Geophysical Journal International, 111, 1–10.<br />

Symes, W. W., and Carazzone, J. J., 1991, Velocity inversion by differential semb<strong>la</strong>nce optimization:<br />

Geophysics, 56, no. 05, 654–663.


172 BIBLIOGRAPHY<br />

Taranto<strong>la</strong>, A., 1984a, <strong>Inversion</strong> of seismic ref<strong>le</strong>ction data in the acoustic approximation: Geophysics,<br />

49, no. 08, 1259–1266.<br />

Taranto<strong>la</strong>, A., 1984b, Linearized inversion of seismic ref<strong>le</strong>ction data: Geophysical Prospecting,<br />

32, 998–1015.<br />

Taranto<strong>la</strong>, A., 1986, A strategy for nonlinear e<strong>la</strong>stic inversion of seismic ref<strong>le</strong>ction data: Geophysics,<br />

51, no. 10, 1893–1903.<br />

Taranto<strong>la</strong>, A., 1987, Inverse prob<strong>le</strong>m theory: Elsevier.<br />

Thierry, P., Operto, S., and Lambare, G., 1999, Fast 2-D ray+born migration/inversion in comp<strong>le</strong>x<br />

media: Geophysics, 64, no. 1, 162–181.<br />

Tygel, M., Sch<strong>le</strong>icher, J., and Hubral, P., 1994, Pulse distortion in <strong>de</strong>pth migration: Geophysics,<br />

59, no. 10, 1561–1569.<br />

˘Cervený, V., Molotkov, I. A., and P˘sen˘sik, I., 1977, Ray method in seismology: Char<strong>le</strong>s University<br />

Press, Prague.<br />

Versteeg, R. J., 1993, Sensitivity of prestack <strong>de</strong>pth migration to the velocity mo<strong>de</strong>l: Geophysics,<br />

58, no. 06, 873–882.<br />

Versteeg, R., 1994, The Marmousi experience: Velocity mo<strong>de</strong>l <strong>de</strong>termination on a synthetic<br />

comp<strong>le</strong>x data set: The Leading Edge, 13, no. 09, 927–936.<br />

Vida<strong>le</strong>, J. E., 1990, Finite-difference calcu<strong>la</strong>tion of traveltimes in three dimensions: Geophysics,<br />

55, no. 05, 521–526.<br />

Vinje, V., Iversen, E., and Gjoystdal, H., 1993, Traveltime and amplitu<strong>de</strong> estimation using<br />

wavefront construction: Geophysics, 58, no. 08, 1157–1166.<br />

Ward, R. W., MacKay, S., Green<strong>le</strong>e, S. M., and Dengo, C. A., 1994, Imaging sediments un<strong>de</strong>r<br />

salt: Where are we: The Leading Edge, 13, no. 08, 834–836.<br />

White, R. S., and D., M., 1989, Magmatism at rift zones: The generation of volcanic continental<br />

margins and flood basalts: J. Geophys. Res., 94, 7685–7729.<br />

White, D. J., 1989, Two-dimensional seismic refraction tomography: Geophys. J., 97, 223–245.


BIBLIOGRAPHY 173<br />

Wiggins, J. W., 1984, Kirchhoff integral extrapo<strong>la</strong>tion and migration of nonp<strong>la</strong>nar data: Geophysics,<br />

49, no. 08, 1239–1248.<br />

Woodward, M. J., 1989, Wave-equation tomography: Ph.D. thesis, Stanford University.<br />

Woodward, M. J., 1992, Wave-equation tomography: Geophysics, 57, no. 01, 15–26.<br />

Wu, R. S., and Toksöz, M. N., 1987, Diffraction tomography and multisource holography applied<br />

to seismic imaging: Geophysics, 52, no. 01, 11–25.<br />

Yilmaz, O., and C<strong>la</strong>erbout, J. F., 1980, Prestack partial migration: Geophysics, 45, no. 12,<br />

1753–1779.<br />

Yilmaz, Özdogan., 1987, Seismic Data Processing: Society of Exploration Geophysicists.<br />

Zelt, C. A., and Barton, P. J., 1998, 3D seismic refraction tomography: A comparison of two<br />

methods applied to data from the faeroe basin: J. Geophys. Res., 103, 7187–7210.<br />

Zelt, C. A., and Smith, R. B., 1992, Seismic traveltime inversion for 2-d crustal velocity structure:<br />

Geophys. J. Int., 108, 16–34.<br />

Ziolkowski, A., and Fokkema, J. T., 1986, Tutorial: The progressive attenuation of highfrequency<br />

energy in seismic ref<strong>le</strong>ction data: Geophysical Prospecting, 34, 981–1001.<br />

Ziolkowski, A., Hanssen, P., Gatliff, R., Li, X., and Jakubowicz, H., 2001, The use of low<br />

frequencies for sub-basalt imaging:, in 71st Ann. Internat. Mtg Soc. of Expl. Geophys., 74–<br />

77.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!