27.07.2013 Views

AIR Tools - A MATLAB Package for Algebraic Iterative ...

AIR Tools - A MATLAB Package for Algebraic Iterative ...

AIR Tools - A MATLAB Package for Algebraic Iterative ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>AIR</strong> <strong>Tools</strong> - A <strong>MATLAB</strong> <strong>Package</strong><br />

<strong>for</strong> <strong>Algebraic</strong> <strong>Iterative</strong><br />

Reconstruction Techniques<br />

Maria Saxild-Hansen<br />

Kongens Lyngby 2010


Technical University of Denmark<br />

In<strong>for</strong>matics and Mathematical Modelling<br />

Building 321, DK-2800 Kongens Lyngby, Denmark<br />

Phone +45 45253351, Fax +45 45882673<br />

reception@imm.dtu.dk<br />

www.imm.dtu.dk


Summary<br />

In this master thesis a <strong>MATLAB</strong> package <strong>AIR</strong> <strong>Tools</strong> with implementations of<br />

several iterative algebraic reconstruction methods <strong>for</strong> discretizations of tomography<br />

problems is developed. The focus is mainly on the two classes of methods:<br />

Simultaneous <strong>Iterative</strong> Reconstruction Technique (SIRT) and <strong>Algebraic</strong> Reconstruction<br />

Techniques (ART). The package also includes three simplified test<br />

problems from medical and seismic tomography.<br />

For each iterative method a number of strategies <strong>for</strong> choosing the relaxation<br />

parameter and the stopping rule are presented and implemented. The relaxation<br />

parameter can be chosen as a fixed parameter or chosen adaptively in each<br />

iteration. For the fixed case a training strategy is developed <strong>for</strong> finding the<br />

optimal parameter <strong>for</strong> a given test problem. The stopping rules provided in the<br />

package is the Discrepancy Principle, the Monotone Error Rule and the NCP<br />

criterion. For the first two methods a training strategy is also provided <strong>for</strong><br />

finding an optimal stopping parameter.<br />

In addition simulation studies and comparisons of per<strong>for</strong>mance of the available<br />

methods and strategies are presented and discussed.<br />

This thesis also includes manual pages <strong>for</strong> each implemented routine that describes<br />

the use of the implemented routines.<br />

KEYWORDS: ART methods, SIRT methods, iterative methods, semi-convergence,<br />

relaxation parameters, stopping rules, tomography.


Resumé<br />

I dette eksamensprojekt udvikles en <strong>MATLAB</strong> programpakke, <strong>AIR</strong> <strong>Tools</strong>, med<br />

implementeringer af flere iterative algebraiske rekonstruktions metoder til diskretiseret<br />

tomografi problemer. Det primære fokus er p˚a to klasser af metoder: Simulatan<br />

<strong>Iterative</strong> Rekonstruktion Teknik (SIRT) og Algebraiske Rekonstruktions<br />

Teknikker (ART). Programpakken indeholder ligeledes tre simple testproblemer<br />

fra medicinsk og seismisk tomografi.<br />

For hver iterative metode præsenteres og implementeres en række strategier til<br />

at vælge relaxations parameteren samt stopkriterier. Relaxations parameteren<br />

kan enten vælges som en konstant parameter eller den kan vælges adaptivt i<br />

hver iteration. For det konstante tilfælde er der udviklet en træningsstrategi<br />

til at finde den optimale værdi <strong>for</strong> et givent testproblem. Stopkriterierne, der<br />

er tilgængelige i denne pakke, er discrepancy princippet, monotone error reglen<br />

samt NCP kriteriet. For de to første metoder er der givet en træningsstrategi<br />

til at finde den optimale værdi <strong>for</strong> stopparamerten.<br />

Yderligere er studier og sammenligninger af metodernes og strategiernes opførsel<br />

ogs˚a præsenteret of diskuteret.<br />

Eksamensprojektet indeholder ogs˚a manual sider til hver implementeret funktion,<br />

som beskriver benyttelsen heraf.<br />

STIKORD: ART metoder, SIRT metoder, iterative metoder, semi-konvergens,<br />

relaxation parameter, stopkriterier, tomografi.


Preface<br />

This master thesis is prepared at the Department of In<strong>for</strong>matics and Mathematical<br />

Modeling, Technical University of Denmark (DTU), and marks the<br />

completion of the master degree in Mathematical Modeling and Computations.<br />

It represents the workload of 35 ETCS points and has been prepared during a<br />

seven month period from August 31 to March 31. The study has been conducted<br />

under the supervision of Professor Per Christian Hansen.<br />

I would like to thank a few people <strong>for</strong> helping me with this project. I would like<br />

to thank Professor in scientific compution at University of Linköping, Dept. of<br />

Mathematics, Tommy Elfving who, through a visit at DTU in November 2009<br />

provided valuable insight into the theory of iterative methods. I would also like<br />

to thank Professor at DTU In<strong>for</strong>matics Klaus Mosegaard <strong>for</strong> his assistance in<br />

creation of a seismic tomography problem and a usefull test phantom and Ph.D.<br />

student Jakob Heide Jørgensen <strong>for</strong> assistance in creation an algorithm <strong>for</strong> the<br />

tomography test problems. Finally, I would like to thank my family and friends,<br />

especially Katrine Lange and Elin A. Larsen <strong>for</strong> assistance and <strong>for</strong> keeping up<br />

my spirit.<br />

Kgs. Lyngby, 31th March 2010<br />

Maria Saxild-Hansen


List of Symbols<br />

The following is a list of symbols used over the thesis. Be aware that this list<br />

only contains the symbols which are used <strong>for</strong> the same purpose through the<br />

thesis. This list is there<strong>for</strong>e not a complete list since only frequently used symbols<br />

are represented. Also be aware that some symbols have multiple meanings.<br />

However the meaning will be clear from the context.<br />

Symbol Quantity Dimension<br />

A coefficient matrix m × n<br />

ai i’th row in the matrix A m<br />

aj j’th column in the matrix A m<br />

aij element in the i’th row and the j’th column of A scalar<br />

b<br />

¯b right-hand side<br />

exact right-hand side<br />

m<br />

m<br />

bi i’th element in the vector b scalar<br />

δ the noise level scalar<br />

I identity matrix<br />

k iteration number scalar<br />

λk relaxation parameter k or scalar<br />

M symmetric, positive definite matrix <strong>for</strong> the SIRT m × m<br />

methods<br />

m, n matrix dimensions scalars<br />

Φ k (σ, λ) iteration-error scalar<br />

ϕi filter factor scalar<br />

Φ diagonal matrix of filter factors n × n


viii Contents<br />

̟ average number of nonzero elements in a row scalar<br />

Ψk (σ, λ) noise-error scalar<br />

ρ spectral radius scalar<br />

Σ diagonal matrix with all singular values m × n<br />

σi singular value of matrix scalar<br />

the number of nonzero elements in the j’th col- scalar<br />

sj<br />

umn<br />

τ the stopping parameter scalar<br />

τ1 parameter <strong>for</strong> the modified Ψ1-based relaxation scalar<br />

τ2 parameter <strong>for</strong> the modified Ψ2-based relaxation scalar<br />

T symmetric positive definite matrix <strong>for</strong> the SIRT n × n<br />

methods<br />

U matrix with all left singular vectors m × m<br />

ui i’th left singular vector m<br />

V matrix with all right singular vectors n × n<br />

vi i’th right singular vector n<br />

w weighting vector m<br />

wi i’th element in the weighting vector scalar<br />

x k solution in the k’th iteration n<br />

¯x exact solution n<br />

Hi the i’th hyperplane<br />

Pi(·) projection<br />

Ri(·) reflection<br />

〈·, ·〉 inner product, i.e. 〈x, y〉 = xTy. · 2 2-norm<br />

NNZ(·) number of nonzero elements scalar


Contents<br />

Summary i<br />

Resumé iii<br />

Preface v<br />

List of Symbols vii<br />

List of Figures xiv<br />

1 Introduction 1<br />

1.1 Sturcture of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . 2<br />

2 Theory of Inverse Problems and Regularization 5<br />

2.1 Discrete Ill-Posed Problems . . . . . . . . . . . . . . . . . . . . . 5<br />

2.2 SVD and Picard Condition . . . . . . . . . . . . . . . . . . . . . 6<br />

2.3 Spectral Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . 9<br />

2.4 <strong>Iterative</strong> Methods and Semi-Convergence . . . . . . . . . . . . . 10<br />

2.5 Resolution Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />

3 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction 13<br />

3.1 Simultaneous <strong>Iterative</strong> Reconstructive Technique (SIRT) . . . . . 14<br />

3.2 <strong>Algebraic</strong> Reconstruction Techniques (ART) . . . . . . . . . . . . 21<br />

3.3 Considerations Towards the <strong>Package</strong> . . . . . . . . . . . . . . . . 26<br />

3.4 Block-<strong>Iterative</strong> Methods . . . . . . . . . . . . . . . . . . . . . . . 27<br />

4 Semi-Convergence and Choice of Relaxation Parameter 33<br />

4.1 Semi-Convergence <strong>for</strong> SIRT Methods . . . . . . . . . . . . . . . . 33<br />

4.2 Choice of Relaxation Parameter . . . . . . . . . . . . . . . . . . . 38


x CONTENTS<br />

5 Stopping Rules 53<br />

5.1 Stopping Rules with Training . . . . . . . . . . . . . . . . . . . . 53<br />

5.2 Normalized Cumulative Periodogram . . . . . . . . . . . . . . . . 58<br />

6 Test Problems 61<br />

7 Testing the Methods 67<br />

7.1 Convergence of DROP . . . . . . . . . . . . . . . . . . . . . . . . 69<br />

7.2 Symmetric Kaczmarz as a SIRT Method . . . . . . . . . . . . . . 70<br />

7.3 Test of the Choice of Relaxation Parameter . . . . . . . . . . . . 71<br />

7.4 Stopping Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83<br />

7.5 Relaxation Strategies Combined with Stopping Rules . . . . . . . 89<br />

8 Manual Pages 97<br />

9 Conclusion and Future Work 147<br />

9.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148<br />

A Appendix 149<br />

A.1 Orthogonal Projection on a Hyperplane . . . . . . . . . . . . . . 149<br />

A.2 Investigation of the Roots . . . . . . . . . . . . . . . . . . . . . . 152<br />

A.3 Work Units <strong>for</strong> the SIRT and ART methods . . . . . . . . . . . . 154<br />

Bibliography 155


List of Figures<br />

2.1 SVD basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7<br />

2.2 Picard plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7<br />

2.3 Illustration of basis semi-convergence . . . . . . . . . . . . . . . . 10<br />

3.1 Cimmino’s reflection method . . . . . . . . . . . . . . . . . . . . 16<br />

3.2 Cimmino’s projection method . . . . . . . . . . . . . . . . . . . . 18<br />

3.3 Kaczmarz’s method . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />

3.4 Symmetric Kaczmarz . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />

4.1 Behaviour of Φ k (σ, λ) and Ψ k (σ, λ). . . . . . . . . . . . . . . . . 37<br />

4.2 Ψ k (σ, λ) as function of σ . . . . . . . . . . . . . . . . . . . . . . . 39<br />

4.3 Relative error histories <strong>for</strong> nine values of λ . . . . . . . . . . . . . 40<br />

4.4 The minimum relative errors <strong>for</strong> different λ-values . . . . . . . . 40<br />

4.5 Optimal number of iterations <strong>for</strong> a SIRT method . . . . . . . . . 41


xii LIST OF FIGURES<br />

4.6 Relative error histories <strong>for</strong> an ART method . . . . . . . . . . . . 43<br />

4.7 The minimum relative errors <strong>for</strong> different λ-values <strong>for</strong> an ART<br />

method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />

4.8 Optimal number of iterations <strong>for</strong> an ART method . . . . . . . . . 44<br />

4.9 Relative error histories <strong>for</strong> a SIRT method with maximum number<br />

of iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 46<br />

4.10 Minimum relative error <strong>for</strong> a SIRT method with maximum number<br />

of iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 46<br />

4.11 Optimal number of iterations <strong>for</strong> a SIRT method with maximum<br />

number of iterations . . . . . . . . . . . . . . . . . . . . . . . . . 47<br />

4.12 Illustration of line search . . . . . . . . . . . . . . . . . . . . . . . 48<br />

6.1 Parallel beam illustration . . . . . . . . . . . . . . . . . . . . . . 62<br />

6.2 Fan beam illustration . . . . . . . . . . . . . . . . . . . . . . . . 63<br />

6.3 Seismic tomography illustration . . . . . . . . . . . . . . . . . . . 64<br />

6.4 The two exact phantoms . . . . . . . . . . . . . . . . . . . . . . . 65<br />

7.1 Relative error histories <strong>for</strong> test of DROP . . . . . . . . . . . . . . 68<br />

7.2 Relative error histories <strong>for</strong> test of DROP using weighting . . . . 69<br />

7.3 Ψ-based relaxations <strong>for</strong> symmetric Kaczmarz . . . . . . . . . . . 71<br />

7.4 Training of relaxation parameter using Cimmino’s projection method 72<br />

7.5 Training of relaxation parameter using Kaczmarz’s method . . . 72<br />

7.6 Training of relaxation parameter using randomized Kaczmarz . . 73<br />

7.7 Relative errors <strong>for</strong> the SIRT methods with trained λ . . . . . . . 73<br />

7.8 Relative errors <strong>for</strong> the ART methods with trained λ . . . . . . . 74


LIST OF FIGURES xiii<br />

7.9 Training of relaxation parameter using Cimmino’s projection method<br />

with maximum number of iterations . . . . . . . . . . . . . . . . 76<br />

7.10 Training of relaxation parameter using Kaczmarz’s method with<br />

maximum number of iterations . . . . . . . . . . . . . . . . . . . 76<br />

7.11 Training of relaxation parameter using randomized Kaczmarz<br />

method with maximum number of iterations . . . . . . . . . . . . 77<br />

7.12 Relative error <strong>for</strong> the SIRT methods using line search . . . . . . 78<br />

7.13 Relative error using the Ψ-based relaxations . . . . . . . . . . . . 79<br />

7.14 Relative error using the modified Ψ-based relaxations . . . . . . 79<br />

7.15 Relative errors <strong>for</strong> the SNARK test problem with different relaxation<br />

strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81<br />

7.16 Training of stopping rule <strong>for</strong> Cimmino’s projection method . . . 84<br />

7.17 Training of stopping rule <strong>for</strong> DROP . . . . . . . . . . . . . . . . 84<br />

7.18 Training of stopping rule <strong>for</strong> Kaczmarz’s method . . . . . . . . . 85<br />

7.19 Illustration of the stopping rules <strong>for</strong> the SIRT methods . . . . . . 86<br />

7.20 Illustration of the stopping rules <strong>for</strong> the ART methods . . . . . . 87<br />

7.21 Ψ-based relaxation with stopping rules . . . . . . . . . . . . . . . 90<br />

7.22 Line search with stopping rules . . . . . . . . . . . . . . . . . . . 91<br />

7.23 Training λ with stopping rules <strong>for</strong> SIRT methods . . . . . . . . . 93<br />

7.24 Training λ with stopping rules <strong>for</strong> ART methods . . . . . . . . . 94<br />

A.1 Illustration of projection on hyperplane where origo is in the hyperplane<br />

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150<br />

A.2 Illustration of projection on the hyperplane wheer origo is not in<br />

the hyperplane . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151<br />

A.3 Illustration of the roots . . . . . . . . . . . . . . . . . . . . . . . 152


xiv<br />

A.4 Zoom of the roots . . . . . . . . . . . . . . . . . . . . . . . . . . 153


Chapter 1<br />

Introduction<br />

In the beginning of the 20th century the Polish mathematician Stefan Kaczmarz<br />

[27] and the Italian mathematician Gianfranco Cimmino [8] independently developed<br />

iterative algorithms <strong>for</strong> solving linear systems. In 1970 Gordon, Bender<br />

and Herman rediscovered Kaczmarz’s method applied in medical imaging [17].<br />

They called the method ART (<strong>Algebraic</strong> Reconstruction Technique) and when<br />

Houndsfield patented the first CT-scanner in 1972, which awarded him together<br />

with Cormark the Nobel Prize in 1979, the classical methods found their practical<br />

purpose in tomograpgy [24]. The word tomography means reconstruction<br />

from slices. After the invention of the CT-scanner several new methods familiar<br />

with the old classical methods were developed.<br />

This master thesis deals with the classical Kaczmarz’s and Cimmino’s methods<br />

but also with the methods familiar with these methods. We divide the gathered<br />

methods into two main categories, the SIRT and the ART methods, and present<br />

strategies <strong>for</strong> choosing the relaxation parameter and different stopping rules.<br />

We will compare the per<strong>for</strong>mance of different methods and different strategies<br />

on a test problem derived from medical tomography.


2 Introduction<br />

1.1 Sturcture of the Thesis<br />

The goal of the project is to develope and implement a <strong>MATLAB</strong> package containing<br />

a number of iterative methods <strong>for</strong> algebraic reconstruction used in tomography<br />

problems. This includes describing the methods in a common framework<br />

such that the methods are described in same notation and the created<br />

functions have similar interfaces. Furthermore strategies <strong>for</strong> choosing the relaxation<br />

parameter must be availiable just as different stopping rules must be<br />

included. A few test problems relevant <strong>for</strong> these kind of methods must also<br />

be implemented. A critical comparison of the different methods and strategies<br />

used on different test problems will be produced. Finally the thesis will have<br />

the <strong>for</strong>m as a extended manual such that it contains chapters with theory and<br />

manual pages <strong>for</strong> each implemented routine.<br />

The chapters of the thesis are organized in the following way:<br />

• Chapter 2: We begin by giving a short presentation of inverse problem<br />

theory and defining the concept of semi-convergence <strong>for</strong> iterative methods<br />

and the concept of resolution limit.<br />

• Chapter 3: In this chapter we introduce the theory of the gathered SIRT<br />

and ART methods which this package concerns. We also provide a brief<br />

overview of block-iterative methods.<br />

• Chapter 4: In the next chapter we examine the semi-convergence behaviour<br />

of a part of the SIRT methods. After this examination we introduce<br />

different strategies of choosing the relaxation parameter, where one<br />

of the strategies is based on the examination of semi-convergence.<br />

• Chapter 5: In this chapter we introduce three strategies <strong>for</strong> the stopping<br />

rules. To devise effective stopping rules a training strategy is introduced<br />

<strong>for</strong> two of the stopping rules.<br />

• Chapter 6: We introduce in this chapter three different test problems,<br />

where two of the test problems arise from medical tomography and the<br />

third test problem arise from seismic tomography.<br />

• Chapter 7: This chapter discusses the per<strong>for</strong>mance of the methods. We<br />

also examine the perfomance of the methods when the different strategies<br />

<strong>for</strong> choosing the relaxation parameter and <strong>for</strong> different stopping rules are<br />

used. Furthermore we compare the per<strong>for</strong>mance of the SIRT and the ART<br />

methods.


1.1 Sturcture of the Thesis 3<br />

• Chapter 8: This chapter contains an overview of the implemented routines<br />

followed by an individual manual page <strong>for</strong> each function in the package.<br />

The manual pages are arranged alphabetically.<br />

• Chapter 9: This chapter contains the conclusion and suggenstions <strong>for</strong><br />

future work.<br />

All the implemented <strong>MATLAB</strong> routines have been implemented in <strong>MATLAB</strong><br />

7.8. To produce the test results, examples and figures a large number of scrips<br />

have been created but only the relevant functions are included in the package.


4 Introduction


Chapter 2<br />

Theory of Inverse Problems<br />

and Regularization<br />

Inverse problems arise in many applications in science and technology. Examples<br />

where inverse problems are found could be in medical imaging, where it is used<br />

e.g. in CT scanning, in geophysical prospecting or image deblurring. We will in<br />

this chapter introduce some of the fundamental concepts of inverse problems.<br />

We will first introduce the concept of an inverse problem and describe what<br />

defines an ill-posed problem. Then the important tools of SVD and the discrete<br />

Picard condition is defined followed by a few examples of spectral filtering.<br />

Finally we will give a short description of semi-convergence <strong>for</strong> iterative methods<br />

and define the concept of resolution limit.<br />

2.1 Discrete Ill-Posed Problems<br />

Inverse problems arise when we need to compute in<strong>for</strong>mation that is either<br />

internal or hidden. In the <strong>for</strong>ward problem we have a known input and a known<br />

system and we can then compute the output. In the inverse problem the output<br />

is often known with errors and we then have to compute either the system or<br />

the input, where the other one is known. For the linear problems we let the<br />

system be represented by the matrix A ∈ R m×n , the output as the right-hand


6 Theory of Inverse Problems and Regularization<br />

side b ∈ R m , which is the known data and the solution x ∈ R n . The problem<br />

can be <strong>for</strong>mulated as a system of linear equations:<br />

Ax = b, (2.1)<br />

where the matrix A typically is a discretization from an ill-posed problem, e.g.<br />

the Radon trans<strong>for</strong>m. The system (2.1) is said to be overdetermined when<br />

m > n and underdetermined when m < n.<br />

The definition of a well-posed problem was invented by Hadamard, who stated<br />

that a problem is well-posed if it satisfies the following requirements:<br />

Existence: There exist a solution to the problem.<br />

Uniqueness: There exist only one solution to the problem.<br />

Stability: The solution must depend continuously on data.<br />

If one of the three conditions is not satisfied, then the problem is said to be<br />

ill-posed.<br />

2.2 SVD and Picard Condition<br />

An important tool in analysing inverse problems is the singular value decomposition<br />

(SVD). SVD is defined <strong>for</strong> any matrix A ∈ R m×n as<br />

A =<br />

min{m,n} <br />

i=1<br />

uiσiv T i ,<br />

where the vectors ui and vi are orthonormal, and<br />

σ1 ≥ σ2 ≥ · · · ≥ σ min{m,n} ≥ 0.<br />

The elements σi are the singular values and the rank of the matrix A is equal<br />

to the number of positive singular values. Assuming that the inverse of A exists<br />

it is given as<br />

A −1 =<br />

min{m,n} <br />

i=1<br />

1<br />

viu<br />

σi<br />

T i .


2.2 SVD and Picard Condition 7<br />

U(:,k)<br />

U(:,k)<br />

U(:,k)<br />

0<br />

−0.2<br />

k = 1<br />

−0.4<br />

0 50<br />

n<br />

k = 4<br />

0.5<br />

0<br />

−0.5<br />

0 50<br />

n<br />

k = 7<br />

0.5<br />

0<br />

−0.5<br />

0 50<br />

n<br />

U(:,k)<br />

U(:,k)<br />

U(:,k)<br />

0.5<br />

0<br />

k = 2<br />

−0.5<br />

0 50<br />

n<br />

k = 5<br />

0.5<br />

0<br />

−0.5<br />

0 50<br />

n<br />

k = 8<br />

0.5<br />

0<br />

−0.5<br />

0 50<br />

n<br />

U(:,k)<br />

U(:,k)<br />

U(:,k)<br />

0.5<br />

0<br />

k = 3<br />

−0.5<br />

0 50<br />

n<br />

k = 6<br />

0.5<br />

0<br />

−0.5<br />

0 50<br />

n<br />

k = 9<br />

0.5<br />

0<br />

−0.5<br />

0 50<br />

n<br />

Figure 2.1: The first 9 left singular vectors ui <strong>for</strong> the test problem shaw.<br />

10 5<br />

10 0<br />

10 −5<br />

10 −10<br />

10 −15<br />

Picard Plot with b exact<br />

σ<br />

i<br />

T<br />

|u b|<br />

i<br />

T<br />

|u b|/σi<br />

i<br />

10<br />

0 10 20 30 40 50<br />

−20<br />

i<br />

(a) Picard plot with no noise.<br />

10 20<br />

10 10<br />

10 0<br />

10 −10<br />

Picard Plot with b noise<br />

σ<br />

i<br />

T<br />

|u b|<br />

i<br />

T<br />

|u b|/σi<br />

i<br />

10<br />

0 10 20 30 40 50<br />

−20<br />

i<br />

(b) Picard plot with noise.<br />

Figure 2.2: The Picard plot <strong>for</strong> the test problem shaw. The left figure is with no noise<br />

while the right figure is with noise level δ = 10 −3 .


8 Theory of Inverse Problems and Regularization<br />

Using this we can write the naive solution as<br />

x = A −1 b =<br />

min{m,n} <br />

i=1<br />

Figure 2.1 shows the first nine left singular vectors ui <strong>for</strong> the test problem shaw<br />

from Regularization <strong>Tools</strong> [21] with white noise level δ = 10 −3 . We see that<br />

the singular vectors have more oscillations as i increases, and the corresponding<br />

singular values σi decrease.<br />

We will now investigate the behaviour of the SVD coefficients 〈ui, b〉 and 〈ui,b〉<br />

. σi<br />

We call a plot of these coefficients together a Picard plot. Figure 2.2 shows the<br />

Picard plot <strong>for</strong> the test problem shaw with n = 50. From the left plot (a) we see<br />

the Picard plot when no noise is added to the right-hand side. We notice that the<br />

SVD coefficients |〈ui, b〉| decay faster than the singular values σi. This continues<br />

until i ≥ 18 where the coefficients level off. We recognize the reached level as the<br />

machine precision. We also notice that the solution coefficients 〈ui,b〉<br />

also decay<br />

σi<br />

but <strong>for</strong> i ≥ 18 they increase due to the inaccuracy of the coefficients 〈ui, b〉.<br />

We there<strong>for</strong>e cannot expect to get a meaningful solution to the inverse problem<br />

since the influence of the rounding errors destroys the computed solution.<br />

The plot to the right (b) shows the same problem, but we have used a noisy<br />

right-hand side. In this plot the SVD coefficients |〈ui, b〉| also decay until a<br />

certain level where they level off. This level is determined by the added noise.<br />

Also the solution’s coefficients 〈ui,b〉<br />

decay in the beginning, but increase again<br />

σi<br />

when the SVD coefficients |〈ui, b〉| level off. In this case the computed solution<br />

is totally dominated by the SVD coefficients which corresponds to the smaller<br />

singular values.<br />

u T i b<br />

In this connection we introduce the discrete Picard Condition.<br />

Definition 2.1 (Discrete Picard Condition) The discrete Picard Condition<br />

is satisfied if <strong>for</strong> all singular values σi greater than τ the corresponding coefficients<br />

|〈ui, b〉| on average decay faster than σi, where τ denotes the level at<br />

which the computed singular values level off due to rounding errors.<br />

Notice that the Picard Condition is about the decay and not the size of the<br />

singular values and the coefficients |〈ui, b〉|. If the discrete Picard condition is<br />

not satisfied then we cannot expect to solve a discrete ill-posed problem.<br />

σi<br />

vi.


2.3 Spectral Filtering 9<br />

2.3 Spectral Filtering<br />

Due to the difficulties associated with the discrete inverse problems the naive<br />

solution x = A −1 b is useless since it is becomes dominated by the rounding<br />

errors. We will in this section introduce two spectral filtering methods, which<br />

can be expressed as a filtered SVD expansion on the <strong>for</strong>m:<br />

xfilter =<br />

min{m,n} <br />

i=1<br />

ϕi<br />

〈ui, b〉<br />

vi,<br />

where ϕi are the filter factors <strong>for</strong> the corresponding method. We will first<br />

introduce the truncated SVD method (TSVD).<br />

We realised that the large errors in the naive solution came from the noisy SVD<br />

coefficients corresponding to the smallest singular values but we also noticed that<br />

the SVD coefficients <strong>for</strong> large singular values were useful, since these coefficients<br />

fulfilled 〈ui,b〉<br />

σi ≃ 〈ui,¯b〉 , where b is the noisy right-hand side and σi<br />

¯b is the righthand<br />

side without noise. This leads to the truncated SVD (TSVD) method<br />

where we choose only to include the first k components of the naive solution<br />

to x. With this method we there<strong>for</strong>e cut off those SVD coefficients that are<br />

dominated by inverted noise. We define the TSVD solution as<br />

xk =<br />

σi<br />

k 〈ui, b〉<br />

vi,<br />

i=1<br />

where we call k the truncation parameter and k must be chosen such that all<br />

the noise-dominated SVD coefficients are discarded. This leads to the following<br />

filter factors <strong>for</strong> the TSVD method:<br />

ϕi =<br />

σi<br />

1 i ≤ k<br />

0 i > k.<br />

The second method we will introduce the Tikhonov regularization. For this<br />

method the filter factors is defined as<br />

ϕi =<br />

σ 2 i<br />

σ 2 i<br />

+ ω2 , i = 1, · · · , n,<br />

where ω is the regularization parameter, which in a sense corresponds to the<br />

truncation parameter k. The Tikhonovs regularization corresponds to the following<br />

minimization problem<br />

min<br />

x {Ax − b 2 2 + ω2 x 2 2 }.


10 Theory of Inverse Problems and Regularization<br />

x 0<br />

x 1<br />

x 2<br />

x k<br />

x k opt<br />

x exact<br />

A −1 b<br />

Figure 2.3: The basis concept of semi-convergence<br />

We notice that <strong>for</strong> σi ≫ ω, then the filter factors are close to 1 and the corresponding<br />

SVD components contribute to xfilter with almost full strength. On<br />

the other hand when σi ≪ ω then the filter factors are close to σ 2 i /ω2 , and the<br />

SVD components are damped or filtered.<br />

2.4 <strong>Iterative</strong> Methods and Semi-Convergence<br />

For large problems where it is not feasible to compute the SVD, we need other<br />

methods than the introduced TSVD and Tikhonov regularization. This leads us<br />

to the use of iterative methods, where we need a user-specified starting vector<br />

x 0 , and from this vector the method produces a sequence of iterates x 1 , x 2 , . . .<br />

that converge to some solution.<br />

For iterative methods Natterer [31] has introduced the concept of semi-convergence.<br />

The concept describes the behaviour of the iterate x k <strong>for</strong> the iterative methods.<br />

The first iterates tend to be better and better approximations of the exact solution<br />

but at some point the iterates start to deteriorate and instead they converge<br />

to the naive solution x = A −1 b, see figure 2.3. For the iterative methods the<br />

regularization parameter is there<strong>for</strong>e the number of iterations.


2.5 Resolution Limit 11<br />

2.5 Resolution Limit<br />

When exploring the iterative methods, which this package concerns, we need to<br />

define the concept of resolution limit. For a better understanding of this concept<br />

we draw attention to the fact that the relative error is defined as<br />

xk − ¯x2<br />

,<br />

¯x2<br />

where x k is the solution in the k’th iterate and ¯x is the exact solution.<br />

The bound <strong>for</strong> how accurate a solution one can obtain, is determined by the<br />

noise in the data and this can be studied in terms of the SVD. We then define<br />

the resolution limit to be this bound. The resolution limit is not only dependent<br />

on the noise but also the used method and the given problem to solve. We define<br />

the resolution limit to be<br />

RL(A, b, method) = min<br />

k<br />

xk − ¯x2<br />

.<br />

¯x2<br />

From this definition we let the resolution limit depend on the used method, and<br />

the problem.


12 Theory of Inverse Problems and Regularization


Chapter 3<br />

<strong>Iterative</strong> Methods <strong>for</strong><br />

Reconstruction<br />

In this chapter we will give a brief introduction to the theory <strong>for</strong> some iterative<br />

methods called SIRT and ART methods. The need <strong>for</strong> iterative methods arises,<br />

when the dimensions of the matrix A become so large that direct factorization<br />

methods become infeasible, which is usually the case in two and three dimensions.<br />

This is typically the case when A is a discretization that arises from a<br />

real-world problem. In this case one can use iterative methods instead of the<br />

well-known Tikhonov regularization or TSVD described in section 2.3. Where<br />

we <strong>for</strong> Tikhonov regularization have the regularization parameter ω, the number<br />

of iterations k plays the role of regularization parameter <strong>for</strong> the iterative<br />

methods.<br />

In the following presented theory we will assume that all the elements in the<br />

matrix A are nonnegative. In the articles where the methods are defined they<br />

do not include used-defined weights, but we have chosen to include them in both<br />

the description and the implementation.


14 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction<br />

3.1 Simultaneous <strong>Iterative</strong> Reconstructive Technique<br />

(SIRT)<br />

In this section we will present the class of iterative methods which we call<br />

Simultaneous <strong>Iterative</strong> Reconstructive Technique (SIRT). As the name refers to,<br />

all the methods of this class are simultaneous, which means that in<strong>for</strong>mation<br />

from all the equations are used at the same time.<br />

In the literature the class of SIRT methods is also referred to as Landwebertypes,<br />

since the Landweber iteration is one of the classical methods of the SIRTclass.<br />

The common property of the SIRT methods is that they can be written<br />

in the following general <strong>for</strong>m:<br />

x k+1 = x k + λkTA T M(b − Ax k ), k = 0, 1, . . . (3.1)<br />

where x k denotes the current iteration vector, x k+1 denotes the new iteration<br />

vector, λk is the relaxation parameter, and the matrices M and T are symmetric<br />

positive definite. We will realize that the different methods depend on the choice<br />

of the matrices M and T. In most of the presented methods we will have that<br />

T = I.<br />

For the methods given on the <strong>for</strong>m (3.1) with T = I the following theorem<br />

regarding convergence has been shown [4], [25].<br />

Theorem 3.1 The iterates on the <strong>for</strong>m (3.1) with T = I converge to a solution<br />

ˆx of Ax − bM if and only if<br />

0 < ǫ ≤ λk ≤ 2<br />

σ2 − ǫ,<br />

1<br />

where ǫ is an arbitrarily small, but fixed constant and σ1 is the largest singular<br />

value of M 1<br />

2 A. If in addition x 0 ∈ R(A T ), then ˆx is the unique solution of<br />

minimum 2-norm.<br />

Theorem 3.1 is a useful theorem since it insures convergence of the SIRT methods<br />

in general. It was originally only proved to be a sufficient condition <strong>for</strong> convergence,<br />

but in [35] it is shown that the condition is also necessary as stated in<br />

the theorem.<br />

3.1.1 Classical Landweber Method<br />

The classical Landweber method was first introduced by Landweber in [29],<br />

and it has often been used <strong>for</strong> image reconstruction. The classical Landweber


3.1 Simultaneous <strong>Iterative</strong> Reconstructive Technique (SIRT) 15<br />

method can be written as follows:<br />

x k+1 = x k + λkA T (b − Ax k ), k = 0, 1, . . ., (3.2)<br />

which corresponds to setting M = T = I in (3.1).<br />

The iterates x k from (3.2) can be expressed as filtered SVD solutions. If we let<br />

the SVD <strong>for</strong> the matrix A take the following <strong>for</strong>m<br />

A = UΣV T =<br />

then the filtered solution can be written as<br />

where Φ k is given as<br />

The filter factors ϕ k i<br />

n<br />

i=1<br />

x k = V Φ k Σ −1 U T b,<br />

uiσiv T i ,<br />

Φ k = diag ϕ k 1, . . . , ϕ k n .<br />

<strong>for</strong> i = 1, . . .,n are given as<br />

ϕ k i = 1 − 1 − λσ 2k i .<br />

For small singular values σi we have that Φk i ≈ kλσ2 i showing that they decay<br />

with the same rate as the Tikhonov filter factors described in section 2.3.<br />

3.1.2 Generalized Landweber<br />

Another classical method is the generalized Landweber iteration which is described<br />

in [20] and [33]. The generalized Landweber has the following <strong>for</strong>m:<br />

x k+1 = x k + λTA T (b − Ax k ), k = 0, 1, . . .,<br />

where λ is our constant relaxation parameter and T is a ”sharping matrix” given<br />

by<br />

T = F(A T A),<br />

where F is a rational function of A T A. We obtain the classical Landweber<br />

method when F = I.<br />

The filter factors ϕ k i<br />

<strong>for</strong> the generalized Landweber method are given by<br />

ϕ k i = 1 − (1 − σ 2 i F(σ 2 i )) k ,


16 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction<br />

✻ R2<br />

H2<br />

H1<br />

✙<br />

R2(z)<br />

R1(z)<br />

z<br />

✲<br />

Figure 3.1: Cimmino’s reflection method<br />

since the eigenvalue decomposition of F(A T A) is given as<br />

F(A T A) =<br />

n<br />

i=1<br />

viF(σ 2 i )v T i .<br />

We see that using the generalized Landweber method gives a further impact<br />

on the filter factors, since the function F occurs in the filter factors. It is also<br />

possible to choose the function in such a way that the method approximates,<br />

say, the TSVD or the Tikhonov regularization.<br />

3.1.3 Cimmino’s Method<br />

Another method in the SIRT-class is Cimmino’s method which was introduced<br />

in [8]. Cimmino’s method was originally based on reflections onto hyperplans<br />

but there also exists a version with projections.<br />

To introduce the two versions of Cimmino’s method, we define Hi to be the<br />

hyperplanes <strong>for</strong> the linear equations 〈a i , x〉 = bi:<br />

Hi = {x ∈ R n | a i , x = bi}, <strong>for</strong> i = 1, . . .,m.<br />

We will introduce both versions of Cimmino’s method, and we will start with<br />

the original that uses reflections.


3.1 Simultaneous <strong>Iterative</strong> Reconstructive Technique (SIRT) 17<br />

The idea about Cimmino’s reflection method is that the next iterate can be<br />

described using an equal weighting of the reflections of x k on Hi. Reflections<br />

on hyperplanes is the following:<br />

Ri(z) = z + 2 bi − 〈ai , z〉<br />

ai2 a<br />

2<br />

i .<br />

The reflection method then uses the average of the reflections of x k onto the<br />

hyperplanes Hi to determine the direction of the step to the new iteration.<br />

Figure 3.1 illustrates the concept in R 2 <strong>for</strong> a consistent problem. The method<br />

can then be written as follows:<br />

x k+1 = x k + λk<br />

m 1 <br />

wi Ri(x<br />

m<br />

k ) − x k ,<br />

i=1<br />

where the relaxation parameter λk determines how much of the step is taken<br />

from x k to the new iterate x k+1 and wi > 0 are user-defined weights. Using the<br />

definition of reflections we get the following:<br />

x k+1 = x k + λk<br />

m 2<br />

m<br />

i=1<br />

wi<br />

bi − 〈ai , xk 〉<br />

ai2 ai <strong>for</strong> k = 0, 1, . . ..<br />

2<br />

Cimmino’s reflection method can be written using matrix notation on the <strong>for</strong>m<br />

wi<br />

<strong>for</strong> i = 1, . . .,m and T = I.<br />

(3.1), where M = 2<br />

m diag<br />

a i 2 2<br />

We will now introduce Cimmino’s projection method. Using an equal weighting<br />

of all the equations the next iterate in Cimmino’s projection method can be<br />

described using orthogonal projections of x k on Hi. As shown in appendix A.1<br />

an orthogonal projection of the vector z on the hyperplane Hi is the following:<br />

Pi(z) = z + bi − a i , z <br />

a i 2 2<br />

a i . (3.3)<br />

Cimmino’s projection method uses the average of the projections of x k onto the<br />

hyperplanes Hi to determine the direction of the step to the new iterate. Figure<br />

3.2 illustrates the concept in R 2 <strong>for</strong> a consistent problem.<br />

The new iterate can then be described as the current iterate plus a contribution<br />

of the average of the found step direction. We can there<strong>for</strong>e write Cimmino’s<br />

projection method as the following:<br />

x k+1 = x k + λk<br />

1<br />

m<br />

m<br />

i=1<br />

<br />

wi Pi(x k ) − x k ,


18 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction<br />

✻ R2<br />

H2<br />

H1<br />

P1(z)<br />

✙<br />

P2(z)<br />

z<br />

✲<br />

Figure 3.2: Cimmino’s projection method<br />

where the relaxation parameter λk determines how much of the step is taken<br />

from x k to the new iterate x k+1 and wi are userdefined weights, where wi > 0<br />

<strong>for</strong> i = 1, . . .,m.<br />

Using the definition of orthogonal projection (3.3) we can rewrite the expression:<br />

x k+1 = x k + λk<br />

m 1<br />

m<br />

i=1<br />

wi<br />

bi − ai , xk a i<br />

a i 2 2<br />

<strong>for</strong> k = 0, 1, . . ..<br />

Using matrix notation Cimmino’s projection method has the general <strong>for</strong>m (3.1),<br />

wi<br />

<strong>for</strong> i = 1, 2, . . ., m and T = I.<br />

where M = 1<br />

m diag<br />

a i 2 2<br />

3.1.4 Component Averaging (CAV)<br />

Component Averaging (CAV) is introduced in [6] and is an expansion of Cimmino’s<br />

method. In Cimmino’s method we use equal weighting of the contributions<br />

from the projections. In the case where the matrix A is dense, it seems<br />

fair that all contributions <strong>for</strong> Pi(x k ) − x k are equally weighted.<br />

The heuristic in CAV includes a factor which is proportional to the number of<br />

nonzero elements. We there<strong>for</strong>e let sj denote the number of nonzero elements


3.1 Simultaneous <strong>Iterative</strong> Reconstructive Technique (SIRT) 19<br />

of column j <strong>for</strong> each j = 1, 2, . . ., n:<br />

sj = NNZ(aj), <strong>for</strong> j = 1, . . .,n.<br />

We then define a i 2 S = n<br />

j=1 a2 ij sj. Using this the CAV algorithm is as follows:<br />

x k+1<br />

j<br />

= xk j<br />

+ λk<br />

m<br />

i=1<br />

wi<br />

bi − a i , x k<br />

a i 2 S<br />

where wi > 0 are user-defined weights.<br />

a i j <strong>for</strong> k = 0, 1, . . .,<br />

We see that when A is dense we get the original Cimmino’s method, since sj = m<br />

<strong>for</strong> all j = 1, . . .,n, and we have ai2 1<br />

S = mai2 2.<br />

To rewrite the CAV algorithm in matrix <strong>for</strong>m we define S = diag(s1, s2, . . . , sn),<br />

where the sj-values are defined as described above. We then let<br />

<br />

wi<br />

DS = diag <strong>for</strong> i = 1, . . .,m,<br />

a i 2 S<br />

where a i 2 S = (ai ) T Sa i and the CAV algorithm has the following matrix <strong>for</strong>m<br />

x k+1 = x k + λkA T DS(b − Ax k ),<br />

which we recognize as (3.1) with M = DS and T = I.<br />

3.1.5 Diagonally Relaxed Orthogonal Projections (DROP)<br />

Another method in the SIRT class is the diagonally relaxed orthogonal projection<br />

(DROP) method which is described in [5]. This method is another<br />

extension of Cimmino’s method, which is inspired by the CAV method. In the<br />

DROP method we also introduce a user-defined weighting of the equations. We<br />

let wi > 0 denote this weighting.<br />

The DROP method can then be written as:<br />

m<br />

x k+1 = x k + λk<br />

i=1<br />

wiS −1 (Pi(x k ) − x k ),<br />

where Pi(x k ) is defined as in (3.3) and S is defined as above <strong>for</strong> CAV. Using<br />

(3.3) we can rewrite the DROP algorithm into the following <strong>for</strong>m:<br />

x k+1<br />

j<br />

= xk j<br />

1<br />

+ λk<br />

sj<br />

m<br />

i=1<br />

wi<br />

bi − a i , x <br />

a i 2 2<br />

a i j ,


20 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction<br />

<strong>for</strong> all j = 1, 2, . . .,n. Recall that wi > 0 <strong>for</strong> all i = 1, . . .,m are user-chosen<br />

weights. When wi = 1 <strong>for</strong> all i = 1, . . .,m and the matrix A is dense, i.e. sj = m<br />

<strong>for</strong> all j = 1, . . . , n then we have Cimmino’s method.<br />

The DROP method has the following matrix <strong>for</strong>m:<br />

x k+1 = x k + λkS −1 A T D(b − Ax k ), (3.4)<br />

which we recognize as the general <strong>for</strong>m with T = S−1 <br />

wi<br />

and M = D = diag ai2 2<br />

Since the DROP method has T = I, we cannot use theorem 3.1 and there<strong>for</strong>e we<br />

make a further investigation of the convergence theory. By defining yk = S 1<br />

2xk 1<br />

and Ā = AS− 2 we can rewrite to another matrix <strong>for</strong>m:<br />

y k+1 = y k + λk ĀT D(b − Āyk ).<br />

For this <strong>for</strong>m it is known, that λk must be between 0 and 2/ρ( ĀTDĀ). Using<br />

the definition of Ā we get that ρ(ĀT DĀ) = ρ(S−1AT DA). Then in [5] it is<br />

shown that <strong>for</strong> the DROP method where wi > 0 <strong>for</strong> all i = 1, . . .,m and if<br />

D = diag wi/ai2 <br />

m×m −1<br />

2 ∈ R and S = diag(1/sj) ∈ Rn×n , where sj = 0,<br />

then ρ(S −1 A T DA) ≤ max{wi|i = 1, . . .,m}. We there<strong>for</strong>e have the following<br />

convergence theorem which replaces theorem 3.1 <strong>for</strong> the DROP method, where<br />

we let zD = 〈z, Dz〉 denote the D-norm:<br />

Theorem 3.2 Assume that wi > 0 <strong>for</strong> all i = 1, . . .,m. If <strong>for</strong> all k ≥ 0,<br />

0 < ǫ ≤ λk ≤ (2 − ǫ)/ max{wi|i = 1, . . .,m},<br />

where ǫ is an arbitrarily small but fixed constant, then any sequence generated by<br />

(3.4) converges to a weighted least squares solution x ∗ = argmin{Ax −bD|x ∈<br />

R n }. If in addition x 0 ∈ R(S −1 A T ), then x ∗ is the unique solution of minimum<br />

S-norm.<br />

3.1.6 Simultaneous <strong>Algebraic</strong> Reconstruction Technique<br />

(SART)<br />

Simultaneous <strong>Algebraic</strong> Reconstruction Technique (SART) is developed in the<br />

ART setting [1], but it can be written in the general SIRT <strong>for</strong>m (3.1) and we<br />

there<strong>for</strong>e categorize it as a SIRT method.<br />

The SART method is written in the following matrix <strong>for</strong>m:<br />

x k+1 = x k + λkV −1 A T W(b − Ax k ),<br />

<br />

.


3.2 <strong>Algebraic</strong> Reconstruction Techniques (ART) 21<br />

where V = diag(ςj) and W = diag 1<br />

ςi <br />

i , where ς and ςj denotes the row and<br />

the column sums:<br />

ς i =<br />

ςj =<br />

n<br />

j=1<br />

m<br />

i=1<br />

a i j<br />

a i j<br />

<strong>for</strong> i = 1, . . .,m<br />

<strong>for</strong> j = 1, . . .,n.<br />

For this method we assume that ai = 0 and aj = 0, such that A does not contain<br />

any zero rows or columns.<br />

Since the SART method has T = I, we cannot use theorem 3.1. The convergence<br />

<strong>for</strong> SART was independently developed by Censor, Elfvind in [4] and Jiang,<br />

Wang in [26]. Both showed that the convergence <strong>for</strong> SART is within the interval<br />

(0, 2).<br />

3.2 <strong>Algebraic</strong> Reconstruction Techniques (ART)<br />

We now introduce a different class of methods which we will denote algebraic<br />

reconstruction techniques (ART). All methods in the ART-class are fully sequential<br />

method, i.e., each equation is treated at a time, since each equation is<br />

dependent on the previous.<br />

3.2.1 Kaczmarz’s Method<br />

The classical and most known method of the ART class is called Kaczmarz’s<br />

method, [27]. The method is a so-called row action method, since each iteration<br />

consist of a ”sweep” through all the rows in the matrix A. Since the method uses<br />

one equation in each step, an iteration consists of m steps. Figure 3.3 shows<br />

an example of a sweep <strong>for</strong> the consistent case with the relaxation parameter<br />

λk = 1.<br />

The algorithm <strong>for</strong> Kaczmarz’s method updates x k in the following way:<br />

x k,0 = x k ,<br />

x k,i = x k,i−1 bi −<br />

+ λk<br />

ai , xk,i−1 ai2 2<br />

x k+1 = x k,m .<br />

a i , i = 1, 2, . . .,m,


22 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction<br />

H4<br />

H5<br />

H3<br />

H2<br />

H6<br />

H1<br />

x k+1<br />

x k<br />

Figure 3.3: Kaczmarz’s Method<br />

If the linear system (2.1) is consistent, then Kaczmarz’s method converges to a<br />

solution of this system. If the system is inconsistent, then every sub-sequence<br />

of iterations converges, but not necessarily to a least squares solution.<br />

In the literature Kaczmarz’s method is also referred to as ART, which can be<br />

confusing since ART is also the name of algebraic reconstruction techniques in<br />

general.<br />

Experiments have shown that Kaczmarz’s method converges fast in the first<br />

iterations after which it converges very slowly. This is perhaps one of the reasons<br />

why this method was often used <strong>for</strong> tomography problems, where the solution<br />

is often found within few iterations.<br />

By using SOR-theory it can be shown that Kaczmarz’s method <strong>for</strong> λ constant<br />

in each iteration can by written in the <strong>for</strong>m (3.1), but then MA is no longer<br />

symmetric [13]:<br />

x k+1 = x k + λA T MA(b − Ax k ), (3.5)<br />

where MA = (D + λL) −1 . Since MA is not symmetric, we cannot use the<br />

theory derived <strong>for</strong> the SIRT-methods. It can on the other hand be proved, that<br />

<strong>for</strong> 0 < λ < 2, then the iterations of Kaczmarz’s method (3.5) converge to a


3.2 <strong>Algebraic</strong> Reconstruction Techniques (ART) 23<br />

solution of<br />

H4<br />

H5<br />

H3<br />

H2<br />

H6<br />

H1<br />

x k+1<br />

x k<br />

Figure 3.4: Symmetric Kaczmarz<br />

A T MA(b − Ax) = 0.<br />

3.2.2 Symmetric Kaczmarz<br />

A variant of the Kaczmarz method is symmetric Kaczmarz. This method is also<br />

fully sequential, and it consists of one ”sweep” of the Kaczmarz method followed<br />

by another ”sweep” of Kaczmarz’s method, where the equations are used in<br />

reverse order. The iteration <strong>for</strong> the symmetric Kaczmarz method there<strong>for</strong>e<br />

consists of 2m − 2 steps. Figure 3.4 shows an example of an iteration <strong>for</strong> the<br />

consistent case with the relaxation parameter λk = 1.<br />

The algorithm <strong>for</strong> symmetric Kaczmarz method is the following:<br />

x k,0 = x k<br />

x k,i = x k,i−1 bi −<br />

+ λk<br />

ai , xk,i−1 ai2 2<br />

x k+1 = x k,1 ,<br />

where x k,1 denotes the last of the step in (3.6).<br />

a i , i = 1, . . .,m, . . . ,2 (3.6)


24 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction<br />

Symmetric Kaczmarz was introduced in [3] and as <strong>for</strong> the Kaczmarz’s method<br />

symmetric Kaczmarz can also be rewritten to have the <strong>for</strong>m of the SIRT methods<br />

[14], where λk = λ:<br />

x k+1 = x k + λA T MSA(b − Ax k ),<br />

where MSA is symmetric, which means that the theory <strong>for</strong> SIRT methods is<br />

valid, but not practical to implement in this way.<br />

3.2.3 Randomized Kaczmarz<br />

The next method we will introduce is the randomized Kaczmarz method. Experience<br />

has shown that Kaczmarz’s method converges very slowly to the solution.<br />

The method we present was proposed in [36] and is proved to have exponential<br />

expected rate of convergence, and the rate does not depend on the number of<br />

equations in the system. The randomized Kaczmarz method has the following<br />

<strong>for</strong>m:<br />

x k+1 = x k + br(i) − ar(i) , xk ar(i) 2 a<br />

2<br />

r(i) ,<br />

where the index r(i) is chosen from the set {1, 2, . . ., m} randomly with probability<br />

proportional with a r(i) 2 2 .<br />

For the randomized Kaczmarz method we cannot talk about iterations but only<br />

the number of steps.<br />

In the definition of randomized Kaczmarz method in [36] the method is presented<br />

without a relaxation parameter λk, but in our implemented algorithm<br />

this relaxation parameter is present.. We emphasize that no convergence results<br />

exist <strong>for</strong> this parameter and a safe choice would there<strong>for</strong>e be λk = 1, since we<br />

then have the originally presented method.<br />

3.2.4 Extended Kaczmarz Method<br />

As mentioned earlier Kaczmarz’s method cannot provide a least squares solution<br />

in the inconsistent case and there<strong>for</strong>e an extended Kaczmarz method was<br />

proposed. In this method we also consider the orthogonal projections onto the<br />

hyperplanes with respect to the columns of A. We let aj denote the j’th column<br />

of A. The extended Kaczmarz method is given both in a version with and without<br />

relaxation parameters. We will in this section only consider the version with


3.2 <strong>Algebraic</strong> Reconstruction Techniques (ART) 25<br />

relaxation parameters. We again let λ denote the constant relaxation parameter<br />

of the orthogonal projection <strong>for</strong> the rows of A and we let α denote the constant<br />

relaxation parameter <strong>for</strong> the orthogonal projection using the columns.<br />

The extended Kaczmarz method has the following algorithm, where x 0 ∈ R n<br />

and y 0 = b:<br />

y k,0 = y k<br />

y k,j = y k,j−1 − α<br />

y k+1 = y k,n<br />

b k+1 = b − y k+1<br />

x k,0 = x k<br />

aj, y k,j−1<br />

aj 2 2<br />

aj<br />

j = 1, . . .,n<br />

x k,i = x k,i−1 + λ bk+1 − ai , xk,i−1 a i , i = 1, . . . , m<br />

x k+1 = x k,m .<br />

a i 2 2<br />

For the extended Kaczmarz method it is proven in [34] that <strong>for</strong> any x 0 ∈ R n<br />

and <strong>for</strong> any λ, α ∈ (0, 2) the method converges to a least squares solution. This<br />

method is not implemented in the package.<br />

3.2.5 Multiplicative ART<br />

Another method in the ART class is the multiplicative ART method. This<br />

method was proposed by in [17]. For this method we assume that x 0 is a n<br />

dimensional vector of all ones and that all the elements in A are between 0 and<br />

1, 0 ≤ aij ≤ 1. The multiplicative ART method is given as:<br />

x k+1<br />

j<br />

=<br />

<br />

bi<br />

〈ai , xk aij x<br />

〉<br />

k j,<br />

where i = (k mod m) + 1. Originally when the method was presented is was<br />

assumed that all the elements in A are either 0 or 1, but later is has been shown<br />

that if<br />

• all the entries of A are between 0 and 1<br />

• A does not have zero rows<br />

• the system (2.1) has a nonnegative solution


26 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction<br />

Work Units WU<br />

Landweber 2<br />

Cimmino 2<br />

CAV 2<br />

DROP 2<br />

SART 2<br />

Kaczmarz 4<br />

Symmetric Kaczmarz 8<br />

Randomized Kacmarz 4<br />

Table 3.1: Working units <strong>for</strong> one iteration of the SIRT and the ART methods.<br />

then multiplicative ART converges to the maximum entropy of the solution<br />

Ax = b, which is defined as<br />

maxent(x) = −<br />

where ¯x is the average value of xj.<br />

n<br />

j=1<br />

xj xj<br />

ln<br />

n¯x n¯x ,<br />

3.3 Considerations Towards the <strong>Package</strong><br />

We have now introduced some SIRT and ART methods. For the package we will<br />

only use some of the introduced methods. For the methods, which are left out of<br />

the package we found them interesting, such that they should be described and<br />

mentioned. In the package we will use the SIRT methods Landweber, Cimmino<br />

(both the reflection and the projection version), CAV, DROP and SART. We<br />

have not implemented generalized Landweber, since there is no specific description<br />

of the T matrix. For the ART methods we have implemented Kaczmarz’s<br />

method, symmetric Kaczmarz and randomized Kaczmarz. Extended Kaczmarz<br />

is not implemented, since it requires a choice of two relaxation parameters. The<br />

method MART is also left out of the package, since the algorithm <strong>for</strong> MART is<br />

very different from the other methods.<br />

We have two classes of methods which cannot be directly compared with respect<br />

to computational work, since they have different properties. There<strong>for</strong>e<br />

we introduce the concept of a work unit WU. We define a work unit to be one<br />

matrix-vector multiplication. In appendix A.3 the total work units per iteration<br />

is calculated <strong>for</strong> each of the implemented methods. The result is collected in<br />

table 3.1. We notice that all SIRT methods use 2 WU per iteration, while both<br />

Kaczmarz’s method and randomized Kaczmarz use 4 WU per iteration, since


3.4 Block-<strong>Iterative</strong> Methods 27<br />

we denote one iteration of randomized Kaczmarz to be m random selections<br />

of a row. Since symmetric Kaczmarz uses twice as many steps per iteration<br />

as Kaczmarz’s method the work units per iteration <strong>for</strong> the method is 8. This<br />

result will be used later to compare the per<strong>for</strong>mance of the SIRT and the ART<br />

methods.<br />

When comparing the methods imnplemented in this package, the user should<br />

notice that due to the <strong>MATLAB</strong> implementation the SIRT methods are much<br />

faster than the ART methods. The user should also be aware that this is only the<br />

case since the implementation is done in <strong>MATLAB</strong>, where loops are slow. Using<br />

another language there would not be this difference in the running time. When<br />

implementing the SIRT methods in <strong>MATLAB</strong> a dilemma occurs between speed<br />

and memory. When creating the matrices M and T we have mostly chosen<br />

the fastest implementation but in case of memory trouble most of the SIRT<br />

methods also have an alternative implementation which require less memory<br />

but with a slower running time. If the alternative code exists it can be found in<br />

the comments in the code.<br />

In the following chapters it might seem to the user that we prefer the SIRT<br />

methods, since most of the remaining theory is <strong>for</strong> the SIRT methods, but this<br />

is only the case since a corresponding theory cannot be found <strong>for</strong> the ART<br />

methods.<br />

3.4 Block-<strong>Iterative</strong> Methods<br />

We will now look into the field of block-iterative methods although they are<br />

not a part of this package. The idea of this class of methods is to partition the<br />

system (2.1) into so-called blocks of equations and treat each block according<br />

to the given iterative method by passing cyclic over all the blocks. Most of the<br />

theory <strong>for</strong> block-iterative methods is based on the assumption that equations<br />

can appear in more than one block, but in the following we will always look<br />

at the case with disjoint partitioning, i.e. every equation can only appear in<br />

exactly one block.<br />

For the case of disjoint partitioning we have the following structure of the system:<br />

⎛<br />

⎜<br />

A = ⎜<br />

⎝<br />

A1<br />

A2<br />

.<br />

.<br />

Ap<br />

⎞<br />

⎛<br />

⎟ ⎜<br />

⎟ ⎜<br />

⎟ , b = ⎜<br />

⎠ ⎝<br />

b1<br />

b2<br />

.<br />

.<br />

bp<br />

⎞<br />

⎛<br />

⎟<br />

⎠ , AT ⎜<br />

= ⎜<br />

⎝<br />

B1<br />

B2<br />

.<br />

.<br />

Bq<br />

⎞<br />

⎟<br />

⎠ ,


28 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction<br />

where p denotes the number of blocks <strong>for</strong> the linear system and q denotes the<br />

number of blocks <strong>for</strong> A T .<br />

For t = 1, . . .,p we let the block with the index Bt ⊆ {1, . . .,m} be a ordered<br />

subset of the <strong>for</strong>m<br />

<br />

Bt = i t 1, i t 2, . . .,i t <br />

m(t) ,<br />

where m(t) is the number of elements in Bt.<br />

We will now introduce some block-iterative methods, but since this software<br />

package does not include block-iterative methods, we will only look at a small<br />

selection. Other block-iterative methods can be found in <strong>for</strong> example [34], [18].<br />

3.4.1 Block-Iteration<br />

The first block-iterative method we will introduce is called the Block-Iteration.<br />

This method was first proposed by Elfving and later generalized by Eggermont,<br />

Herman and Lent. The method is also known as the ordinary Block-Kaczmarz<br />

method. For x 0 ∈ R the algorithm can be written as:<br />

x k,0 = x k<br />

x k,t = x k,t−1 + λtA T t Mt(b t − Atx k,t−1 ), t = 1, 2, . . ., p<br />

x k+1 = x k,p ,<br />

where λt is a set of relaxation parameters and Mt is a set of given symmetric<br />

positive definite matrices. In the algorithm originally proposed by Elfving we<br />

had that Mt = (AtA T t )−1 and λt = λ.<br />

For p = 1, i.e. only one block we have that the method is given on the standard<br />

SIRT <strong>for</strong>m (3.1) with T = I and this is called a fully simultaneous iteration.<br />

With p = m we on the other hand have a fully sequential iteration since each<br />

block consists of only one equation.<br />

In [14] it is proven that if<br />

0 < ǫ ≤ λt ≤<br />

ρ(A T t<br />

2 − ǫ<br />

,<br />

MtAt)<br />

<strong>for</strong> t = 1, . . .,p, then the Block-Iteration method converges.<br />

One block-iteration is defined as a pass through all data and since the Block-<br />

Iteration method uses a single block in each block-step every block-iteration


3.4 Block-<strong>Iterative</strong> Methods 29<br />

consists of p steps. One block-iteration of the Block-Iteration with the relaxation<br />

parameter λk can be written as:<br />

x k+1 = x k + A T ¯ MB(b − Ax k ), (3.7)<br />

¯MB = ( ¯ D + L) −1<br />

where ¯ D is block-diagonal and L is block-lower triangular and defined as:<br />

⎛<br />

0<br />

⎜<br />

L = ⎜ A2A<br />

⎜<br />

⎝<br />

0<br />

T 1<br />

.<br />

. ..<br />

. .. . ..<br />

⎞<br />

⎟<br />

⎠ ,<br />

⎛<br />

λ<br />

D ¯ ⎜<br />

= ⎝<br />

−1 −1<br />

1 M1 . ..<br />

0<br />

ApA T 1 · · · ApA T p−1 0<br />

The sequence defined by (3.7) converges towards the solution of<br />

A T ¯ MB(b − Ax) = 0.<br />

3.4.2 Symmetric Block-Iteration<br />

0 λ−1 p M −1<br />

p<br />

⎞<br />

⎟<br />

⎠ .(3.8)<br />

In Symmetric Block-Iteration one block-iteration consists of first one blockiteration<br />

of the above Block-Iteration method followed by another block-iteration,<br />

where the blocks appear in reverse order. This gives the algorithm the following<br />

control order t = 1, 2, . . .,p − 1, p, p − 1, . . . 1.<br />

The algorithm <strong>for</strong> the symmetric block-iteration <strong>for</strong> x 0 ∈ R n looks as follows:<br />

x k,0 = x k<br />

x k,t = x k,t−1 + λtA T t Mt(b t − Atx k,t−1 ), (3.9)<br />

x k+1 = x k,1 ,<br />

where t = 1, . . .,p − 1, p, p − 1, . . .,1 and x k,1 denotes the last step in (3.9).<br />

One block-iteration of the Symmetric Block-Iteration method can be written in<br />

a general <strong>for</strong>m, where we let<br />

AA T = L + D + L T<br />

be the splitting of AA T into its lower block triangular, block diagonal and upper<br />

block triangular parts. The block-iteration can then be written as:<br />

x k+1 = x k + A T ¯ MSB(b − Ax k ). (3.10)


30 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction<br />

Using (3.8) and ˜ D = 2 ¯ D − D we get<br />

¯MSB = ( ¯ D + L T ) −1 ˜ D( ¯ D + L) −1 ,<br />

where ¯ MSB is symmetric positive definite.<br />

From [14] we have that the block-iterations of Symmetric Block-Iteration (3.10)<br />

converge to a solution x of the weighted least squares problem<br />

min<br />

x Ax − b ¯ MSB .<br />

If in addition x 0 ∈ R(A T ), then x is the unique solution of minimal 2-norm,<br />

where the corresponding normal equations are<br />

A T ¯ MSB(b − Ax) = 0.<br />

3.4.3 Block-<strong>Iterative</strong> Component Averaging Methods (BI-<br />

CAV)<br />

Earlier we defined the CAV method as one of the SIRT methods. The Block-<br />

<strong>Iterative</strong> Component Averaging method (BICAV) is the block version of the<br />

CAV method introduced in [7]. As <strong>for</strong> the CAV method we will define the<br />

factor s t j . In the BICAV case st j<br />

is the number of nonzero elements in the j’th<br />

column of At <strong>for</strong> t = 1, 2, . . .,p. The BICAV method can then be written on<br />

the following <strong>for</strong>m:<br />

x k+1<br />

j<br />

= xk j + λk<br />

where ai t(k) 2S = n j=1 st(k)<br />

us to the following matrix <strong>for</strong>m:<br />

<br />

i∈B t(k)<br />

bi − ai , xk a i j,<br />

a i t(k) 2 S<br />

j (a i j )2 , t(k) = (k mod p) + 1 and k ≥ 0. This lead<br />

x k+1 = x k + λkA T t(k) M t(k)(b t(k) − A t(k)x k ), (3.11)<br />

<br />

where M = diag 1/ai t(k) 2 <br />

S <strong>for</strong> all i = 1, . . .,m.<br />

In [4] the following convergence theorem is proven <strong>for</strong> the BICAV method:<br />

Theorem 3.3 For<br />

0 < ǫ ≤ λk ≤ (2 − ǫ)/ρ(A T t(k) M t(k)A t(k)),


3.4 Block-<strong>Iterative</strong> Methods 31<br />

where ǫ is an arbitrarily small but fixed constant and M t(k) are given symmetric<br />

and positive definite matrices with the control t(k), then any sequence generated<br />

by (3.11) converges to a solution <strong>for</strong> (2.1). If in addition x 0 ∈ R(A T ), then x k<br />

converges to the solution of minimum 2-norm.<br />

The BICAV method has the property that <strong>for</strong> p = 1 the method becomes fully<br />

simultaneous, i.e. it becomes the CAV method. For p = m we on the other<br />

hand have that BICAV becomes the well-known Kaczmarz’s method.<br />

3.4.4 Block-<strong>Iterative</strong> Diagonally Relaxed Orthogonal Projections<br />

(BIDROP)<br />

For the general SIRT methods we described a method called DROP and we<br />

will now introduce its block-iterative generalization, which we will call Block-<br />

<strong>Iterative</strong> Diagonally Relaxed Orthogonal Projections (BIDROP).<br />

If we let Wt be positive definite diagonal matrices and Ut be symmetric positive<br />

definite matrices <strong>for</strong> t = 1, 2, . . ., p, then the algorithm <strong>for</strong> the BIDROP method<br />

looks as follows:<br />

x k+1 = x k + λkU t(k)A T t(k) W t(k)(b t(k) − Atkx k ), (3.12)<br />

where t(k) = (k mod p) + 1.<br />

The following convergence theorem is derived <strong>for</strong> the BIDROP method:<br />

Theorem 3.4 Let U be a given symmetric and positive definite matrix, and let<br />

Wt be given positive definite diagonal matrices. If <strong>for</strong> all k ≥ 0,<br />

0 < ǫ ≤ λk ≤ (2 − ǫ)/ρ(UA T t(k) W t(k)A t(k)),<br />

where ǫ is an arbitrarily small but fixed constant, then any sequence generated<br />

by (3.12) converges to a solution. If in addition x 0 ∈ R(UA T ), then the solution<br />

has minimal U −1 -norm.<br />

With only one block, i.e. p = 1, and U1 = S and W1 = W, then we have the<br />

standard DROP method.<br />

The BIDROP method is a general method since Ut and Wt is not specific given.<br />

One of the variants of BIDROP is introduced in [5] and is called BIDROP1.


32 <strong>Iterative</strong> Methods <strong>for</strong> Reconstruction<br />

This method has the following scheme:<br />

x k+1 = x k + λkU<br />

where µ t(k)<br />

q is defined as<br />

µ t(k)<br />

q<br />

m(t(k)) <br />

q=1<br />

=<br />

w t(k)<br />

q = 1<br />

µ t(k)<br />

<br />

q b t(k) − a iq it(k) q , x k<br />

a it(k) q ,<br />

w t(k)<br />

q<br />

a it(k)<br />

q 2 2<br />

, where<br />

<strong>for</strong> q = 1, 2, . . .,m(t). The matrix U is fixed <strong>for</strong> each block, i.e. Ut = U and is<br />

given as<br />

<br />

1<br />

U = diag ,<br />

τj<br />

where τj = max st j |t = 1, . . .,p and st j is the number of nonzero elements in<br />

column j <strong>for</strong> the block At.


Chapter 4<br />

Semi-Convergence and Choice<br />

of Relaxation Parameter<br />

4.1 Semi-Convergence <strong>for</strong> SIRT Methods<br />

For the SIRT methods on the <strong>for</strong>m (3.1) with T = I, theorem 3.1 ensures the<br />

convergence to a solution of the least squares problem Ax − bM, but when<br />

solving linear ill-posed problems with iterative methods we are typically more<br />

interested in the earlier mentioned semi-convergence behaviour. We will now<br />

take a close look at the semi-convergence <strong>for</strong> the SIRT methods [16]. To make<br />

the presentation simpler we assume that m ≥ n, but the used theory can be<br />

applied regardless the dimensions.<br />

We assume that the noise in the right-hand side is additive i.e.,<br />

b = ¯ b + δb,<br />

where ¯ b is the noise-free right-hand side and δb is the noise component which<br />

can be caused by discretization errors and measurement errors.<br />

We want to analyze the semi-convergence behaviour of the SIRT scheme where<br />

T = I. To do this we assume that the relaxation parameter λ is constant <strong>for</strong> all<br />

iterations. For convenience we introduce<br />

B = A T MA and c = A T Mb,


34 Semi-Convergence and Choice of Relaxation Parameter<br />

and let the singular value decomposition (SVD) of M 1<br />

2A be<br />

M 1<br />

2A = UΣV T ,<br />

where Σ = diag(σ1, . . . , σp, 0, . . .,0) with σ1 ≥ σ2 ≥ . . . ≥ σp > 0, and<br />

rank(A) = p.<br />

From the SIRT scheme we get the following:<br />

x k = x k−1 + λA T M(b − Ax k−1 )<br />

= x k−1 + λA T Mb − λA T MAx k−1<br />

= x k−1 + λc − λBx k−1<br />

= (I − λB)x k−1 + λc.<br />

By direct insertion we obtain, <strong>for</strong> k = 1,<br />

Similar we <strong>for</strong> k = 2 get:<br />

Similar we <strong>for</strong> k = 3 get:<br />

x 1 = (I − λB)x 0 + λc.<br />

x 2 = (I − λB)x 1 + λc<br />

x 3 = (I − λB)x 2 + λc<br />

= (I − λB) (I − λB)x 0 + λc + λc<br />

= (I − λB) 2 x 0 + (I − λB)λc + λc<br />

= (I − λB) 2 x 0 + ((I − λB) + I) λc.<br />

= (I − λB) (I − λB) 2 x 0 + ((I − λB) + I)λc + λc<br />

= (I − λB) 3 x 0 + (I − λB) 2 + (I − λB) λc + λc<br />

= (I − λB) 3 x 0 + (I − λB) 2 + (I − λB) + I λc.<br />

It can then be seen that the k’th iteration can be written as<br />

x k = (I − λB) k x 0 k−1<br />

+ λ<br />

<br />

(I − λB) j c.<br />

j=0<br />

Using the SVD <strong>for</strong> M 1<br />

2A we can rewrite B:<br />

where<br />

B =<br />

<br />

M 1<br />

T <br />

2 A M 1<br />

<br />

2A = V Σ T Σ T V T = V FV T , (4.1)<br />

F = diag σ 2 1, σ 2 2, . . . , σ 2 p, 0, . . .,0 .


4.1 Semi-Convergence <strong>for</strong> SIRT Methods 35<br />

By using (4.1) we can then write<br />

k−1 <br />

(I − λB) j =<br />

j=0<br />

=<br />

k−1 T T<br />

V V − λV FV j<br />

k−1 T<br />

= V (I − λF)V j<br />

j=0<br />

j=0<br />

k−1 j T<br />

V (I − λF) V ⎛<br />

k−1 <br />

= V ⎝ (I − λF) j<br />

⎞<br />

⎠ V T<br />

j=0<br />

= V EkV T ,<br />

where the i’th diagonal element of Ek is<br />

<br />

k−1<br />

(1 − λσ 2 i ) j = 1 + (1 − λσ 2 i ) + (1 − λσ 2 i ) 2 + . . . + (1 − λσ 2 i ) k−1<br />

j=0<br />

j=0<br />

= 1 − (1 − λσ2 i )k<br />

1 − (1 − λσ2 i ) = 1 − (1 − λσ2 i )k<br />

λσ2 i<br />

where the <strong>for</strong>mula <strong>for</strong> geometric series is used to obtain the last result. The<br />

matrix Ek then has the following <strong>for</strong>m:<br />

Ek = diag<br />

<br />

1 − (1 − λσ 2 1 )k<br />

λσ 2 1<br />

Assuming that x0 = 0 we can then write x k as<br />

, . . . , 1 − (1 − λσ2 p )k<br />

λσ2 <br />

, 0, . . .,0 .<br />

p<br />

x k = V (λEk)V T c = V (λEk)V T A T Mb (4.2)<br />

= V (λEk)V T V Σ T U T M 1<br />

2( ¯b + δb)<br />

p 2<br />

= 1 − (1 − λσi ) k uT i<br />

i=1<br />

,<br />

M 1<br />

2 ( ¯ b + δb)<br />

where ui and vi are the columns of U and V respectively and ϕ k i = 1−(1−λσ2 i )k<br />

<strong>for</strong> i = 1, 2, . . ., p are the filter factors [20, p. 138].<br />

The minimum-norm solution to the weighted least squares problem with the<br />

noise-free right-hand side ¯x = argminAx − ¯ bM can, using SVD, be written as<br />

where<br />

σi<br />

¯x = V EΣ T U T M 1<br />

2¯b, (4.3)<br />

<br />

1<br />

E = diag<br />

σ2 ,<br />

1<br />

1<br />

σ2 , . . . ,<br />

2<br />

1<br />

σ2 <br />

, 0, . . .,0 .<br />

p<br />

vi,


36 Semi-Convergence and Choice of Relaxation Parameter<br />

The error in the k’th iterate can then be expressed as<br />

x k − ¯x = V (λEk)Σ T U T M 1<br />

2 ( ¯b + δb) − V EΣ T U T M 1<br />

2¯b = V (λEkΣ T U T M 1<br />

2¯b + λEkΣ T U T M 1<br />

2δb − EΣ T U T M 1<br />

2¯b <br />

= V (λEk − E)Σ T U T M 1<br />

2¯b + λEkΣ T U T M 1<br />

<br />

2δb .<br />

We then define D k 1 and Dk 2 as<br />

and<br />

D k 1 = (λEk − E)Σ T = −diag<br />

D k 2 = λEkΣ T = diag<br />

<br />

<br />

(1 − λσ 2 1 )k<br />

σ1<br />

1 − (1 − λσ 2 1 )k<br />

σ1<br />

ˆb =<br />

1 T<br />

U M 2¯b δˆb = U T M 1<br />

2 δb,<br />

then we can write the projected error e V,k as<br />

e V,k ≡ V T (x k − x ∗ ) = D k 1 ˆ b + D k 2 δˆ b.<br />

For the later analysis we define the following functions:<br />

Φ k (σ, λ) = (1 − λσ2 ) k<br />

Ψ k (σ, λ) = 1 − (1 − λσ2 ) k<br />

We can then write the j’th component <strong>for</strong> e as<br />

e V,k<br />

j = −Φk (σi, λ) ˆ bj + Ψ k (σi, λ)δ ˆ bj,<br />

, . . . , (1 − λσ2 p )<br />

<br />

, 0, . . .,0 (4.4)<br />

σp<br />

, . . . , 1 − (1 − λσ2 p )<br />

<br />

, 0, . . .,0 ,(4.5)<br />

σ<br />

σ<br />

σp<br />

(4.6)<br />

where the first term is an iteration-error and the second term is a noise-error. It<br />

is the interplay between the iteration-error and the noise error that explains the<br />

semi-convergence behaviour. Figure 4.1 shows Φ k (σ, λ) and Ψ k (σ, λ) <strong>for</strong> fixed λ<br />

and various σ as function of the iteration index k. It can be seen that <strong>for</strong> small<br />

values of k the noise-error is negligible and the iteration seems to converge to<br />

the exact solution. When the noise-error reaches the order of magnitude of the<br />

approximation error, then the propagated noise-error is no longer hidden in the<br />

regularized solution, and the total error starts to increase.<br />

We now want to investigate the behaviour of the functions Φ k (σ, λ) and Ψ k (σ, λ).


4.1 Semi-Convergence <strong>for</strong> SIRT Methods 37<br />

20<br />

10<br />

σ = 0.0468<br />

Φ k (σ,λ)<br />

Ψ k (σ,λ)<br />

0<br />

0 10 20 30<br />

40<br />

20<br />

σ = 0.0247<br />

Φ k (σ,λ)<br />

Ψ k (σ,λ)<br />

0<br />

0 10 20 30<br />

30<br />

20<br />

10<br />

300<br />

200<br />

100<br />

σ = 0.0353<br />

Φ k (σ,λ)<br />

Ψ k (σ,λ)<br />

0<br />

0 10 20 30<br />

σ = 0.0035<br />

Φ k (σ,λ)<br />

Ψ k (σ,λ)<br />

0<br />

0 10 20 30<br />

Figure 4.1: The behaviour of Φ k (σ, λ) and Ψ k (σ, λ) <strong>for</strong> fixed λ and various σ as<br />

function of the iteration index k.<br />

Proposition 4.1 Let<br />

0 < ǫ ≤ λ ≤ 2/σ 2 1 − ǫ, and 0 < σp ≤ σ < 1<br />

√ λ . (4.7)<br />

a) For λ and σ fixed then Φ k (σ, λ) is decreasing and convex and Ψ k (σ, λ) is<br />

increasing and concave.<br />

b) For all integers k > 0 it holds that Φ k (σ, λ), Ψ k (σ, λ) ≥ 0 and Φ k (σ, 0) =<br />

1<br />

σ , Ψk (σ, 0) = 0.<br />

c) For λ fixed and k ≤ 0, then as function as σ Φ k (σ, λ) is decreasing.<br />

The proof <strong>for</strong> this proposition can be found in [16].<br />

Remark 4.2 The upper bound <strong>for</strong> σ in (4.7) is ˆσ = 1/ √ λ. When 0 < ǫ ≤ λ ≤<br />

1/σ 2 1 then ˆσ ≥ σ1 and when 1/σ 2 1 < λ < 2/σ2 1 then ˆσ ≥ 1/ 2/σ 2 1 = σ1/ √ 2.<br />

Hence ˆσ ≥ σ1/ √ 2 <strong>for</strong> all relaxation parameters λ satisfying (4.7).<br />

For small values of k the noise-errors expressed via Ψ k (σ, λ) is negligible and the<br />

iteration approaches the exact solution. When the noise-error reaches the same<br />

order of magnitude as the approximation error, the propagated noise-error is no<br />

longer hidden in the iteration vector and the total error starts to increase.


38 Semi-Convergence and Choice of Relaxation Parameter<br />

Proposition 4.3 Assume that (4.7) of proposition 4.1 holds, and let λ be fixed.<br />

For k ≥ 2 it holds: There exist a point σ ∗ k ∈ (0, 1/ (λ)) such that<br />

where σ ∗ k<br />

is unique and<br />

σ ∗ k<br />

= arg max<br />

0


4.2 Choice of Relaxation Parameter 39<br />

10 3<br />

10 2<br />

10 1<br />

10 −3<br />

10 0<br />

Ψ k (σ,100)<br />

10 −2<br />

σ<br />

k = 10<br />

k = 30<br />

k = 90<br />

k = 270<br />

1/σ<br />

σ *<br />

k<br />

Figure 4.2: The function Ψ k (σ, λ) as function of σ <strong>for</strong> λ = 100 and k = 10, 30, 90<br />

and 270. The dashed line illustrates 1/σ. The black dots denotes the maximum of the<br />

functions.<br />

4.2.1 Training to Optimal Choice<br />

The purpose of this strategy is to find a constant relaxation parameter λ = λk<br />

of optimal choice, when the exact solution ¯x is known. But how do we define<br />

the concept of an “optimal λ-value”. Since the ART and the SIRT methods<br />

have different properties, we will treat each class of methods separately.<br />

SIRT Methods<br />

Usually the goal of a reconstruction method is to minimize the relative error.<br />

The challange is to do this when the exact solution is unknown; but we can study<br />

the behaviour of the methods <strong>for</strong> problems with known solutions. Figure 4.3<br />

shows the relative error as function of the iteration number k <strong>for</strong> 9 values of λ <strong>for</strong><br />

three different noise levels. For all three noise levels it holds that the minimum<br />

relative error reaches the same resolution limit <strong>for</strong> many different values of λ.<br />

Figure 4.4 illustrates the minimum relative error <strong>for</strong> different λ-values <strong>for</strong> δ =<br />

0.03. The green lines illustrate an interval that includes ±0.015% of the resolution<br />

limit. From this we observe that we <strong>for</strong> almost all the λ-values have the<br />

10 −1


40 Semi-Convergence and Choice of Relaxation Parameter<br />

λ = 10<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 80<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 120<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 30<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 100<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 130<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 60<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 110<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 150<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

(a) Noise level δ = 0.03<br />

λ = 10<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 80<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 120<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 30<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 100<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 130<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 10<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 80<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 120<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

(c) Noise level δ = 0.08<br />

λ = 30<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 100<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 130<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 60<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 110<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 150<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

(b) Noise level δ = 0.05<br />

λ = 60<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 110<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

λ = 150<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

Figure 4.3: The figure shows the relative error histories <strong>for</strong> nine values of λ using a<br />

SIRT method. Each subfigure represents a specific noise level.<br />

relative error<br />

0.5<br />

0.45<br />

0.4<br />

0.35<br />

0.3<br />

0.25<br />

Minimum relative error as function of lambda<br />

0.2<br />

0 50 100 150<br />

λ<br />

Figure 4.4: Illustration of the minimum relative error <strong>for</strong> different λ-values <strong>for</strong> a<br />

SIRT method. The dots denote the relative errors while the green dashed lines show<br />

the interval of ±0.015% of the resolution limit.


4.2 Choice of Relaxation Parameter 41<br />

k opt<br />

Optimal number of iterations k opt as function of lambda<br />

100<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 25 50 75 100 125 150<br />

λ<br />

Figure 4.5: The optimal number of iterations kopt as function of the λ-values <strong>for</strong> a<br />

SIRT method.<br />

minimum relative error inside this interval. The only exception is when λ is<br />

close to either 0 or 2<br />

σ2 . We are now convinced that the minimum relative error<br />

1<br />

reaches the resolution limit <strong>for</strong> many different values of λ, and we then need<br />

another way to distinguish between the different λ-values.<br />

We there<strong>for</strong>e take a second look at figure 4.3. The difference between the error<br />

histories <strong>for</strong> different λ-value is the iteration number <strong>for</strong> which the minimum<br />

relative error is reached. From this we would like to define the optimal λ-value<br />

as the λ which gives rise to the fastest convergence to the smallest relative error<br />

in the solution. “Training” is a strategy that selects the optimal λ from a test<br />

problem with a known solution. The hope is that the λ chosen this way is also<br />

a good choice <strong>for</strong> a real problem. This is the case if the test problem is chosen<br />

to reflect the properties of the real problem.<br />

This definition leads us to a strategy in two parts, where the first part is to<br />

determine the resolution limit and the second part is to determine the λ-value,<br />

which reaches the resolution limit using the smallest number of iterations. From<br />

figure 4.3 and 4.4 we conclude that using λ = 1<br />

σ2 would be a safe choice of<br />

1<br />

relaxation parameter to determine the resolution limit since it represents the<br />

midpoint of the convergence interval. We there<strong>for</strong>e find the minimum relative<br />

error and define the upper bound of the resolution limit to be the relative error<br />

plus 1%. We define the upper bound of the resolution limit to be ub.<br />

For the second part of the strategy we use a modified version of the golden<br />

section search to find the value of λ that reaches the resolution limit within the<br />

smallest number of iterations [38]. The requirement <strong>for</strong> using golden section<br />

search is that the function that we want to minimize is unimodal. Figure 4.5<br />

illustrates the optimal number of iterations kopt as function as λ. From this


42 Semi-Convergence and Choice of Relaxation Parameter<br />

figure it seems reasonable to assume, that we have an unimodal function. We<br />

also notice, that the λ value we seek lies in the right part of the interval.<br />

In our modified golden section search we denote the search interval (a, b), which<br />

is the convergence interval <strong>for</strong> the given SIRT method. For this method we<br />

also need two interior points, which we define to be c = a + r(b − a) and<br />

d = a+(1 −r)(b −a), where r = 3−√ 5<br />

2 . The reason <strong>for</strong> this choice can be found<br />

in [38].<br />

We then define the function values fc and fd of the interior points c and d to<br />

be the iteration number which corresponds to the solution with the smallest<br />

relative error with λ equal to c and d respectively. We also define the smallest<br />

relative error <strong>for</strong> each of the interior points as xc and xd.<br />

In the ordinary golden section search the function values are used to reduce<br />

the interval. In our modified version we also use the knowledge of the value of<br />

the smallest relative error. We there<strong>for</strong>e reduce the interval according to the<br />

following properties in the given order:<br />

If xc > ub: This means that the relative error <strong>for</strong> λ = c has not reached the<br />

resolution limit, and since tests have shown that the optimal value lies in<br />

the right part of the interval, we can reduce the interval to (c, b).<br />

If xd > ub: In this case we have that the relative error <strong>for</strong> λ = d is outside the<br />

resolution interval. When we reach this point, then we know that λ = c<br />

is inside the resolution interval, and using this in<strong>for</strong>mation we can remove<br />

the right part of the interval, such that our new interval is (a, d).<br />

If fc ≥ fd: In this case both the point c and d are allowed values of λ. Our<br />

new objective is to determine the minimum number of iterations used. If<br />

fc is greather than or equal to fd, then acoording to the unimodality we<br />

can reduce the interval to (c, b). We choose this case to be the tiebreaker,<br />

if fc = fd, since we have assumed that the optimal value lies in the right<br />

part of the interval.<br />

If fd > fc: In the last case we again have that both the point c and d are<br />

allowed values of λ. We reduce the interval to (a, d) according to the<br />

assumption of unimodality.<br />

The reductions continue until the difference between c and d is very small, and<br />

the optimal value of λ is then chosen to be λ = (c+d)<br />

2 .


4.2 Choice of Relaxation Parameter 43<br />

λ = 0.1<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.7<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 1.4<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.3<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.9<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 1.7<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

(a) Noise level δ = 0.03<br />

λ = 0.5<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 1.1<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 2<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.1<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.7<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 1.4<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.3<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.9<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 1.7<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.1<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.7<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 1.4<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

(c) Noise level δ = 0.08<br />

λ = 0.3<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.9<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 1.7<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 0.5<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 1.1<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 2<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

(b) Noise level δ = 0.05<br />

λ = 0.5<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 1.1<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

λ = 2<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5 10<br />

Figure 4.6: The figures show the relative error histories <strong>for</strong> nine values of λ using an<br />

ART method. Each subfigure represent a specific noise level.<br />

ART Methods<br />

Inspired by the modified golden section search method <strong>for</strong> the SIRT algorithms<br />

we look at figure 4.6, which shows the relative error as function of the iteration<br />

number k <strong>for</strong> nine values of λ <strong>for</strong> three different noise levels. We notice that<br />

not all values of λ reach the resolution limit. From figure 4.7 we clearly see<br />

that only a small number of λ-values reaches the so-called resolution limit. We<br />

would like to keep the definition of the optimal λ-value and the overall structure<br />

of the strategy to find this, but we need to make some changes that makes the<br />

method fit to the ART methods.<br />

Again we keep our strategy in two parts, where the first part is to determine the<br />

resolution limit and the second part is to determine the λ-value, which reaches<br />

the resolution limit using the fewest number of iterations. From figures 4.6 and


44 Semi-Convergence and Choice of Relaxation Parameter<br />

relative error<br />

0.5<br />

0.45<br />

0.4<br />

0.35<br />

0.3<br />

0.25<br />

Minimum relative error as function of lambda<br />

0.2<br />

0 0.5 1 1.5 2<br />

λ<br />

Figure 4.7: The figures shows the relative error histories <strong>for</strong> 9 values of λ using a<br />

ART method. Each subfigure represent a specific noise level.<br />

k opt<br />

10<br />

8<br />

6<br />

4<br />

2<br />

Optimal number of iterations k opt as function of lambda<br />

0<br />

0 0.5 1 1.5 2<br />

λ<br />

Figure 4.8: The optimal number of iterations kopt as function of the λ-values <strong>for</strong> an<br />

ART method.<br />

4.7 we conclude that using λ = 0.25 would be an appropriate choice. Note<br />

that the convergence interval <strong>for</strong> all ART methods is (0, 2). Again we find the<br />

smallest relative error and define the upper bound of the resolutions limit ub to<br />

be this relative error plus 1%.<br />

For the second part of the strategy we use another modified version of golden<br />

section search. From figure 4.8 and 4.7 we conclude that it sounds reasonable<br />

to assume that the function is unimodal, since most of the interval will be<br />

discarded, since the relative error is above the upper bound of the resolution<br />

limit. We notice that <strong>for</strong> the ART methods the λ-value we seek lies in the left<br />

part of the interval.<br />

As be<strong>for</strong>e we denote the search interval (a, b), where a = 0 and b = 2. The<br />

interior points c and d, the function values fc and fd and the value xc and xd


4.2 Choice of Relaxation Parameter 45<br />

are defined as above. The reduction of the interval follows the given order:<br />

If xd > ub: This means that the relative error <strong>for</strong> λ = d has not reached the<br />

resolution limit, and since tests have shown that the optimal value lies in<br />

the left part of the interval, we can reduce the interval to (a, d).<br />

If xc > ub: In this case we have that the relative error <strong>for</strong> λ = c is outside the<br />

resolution interval. When we reach this point, then we know that λ = d<br />

is inside the resolution interval, and using this in<strong>for</strong>mation we can remove<br />

the right part of the interval, such that our new interval is (c, b).<br />

If fc > fd: In this case both the point c and d are allowed values of λ. Our<br />

new objective is to determine the minimum number of iterations used. If<br />

fc is greather than or equal to fd, then acoording to the unimodality we<br />

can reduce the interval to [c, b].<br />

If fd ≥ fc: In the last case we again have that both the point c and d are<br />

allowed values of λ. We reduce the interval to (a, d) acoording to the<br />

assumption of unimodality. We choose this case to be the tiebreaker, if<br />

fc = fd, since we have assumed that the optimal value lies in the left part<br />

of the interval.<br />

Again the reductions continue until the difference between c and d is very small,<br />

and the optimal value of λ is chosen to be λ = (c+d)<br />

2 .<br />

Introducing Maximum Number of Iterations<br />

In both the implementation of the strategy <strong>for</strong> SIRT methods and ART methods<br />

a default number of iteration is used when the resolution limit is determined.<br />

For some problems it could be the case that we, with the default number of<br />

iterations, do not reach the point in the semi-convergence where the relative<br />

error starts to increase. It is there<strong>for</strong>e possible <strong>for</strong> the users to increase the<br />

maximum number of iterations by use of an input parameter.<br />

This input parameter can, on the other hand, also be decreased if the user will<br />

only allow a smaller number of iterations. In this case a possible consequence<br />

could be, that with the given number of iterations the solutions does not reach<br />

the point in the semi-convergence, where the relative error again starts to increase.<br />

If this is not the case <strong>for</strong> λ = 1<br />

σ2 <strong>for</strong> the SIRT methods and λ = 0.25 <strong>for</strong><br />

1<br />

the ART methods, then our introduced method does not find the actual resolution<br />

limit. The problem will then not have the earlier shown properties, and the<br />

problem to solve is completely different. Figure 4.9 shows the relative errors <strong>for</strong>


46 Semi-Convergence and Choice of Relaxation Parameter<br />

λ = 10<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5<br />

λ = 80<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5<br />

λ = 120<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5<br />

λ = 30<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5<br />

λ = 100<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5<br />

λ = 130<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5<br />

λ = 60<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5<br />

λ = 110<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5<br />

λ = 150<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 5<br />

Figure 4.9: The figure shows the relative error histories <strong>for</strong> nine values of λ using a<br />

SIRT method when the maximum number of iteration is 7.<br />

relative error<br />

0.5<br />

0.45<br />

0.4<br />

0.35<br />

0.3<br />

0.25<br />

Minimum relative error as function of lambda<br />

0.2<br />

0 50 100 150<br />

λ<br />

Figure 4.10: Illustration of the minimum relative error <strong>for</strong> different λ-values <strong>for</strong> a<br />

SIRT method, when the maximum number of iteration is 7. The dots denote the relative<br />

errors while the green dashed lines show the interval of ±0.015% of the resolution limit,<br />

which is found in 1<br />

σ 2 1<br />

.


4.2 Choice of Relaxation Parameter 47<br />

k opt<br />

8<br />

7.5<br />

7<br />

6.5<br />

Optimal number of iterations k opt as function of lambda<br />

6<br />

0 25 50 75 100 125 150<br />

λ<br />

Figure 4.11: The optimal number of iterations kopt as function of the λ-values <strong>for</strong> a<br />

SIRT method, when the maximum number of iterations is 7.<br />

nine different values of λ, i.e. the same as figure 4.3. The only difference is that<br />

the allowed number of iterations is 7. We observe that the minimum relative<br />

error <strong>for</strong> all nine values of λ is 7 which indicates that the actually minimum is<br />

not found. Figure 4.10 illustrates the minimum relative error <strong>for</strong> different values<br />

of λ. We now notice that the interval <strong>for</strong> the resolution limit no longer contain<br />

most of the relative errors. Figure 4.11 shows the optimal number of iterations<br />

as function as λ. We observe that all λ-values give rise to the same number<br />

of iterations 7, which is the maximum number of iterations. In this case we<br />

cannot rely on the fact, that the introduced strategy <strong>for</strong> finding λ will return a<br />

reasonable result.<br />

The implemented versions of the defined strategies contain some kind of check,<br />

that can determine if the actual resolution limit is reached. If this is the case,<br />

then the original strategy is used. Otherwise the program uses a different approach.<br />

In the case the resolution limit is not reached, then the used number<br />

of iterations is the same <strong>for</strong> almost every λ-value, which can be seen in figure<br />

4.11. The relative error at this point is different, and the golden section search<br />

will consider the relative error instead of the number of iterations.<br />

4.2.2 Line Search<br />

The next strategy we will present is based on picking λk such that the error<br />

¯x − x k 2 is minimized in each iteration. This type of methods are also known<br />

as line search and are only derived <strong>for</strong> SIRT methods, where T = I [2], [9], [10],<br />

[11]. In the following we will derive the line-search strategy <strong>for</strong> the different<br />

SIRT methods, but we will assume that the problem is consistent, i.e. A¯x = b,


48 Semi-Convergence and Choice of Relaxation Parameter<br />

x k<br />

where ¯x denotes the exact solution.<br />

x k+1<br />

p<br />

✸<br />

k<br />

Figure 4.12: Illustratation of line search<br />

In general we can write all the SIRT methods as<br />

¯x<br />

x k+1 = x k + λkp k , (4.10)<br />

where p k then varies with the method. When using line search the aim is to find<br />

the minimum euclidean distance from the next iteration to the exact method:<br />

min x k+1 − ¯x2.<br />

By looking at figure 4.12 we see that the minimum solution also can be found<br />

as finding x k+1 such that the direction p k from the existing step is orthogonal<br />

to the vector x k+1 − x ∗ , i.e.<br />

〈p k , x k+1 − ¯x〉 = 0.<br />

Using the expression <strong>for</strong> the method (4.10) we then get:<br />

〈p k , x k + λkp k − ¯x〉 = 〈p k , x k − ¯x〉 + λk〈p k , p k 〉 = 0.<br />

From this it follows that<br />

λk = 〈pk , ¯x − xk 〉<br />

pk2 .<br />

2<br />

We will now derive the <strong>for</strong>mula <strong>for</strong> all the SIRT methods, where T = I, i.e. we<br />

have that p k = A T M(b − Ax k ). For the numerator we get:<br />

〈A T M(b − Ax k , ¯x − x k )〉 =<br />

〈M(b − Ax k ), A(¯x − x k )〉 =<br />

〈M(b − Ax k ), A¯x − Ax k 〉.


4.2 Choice of Relaxation Parameter 49<br />

We then use that A¯x = b and define r k = b − Ax k . This gives us the following<br />

<strong>for</strong> the numerator:<br />

For the denominator we get:<br />

〈M(b − Ax k ), b − Ax k 〉 = 〈Mr k , r k 〉.<br />

p k 2 2 = A T M(b − Ax k ) 2 2 = A T Mr k 2 2.<br />

This gives us the following method to determine λk:<br />

λk = 〈Mrk , rk 〉<br />

ATMr k2 . (4.11)<br />

2<br />

4.2.3 Relaxation to Control Noise Propagation<br />

We will now introduce two strategies to choose the relaxation parameter λk.<br />

Both methods arise from the analysis of the semi-convergence behaviour and are<br />

only derived <strong>for</strong> the SIRT methods where T = I, but in the software package<br />

the developed theory can also be used <strong>for</strong> SIRT methods where T = I although<br />

the theory is not valid. The motivations <strong>for</strong> these methods are to monitor and<br />

control the noise-part of the error. The methods are presented in [16] and all<br />

the used proofs can be found there also.<br />

The first strategy we will denote Ψ1-based relaxation and it takes the following<br />

<strong>for</strong>m:<br />

λk =<br />

√<br />

2<br />

σ2 1<br />

2<br />

σ2 1<br />

<strong>for</strong> k = 0, 1<br />

(1 − ζk) <strong>for</strong> k ≥ 2<br />

where ζk is the unique root in (0, 1) of the polynomial (4.9).<br />

, (4.12)<br />

The following theorem ensures that the iterates produced with the strategy<br />

(4.12) converge towards the weighted least squares solutions:<br />

Theorem 4.6 The iterates produced using the Ψ1-based relaxation strategy (4.12)<br />

converge toward a solution of minx Ax − bM.<br />

We first assume that λ is fixed in the first k iterations<br />

λj = λ, j = 0, 1, . . ., k − 1.


50 Semi-Convergence and Choice of Relaxation Parameter<br />

With this assumption we can use the theory of semi-convergence from section<br />

4.1. We let x k and ¯x k denote the iterates from (3.1) with noisy and noise free<br />

data respectively. The error in the k’th iterate satisfies<br />

x k − ¯x2 ≤ ¯x k − ¯x2 + x k − ¯x k 2,<br />

and the error is decomposed into two parts the iteration error ¯x k − ¯x and the<br />

noise error x k − ¯x k . Using (4.2), (4.3), (4.4) and (4.5) we get<br />

¯x k − ¯x = V (λEk)Σ T U T M 1<br />

1<br />

2¯ T T<br />

b − V EΣ U M 2¯b = V (λEk − E)Σ T U T M 1<br />

2¯b = V D k 1U T M 1<br />

2¯b x k − ¯x k = V (λEk)Σ T U T M 1<br />

2 b − V (λEk)Σ T U T M 1<br />

2¯b = V D k 2U T M 1<br />

2(b − ¯b) = V D k 2UT M 1<br />

2δb.<br />

The noise-error is then bounded by<br />

x k − ¯x k 2 ≤ max<br />

1≤i≤p Ψk (σi, λ)M 1<br />

2 δb2.<br />

We then assume that λ ∈ (0, 1/σ2 1 ] and using Remark 4.2 we have that ˆσ ≥ σ1<br />

and it then follows that <strong>for</strong> k ≥ 2<br />

max<br />

1≤i≤p Ψk (σi, λ) ≤ max Ψ<br />

0≤σ≤σ1<br />

k (σ, λ) ≤ max<br />

0≤σ≤ˆσ Ψk (σ, λ) = Ψ k (σ ∗ k, λ). (4.13)<br />

It then follows using (4.6) and (4.8)<br />

x k − ¯x k 2 ≤ Ψ k (σ ∗ k, λ)M 1<br />

2 δb2 =<br />

<br />

1 −<br />

1 − λ 1−ζk<br />

1−ζk<br />

λ<br />

λ<br />

k<br />

M 1<br />

2δb2<br />

= √ λ 1 − ζk k √ M<br />

1 − ζk<br />

1<br />

2δb2. (4.14)<br />

Then consider the k’th iteration and choose λk from (4.12). With the assumption<br />

that λj + 1/λj ≈ 1, which holds <strong>for</strong> (4.12), we can assume that (4.14) holds<br />

approximatively. By substituting (4.12) into (4.14) we get <strong>for</strong> k ≥ 2<br />

x k − ¯x k 2 ≤ √ λ 1 − ζk k √ M<br />

1 − ζk<br />

1<br />

2δb2<br />

<br />

<br />

√<br />

2<br />

1 − ζ<br />

1 − ζk<br />

σ1<br />

k k √ M<br />

1 − ζk<br />

1<br />

2δb2<br />

√<br />

2<br />

σ1<br />

(1 − ζ k k)M 1<br />

2 δb2.


4.2 Choice of Relaxation Parameter 51<br />

This implies that the Ψ1-based strategy gives an upper bound <strong>for</strong> the noise-part.<br />

For the case λ ∈ (1/σ2 1 , 2/σ2 1 ) equation (4.13) only holds approximatively. However<br />

<strong>for</strong> the Ψ1-based relaxation we have that λ ≤ 1/σ2 1 <strong>for</strong> small values of k.<br />

The second strategy we denote Ψ2-based relaxation and it takes the following<br />

<strong>for</strong>m:<br />

λk =<br />

√<br />

2<br />

σ2 1<br />

2<br />

σ2 1−ζk<br />

1 (1−ζk k )2 <strong>for</strong> k ≥ 2<br />

<strong>for</strong> k = 0, 1<br />

. (4.15)<br />

We use the same approach as <strong>for</strong> the Ψ1-based relaxation and substitute (4.15)<br />

into (4.14) and we will then get the following bound <strong>for</strong> the noise error using<br />

Ψ2-based relaxation:<br />

x k − ¯x k 2 ≤<br />

√ 2<br />

σ1<br />

M 1<br />

2δb2<br />

In [16] it is shown that iterates produced with the Ψ2-based relaxation converge<br />

towards the weighted least squares solution.<br />

In [16] the possibility <strong>for</strong> using an accelerated modification of the strategies Ψ1<br />

and Ψ2 are discussed. The idea is to choose ¯ λk = τkλk <strong>for</strong> k ≥ 2, where τk is<br />

the parameter to be chosen. For the Ψ1 strategy this modification would mean<br />

that<br />

¯λk = τk,1<br />

2<br />

(1 − ζk) k ≥ 2. (4.16)<br />

σ 2 1<br />

For τk,1 < (1−ζk) −1 we would stay inside the convergence interval. By choosing<br />

the parameter τk,1 to be constant <strong>for</strong> all iterations k we must use τk,1 = τ1 =<br />

(1 −ζ1) −1 ≃ 1.5. For the Ψ2 strategy the modification takes the following <strong>for</strong>m:<br />

¯λk = τk,2<br />

2<br />

σ 2 1<br />

1 − ζk<br />

(1 − ζ 2 k )2 k ≥ 2, (4.17)<br />

and with τk,2 < (1−ζ 2 k )2 /(1−ζk) the convergence is maintained. For a constant<br />

value of τk,2 we have the upper bound τ2 ≃ 1.18.<br />

Even though the theory shows that the upper bound of the constant parameters<br />

τ1 and τ2 is 1.5 and 1.18 respectively experiments is in [16] illustrates that it<br />

pays to allow a larger value. We there<strong>for</strong>e choose that the reasonable choices<br />

are τ1 = 2 and τ2 = 1.5.


52 Semi-Convergence and Choice of Relaxation Parameter


Chapter 5<br />

Stopping Rules<br />

In the previous chapter we discussed methods <strong>for</strong> choosing the relaxation parameter.<br />

In this chapter we will look at strategies <strong>for</strong> determining the optimal<br />

number of iterations k∗. We will present three strategies. The first two strategies<br />

require some kind of knowledge of the noise level δ and also a user-chosen<br />

parameter τ. We will <strong>for</strong> both these strategies present a training strategy to<br />

choose a reasonable value of τ. In the following chapter we let · denote the<br />

2-norm · 2.<br />

5.1 Stopping Rules with Training<br />

In this section we will introduce a general rule to determine the appropriate<br />

stopping index k∗ and from this general rule we will focus on two already known<br />

special cases, which are all described in [15].<br />

As in section 4.1 we assume the following additive noise model:<br />

b = ¯ b + δb,<br />

where ¯ b is the noise free right-hand side and δb is the noise component, which<br />

may come from both discretization errors and measurement errors. We also


54 Stopping Rules<br />

assume that the norm of the error is known:<br />

δ = δb.<br />

For notational convenience we assume that λ = λk.<br />

Proposition 5.1 Let {xk } be given from (3.1), where T = I and rk = M 1<br />

2 (b −<br />

Axk ). Put Q = M 1<br />

2 AATM 1<br />

2, W = I − λβ<br />

2(1−α)<br />

Q, where α, β are given real<br />

numbers. Let ¯ b ∈ R(A) and ¯x be any solution to Ax = ¯ b and let −1 ≤ τk ≤ 1.<br />

Put ek = ¯x − x k and t1 = 2λ(1 − α)〈r k , Wr k 〉 then<br />

where<br />

e 2 k+1 = e2k − λ(dα,β − 2τkδM 1<br />

2r k ) − t1, (5.1)<br />

dα,β = 〈r k , (2α + β − 1)r k + (1 − β)r k+1 〉. (5.2)<br />

The proof can be found in [15]. From (5.1) we get<br />

e 2 k+1 ≤ e 2 k − λ(dα,β − τδM 1<br />

2 r k ) − t1, (5.3)<br />

where τ = 2 maxk |τk|, such that τ ∈ (0, 2). This means that the error is<br />

decreasing as long as t1 ≥ 0, dα,β − τδM 1<br />

2 r k ≥ 0.<br />

This lead us to the following general rule:<br />

α, β-rule:<br />

dα,β<br />

rk 1<br />

≤ τδM 2 (5.4)<br />

<br />

By using the α, β-rule we search <strong>for</strong> the smallest iteration number k = kα,β,<br />

where the monotonicity property ¯x − xk < ¯x − xk+1 are guaranteed. (If<br />

dα,β/r0 ≤ τδM 1<br />

2 then kα,β = 0.)<br />

Proposition 5.2 Let α, β ∈ (0, 1). Then<br />

and<br />

λ ≤ λ1 =<br />

λ ≤ λ2 =<br />

2(1 − α)<br />

βσ 2 1<br />

2α<br />

(1 − β)σ 2 1<br />

⇒ t1 ≥ 0,<br />

⇒ dα,β ≥ 0.<br />

The proof <strong>for</strong> can be found in [15]. Using this proposition we should take<br />

⇒ α + β ≥ 1 and<br />

λ ≤ λmax = min(λ1, λ2). It can now be seen that λ1 ≤ 2<br />

σ 2 1


5.1 Stopping Rules with Training 55<br />

⇒ α + β ≤ 1. This means that λ1 ≤ λ2 ⇒ α + β ≥ 1. From this it<br />

follows that<br />

λ2 ≤ 2<br />

σ 2 1<br />

λmax = λ1 ≤ 2<br />

σ 2 1<br />

= λ2 ≤ 2<br />

σ 2 1<br />

= 2<br />

σ 2 1<br />

The rule corresponding to λmax = 2<br />

σ 2 1<br />

is<br />

if α + β ≥ 1<br />

if α + β ≤ 1<br />

if α + β = 1. (5.5)<br />

dα,1−α = 〈r k , (2α + 1 − α − 1)r k + (1 − 1 + α)r k+1 〉<br />

= 〈r k , αr k + αr k+1 〉.<br />

The ME-rule which we will describe later is a rule of the just mentioned <strong>for</strong>m.<br />

5.1.1 The Discrepancy Principle<br />

We will now introduce a specific variant of the α, β-rule (5.4), the well-known<br />

discrepancy principle (DP) of Morozov. To gain the DP-rule we let α = 0.5, β =<br />

1 and then by (5.2), d0.5,1 = r k 2 = dDP. The stopping index k = k0.5,1 = kDP<br />

is then the first index <strong>for</strong> which<br />

DP-rule: r k ≤ τδM 1<br />

2 . (5.6)<br />

We note that from proposition 5.2 that λ2 = +∞ and λ1 = 1<br />

σ2 . Hence <strong>for</strong> the<br />

1<br />

DP-rule the error ek is monotonically decreasing <strong>for</strong> k = 1, 2, . . .,kDP assuming<br />

that λ ∈ (0, 1/σ2 1 ).<br />

Since we introduced DP as a specific variant of the α, β-rule, <strong>for</strong>mula (5.6) is<br />

only valid <strong>for</strong> the SIRT methods where T = I. By using the original version of<br />

the discrepancy principle we can also <strong>for</strong>mulate the discrepancy priciple <strong>for</strong> the<br />

remaining methods. For these methods the stopping index k = kDP is the first<br />

index <strong>for</strong> which<br />

Ax k − b2 ≤ τδ.<br />

5.1.2 The Monotone Error Rule<br />

Another specific variant of the α, β-rule (5.4) the monotone error rule (ME) by<br />

Hämarik and Tautenhahn [23]. We let α = 1, β = 0 and get d1,0 = dME =


56 Stopping Rules<br />

〈r k , r k + r k+1 〉. The stopping index k = k0.5,1 = kME is the first index <strong>for</strong><br />

which<br />

ME-rule:<br />

dME<br />

rk 1<br />

≤ τδM 2 . (5.7)<br />

<br />

From proposition 5.2 we get that λ2 = 2<br />

σ2 . The expression <strong>for</strong> λ1 cannot be<br />

1<br />

used directly from proposition 5.2 and we must there<strong>for</strong>e look at the definition<br />

of t1 given in proposition 5.1. We then have<br />

t1 = 2λ(1 − α)〈r k , Wr k 〉<br />

= 2λ〈r k , (1 − α)Wr k 〉<br />

= 2λ<br />

<br />

r k <br />

, (1 − α)I − λβ<br />

2 Q<br />

In this case t1 = 0 and it follows that λmax = 2<br />

σ 2 1<br />

<br />

r k<br />

<br />

.<br />

in accordance with (5.5).<br />

For the ME-rule the error ek monotonically decreases <strong>for</strong> k = 1, 2, . . ., kME<br />

assuming that λ ∈ (0, 2/σ 2 1). The ME-rule in this <strong>for</strong>m is only valid <strong>for</strong> the<br />

SIRT methods, where T = I.<br />

A further investigation and comparison of rules (5.6) and (5.7) can be found in<br />

[15].<br />

5.1.3 The Training Part<br />

To generate effective stopping rules <strong>for</strong> the DP-rule and the ME-rule we will<br />

use training to teach the rule when to stop <strong>for</strong> a certain data set, the training<br />

sample. Our hope is that that it will be successful when it is used on different<br />

data sets not too distant from the training sample.<br />

From the inequality (5.3) we have that<br />

where<br />

e 2 k − e 2 k+1 ≥ Pk,<br />

Pk = λ(dα,β − τδM 1<br />

2 · r k ).<br />

We then have that Pk acts like a predictor <strong>for</strong> e 2 k − e2 k+1 . As long as Pk > 0<br />

then the iterations should be continued and spot the first time where<br />

Pk−1 > 0, Pk ≤ 0.


5.1 Stopping Rules with Training 57<br />

Using this we obtain the following bounds and acceptance interval <strong>for</strong> τ:<br />

Rk = dα,β(k)<br />

δM 1<br />

2 rk ≤ τ < dα,β(k − 1)<br />

δM 1<br />

2 rk−1 = Rk−1. (5.8)<br />

The training process consists of the following steps. We assume that the matrix<br />

A is given.<br />

1. Choose a test solution ¯x.<br />

2. Generate rhs ¯ b.<br />

3. Generate noisy samples of rhs ¯ b: b i = ¯ b + δb i , i = 1, . . . , s.<br />

4. For each sample b i , i = 1, . . .,s compute {x k (b i )} by using the algorithm<br />

described by equation (3.1), where T = I. Find the index<br />

such that the relative error<br />

is minimal.<br />

k = k∗ = k∗(i),<br />

E k(i) = xk (b) − ¯x<br />

¯x<br />

5. Use <strong>for</strong>mula (5.8) to find the corresponding interval <strong>for</strong> τ:<br />

τ : τ = τi ∈ [R k∗(i), R k∗(i)−1).<br />

Put ¯τi = mid [Rk∗(i), Rk∗(i)−1) and define ¯τ = 1 s s i=1 ¯τi.<br />

6. Use τ = ¯τ in the stopping rule.<br />

In [15] an alternative training scheme is also introduced. In this scheme the<br />

points 5. and 6. in the above scheme are replaced with points that use the<br />

length of the τ intervals instead of the τ itself.<br />

Even though the theory <strong>for</strong> this training scheme rises from SIRT methods on the<br />

<strong>for</strong>m (3.1) where T = I, we will in this software package also use this strategy <strong>for</strong><br />

the remaining methods. This requires some changes in the acceptance interval<br />

(5.8) <strong>for</strong> the ART methods, since M does not exist <strong>for</strong> these methods.


58 Stopping Rules<br />

5.2 Normalized Cumulative Periodogram<br />

When we first introduced the iterative methods in chapter 3, we mentioned<br />

that the number of iterations k can be compared to Tikhononvs regularization<br />

parameter ω and the truncation parameter <strong>for</strong> TSVD k. In the choice of an<br />

optimal value <strong>for</strong> the number of iterations <strong>for</strong> the iterative methods we got the<br />

idea to use the Normalized Cumulative Periodogram (NCP), which is already<br />

used to determine the regularization parameters <strong>for</strong> Tikhonov and TSVD.<br />

The motivation <strong>for</strong> the NCP method was to find a method to choose the regularization<br />

parameter without calculating the SVD or looking at the Picard plot.<br />

In the NCP approach we look at the residual vector r k = b − Ax k as a time<br />

series and consider the exact right hand side ¯ b as a signal which appears clearly<br />

different from the noise vector δb. We can do this since we know that ¯ b is a<br />

smooth function. We then want to find the regularization parameter where the<br />

residual changes from being signal-like and dominated by components from ¯ b to<br />

being noise-like and dominated by components of δb.<br />

In [22] it is discussed that the singular functions are similar to the Fourier basis<br />

functions and the discrete Fourier trans<strong>for</strong>m (DFT) is there<strong>for</strong>e used in the NCP<br />

method.<br />

We let ˆr k denote the DFT of the residual vector r k <strong>for</strong> the iterative method,<br />

ˆr k = dft(r k ) = (ˆr k )1, (ˆr k )2, . . .,(ˆr k T m<br />

)m ∈ C .<br />

The power spectrum of r k is defined as the real vector<br />

p k = |(ˆr 2 )1| 2 , |(ˆr 2 )2| 2 , . . .,|(ˆr 2 )q+1| 2 T , q = ⌊m/2⌋,<br />

where q denotes the largest integer such that q ≤ m/2.<br />

We then define the normalized cumulative periodogram (NCP) <strong>for</strong> the residual<br />

vector r k as the vector c(r k ) ∈ R q as<br />

c(r k )i = (pk )2 + . . . + (pk )i+1<br />

(pk )2 + . . . + (pk , i = 1, . . .,q.<br />

)q+1<br />

If the residual vector consists of white noise, then by definition the expected<br />

power spectrum is flat, i.e. E((p k )2) = E((p k )3) = . . . = E((p k )q+1). Hence<br />

the points on the NCP curve (i, E(c(r k )i)) lie on the straight line from (0, 0)<br />

to (q, 1). Actual noise does not have an ideal flat spectrum, but we can still<br />

expect the NCP to be close to a straight line. A statistical method to determine<br />

whether the NCP is within a straight line is that with a 5 % signification level


5.2 Normalized Cumulative Periodogram 59<br />

the NCP curve must lie inside the Kolmogorov-Smirnoff limits ±1.35q 1/2 of the<br />

straight line.<br />

In practice it can be difficult to achieve the Kolmogorov-Smirnoff limits, and we<br />

will instead choose the regularization parameter <strong>for</strong> which the residual r k represents<br />

white noise the most, in the sense that NCP is closest to a straight line. We<br />

measure the 2-norm between the NCP and the vector cwhite = (1/q, 2/q, . . ., 1) T .<br />

We then define the NCP method as choosing k∗ = kNCP as minimizer of:<br />

d(k) = c(r k ) − cwhite2.


60 Stopping Rules


Chapter 6<br />

Test Problems<br />

This software package includes three tomography test problems: parallel- and<br />

fan beam tomography and seismic tomography. Both parallel- and fan beam<br />

tomography arise from transmission tomography [32], [31], [28] where we study<br />

an object with nondiffractive radiation, i.e. X-rays. The loss of intensity of<br />

the X-rays are recorded by a detector and used to produce a two-dimensional<br />

image of the irradiated object. If we let I0 denote the intensity of beam L from<br />

the source, f(x) denote the linear attenuation coefficient at the point x, and I<br />

denote the intensity of the beam after having passed the object, then<br />

which can also be written as<br />

<br />

I<br />

L<br />

I0<br />

f(x)dx = log I0<br />

I ,<br />

R<br />

−<br />

= exp L f(x)dx .<br />

This provides us with the line integrals of the function f along the lines L. The<br />

trans<strong>for</strong>m that maps the function on R 2 into a set of line integral are called the<br />

Radon trans<strong>for</strong>m [31].<br />

The difference between parallel- and fan beam tomography lies in the arrangement<br />

of the rays L. For parallel beams the rays rise from sources arranged in<br />

parallel and with equally spacing. To get a better representation of the radiated


62 Test Problems<br />

ray i<br />

x 1<br />

x 2<br />

x<br />

3<br />

x<br />

4<br />

x<br />

5<br />

x<br />

6<br />

x<br />

7<br />

x<br />

8<br />

x<br />

9<br />

x<br />

10 10 10 10 10 10 10 10<br />

x<br />

11<br />

x<br />

12<br />

x<br />

13<br />

x<br />

14<br />

x<br />

15<br />

x<br />

16<br />

x<br />

17<br />

x<br />

18<br />

x<br />

19<br />

x<br />

20<br />

x<br />

21<br />

x<br />

22<br />

x<br />

23<br />

x<br />

24<br />

x<br />

25<br />

x<br />

26<br />

x<br />

27 27 27 27 27 27 27 27<br />

x<br />

28 28 28 28 28 28 28 28<br />

x<br />

29<br />

x<br />

30<br />

x<br />

31<br />

x<br />

32<br />

x<br />

33 33 33 33 33 33 33 33<br />

x<br />

34 34 34 34 34 34 34 34<br />

x<br />

35<br />

x<br />

36<br />

Figure 6.1: Illustration of parallel beam tomography <strong>for</strong> a specific angle of the sources.<br />

domain the sources can be rotated around the domain using different angles θ<br />

in such a way, that the rays are still parallel. Figure 6.1 illustrates an example<br />

of a discretized domain with parallel rays <strong>for</strong> a given angle of the sources.<br />

For fan beam tomography we only have a single source. From this source a<br />

number of rays are then arranged like a fan. There are two types of fan beam<br />

tomography, depending on whether the rays are equiangular or equispaced. Figure<br />

6.2 illustrates a discretized case of fan beam <strong>for</strong> equiangular rays, where the<br />

green circle illustrates the source and the red lines the rays. To get a better representation<br />

of the domain the source can be rotated around the domain keeping<br />

the distance to the center of the domain constant.<br />

Seismic tomography is a part of the geophysical tomography problems. In seismic<br />

tomography the travel time through a domain of the subsurface of the earth<br />

is observed. From inversions of the line integrals along the seismic waves the<br />

structure of the subsurface is estimated. The travel time tL <strong>for</strong> ray L can be<br />

expressed as<br />

<br />

tL = s(l)dl,<br />

where s(l) is the slowness, which is the reciprocal of the velocity.<br />

L


x<br />

1<br />

x<br />

2<br />

x<br />

3<br />

x<br />

4<br />

x<br />

5<br />

x<br />

6<br />

x<br />

7<br />

x<br />

8<br />

x<br />

9<br />

x x x x<br />

13 19 25 31<br />

x x x x<br />

14 20 26 32<br />

x x x x<br />

15 21 21 21 21 21 21 21 21 27 27 27 27 27 27 27 27 33 33 33 33 33 33 33 33<br />

x x x x x<br />

10 16 22 28 28 28 28 28 28 28 28 34 34 34 34 34 34 34 34<br />

x x x x x<br />

11 17 23 29 35<br />

x x x x x<br />

12 18 24 24 24 24 24 24 24 24 30 30 30 30 30 30 30 30 36 36 36 36 36 36 36 36<br />

Figure 6.2: Illustration of fan beam tomography.<br />

In our seismic tomography problem we consider a 2-dimesional subsurface slice.<br />

On the right side of the subsurface s sources are positioned, such that the<br />

distance between the sources is constant and the distance from the top located<br />

source to the surface and the distance from the bottom source to the boundary<br />

of the domain is half the distance between two sources. On the left side of<br />

the subsurface and on the surface a total of p seismographs or receivers are<br />

located under the same conditions as the sources. For each source s p rays are<br />

transmitted, such that all receivers are hit. Figure 6.3 illustrates the set-up of<br />

the seismic tomography problem, where the green circles denote the sources, the<br />

blue squares denote the receivers and the red lines denote the rays from one of<br />

the sources.<br />

To apply the three test problems we need a <strong>for</strong>mulation as a linear system on<br />

the <strong>for</strong>m Ax = b. This can be done similarly <strong>for</strong> all three test problems, since<br />

only the arrangement of the rays is different. To avoid confusions we observe a<br />

domain, which is described by the function f, which is either the object from<br />

parallel or fan beam or the structure of the subsurface. We start by dividing<br />

the domain into N parts of unit lengths in each of the dimensions. This gives<br />

us N 2 square cells. All cells are numbered from 1 to N 2 starting with the cell<br />

in the upper left corner to the cell in the bottom right corner running along the<br />

63


64 Test Problems<br />

x<br />

1<br />

x<br />

2<br />

x<br />

3<br />

x<br />

4<br />

x<br />

5<br />

x<br />

6<br />

x<br />

7<br />

x<br />

8<br />

x<br />

9<br />

x<br />

10<br />

x<br />

11<br />

x<br />

12<br />

x<br />

13<br />

x<br />

14<br />

x<br />

15<br />

x<br />

16 16 16 16 16 16 16 16 16 16 16 16<br />

x<br />

17<br />

x<br />

18<br />

x<br />

19<br />

x<br />

20<br />

x<br />

21<br />

x<br />

22 22 22 22 22 22 22 22 22 22 22 22<br />

x<br />

23<br />

x<br />

24<br />

x<br />

25<br />

x<br />

26<br />

x<br />

27 27 27 27 27 27 27 27 27 27 27 27<br />

x<br />

28<br />

x<br />

29<br />

x<br />

30<br />

x<br />

31<br />

x<br />

32<br />

x<br />

33<br />

x<br />

34 34 34 34 34 34 34 34 34 34 34 34<br />

x<br />

35<br />

x<br />

36<br />

Figure 6.3: Illustration of seismic tomography.<br />

columns, i.e. the numbering from figure 6.1, 6.2 and 6.3. Each cell j is assigned<br />

a constant value xj, which is an approximation of the average of the function f<br />

within the j’th cell. In this way the reshaped vector x is a discretized version<br />

of the ”true” function f.<br />

For illustration we consider the i’th ray in figure 6.1, which passes through cells<br />

in the domain. We define the element aij as the length of the i’th ray through<br />

cell j, i.e. aij = 0 if ray i does not pass through cell j. The contribution from<br />

ray i through cell j is then the length multiplied with the value of cell j, i.e.<br />

aij · xj. The measurements bi is then:<br />

<br />

bi = aijxj, i = 1, . . .,M,<br />

N 2<br />

j=1<br />

where M is the number of rays.<br />

The used exact solution depends on the chosen test problem. For the parallel<br />

and fan beam test problems the exact solution is the modified Shepp-Logan<br />

phantom head defined in [37]. The Shepp-Logan phantom is a famous model of<br />

the brain based on ellipses. The phantom is often used <strong>for</strong> medical tomography


Shepp−Logan Phantom, N = 100<br />

(a) The modified Shepp-Logan phantom.<br />

Seismic Phantom, N = 100<br />

(b) The seismic phantom subsurface.<br />

Figure 6.4: The exact solutions <strong>for</strong> the test problems.<br />

and can be scaled <strong>for</strong> different discretizations. In this modified version the<br />

contrast is improved <strong>for</strong> a better visiualization. Figure 6.4 (a) illustates the<br />

modified Shepp-Logan phantom <strong>for</strong> N = 100.<br />

For the seismic tomography test problem we have chosen to create our own<br />

phantom. This phantom is an illustration of a 2-dimensional subsurface of<br />

simple convergent boundaries of two tectonic plates with different slowness. We<br />

have chosen the case where the plates create a subduction zone, since one plate<br />

moves underneath the other. Also this test phantom can be scaled <strong>for</strong> different<br />

discretizations. Figure 6.4 (b) illustrates the tectonic phantom <strong>for</strong> N = 100.<br />

65


66 Test Problems


Chapter 7<br />

Testing the Methods<br />

In this chapter we will investigate the per<strong>for</strong>mance of the implemented iterative<br />

methods and the corresponding strategies. When per<strong>for</strong>ming these investigations<br />

we must pay attention to the term inverse crime. Inverse crime arises when<br />

the same model is used to produce simulated data and to invert data or when<br />

the same discretization is used to simulate and to invert. Inverse crime often<br />

results in problems that are easier to solve than problems that arise from real<br />

data, but if the algorithms do not work on inverse crime problems we cannot<br />

hope that they will work on real data. In this chapter we will use a standard<br />

test problem with inverse crime.<br />

The standard test problem will be used <strong>for</strong> almost every test case. We choose<br />

the parallel beam tomography test problem with the discretization N = 100.<br />

The angles of the sources are chosen to start with the angle 0 degrees and end<br />

with 179 degrees with a gap of 5 degrees. For each of these 36 angles we use 150<br />

parallel rays. The generated matrix A then has the dimension 5400 × 10000,<br />

which means the the system is underdetermined. We create a noisy right-hand<br />

side by adding white Gaussian noise with noise level δ = 0.05.


68 Testing the Methods<br />

relative error<br />

0.5<br />

0.45<br />

0.4<br />

0.35<br />

0.3<br />

0.25<br />

0.2<br />

0 0.5 1 1.5 2 2.5<br />

(a) SNARK: Relative error histories.<br />

k optimal<br />

100<br />

80<br />

60<br />

40<br />

20<br />

λ<br />

0<br />

0 0.5 1 1.5 2 2.5<br />

(c) SNARK: Optimal number of<br />

iterations.<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0 0.5 1 1.5 2 2.5<br />

(e) fanbeamtomo: Relative error<br />

histories.<br />

k optimal<br />

100<br />

90<br />

80<br />

70<br />

60<br />

50<br />

λ<br />

λ<br />

40<br />

0 0.5 1 1.5 2 2.5<br />

(g) fanbeamtomo: Optimal number<br />

of iterations.<br />

λ<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0 0.5 1 1.5 2 2.5<br />

(b) paralleltomo: Relative error<br />

histories.<br />

k optimal<br />

100<br />

80<br />

60<br />

40<br />

λ<br />

20<br />

0 0.5 1 1.5 2 2.5<br />

(d) paralleltomo: Optimal<br />

number of iterations.<br />

relative error<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0 0.5 1 1.5 2 2.5<br />

(f) seismictomo: Relative error<br />

histories.<br />

k optimal<br />

100<br />

80<br />

60<br />

40<br />

20<br />

λ<br />

λ<br />

0<br />

0 0.5 1 1.5 2 2.5<br />

(h) seismictomo: Optimal number<br />

of iterations.<br />

Figure 7.1: Relative error histories <strong>for</strong> the DROP method <strong>for</strong> four different test problems.<br />

λ


7.1 Convergence of DROP 69<br />

relative error<br />

k optimal<br />

0.5<br />

0.45<br />

0.4<br />

0.35<br />

0.3<br />

0.25<br />

0.2<br />

0 0.02 0.04 0.06 0.08 0.1<br />

(a) SNARK head phantom<br />

100<br />

80<br />

60<br />

40<br />

20<br />

λ<br />

0<br />

0 0.02 0.04 0.06 0.08 0.1<br />

(c) SNARK head phantom<br />

λ<br />

relative error<br />

k optimal<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0 0.02 0.04 0.06 0.08 0.1<br />

100<br />

90<br />

80<br />

70<br />

60<br />

λ<br />

(b) paralleltomo<br />

50<br />

0 0.02 0.04 0.06 0.08 0.1<br />

λ<br />

(d) paralleltomo<br />

Figure 7.2: The minimmu relative error af function of the relaxation parameter,when<br />

the weighting w is random numbers from 0 to 50.<br />

7.1 Convergence of DROP<br />

In section 3.1.5 when we derived the SIRT method DROP, we mentioned that<br />

the upper bound of the convergence interval <strong>for</strong> DROP can be estimated by<br />

2/ max(wi) <strong>for</strong> i = 1, . . .,m, where wi > 0 denotes the user-defined weighting<br />

of the equations. In this test we will look at the consequence <strong>for</strong> choosing<br />

this interval instead of the originally derived interval (0, 2/ρ(S −1 A T DA)). The<br />

advantage of using the simplified upper bound 2/ max(wi) is that we then do<br />

not have to compute the spectral radius ρ(S −1 A T DA), since this can be very<br />

expensive.<br />

For this test we will not only use the standard test problem. We will also use<br />

a test problem from fan beam tomography, seismic tomography and a special<br />

variant of the SNARK phantom head, which is not available in the software<br />

package.<br />

The size of the convergence interval has influence on the choice of the relaxation<br />

parameter λ = λk. We will there<strong>for</strong>e <strong>for</strong> the different mentioned test problems<br />

study the relative error histories <strong>for</strong> the different values of λ and look at the<br />

optimal number of iterations. In this way it should be clear what is lost by using


70 Testing the Methods<br />

the simplified version of the convergence interval.<br />

Figure 7.1 illustrates the four different test problems and <strong>for</strong> each problem the<br />

minimum relative errors and the corresponding number of iterations. The vertical<br />

dotted line illustrates the upper bound when using 2/ max(wi). In section<br />

4.2.1 we defined the optimal value of the relaxation parameter λ to be the λ,<br />

which gives rise to the fastest convergence to the smallest relative error in the<br />

solution. We notice that only <strong>for</strong> the test problem SNARK the simplified convergence<br />

interval will contain the optimal value of λ. For all the other test<br />

problems the optimal value of λ is cut off.<br />

To get a better idea of the per<strong>for</strong>mance of the interval we chose to include a<br />

weighting matrix not equal to 1. The vector w is created as random numbers<br />

between 0 and 50. Figure 7.2 illustrates the minimum relative errors and the<br />

corresponding number of iterations <strong>for</strong> the test problem SNARK and the standard<br />

test problem. Again the vertical dotted line illustrates the simplified upper<br />

bound of the convergence interval. For this example we see that a lot of the<br />

original convergence interval is removed, and again the optimal value of λ is cut<br />

off.<br />

Based on these observations we conclude that using the simplified convergence<br />

interval is not a good idea if you are interested in finding an optimal value of<br />

the relaxation parameter λ, since we risk losing the optimal value of λ. We<br />

have there<strong>for</strong>e chosen that our implementation of the DROP method uses the<br />

original but more expensive convergence interval.<br />

7.2 Symmetric Kaczmarz as a SIRT Method<br />

When we in section 3.2.2 introduced the symmetric Kaczmarz method we mentioned<br />

that it could be rewritten on the SIRT <strong>for</strong>m (3.1), in such a way that the<br />

matrix MSA is symmetric which means that the derived theory <strong>for</strong> the SIRT<br />

methods is also valid <strong>for</strong> the symmetric Kaczmarz method. Since we are not<br />

interested in computing the matrix MSA, the only strategies we can use are the<br />

Ψ1- and Ψ2-based relaxation strategies to chose the relaxation parameter λk.<br />

We will not include the modified Ψ1 and Ψ2 strategies, since we from the paper<br />

[16] do not have any good choice of the parameter τ and it is not a part of this<br />

project to discover the per<strong>for</strong>mance of this parameter.<br />

Figure 7.3 illustrates the relative error histories <strong>for</strong> the solutions with three<br />

different choices of λk. The red circles denote the relative error of the solutions<br />

when the Ψ1-based relaxation is chosen, the blue triangles illustrate the relative


7.3 Test of the Choice of Relaxation Parameter 71<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

symkaczmarz: Ψ−based relaxations<br />

0.4<br />

0 5 10<br />

k<br />

15 20<br />

Ψ 1<br />

Ψ 2<br />

λ = 0.25<br />

Figure 7.3: Ψ-based relaxation <strong>for</strong> symkaczmarz.<br />

error of the solutions when the Ψ2-based relaxation is chosen and the pink<br />

diamonds illustrate the relative error histories when we chose λ = 0.25. We<br />

notice that <strong>for</strong> both the Ψ1- and Ψ2-based relaxations the relative error decreases<br />

and levels out as the number of iterations increase which is the behavior we<br />

would expect. We also notice that the relative error <strong>for</strong> the Ψ1- and Ψ2-based<br />

relaxations do not reach the same level as the relative errors <strong>for</strong> the solutions<br />

where we have a constant value of λ, in the part of the interval, where we would<br />

expect the optimal value to be. We will later compare the per<strong>for</strong>mance of the<br />

strategies <strong>for</strong> choosing the relaxation parameter.<br />

7.3 Test of the Choice of Relaxation Parameter<br />

In section 4.2 we introduced several methods or strategies to select the relaxation<br />

parameter λk in a reasonable way. We will in this section investigate the<br />

per<strong>for</strong>mance of each of the methods or strategies.<br />

Training<br />

We start by investigating our developed strategies <strong>for</strong> finding the optimal value<br />

of λ = λk using training. In this test case we give the algorithm the best con-


72 Testing the Methods<br />

relative error<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

Minimum relative error as function as λ<br />

0.4<br />

0 50 100 150 200 250 300<br />

λ<br />

(a) The minimum relative errors as function<br />

as λ.<br />

# iterations k<br />

100<br />

80<br />

60<br />

40<br />

20<br />

# iterations k as function of λ<br />

0<br />

0 50 100 150 200 250 300<br />

λ<br />

(b) The corresponding number of iterations<br />

k <strong>for</strong> the minimum relative error as<br />

function as λ.<br />

Figure 7.4: Illustration of the minimum relative errors <strong>for</strong> different lambda values and<br />

the corresponding number of iterations <strong>for</strong> Cimmino’s projection method.<br />

relative error<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

Minimum relative error as function as λ<br />

0.4<br />

0 0.5 1 1.5 2<br />

λ<br />

(a) The minimum relative errors as function<br />

as λ.<br />

# iterations k<br />

10<br />

8<br />

6<br />

4<br />

2<br />

# iterations k as function of λ<br />

0<br />

0 0.5 1 1.5 2<br />

λ<br />

(b) The corresponding number of iterations<br />

k <strong>for</strong> the minimum relative error as<br />

function as λ.<br />

Figure 7.5: Illustration of the minimum relative errors <strong>for</strong> different lambda values and<br />

the corresponding number of iterations <strong>for</strong> Kaczmarz’s method.


7.3 Test of the Choice of Relaxation Parameter 73<br />

relative error<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

Minimum relative error as function as λ<br />

0.3<br />

0 0.5 1 1.5 2<br />

λ<br />

(a) The minimum relative errors as function<br />

as λ.<br />

# iterations k<br />

20<br />

15<br />

10<br />

5<br />

# iterations k as function of λ<br />

0<br />

0 0.5 1 1.5 2<br />

λ<br />

(b) The corresponding number of iterations<br />

k <strong>for</strong> the minimum relative error as<br />

function as λ.<br />

Figure 7.6: Illustration of the minimum relative errors <strong>for</strong> different lambda values and<br />

the corresponding number of iterations <strong>for</strong> the randomized Kaczmarz method.<br />

relative error<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

Training to optimal λ<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

drop<br />

sart<br />

0 20 40 60 80 100<br />

number of iterations k<br />

Figure 7.7: The relative errors <strong>for</strong> the SIRT methods with optimal λ value.


74 Testing the Methods<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

Training to optimal λ<br />

kacczmarz<br />

symkaczmarz<br />

randkaczmarz<br />

0.3<br />

0 5 10<br />

number of iterations k<br />

15 20<br />

Figure 7.8: The relative errors <strong>for</strong> the ART methods with optimal λ value.<br />

ditions <strong>for</strong> determining the optimal value of the constant relaxation parameter<br />

λ, since we will use the training method on the problem we want to solve.<br />

Since we in our implementation have made it possible <strong>for</strong> the user to chose a<br />

maximum number of iterations we will investigate both the behaviour of the<br />

training methods, if the maximum number of iteration is chosen in a sensible<br />

way, and also the behaviour if the maximum number of iterations is chosen to<br />

be too small.<br />

In the following investigations we have chosen not to visualize the behaviour<br />

of all the implemented iterative methods since the per<strong>for</strong>mance of all methods<br />

can be represented by only a few examples. Figure 7.4 illustrates the minimum<br />

relative error as function as λ (the left figure) and the corresponding number<br />

of iterations needed to obtain this (the right figure) <strong>for</strong> Cimmino’s projection<br />

method. In the left figure we observe, that the minimum relative errors as<br />

expected are almost equal except in the beginning and in the end of the convergence<br />

interval. The red square denotes the λ-value found by the training<br />

strategy and the corresponding relative error. As we would expect the relative<br />

error is smaller than the upper bound of the resolution limit which is denoted<br />

by the green dashed lines. We then look at the right figure and observe that the<br />

found value denoted by the red diamond is very close to the minimum number<br />

of iterations used. As mentioned this example illustrates the typical behavior<br />

of the SIRT methods, and from this we are very satisfied with the per<strong>for</strong>mance<br />

of the training strategy <strong>for</strong> the SIRT methods.


7.3 Test of the Choice of Relaxation Parameter 75<br />

We then take a look at figure 7.5. Again the figure illustrates the minimum<br />

relative error as function as λ (the left figure) and the corresponding number of<br />

iterations needed to obtain this (the right figure) but <strong>for</strong> Kaczmarz’s method.<br />

As expected only a small interval of the λ values have minimum relative errors<br />

below the upper bound of the resolution limit, and we notice that the λ found by<br />

the training strategy (the red square) lies just below this upper bound. When<br />

involving the number of iterations we notice that the found λ value (the red<br />

diamond) is in fact the value which is below the upper bound of the relative error<br />

and uses the minimum number of iterations. We notice that a lot of λ-values<br />

have a smaller number of iterations, but from the minimum relative errors they<br />

can be eliminated, since they are above the upper bound of the resolution limit.<br />

The behaviour of Kaczmarz’s method is similar to the symmetric Kaczmarz<br />

method, but <strong>for</strong> the randomized Kaczmarz method we observe deviation. Figure<br />

7.6 illustrates the behaviour of the randomized Kaczmarz method. We see<br />

that the per<strong>for</strong>mance of the left figure is similar to the figure <strong>for</strong> Kaczmarz’s<br />

method, but the right figure looks different. Since this method involves a random<br />

selection of the rows we get a stocastic result and we can only discuss the<br />

per<strong>for</strong>mance of the method as an average. We can then from the right figure see<br />

that the overall per<strong>for</strong>mance is close to the per<strong>for</strong>mance <strong>for</strong> Kaczmarz’s method.<br />

Figure 7.7 illustrates the relative errors <strong>for</strong> all the SIRT methods, when the<br />

optimal value of λ is used. We notice that the per<strong>for</strong>mance of the methods are<br />

almost equal except <strong>for</strong> Landweber’s method which has slower semi-convergence<br />

than the other methods. SART also returns a result which is sligthly better<br />

than most methods. We also notice that Cimmino’s projection method and<br />

Cimmino’s reflection method return the exact same solutions, but the relaxation<br />

parameter is exactly twice as big <strong>for</strong> the projection method as <strong>for</strong> the reflection<br />

method. When returning to the <strong>for</strong>mulations of the two problems we also notice<br />

that only a factor 2 differs between the two methods. There<strong>for</strong>e when we <strong>for</strong><br />

one method can find the optimal value, the other must depend on a factor 2.<br />

Figure 7.8 illustrates the relative errors <strong>for</strong> all the ART methods, when the<br />

optimal value of λ is used. From this we notice that Kaczmarz’s method and<br />

symmetric Kaczmarz have similar behaviour, while randomized Kaczmarz seems<br />

to reach semi-convergence later than the other methods, but it seems to stay at<br />

the semi-convergence level.<br />

As metioned in section 4.2.1 the implemented strategies have a different approach<br />

to find the optimal value of λ if too few numbers of iterations are used.<br />

Figure 7.9 illustrates the minimum relative error in figure (a) and figure (b)<br />

illustrates the relative error histories of nine different values of λ <strong>for</strong> Cimmino’s<br />

projection method. From (b) we clearly see that the minimum relative error is<br />

found after 20 iterations which in this case is also the allowed maximum number<br />

of iteration. This obviously has an effect on figure (a), since semi-convergence


76 Testing the Methods<br />

relative error<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

Minimum relative error as function as λ<br />

0.4<br />

0 50 100 150 200 250 300<br />

λ<br />

(a) The minimum relative errors as function<br />

as λ.<br />

λ = 21.7167<br />

1<br />

0.5<br />

0 10 20<br />

λ = 130.3002<br />

1<br />

0.5<br />

0 10 20<br />

λ = 211.7378<br />

1<br />

0.5<br />

0 10 20<br />

λ = 76.0084<br />

1<br />

0.5<br />

0 10 20<br />

λ = 157.4461<br />

1<br />

0.5<br />

0 10 20<br />

λ = 238.8837<br />

1<br />

0.5<br />

0 10 20<br />

λ = 103.1543<br />

1<br />

0.5<br />

0 10 20<br />

λ = 184.5919<br />

1<br />

0.5<br />

0 10 20<br />

λ = 266.0296<br />

1<br />

0.5<br />

0 10 20<br />

(b) The corresponding number of iterations<br />

k <strong>for</strong> the minimum relative error as<br />

function as λ.<br />

Figure 7.9: Illustration of the minimum relative errors <strong>for</strong> different lambda values<br />

and the corresponding number of iterations <strong>for</strong> Cimmino’s projection method when the<br />

maximum number of iterations is 20.<br />

relative error<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

Minimum relative error as function as λ<br />

0.4<br />

0 0.5 1 1.5 2<br />

λ<br />

(a) The minimum relative errors as function<br />

as λ.<br />

λ = 0.16327<br />

1<br />

0.5<br />

0 2 4<br />

λ = 0.97959<br />

1<br />

0.5<br />

0 2 4<br />

1<br />

λ = 1.5918<br />

0.5<br />

0 2 4<br />

λ = 0.57143<br />

1<br />

0.5<br />

0 2 4<br />

1<br />

λ = 1.1837<br />

0.5<br />

0 2 4<br />

1<br />

λ = 1.7959<br />

0.5<br />

0 2 4<br />

λ = 0.77551<br />

1<br />

0.5<br />

0 2 4<br />

1<br />

λ = 1.3878<br />

0.5<br />

0 2 4<br />

1<br />

λ = 2<br />

0.5<br />

0 0.5 1<br />

(b) The corresponding number of iterations<br />

k <strong>for</strong> the minimum relative error as<br />

function as λ.<br />

Figure 7.10: Illustration of the minimum relative errors <strong>for</strong> different lambda values<br />

and the corresponding number of iterations <strong>for</strong> Kaczmarz’s method when the maximum<br />

number of iterations is 4.


7.3 Test of the Choice of Relaxation Parameter 77<br />

relative error<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

Minimum relative error as function as λ<br />

0.4<br />

0 0.5 1 1.5 2<br />

λ<br />

(a) The minimum relative errors as function<br />

as λ.<br />

λ = 0.16327<br />

1<br />

0.5<br />

0 2 4<br />

λ = 0.97959<br />

1<br />

0.5<br />

0 2 4<br />

1<br />

λ = 1.5918<br />

0.5<br />

0 2 4<br />

λ = 0.57143<br />

1<br />

0.5<br />

0 2 4<br />

1<br />

λ = 1.1837<br />

0.5<br />

0 2 4<br />

1<br />

λ = 1.7959<br />

0.5<br />

0 2 4<br />

λ = 0.77551<br />

1<br />

0.5<br />

0 2 4<br />

1<br />

λ = 1.3878<br />

0.5<br />

0 2 4<br />

1<br />

λ = 2<br />

0.5<br />

0 0.5 1<br />

(b) The corresponding number of iterations<br />

k <strong>for</strong> the minimum relative error as<br />

function as λ.<br />

Figure 7.11: Illustration of the minimum relative errors <strong>for</strong> different lambda values<br />

and the corresponding number of iterations <strong>for</strong> the randomized Kaczmarz method when<br />

the maximum number of iteration is 4.<br />

is not reached <strong>for</strong> the λ values. In this case the optimal value of λ is only found<br />

based on the relative error, and from the red square in figure (b) we conclude<br />

that the found λ is reasonable, and that our developed strategy <strong>for</strong> finding the<br />

optimal value of λ per<strong>for</strong>med as expected.<br />

Figure 7.10 illustrates the minimum relative error in figure (a), and figure (b)<br />

illustrates the relative error histories of nine different values of λ <strong>for</strong> Kaczmarz’s<br />

method. Again we see from figure (b) that the maximum number of iterations is<br />

reached <strong>for</strong> each value of λ, and again the optimal value of λ is found based on the<br />

minimum in figure (a). The found value, the red square, seems to be reasonable.<br />

Figure 7.11 illustrates the minimum relative error and the relative error histories<br />

<strong>for</strong> the randomized Kaczmarz method. We notice that the found value of λ has<br />

a relative error below the upper bound of the resolution limit. We also notice<br />

that the curve <strong>for</strong> minimum relative errors is more flat <strong>for</strong> randomized Kaczmarz<br />

than <strong>for</strong> Kaczmarz’s method, which can make it difficult <strong>for</strong> the algorithm to<br />

determine which of the values to choose.<br />

Line Search<br />

The next strategy of choosing the relaxation parameter we will investigate is<br />

line search. As metioned when we introduced line search in section 4.2.2, this<br />

strategy can only be used <strong>for</strong> SIRT methods, where T = I. Figure 7.12 illustrates<br />

the relative error histories <strong>for</strong> Landwebers method, Cimmino’s projection<br />

method, Cimmino’s reflection method, and the CAV method, when λ is cho-


78 Testing the Methods<br />

Relative Error<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

Relative errors<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

0.3<br />

0 10 20 30 40 50<br />

number of iterations k<br />

Figure 7.12: Relative error histories <strong>for</strong> the relaxation parameter λ chosen with line<br />

search.<br />

sen using line search. We notice that besides the semi-convergence behaviour,<br />

then both of the Cimmino methods, and <strong>for</strong> CAV the error has a zigzagging<br />

behaviour. Experience shows that this behavior depends on the noise on data.<br />

For small noise levels the zigzagging is almost invisible but <strong>for</strong> larger noise levels<br />

the erratic behaviour increases. The explanation of this behaviour seems to be<br />

that line search assumes consistent data which is not the case in out test problem.<br />

The conclusion of the per<strong>for</strong>mance of the line search strategy is then that<br />

<strong>for</strong> small noise levels we get a good per<strong>for</strong>mance, but not <strong>for</strong> larger noise levels.<br />

Relaxation to Control Noise Propagation<br />

The last of the introduced strategies <strong>for</strong> choosing the relaxation parameter actually<br />

consists of four different strategies, since it consists of both the Ψ1- and<br />

the Ψ2-based relaxation and their modified versions. Since we earlier in this<br />

chapter saw that the symmetric Kaczmarz method could also be used together<br />

with these strategies, we will test the strategies on the SIRT methods and the<br />

symmetric Kaczmarz method.<br />

Figure 7.13 illustrates the relative error histories when λk is chosen using the<br />

Ψ1- and Ψ2-based relaxations. We notice the relative error remains almost<br />

constant after the minimum has been obtained <strong>for</strong> both Ψ1 and Ψ2, which shows<br />

that these strategies indeed dampens the influence of the noise-error, which


7.3 Test of the Choice of Relaxation Parameter 79<br />

Relative Error<br />

Relative Error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

Relative errors <strong>for</strong> Ψ 1<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

drop<br />

sart<br />

symkaczmarz<br />

0.4<br />

0 50 100 150<br />

number of iterations k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

(a) Ψ1-based relaxation.<br />

Relative Error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

Relative errors <strong>for</strong> Ψ 2<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

drop<br />

sart<br />

symkaczmarz<br />

0.4<br />

0 50 100 150<br />

number of iterations k<br />

(b) Ψ2-based relaxation.<br />

Figure 7.13: The relative error histories <strong>for</strong> Ψ1- and Ψ2-based relaxations.<br />

Relative errors <strong>for</strong> Ψ 1 modified<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

drop<br />

sart<br />

0.4<br />

0 50 100 150<br />

number of iterations k<br />

(a) Modified Ψ1-based relaxation.<br />

Relative Error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

Relative errors <strong>for</strong> Ψ 2 modified<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

drop<br />

sart<br />

0.4<br />

0 50 100 150<br />

number of iterations k<br />

(b) Modified Ψ2-based relaxation.<br />

Figure 7.14: The relative error histories <strong>for</strong> the modified Ψ1- and Ψ2-based relaxations.


80 Testing the Methods<br />

Minimum relative error Training Line search Modified Ψ2<br />

Landweber 0.3874 (70) 0.3875 (49) 0.4519 (150)<br />

Cimmino (projection) 0.4240 (40) 0.4267 (33) 0.4503 (150)<br />

Cimmino (reflection) 0.4240 (40) 0.4267 (33) 0.4503 (150)<br />

CAV 0.4235 (40) 0.4266 (33) 0.4504 (150)<br />

DROP 0.4262 (39) - 0.4513 (150)<br />

SART 0.4048 (45) - 0.4399 (150)<br />

Kaczmarz 0.4246 (3) - -<br />

Symmetric Kaczmarz 0.4247 (2) - 0.5010 (7)*<br />

Randomized Kaczmarz 0.3957 (9) - -<br />

Table 7.1: Table of the minimum relative error <strong>for</strong> each SIRT or ART method combined<br />

with the different strategies <strong>for</strong> choosing λ. The numbers appearing in brackets<br />

are the used number of iterations. The * <strong>for</strong> symmetric Kaczmarz denote that <strong>for</strong> this<br />

method the modified Ψ2 strategy is not used. Instead is the Ψ1 strategy used, since this<br />

strategy gave the best result <strong>for</strong> the symmetric Kaczmarz method.<br />

reduces the sensitivity of the solution such that the influence of the choosing<br />

too many iterations is dampened. We notice that <strong>for</strong> the SIRT methods the<br />

per<strong>for</strong>mance <strong>for</strong> Ψ2 is better than Ψ1, while <strong>for</strong> the symmetric Kaczmarz method<br />

the per<strong>for</strong>mance of Ψ1 is better than Ψ2.<br />

For the modified versions of the Ψ1- and Ψ2-based relaxations the parameter τ<br />

is chosen based on the results from [16]. Figure 7.14 illustrates the relative error<br />

histories, when λk is chosen using the modified Ψ1- and Ψ2-based relaxations.<br />

Again we notice that the influence of the noise error is dampened. Comparing<br />

the modified versions with the original version we notice that the modified<br />

strategies reach a lower level of relative errors within the same number of iterations.<br />

The conclusion is there<strong>for</strong>e that the acceleration of the strategies seems to<br />

be a good idea. As metioned we have only used a constant value of the parameter<br />

τ but it could be interesting to see if choosing τk depending on the iteration<br />

could give an even better result. This investigation and a closer investigation of<br />

how to determine a constant “optimal” value of the parameter τ is not a part<br />

of this project, and will there<strong>for</strong>e not be investigated further.<br />

Comparisation of the Relaxation Strategies<br />

By observing the figures 7.7, 7.12 and 7.14 we can compare the per<strong>for</strong>mance of<br />

the different relaxation strategies, since they are applied on the same problem.<br />

When comparing the methods with the different strategies we will consider both<br />

the minimum relative error and the used number of iterations <strong>for</strong> this minimum.


7.3 Test of the Choice of Relaxation Parameter 81<br />

relative error<br />

Relative Error<br />

Relative Error<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

Training to optimal λ<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

drop<br />

sart<br />

0.2<br />

0 20 40 60 80 100<br />

number of iterations k<br />

(a) Trained λ <strong>for</strong> SIRT methods.<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0 50 100 150<br />

number of iterations k<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

Relative errors <strong>for</strong> Ψ 1<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

drop<br />

sart<br />

symkaczmarz<br />

(c) Ψ1-based relaxation.<br />

Relative errors <strong>for</strong> Ψ 1 modified<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

drop<br />

sart<br />

0.2<br />

0 50 100 150<br />

number of iterations k<br />

(e) Modified Ψ1-based relaxation.<br />

Relative Error<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0 10 20 30 40 50<br />

number of iterations k<br />

relative error<br />

Relative Error<br />

Relative Error<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

Training to optimal λ<br />

kacczmarz<br />

symkaczmarz<br />

randkaczmarz<br />

0.2<br />

0 5 10<br />

number of iterations k<br />

15 20<br />

(b) Trained λ <strong>for</strong> ART methods.<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

Relative errors <strong>for</strong> Ψ 2<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

drop<br />

sart<br />

symkaczmarz<br />

0.2<br />

0 50 100 150<br />

number of iterations k<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

Relative errors<br />

(g) Line search.<br />

(d) Ψ2-based relaxation.<br />

Relative errors <strong>for</strong> Ψ 2 modified<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

drop<br />

sart<br />

0.2<br />

0 50 100 150<br />

number of iterations k<br />

(f) Modified Ψ2-based relaxation.<br />

landweber<br />

cimminoProj<br />

cimminoRefl<br />

cav<br />

Figure 7.15: The relative error histories <strong>for</strong> the SNARK test problem using the different<br />

relaxation strategies.


82 Testing the Methods<br />

The minimum relative errors and used iterations is gathered in table 7.1. For<br />

the Ψ-based strategies we have only shown the best result, which <strong>for</strong> all the<br />

SIRT methods was with the modified Ψ2 strategy, but <strong>for</strong> symmetric Kaczmarz<br />

was the Ψ1 strategy.<br />

By looking at figure 7.7 and table 7.1 we notice that most of the methods are<br />

almost equally good, when the optimal relaxation parameter is found <strong>for</strong> each<br />

method. The only method that has a smaller minimum but found with more<br />

iterations is the Landweber method. From figure 7.12 and the table, where<br />

line search is used to compute the relaxation parameter we notice that again<br />

Landweber has a smaller minimum relative error than the other methods but<br />

uses more iterations. Comparing the minimum relative errors obtained with the<br />

training strategy with the line search strategy we see that the in general gives<br />

almost the same relative errors. Regarding the used number of iterations line<br />

search uses a few less than with an optimal value of the relaxation parameter.<br />

We then compare with figure 7.14 (b), since we have already concluded that the<br />

modified Ψ2-based relaxation gives the best results of the Ψ-based relaxations.<br />

For the modified Ψ2-based relaxation we see that all methods per<strong>for</strong>m equally<br />

well with this strategy. Comparing this with the other strategies we conclude<br />

that the minimum relative error is almost the same as <strong>for</strong> the other methods.<br />

Concerning the number of iterations the modified Ψ2 strategy has not found the<br />

minimum after 150 iterations, since the strategy dampens the noise-error, but<br />

we also notice that not much has happened with the relative error <strong>for</strong> the last<br />

50 iterations.<br />

The conclusion <strong>for</strong> the SIRT methods with this test problem must be, that all<br />

three introduced strategies gives satisfactory results. The risk when using line<br />

search is that the method assumes consistency which we cannot guarantee <strong>for</strong><br />

large noise levels and in this case it seems that the modified Ψ2 strategy is a<br />

good alternative. What is interesting is that the training strategy gives the best<br />

result but one must also keep in mind that the training strategy is given optimal<br />

conditions since training and solving are per<strong>for</strong>med on the same problem. The<br />

difference between training and the other strategies is that training gives a constant<br />

relaxation parameter, where the other methods have adaptive relaxation<br />

parameters. We can there<strong>for</strong>e conclude that using both using a constant and<br />

an adaptive relaxation parameter seems to be a good choice.<br />

For the ART methods we only have the strategy of finding an optimal relaxation<br />

parameter by using training. Only <strong>for</strong> symmetric Kaczmarz the Ψ1- and Ψ2based<br />

relaxations are defined. For this method we compare the result from figure<br />

7.8 with figure 7.13. From this we notice that using the Ψ1-based relaxation we<br />

get the highest minimum relative error which is obtained after 7 iterations. The<br />

number of iterations is there<strong>for</strong>e larger <strong>for</strong> the Ψ1-based relaxation than using<br />

training to find an optimal value where the minimum relative error was found


7.4 Stopping Rules 83<br />

after 2 iterations. Even though the constant relaxation strategy per<strong>for</strong>ms better<br />

we must keep in mind, that the Ψ-based relaxations give good results without the<br />

need of knowing the exact solution, which is the case <strong>for</strong> the training strategy.<br />

For the remaining ART methods, where we only have the training strategy we<br />

could wish that an adaptive strategy existed, since it seems to give better results.<br />

As mentioned when the test problem was introduced, we have commited inverse<br />

crime when we created the test problem. To investigate the per<strong>for</strong>mance of the<br />

different relaxation strategies, when the test problem is not created with inverse<br />

crime we use the earlier used test problem SNARK. Figure 7.15 illustrates the<br />

relative error histories <strong>for</strong> the different methods using the different relaxation<br />

strategies, when the test problem SNARK is used. By looking at figure (a)<br />

we notice that <strong>for</strong> this test problem some methods per<strong>for</strong>m better than others<br />

meaning that the minimum relative error is smaller <strong>for</strong> some methods than<br />

<strong>for</strong> others. We notice that this is also the case <strong>for</strong> other relaxation strategies.<br />

We also notice that the minimum relative error is almost the same whatever<br />

relaxation strategy is used. From our own results and the results in [16], that<br />

also uses the SNARK test problems we can conclude that <strong>for</strong> small noise levels<br />

line search is a very effective method, but <strong>for</strong> larger noise levels, where line<br />

search has erratic behaviour the Ψ-based relaxations are perferred, since the<br />

per<strong>for</strong>mance is almost equal, but the dampening of the error is better.<br />

We find it very interesting that the comparisons are the same <strong>for</strong> the two test<br />

problems when looking at the training strategy. Even though the training strategy<br />

did not seem to be a bad idea, the problem with this strategy is still, that<br />

one must have a similar test problem to train on. The line search method is<br />

only defined <strong>for</strong> a few of the SIRT methods while the Ψ-based relaxation seems<br />

to per<strong>for</strong>m well on all SIRT methods even though the theory is only valid <strong>for</strong><br />

SIRT methods where T = I.<br />

7.4 Stopping Rules<br />

To complete the testing of the introduced strategies and methods we also want<br />

to take a closer look at the per<strong>for</strong>mance of the different stopping rules which<br />

we introduced in section 5. Since two of the methods require training of a<br />

parameter, we start by looking at the per<strong>for</strong>mance of this training.


84 Testing the Methods<br />

τ<br />

55<br />

50<br />

45<br />

40<br />

35<br />

30<br />

25<br />

τ <strong>for</strong> cimminoProj<br />

DP<br />

ME<br />

20<br />

10 20 30<br />

number of samples s<br />

40 50<br />

Figure 7.16: The trained value of τ <strong>for</strong> different number of samples <strong>for</strong> both the discrepancy<br />

principle (DP) and the monotone error rule (ME) using Cimmino’s projection<br />

method.<br />

τ<br />

1600<br />

1400<br />

1200<br />

1000<br />

800<br />

τ <strong>for</strong> drop<br />

DP<br />

ME<br />

600<br />

10 20 30<br />

number of samples s<br />

40 50<br />

Figure 7.17: The trained value of τ <strong>for</strong> different number of samples <strong>for</strong> both the discrepancy<br />

principle (DP) and the monotone error rule (ME) using the DROP method.


7.4 Stopping Rules 85<br />

τ<br />

820<br />

810<br />

800<br />

790<br />

780<br />

770<br />

760<br />

τ <strong>for</strong> kaczmarz<br />

750<br />

DP<br />

740<br />

10 20 30<br />

number of samples s<br />

40 50<br />

Figure 7.18: The trained value of τ <strong>for</strong> different number of samples <strong>for</strong> the discrepancy<br />

principle (DP) using Kaczmarz’s method.<br />

Training<br />

As already mentioned the stopping rules DP and ME require training of the<br />

parameter τ, but when training this parameter the user must select the number<br />

of samples s which the parameter τ will be based on. It there<strong>for</strong>e makes sence<br />

first to investigate the influence of the number of samples s.<br />

Figure 7.16 illustrates the value of the trained parameter τ <strong>for</strong> a different number<br />

of samples s using Cimmino’s projection method. The blue circles denote the<br />

value of the parameter τ <strong>for</strong> DP and the red squares denote the value <strong>for</strong> ME.<br />

We see that except when using only 10 samples, the trained values almost do<br />

not vary. This is the case <strong>for</strong> both the DP and the ME parameter. Figure 7.17<br />

also illustrates the trained parameters τ <strong>for</strong> both the DP and the ME but when<br />

using the DROP method. We notice that the behaviour from figure 7.16 repeats,<br />

and that only the value esimated from 10 samples vary a lot. We let the results<br />

from the DROP method and Cimmino’s projection method be representative<br />

examples of the SIRT methods and conclude that using 15-20 samples would<br />

be a good choice, since the running time also increases as s increases. Figure<br />

7.18 illustrates the variation of the τ parameter <strong>for</strong> DP when using Kaczmarz’s<br />

method. Using the result <strong>for</strong> this representative example of the ART method<br />

we come to the same conclusion as <strong>for</strong> the SIRT methods, which is that using<br />

15-20 samples <strong>for</strong> the ART method is a good choice.


86 Testing the Methods<br />

relative error<br />

relative error<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

landweber<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 50 100 150<br />

number of iteration k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

(a) Landweber<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

cimminoRefl<br />

(c) Cimmino’s reflection<br />

drop<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

(e) DROP<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

relative error<br />

relative error<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

cimminoProj<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

(b) Cimmino’s projection<br />

cav<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

(d) CAV<br />

sart<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

(f) SART<br />

Figure 7.19: Illustration of the stopping rules <strong>for</strong> the SIRT methods.


7.4 Stopping Rules 87<br />

Stopping index k∗ kopt NCP DP ME<br />

Landweber 135 84 133 134<br />

Cimmino (projection) 73 67 9 19<br />

Cimmino (reflection) 73 67 9 19<br />

CAV 74 66 8 16<br />

DROP 72 67 9 19<br />

SART 84 62 37 48<br />

Kaczmarz 7 6 5 -<br />

Symmetric Kaczmarz 3 3 2 -<br />

Randomized Kaczmarz 6 5 2 -<br />

Table 7.2: The stopping index k∗ <strong>for</strong> all iterative methods. For each method<br />

the stopping rule, which is closest to kopt is bold.<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

kaczmarz<br />

min rel. error<br />

DP<br />

NCP<br />

0.3<br />

0 2 4 6 8 10<br />

number of iteration k<br />

(a) Kaczmarz<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

randkaczmarz<br />

0.3<br />

0 2 4 6 8 10<br />

number of iteration k<br />

(c) Randomized Kaczmarz<br />

symkaczmarz<br />

min rel. error<br />

DP<br />

NCP<br />

0.3<br />

0 2 4 6 8 10<br />

number of iteration k<br />

(b) Symmetric Kaczmarz<br />

min rel. error<br />

DP<br />

NCP<br />

Figure 7.20: Illustration of the stopping rules <strong>for</strong> the ART methods.


88 Testing the Methods<br />

Testing the Stopping Rules<br />

After having determined the number of samples when training <strong>for</strong> the stopping<br />

rules DP and ME we can observe the per<strong>for</strong>mance of the stopping rules on<br />

the different iterative methods. We again give the training method optimal<br />

conditions since we train on and solve the same problem. For this test we will<br />

use the built-in default relaxation parameter <strong>for</strong> each method.<br />

For each method we solve the problem with only a maximum number of iterations<br />

as a stopping criteria. For all the iterations we compute the relative errors<br />

and find the minimum relative error. We then solve the same problem with each<br />

of the stopping rules and compare the result with the number of iterations <strong>for</strong><br />

the minimum relative error. Table 7.2 contains the stopping index <strong>for</strong> each of<br />

the stopping rules <strong>for</strong> each method and the number of iterations used to reach<br />

the minimum relative error kopt. The figures 7.19 and 7.20 illustrate the relative<br />

error histories <strong>for</strong> the methods, and <strong>for</strong> each method it is marked where the<br />

stopping rules stopped the method.<br />

We start by looking at the results <strong>for</strong> Landweber’s method. From the table and<br />

from figure 7.19 (a) we see that <strong>for</strong> Landweber’s method both the stopping rule<br />

ME and DP are very close to the optimal stopping index. The stopping rule<br />

NCP stops the iterations after only 84 iterations. By looking at the figure this<br />

can be explained by the behaviour of the relative errors, since the change in the<br />

relative error is very small after 80 iterations. This implies that even though<br />

both ME and DP are very close to the optimal stopping index the result <strong>for</strong><br />

NCP is not bad, since it stops after fewer iterations but with almost the same<br />

in<strong>for</strong>mation.<br />

For both of Cimmino’s methods the stopping rule NCP is closest to the optimal<br />

stopping index. For this example the stopping rule DP allows only 9 iterations<br />

and from figure 7.19 (b) and (c) we notice that the relative errors after this point<br />

are still significantly decreasing. The stopping rule ME allows 19 iterations<br />

which is just be<strong>for</strong>e the relative errors starts to level out. This behaviour is<br />

recognized <strong>for</strong> both the CAV method (figure (d)) and <strong>for</strong> the DROP method<br />

(figure (e)).<br />

For the SART method NCP again gives the stopping index closest to the optimal<br />

stopping index kopt, but <strong>for</strong> this case both the DP and the ME are close to the<br />

point on the relative error history, where the error levels out. This means that<br />

<strong>for</strong> this method the different stopping rules return a solution of almost equal<br />

quality regarding the error.<br />

A conclusion on the stopping rules <strong>for</strong> the SIRT methods based on the table


7.5 Relaxation Strategies Combined with Stopping Rules 89<br />

and the figure must be, that the only stopping rule that presents really bad<br />

results is the DP but only <strong>for</strong> some of the SIRT methods. According to this<br />

small test a safe choice of stopping rule is the NCP since it always stops the<br />

iterations after the relative error has leveled off. The advantage of the NCP<br />

method is also that it does not require any knowledge of the problem. Both the<br />

DP and the ME require training, where in<strong>for</strong>mation about the noise level must<br />

be known, and in this test we gave them optimal conditions to determine the<br />

stopping index, since the training problem is the same as the solving problem.<br />

Despite this advantage they per<strong>for</strong>m more poorly than the NCP method.<br />

Figure 7.20 illustrates the relative error histories <strong>for</strong> the ART methods. For the<br />

ART methods we can only use DP and NCP. We first look at Kaczmarz method<br />

in figure (a). From this figure we notice that the NCP stopping rule is closest<br />

to the optimal stopping index kopt, but both stopping rules have reached almost<br />

the same level of relative error as the kopt index. Figure (b) illustrates the<br />

relative error histories <strong>for</strong> symmetric Kaczmarz. In this case the NCP stopping<br />

index is actually the same as the kopt, but the DP index is only one iteration<br />

smaller and the error almost at the same level as the optimal index. Figure (c)<br />

illustrates the relative error histories <strong>for</strong> randomized Kaczmarz, and in this case<br />

the DP stopping index is closest to the optimal index kopt. The NCP index is<br />

in this case a bad stopping index since we have not reached where the errors<br />

level out yet. For the ART methods we conclude that in most cases the NCP<br />

stopping rule is the most effective, but the DP stopping rule is not a bad choice.<br />

Again we must keep in mind, that DP was given optimal condions and that it<br />

requires training and knowledge of the noise level.<br />

7.5 Relaxation Strategies Combined with Stopping<br />

Rules<br />

We have earlier tested the per<strong>for</strong>mance of the stopping rules and the strategies<br />

to determine the relaxation parameter separately, and will now investigate the<br />

per<strong>for</strong>mance when the strategies are used together.<br />

Relaxation to Control Noise Propagation with Stopping<br />

Rules<br />

Since we earlier in section 7.3 concluded that the relaxation strategies Ψ1 and Ψ2<br />

were good choices of relaxation strategies, we will test the per<strong>for</strong>mance when<br />

using the modified Ψ2 strategy and the stopping rules together. Figure 7.21


90 Testing the Methods<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

cimminoProj using Ψ 2 modified<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 200 400 600 800 1000<br />

number of iteration k<br />

(a) The modified Ψ2 strategy <strong>for</strong> Cimmino’s<br />

projection method.<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

drop using Ψ 2 modified<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 200 400 600 800 1000<br />

number of iteration k<br />

(b) The modified Ψ2 strategy <strong>for</strong> the DROP<br />

method.<br />

Figure 7.21: The relaxation strategy Ψ2 modified and combined with the stopping<br />

rules.<br />

illustrates the relative error histories <strong>for</strong> Cimmino’s projection method and <strong>for</strong><br />

the DROP method. The stopping index <strong>for</strong> the stopping rules are illustrated<br />

by different markers. From this figure we notice that even though we allow<br />

1000 iterations, then the minimum of the relative error is not reached, since<br />

the method dampens the noise error and hence the semi-convergence behaviour.<br />

This implies that NCP does not stop the iterations and that DP and ME are<br />

stopped after only a few iterations, where we clearly see that the relative error<br />

has not reached a level that is close to the level after 1000 iterations. We<br />

conclude that since we earlier have shown that the Ψ-based relaxation are good<br />

choices of relaxation strategies we could use a stopping rule that could find an<br />

appropriate stopping index <strong>for</strong> these methods.<br />

Line Search with Stopping Rules<br />

Since line search turned out to give good results when the noise level is low,<br />

we are interested in investigating if the introduced stopping rules can be used<br />

together with line search. Figure 7.22 illustrates the relative error histories<br />

and the stopping index <strong>for</strong> all stopping rules on the SIRT methods, where line<br />

search is defined. We notice that <strong>for</strong> all four methods the NCP stopping rule<br />

stops the iterations too early, since we still have a significant decay after the<br />

NCP stopping index. On the other hand both the DP and the ME give very bad<br />

results results since they stop the iterations either way be<strong>for</strong>e or after the optimal<br />

index. We conclude that with this relaxation strategy none of the stopping rules<br />

give satisfactory results which could be cause by the earlier described zigzagging<br />

behaviour.


7.5 Relaxation Strategies Combined with Stopping Rules 91<br />

relative error<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

landweber, line search<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 10 20 30 40 50<br />

number of iteration k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

(a) Landweber<br />

cimminoRefl, line search<br />

0.3<br />

0 10 20 30 40 50<br />

number of iteration k<br />

(c) Cimmino’s reflection<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

relative error<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

cimminoProj, line search<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 10 20 30 40 50<br />

number of iteration k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

(b) Cimmino’s projection<br />

cav, line search<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 10 20 30 40 50<br />

number of iteration k<br />

(d) CAV<br />

Figure 7.22: Illustration of the stopping rules <strong>for</strong> the SIRT methods using line search.


92 Testing the Methods<br />

Total Work Units Stopping index k∗ WU<br />

Landweber 69 ∗ 138<br />

Cimmino (projection) 36 72<br />

Cimmino (reflection) 36 72<br />

CAV 7 ∗ 14<br />

DROP 36 72<br />

SART 29 ∗ 58<br />

Kaczmarz 3 ∗ 12<br />

Symmetric Kaczmarz 2 16<br />

Randomized Kaczmarz 6 ∗ 24<br />

Table 7.3: The * denotes that the stopping rule DP is used, while all other method<br />

uses NCP.<br />

Comparing the Per<strong>for</strong>mance of SIRT and ART<br />

We want to compare the per<strong>for</strong>mance of the SIRT and the ART methods and to<br />

give the methods equal possibility to per<strong>for</strong>m well, we use the training strategy<br />

<strong>for</strong> the relaxation parameter since we in section 7.3 showed that all methods<br />

are almost equally good when the relaxation parameter is trained. Figure 7.23<br />

and figure 7.24 illustate the relative error histories <strong>for</strong> the SIRT and the ART<br />

methods. For each method the stopping index <strong>for</strong> the different stopping rules<br />

are marked. We notice from figure 7.23 that the NCP method does not work<br />

well <strong>for</strong> the Landweber method, the CAV method and the SART method. For<br />

Landweber we notice that the DP stopping rule are very close to the minimum<br />

relative error and we will there<strong>for</strong> use the DP rule <strong>for</strong> Landweber. For the<br />

CAV method none of the stopping rule return satisfactory results, but the DP<br />

rule is the closest. For the SART method we also choose DP since it returns a<br />

result with less iterations than the minimum relative error, but almost at the<br />

same level of the error. For the rest of the SIRT methods we choose NCP to<br />

be the stopping rule since NCP stops the iterations <strong>for</strong> these methods when<br />

the relative errors level out, and the in<strong>for</strong>mation when iterating further is very<br />

small. As mentioned earlier NCP is also the easiest stopping rule to use, since<br />

it does not require training and knowledge about the noise level. For the ART<br />

methods we choose DP <strong>for</strong> Kaczmarz, NCP <strong>for</strong> symmetric Kaczmarz and DP<br />

<strong>for</strong> randomized Kaczmarz. For all the mentioned choices we have chosen the<br />

stopping rule which is closest to the minimum relative error and with the error<br />

on the same level.<br />

To compare the SIRT and the ART methods we recall the introduced work unit<br />

WU from section 3.3. The total work of a method is then the number of used<br />

iterations multiplied with the work units per iteration <strong>for</strong> the given method.<br />

Table 7.3 shows the chosen stopping index and the total work of the method.


7.5 Relaxation Strategies Combined with Stopping Rules 93<br />

relative error<br />

relative error<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

landweber, λ = 0.00052968<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

(a) Landweber<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

cimminoRefl, λ = 122.3413<br />

(c) Cimmino’s reflection<br />

drop, λ = 2.1673<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

(e) DROP<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

relative error<br />

relative error<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

cimminoProj, λ = 244.6826<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

(b) Cimmino’s projection<br />

cav, λ = 2.2216<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

(d) CAV<br />

sart, λ = 1.8541<br />

min rel. error<br />

DP<br />

ME<br />

NCP<br />

0.3<br />

0 20 40 60 80 100<br />

number of iteration k<br />

(f) SART<br />

Figure 7.23: Illustration of the stopping rules <strong>for</strong> the SIRT methods with a trained<br />

value of λ.


94 Testing the Methods<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

kaczmarz, λ = 0.43769<br />

min rel. error<br />

DP<br />

NCP<br />

0.3<br />

0 2 4 6 8 10<br />

number of iteration k<br />

(a) Kaczmarz<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

relative error<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

randkaczmarz, λ = 1<br />

0.3<br />

0 2 4 6 8 10<br />

number of iteration k<br />

(c) Randomized Kaczmarz<br />

symkaczmarz, λ = 0.32624<br />

min rel. error<br />

DP<br />

NCP<br />

0.3<br />

0 2 4 6 8 10<br />

number of iteration k<br />

(b) Symmetric Kaczmarz<br />

min rel. error<br />

DP<br />

NCP<br />

Figure 7.24: Illustration of the stopping rules <strong>for</strong> the ART methods with a trained<br />

value of λ.


7.5 Relaxation Strategies Combined with Stopping Rules 95<br />

From that result we clearly see that the ART methods use less work units to<br />

obtain a solution which has the same quality as the SIRT methods. Only the<br />

CAV method uses almost the same amount of work units but we also recall<br />

that <strong>for</strong> this method the quality of the solution is not as good as <strong>for</strong> the other<br />

methods.<br />

For this package the SIRT methods have an advantage, since the implementation<br />

is done in <strong>MATLAB</strong>, where the structure of the SIRT methods can be used<br />

to speed up the running time but this is only the case <strong>for</strong> <strong>MATLAB</strong> implementations.<br />

The good per<strong>for</strong>mance of the ART methods present a dilemma. Through the<br />

project we have experienced that the theory and understanding of the SIRT<br />

methods are better than <strong>for</strong> ART, whereas we do not have theory <strong>for</strong> semiconvergence<br />

and adaptive relaxation strategies <strong>for</strong> the ART methods. The experiments<br />

through this chapter have shown that ART produced the fastest solution,<br />

but by choosing the relaxation parameter by an adaptive method the<br />

SIRT methods could produce just as accurate solutions but without the need<br />

<strong>for</strong> training. However this requires more computational work. In future work we<br />

could hope that someone would come up with an adaptive method <strong>for</strong> the ART<br />

method, such that they could produce results without the need <strong>for</strong> training.<br />

One could also hope <strong>for</strong> a stopping rule <strong>for</strong> the SIRT methods with adaptive<br />

relaxation parameter that was able to stop the iterations when the curve of the<br />

relative errors starts to level out, since this could minimize the computational<br />

work <strong>for</strong> this method. In general one could hope <strong>for</strong> a better stopping rule since<br />

our results through this chapter have shown that all the known stopping rules<br />

are unstable in finding the optimal stopping index.


96 Testing the Methods


Chapter 8<br />

Manual Pages<br />

ITERATIVE SIRT METHODS<br />

cav Component Averaging (CAV) iterative method<br />

cimminoProj Cimmino’s iterative projection method<br />

cimminoRefl Cimmino’s iterative reflection method<br />

drop Diagonally Relaxed Orthogonal Projections (DROP)<br />

iterative method<br />

landweber The Classical Landweber iterative method<br />

sart The Simultaneous <strong>Algebraic</strong> Reconstruction Technique<br />

(SART) iterative method<br />

ITERATIVE ART METHODS<br />

kaczmarz Kaczmarz’s iterative method also known as algebraic<br />

reconstruction technique (ART)<br />

randkaczmarz The randomized Kaczmarz iterative method<br />

symkaczmarz The symmetric Kaczmarz iterative method


98 Manual Pages<br />

TRAINING ROUTINES<br />

trainDPME Training strategy to estimate the best parameter<br />

when the discrepancy principle or monotone error<br />

rule is used as stopping rule<br />

trainLambdaART Training strategy to find the best constant relaxation<br />

parameter λ <strong>for</strong> a given ART method<br />

trainLambdaSIRT Training strategy to find the best constant relaxation<br />

parameter λ <strong>for</strong> a given SIRT method<br />

TEST PROBLEMS<br />

fanbeamtomo Creates a two-dimensional fan beam tomography test<br />

problem<br />

paralleltomo Creates a two-dimensional parallel beam tomography<br />

test problem<br />

seismictomo Creates a two-dimensional seismic tomography test<br />

problem<br />

DEMO ROUTINES<br />

ARTdemo Demo illustrating the simple use of the ART methods<br />

SIRTdemo Demo illustrating the simple use of the SIRT methods<br />

trainingdemo Demo illustrating the use of the training routines and<br />

the afterwards use of the SIRT and the ART methods<br />

AUXILIARY ROUTINES<br />

calczeta Calculates the roots of a specific polynomial g(z) of<br />

degree k


The Demo functions<br />

This <strong>MATLAB</strong> package includes three demo functions which illustrate the use<br />

of the remaining functions in the package.<br />

The demo function ARTdemo illustrates the use of the ART methods kaczmarz,<br />

symkaczmarz and randkaczmarz. First the demo function creates a parallel<br />

beam tomography test problem using the test problem paralleltomo. For this<br />

test problem noise is added to the right-hand side and the noisy problem is<br />

then solved using the ART methods with 10 iterations. The result is shown as<br />

four images, where one contains the exact solution and the remaining images<br />

illustrate the found solutions using the three ART methods.<br />

The demo functionSIRTdemo illustrates the use of the SIRT methodslandweber,<br />

cimminoProj,cimminoRefl, cav, drop, and sart. First the demo function creates<br />

a parallel beam tomography test problem using the test problemparalleltomo.<br />

For this test problem noise is added to the right-hand side and the noisy<br />

problem is then solved using the SIRT methods with 50 iterations. The result is<br />

shown as seven images, where one contains the exact solution and the remaining<br />

images illustrate the found solutions using the six SIRT methods.<br />

The demo function trainingdemo illustrates the use of the training functions<br />

trainLambdaART, trainLambdaSIRT, and trainDPME followed by the solving<br />

with an ART or a SIRT method. In this demo the used SIRT method is<br />

cimminoProj and the used ART method is kaczmarz. First the demo function<br />

creates a parallel beam tomography test problem using the test problem<br />

paralleltomo. For this test problem noise is added to the right-hand side.<br />

Then the training strategy trainLambdaSIRT is used to find the relaxation parameter<br />

<strong>for</strong> cimminoProj and trainLambdaART is used to find the relaxation<br />

parameter <strong>for</strong> kaczmarz. Including this in<strong>for</strong>mation the stopping parameter is<br />

found <strong>for</strong> each of the methods, where cimminoProj uses the ME stopping rule<br />

and kaczmarz uses the DP stopping rule. After this we solve the problem with<br />

the specified relaxation parameter and stopping rule. The result is shown as<br />

three images, where one contains the exact image and the remaining images<br />

illustrate the found solutions.<br />

99


100 Manual Pages<br />

calczeta<br />

Purpose:<br />

Synopsis:<br />

Calculates the roots of a specific polynomial g(z) of degree k.<br />

z = calczeta(k)<br />

Description:<br />

This function calculates the unique root in the interval (0, 1) by use of Newton-<br />

Raphson’s method and Horner’s rule of the polynomial of degree k:<br />

g(z) = (2k − 1)z k−1 − (z k−2 + ... + z + 1) = 0.<br />

The input k can be given as both a scalar or a vector and the corresponding<br />

root or roots are returned in the output z.<br />

The fuction calczeta is used in the functions cav, cimminoProj,cimminoRefl,<br />

drop, landweber, sart and symkaczmarz.<br />

Algorithm:<br />

See appendix A.2 <strong>for</strong> further discription of the used algorithm.<br />

Examples:<br />

Calculate the roots <strong>for</strong> the degrees 2 up to 100 and plot the found roots.<br />

k = 2:100;<br />

z = calczeta(k);<br />

figure, plot(k,z,’bo’)


See also:<br />

cav,cimminoProj,cimminoRefl,drop,landweber,sart,symkaczmarz.<br />

References:<br />

1. See appendix A.2.<br />

101<br />

2. L. Eldén, L. Wittmeyer-Koch and H. B. Nielsen, Introduction to Numerical<br />

Computation, Studentlitteratur AB, 2004.


102 Manual Pages<br />

cav<br />

Purpose:<br />

Synopsis:<br />

Component Averaging (CAV) iterative method.<br />

[X info restart] = cav(A,b,K)<br />

[X info restart] = cav(A,b,K,x0)<br />

[X info restart] = cav(A,b,K,x0,options)<br />

Algorithm:<br />

For arbitrary x 0 ∈ R n the algorithm <strong>for</strong> cav takes the following <strong>for</strong>m:<br />

x k+1 = x k + λkA T DS(b −Ax k ),<br />

<br />

where DS = diag w1/ n j=1 sja2 1j , . . .,wm/ n j=1 sja2 <br />

mj and S = diag(s1, . . .,sn),<br />

where sj is the number of nonzero elements in column j of A.<br />

Description:<br />

The function implements the Component Averaging (CAV) iterative method <strong>for</strong><br />

solving the linear system Ax= b. The starting vector is x0; if no starting vector<br />

is given then x0 = 0 is used.<br />

The numbers given in the vector K are iteration numbers, that specify which<br />

iterations are stored in the output matrix X. If a stopping rule is selected (see<br />

below) and K = [ ], then X contains the last iterate only.<br />

The maximum number of iterations is determined either by the maximum number<br />

in the vector K or by the stopping rule specified in the field stoprule in the<br />

struct options. If K is empty a stopping rule must be specified.<br />

The relaxation parameter is given in the field lambda in the struct options,<br />

either as a constant or as a string that determines the method to compute


103<br />

lambda. As default lambda is set to 1/˜σ 2 1, where ˜σ1 is an estimate of the largest<br />

singular value of D 1<br />

2<br />

S A.<br />

The second output info is a vector with two elements. The first element is an<br />

indicator, that denotes why the iterations were stopped. The number 0 denotes<br />

that the iterations were stopped because the maximum number of iterations<br />

were reached, 1 denotes that the NCP-rule stopped the iterations, 2 denotes that<br />

the DP-rule stopped the iteration and 3 denotes that the ME-rule stopped the<br />

iterations. The second element in info is the number of used iterations.<br />

The struct restart, which can be given as output, contains in the field s1 the<br />

estimated largest singular value. restart also returns a vector containing the<br />

diagonal of the matrix DS in the field M and an empty vector in the field T. The<br />

struct restart can also be given as input in the struct options such that the<br />

program does not have to recompute the contained values. We recommend only<br />

to use this, if the user has good knowledge of <strong>MATLAB</strong> and is completely sure<br />

of the use of restart as input.<br />

Use of options:<br />

The following fields in options are used in this function:<br />

- options.lambda:<br />

- options.lambda = c, wherecis a constant, satisfying 0 ≤ c ≤ 2/σ 2 1.<br />

A warning is given if this requirement is estimated to be violated.<br />

- options.lambda = ’linesearch’, where the method linesearch<br />

is used to compute the value <strong>for</strong> λk in each iteration using (4.11)<br />

from section 4.2.<br />

- options.lambda = ’psi1’, where the method psi1 computes the<br />

values <strong>for</strong> λk using the Ψ1-based relaxation (4.12) from section 4.2.<br />

- options.lambda = ’psi1mod’, where the methodpsi1mod computes<br />

the values <strong>for</strong> λk using the modified Ψ1-based relaxation (4.16) with<br />

τ1 = 2 from section 4.2. The parameter τ1 can be changed in line<br />

353 in the code.<br />

- options.lambda = ’psi2’, where the method psi2 computes the<br />

values <strong>for</strong> λk using the Ψ2-based relaxation (4.15) from section 4.2.<br />

- options.lambda = ’psi2mod’, where the methodpsi2mod computes<br />

the values <strong>for</strong> λk using the modified Ψ2-based relaxation (4.17) with<br />

τ2 = 1.5 from section 4.2. The parameter τ2 can be changed in line<br />

400 in the code.


104 Manual Pages<br />

- options.restart<br />

- options.restart.M = a vector with the diagonal of DS.<br />

- options.restart.s1 = ˜σ1, where ˜σ1 is the estimated largest singu-<br />

lar value of D 1<br />

2<br />

S A<br />

- options.stoprule<br />

- options.stoprule.type<br />

- options.stoprule.type = ’NONE’, where no stopping rule is<br />

given and only the maximum number of iterations is used to<br />

stop the algorithm. This choice is default.<br />

- options.stoprule.type = ’NCP’, where the optimal number<br />

of iterations k∗ is chosen according to Normalized Cumulative<br />

Periodogram described in section 5.2.<br />

- options.stoprule.type = ’DP’, where the stopping index k∗<br />

is determined according to the discrepancy principle (DP) described<br />

in section 5.1.<br />

- options.stoprule.type = ’ME’, where the stopping index k∗<br />

is determined according to the monotone error rule (ME) described<br />

in section 5.1.<br />

- options.stoprule.taudelta = τδ, where δ is the noise level and τ<br />

is user-chosen. This parameter is only needed <strong>for</strong> the stoprule types<br />

DP and ME.<br />

- options.w<br />

Examples:<br />

- options.w = w, where w is an m-dimensional vector.<br />

Generate a “noisy” 50 × 50 parallel beam tomography problem, computes 50<br />

cav iterations and show the last iterate:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;<br />

X = cav(A,b,1:50);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off


See also:<br />

cimminoProj, cimminoRefl, drop, landweber, sart.<br />

References:<br />

105<br />

1. Y. Censor, D. Gordan and R. Gordan, Component averaging: An efficient<br />

iterative parallel algorithm <strong>for</strong> large sparse unstructured problems, Parallel<br />

Computing 27 (2001), p. 777-808.


106 Manual Pages<br />

cimminoProj<br />

Purpose:<br />

Synopsis:<br />

Cimmino’s iterative projection method.<br />

[X info restart] = cimminoProj(A,b,K)<br />

[X info restart] = cimminoProj(A,b,K,x0)<br />

[X info restart] = cimminoProj(A,b,K,x0,options)<br />

Algorithm:<br />

For arbitrary x 0 ∈ R n the algorithm <strong>for</strong> cimminoProj take the following <strong>for</strong>m:<br />

x k+1 = x k + λkA T M(b −Ax k ),<br />

where M = wi<br />

m diagA(i, :) −2<br />

2 <strong>for</strong> i = 1, . . .,m.<br />

Description:<br />

The function implements Cimmino’s iterative projection method <strong>for</strong> solving linear<br />

systems Ax= b. The starting vector is x0; if no starting vector is given, then<br />

x0 = 0 is used.<br />

The numbers given in the vector K are iteration numbers that specify which<br />

iterations are stored in the output matric K. If a stopping rule us selected (see<br />

below) and K = [ ], then X contains the last iterate only.<br />

The maximum number of iterations is determined eiter by the maximum number<br />

in the vector K or by the stopping rule specified in the field stoprule in the<br />

struct options. If K is empty a stopping rule must be specified.<br />

The relaxation parameter is given in the field lambda in the struct options,<br />

either as a constant or as a string that determines the method to compute<br />

lambda. As default lambda is set to 1/˜σ 2 1, where ˜σ1 is an estimate of the largest<br />

singular value of M 1<br />

2A.


107<br />

The second output info is a vector with two elements. The first element is an<br />

indicator that denotes why the iterations were stopped. The number 0 denotes<br />

that the iterations were stopped because the maximum number of iterations<br />

were reached, 1 denotes that the NCP-rule stopped the iterations, 2 denotes that<br />

the DP-rule stopped the iteration and 3 denotes that the ME-rule stopped the<br />

iterations. The second element in info is the number of used iterations.<br />

The struct restart, which can be given as output contains in the field s1 the<br />

estimated largest singular value. restart also returns a vector containing the<br />

diagonal of the matrix M in the field M and an empty vector in the field T. The<br />

struct restart can also be given as input in the struct options, such that the<br />

program do not have to recompute the contained values. We recommend only<br />

to use this, if the user has good knowledge of <strong>MATLAB</strong> and is completly sure<br />

of the use of restart as input.<br />

Use of options<br />

The following fields in options are used in this function:<br />

- options.lambda:<br />

- options.lambda = c, wherecis a constant, satisfying 0 ≤ c ≤ 2/σ 2 1 .<br />

A warning is given if this requirement is estimated to be violated.<br />

- options.lambda = ’linesearch’, where the method linesearch<br />

is used to compute the value <strong>for</strong> λk in each iteration using (4.11)<br />

from section 4.2.<br />

- options.lambda = ’psi1’, where the method psi1 computes the<br />

values <strong>for</strong> λk using the Ψ1-based relaxation (4.12) from section 4.2.<br />

- options.lambda = ’psi1mod’, where the methodpsi1mod computes<br />

the values <strong>for</strong> λk using the modified Ψ1-based relaxation (4.16) with<br />

τ1 = 2 from section 4.2. The parameter τ1 can be changed in line<br />

362 in the code.<br />

- options.lambda = ’psi2’, where the method psi2 computes the<br />

values <strong>for</strong> λk using the Ψ2-based relaxation (4.15) from section 4.2.<br />

- options.lambda = ’psi2mod’, where the methodpsi2mod computes<br />

the values <strong>for</strong> λk using the modified Ψ2-based relaxation (4.17) with<br />

τ2 = 1.5 from section 4.2. The parameter τ2 can be changed in line<br />

409 in the code.<br />

- options.restart


108 Manual Pages<br />

- options.restart.M = a vector with the diagonal of M.<br />

- options.restart.s1= ˆσ1, where ˆσ1 is the estimated largest singular<br />

value of M 1<br />

2 A.<br />

- options.stoprule<br />

- options.stoprule.type<br />

- options.stoprule.type = ’none’, where no stopping rule is<br />

given and only the maximum number of iterations is used to<br />

stop the algorithm. This choice is default.<br />

- options.stoprule.type = ’NCP’, where the optimal number<br />

of iterations k∗ is chosen according to Normalized Cumulative<br />

Periodogram described in section 5.2.<br />

- options.stoprule.type = ’DP’, where the stopping index k∗<br />

is determined according to the discrepancy priciple (DP) described<br />

in 5.1.<br />

- options.stoprule.type = ’ME’, where the stopping index k∗<br />

is determined according to the monotone error rule (ME) described<br />

in 5.1.<br />

- options.stoprule.taudelta = τδ, where δ is the noise level and τ<br />

is user-chosen. This parameter is only needed <strong>for</strong> the stoprule types<br />

DP and ME.<br />

- options.w<br />

Examples:<br />

- options.w = w, where w is an m-dimensional vector.<br />

Generate a “noisy” 50 × 50 parallel beam tomography problem, computes 50<br />

cimminoProj iterations and show the last iterate:<br />

See also:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;<br />

X = cimminoProj(A,b,1:50);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off<br />

cav, cimminoRefl, drop, landweber, sart.


References:<br />

109<br />

1. G. Cimmino, Calcolo approssimato per le soluzioni dei sistemi di equazioni<br />

lineari, La Ricerca Scientifica, XVI, Series II, Anno IX, 1 (1938), p. 326-<br />

333.


110 Manual Pages<br />

cimminoRefl<br />

Purpose:<br />

Synopsis:<br />

Cimmino’s iterative reflection method.<br />

[X info restart] = cimminoRefl(A,b,K)<br />

[X info restart] = cimminoRefl(A,b,K,x0)<br />

[X info restart] = cimminoRefl(A,b,K,x0,options)<br />

Algorithm:<br />

For arbitrary x 0 ∈ R n the algorithm <strong>for</strong> cimminoRefl take the following <strong>for</strong>m:<br />

x k+1 = x k + λkA T M(b −Ax k ),<br />

where M = 2wi<br />

m diagA(i, :) −2<br />

2 <strong>for</strong> i = 1, . . .,m.<br />

Description:<br />

The function implements Cimmino’s iterative reflection method <strong>for</strong> solving linear<br />

systems Ax= b. The starting vector is x0; if no starting vector is given, then<br />

x0 = 0 is used.<br />

The numbers given in the vector K are iteration numbers that specify which<br />

iterations are stored in the output matrix K. If a stopping rule is selected (see<br />

below) and K = [ ], then X contains the last iterate only.<br />

The maximum number of iterations is determined either by the maximum number<br />

in the vector K or by the stopping rule specified in the field stoprule in the<br />

struct options. If K is empty a stopping rule must be specified.<br />

The relaxation parameter is given in the field lambda in the struct options,<br />

either as a constant or as a string that determines the method to compute<br />

lambda. As default lambda is set to 1/˜σ 2 1, where ˜σ1 is an estimate of the largest<br />

singular value of M 1<br />

2A.


111<br />

The second output info is a vector with two elements. The first element is an<br />

indicator that denotes why the iterations were stopped. The number 0 denotes<br />

that the iterations were stopped because the maximum number of iterations<br />

were reached, 1 denotes that the NCP-rule stopped the iterations, 2 denotes that<br />

the DP-rule stopped the iterations and 3 denotes that the ME-rule stopped the<br />

iterations. The second element in info is the number of used iterations.<br />

The struct restart, which can be given as output contains in the field s1 the<br />

estimated largest singular value. restart also returns a vector containing the<br />

diagonal of the matrix M in the field M and an empty vector in the field T. The<br />

struct restart can also be given as input in the struct options, such that the<br />

program do not have to recompute the contained values. We recommend only<br />

to use this, if the user has good knowledge of <strong>MATLAB</strong> and is completely sure<br />

of the use of restart as input.<br />

Use of options<br />

The following fields in options are used in this function:<br />

- options.lambda:<br />

- options.lambda = c, wherecis a constant, satisfying 0 ≤ c ≤ 2/σ 2 1 .<br />

A warning is given if this requirement is estimated to be violated.<br />

- options.lambda = ’linesearch’, where the method linesearch<br />

is used to compute the value <strong>for</strong> λk in each iteration using (4.11)<br />

from section 4.2.<br />

- options.lambda = ’psi1’, where the method psi1 computes the<br />

values <strong>for</strong> λk using the Ψ1-based relaxation (4.12) from section 4.2.<br />

- options.lambda = ’psi1mod’, where the methodpsi1mod computes<br />

the values <strong>for</strong> λk using the modified Ψ1-based relaxation (4.16) with<br />

τ1 = 2 from section 4.2. The parameter τ1 can be changed in line<br />

356 in the code.<br />

- options.lambda = ’psi2’, where the method psi2 computes the<br />

values <strong>for</strong> λk using the Ψ2-based relaxation (4.15) from section 4.2.<br />

- options.lambda = ’psi2mod’, where the methodpsi2mod computes<br />

the values <strong>for</strong> λk using the modified Ψ2-based relaxation (4.17) with<br />

τ2 = 1.5 from section 4.2. The parameter τ2 can be changed in line<br />

403 in the code.<br />

- options.restart


112 Manual Pages<br />

- options.restart.M a vector with the diagonal of M.<br />

- options.restart.s1= ˆσ1, where ˆσ1 is the estimated largest singular<br />

value of M 1<br />

2 A.<br />

- options.stoprule<br />

- options.stoprule.type<br />

- options.stoprule.type = ’none’, where no stopping rule is<br />

given and only the maximum number of iterations is used to<br />

stop the algorithm. This choice is default.<br />

- options.stoprule.type = ’NCP’, where the optimal number<br />

of iterations k∗ is chosen according to Normalized Cumulative<br />

Periodogram described in section 5.2.<br />

- options.stoprule.type = ’DP’, where the stopping index k∗<br />

is determined according to the discrepancy principle (DP) described<br />

in section 5.1.<br />

- options.stoprule.type = ’ME’, where the stopping index k∗<br />

is determined according to the monotone error rule (ME) described<br />

in section 5.1.<br />

- options.stoprule.taudelta = τδ, where δ is the noise level and τ<br />

is user-chosen. This parameter is only needed <strong>for</strong> the stoprule types<br />

DP and ME.<br />

- options.w<br />

Examples:<br />

- options.w = w, where w is an m-dimensional vector.<br />

Generate a “noisy” 50 × 50 parallel beam tomography problem, computes 50<br />

cimminoRefl iterations and show the last iterate:<br />

See also:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;<br />

X = cimminoRefl(A,b,1:50);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off<br />

cav, cimminoProj, drop, landweber, sart.


References:<br />

113<br />

1. G. Cimmino, Calcolo approssimato per le soluzioni dei sistemi di equazioni<br />

lineari, La Ricerca Scientifica, XVI, Series II, Anno IX, 1 (1938), p. 326-<br />

333.


114 Manual Pages<br />

drop<br />

Purpose:<br />

Synopsis:<br />

Diagonally Relaxed Orthogonal Projections (DROP) iterative method.<br />

[X info restart] = drop(A,b,K)<br />

[X info restart] = drop(A,b,K,x0)<br />

[X info restart] = drop(A,b,K,x0,options)<br />

Algorithm:<br />

For arbitrary x 0 ∈ R n the algorithm <strong>for</strong> the drop method takes the following<br />

<strong>for</strong>m:<br />

x k+1 = x k + λkS −1 A T D(b −Ax k ),<br />

where S−1 = diag s −1<br />

j and sj is the number of nonzero elements in column j<br />

<br />

wi<br />

of A and D = diag <strong>for</strong> i = 1, . . .,m.<br />

Description:<br />

A(i,: 2 2<br />

The function implements the Diagonally Relaxed Orthogonal Projections (DROP)<br />

iterative method <strong>for</strong> solving the linear system Ax= b. The starting vector is x0;<br />

if no starting vector is given, then x0 = 0 is used.<br />

The numbers given in the vector K are the iteration numbers, that specify which<br />

iterations are stored in the output matrix X. If a stopping rule is selected (see<br />

below) and K = [ ], then X contains the last iterate only.<br />

The maximum number of iterations is determined either by the maximum number<br />

in the vector K or by the stopping rule specified in the field stoprule in the<br />

struct options. If K is empty a stopping rule must be specified.<br />

The relaxation parameter is given in the field lambda in the struct options,<br />

either as a constant or as a string that determines the method to compute


115<br />

lambda. As default lambda is set to 1/ˆρ, where ˆρ is an estimate of the spectial<br />

radius of S −1 A T DA.<br />

The second output info is a vector with two elements. The first element is an<br />

indicator, that denotes why the iterations were stopped. The number 0 denotes<br />

that the iterations were stopped because the maximum number of iterations<br />

were reached, denotes that the NCP-rule stopped the iterations, 2 denotes that<br />

the DP-rule stopped the iterations and 3 denotes that the ME-rule stopped the<br />

iterations. The second element in info is the number of used iterations.<br />

The struct restart, which can be given as output contains in the field s1 the<br />

estimated largest singular value. restart also returns a vector containing the<br />

diagonal of the matrix D in the field M and the diagonal of the matrix S in the<br />

field T. The struct restart can also be given as input in the struct options,<br />

such that the program do not have to recompute the contained values. We<br />

recommend only to use this, if the user has good knowledge of <strong>MATLAB</strong> and<br />

is completely sure of the use of restart as input.<br />

Use of options<br />

The following fields in options are used in this function:<br />

- options.lambda:<br />

- options.lambda = c, where c is a constant, satisfying 0 ≤ c ≤ 2/ρ.<br />

A warning is given if this requirement is estimated to be violated.<br />

- options.lambda = ’psi1’, where the method psi1 computes the<br />

values <strong>for</strong> λk using the Ψ1-based relaxation (4.12) from section 4.2.<br />

- options.lambda = ’psi1mod’, where the methodpsi1mod computes<br />

the values <strong>for</strong> λk using the modified Ψ1-based relaxation (4.16) with<br />

τ1 = 2 from section 4.2. The parameter τ1 can be changed in line<br />

350 in the code.<br />

- options.lambda = ’psi2’, where the method psi2 computes the<br />

values <strong>for</strong> λk using the Ψ2-based relaxation (4.15) from section 4.2.<br />

- options.lambda = ’psi2mod’, where the methodpsi2mod computes<br />

the values <strong>for</strong> λk using the modified Ψ2-based relaxation (4.17) with<br />

τ2 = 1.5 from section 4.2. The parameter τ2 can be changed in line<br />

397 in the code.<br />

- options.restart


116 Manual Pages<br />

- options.restart.M = a vector containing the diagonal of D.<br />

- options.restart.T = a vector containing the diagonal of S −1 .<br />

- options.restart.s1 = ˆσ1, where ˆσ1 = √ ˆρ.<br />

- options.stoprule<br />

- options.stoprule.type<br />

- options.stoprule.type = ’none’, where no stopping rule is<br />

given and only the maximum number of iterations is used to<br />

stop the algorithm. This choice is default.<br />

- options.stoprule.type = ’NCP’, where the optimal number<br />

of iterations k∗ is chosen according to Normalized Cumulative<br />

Periodogram described in section 5.2.<br />

- options.stoprule.type = ’DP’, where the stopping index is<br />

determined according to the dicrepancy principle (DP) described<br />

in section 5.1.<br />

- options.stoprule.type = ’ME’, where the stopping index is<br />

determined according to the monotone error rule (ME) described<br />

in section 5.1.<br />

- options.stoprule.taudelta = τδ, where δ is the noise level and τ<br />

is user-chosen. This parameter is only needed <strong>for</strong> the stoprule types<br />

DP and ME.<br />

- options.w<br />

Examples:<br />

- options.w = w, where w is an m-dimensional vector.<br />

Generate a “noisy” 50 × 50 parallel beam tomography problem, computes 50<br />

drop iterations and show the last iterate:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;<br />

X = drop(A,b,1:50);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off


See also:<br />

cav, cimminoProj, cimminoRefl, landweber, sart.<br />

References:<br />

117<br />

1. Y. Censor, T. Elfving, G. Herman and T. Nikazad, On diagonally relaxed<br />

orthogonal projection methods, SIAM J. Sci. Comput., 30 (2007/08), p.<br />

473-504.


118 Manual Pages<br />

fanbeamtomo<br />

Purpose:<br />

Synopsis:<br />

Creates a two-dimensional fan beam tomography test problem.<br />

[A b x theta p R w] = fanbeamtomo(N)<br />

[A b x theta p R w] = fanbeamtomo(N,theta)<br />

[A b x theta p R w] = fanbeamtomo(N,theta,p)<br />

[A b x theta p R w] = fanbeamtomo(N,theta,p,R)<br />

[A b x theta p R w] = fanbeamtomo(N,theta,p,R,w)<br />

[A b x theta p R w] = fanbeamtomo(N,theta,p,R,w,isDisp)<br />

Description:<br />

This function creates a two-dimensional tomography test problem using fan<br />

beams. A 2-dimensional domain is divided into N equally spaced intervals in<br />

boths dimension creating N 2 cells. For each specified angle theta in degrees a<br />

source is located with distance RN to the center of the domain. From the sources<br />

p equiangular rays penetrate the domain with a span of w between the first and<br />

the last ray. The default values <strong>for</strong> the angles is theta = 0:359. The number of<br />

raysphave the default value equal toround( √ 2N). The distance from the center<br />

of the domain to the sources is given in the unit of side lengths and default value<br />

of R is 2. The default value of the span w is calculated such that from (0,RN)<br />

the first ray hits the point (-N/2,N/2) and the last hits (N/2,N/2). If the input<br />

isDisp is different from 0 then the function also creates an illustration of the<br />

problem with the used angles and rays etc. As defaul isDisp is 0.<br />

The function returns a coefficient matrix A with the dimension nA·p×N 2 , where<br />

nA is the number of used angles, the right hand side b and the phantom head<br />

reshaped as a vector x. The figure below illustrates the phantom head <strong>for</strong> N<br />

= 100. In case that default values are used the function also returns the used<br />

angles theta, the number of used rays <strong>for</strong> each angle p, the used distance from<br />

the source to the center of the domain given in side lengths R and the used span<br />

of the rays w.


Algorithm:<br />

119<br />

The element aij is defined as the length of the i’th ray through the j’th cell<br />

with aij = 0 if ray i does not go through cell j. The exact solution of the head<br />

phantom is reshaped as a vector and the i’th element in the right hand side bi<br />

is<br />

<br />

bi = aijxj, i = 1, . . .,nA ·p.<br />

N 2<br />

j=1<br />

For further in<strong>for</strong>mation see chapter 6.<br />

Examples:<br />

Create a test problem and visualize the solution:<br />

See also:<br />

N = 64; theta = 0:5:359; p = 2*N; R = 2;<br />

[A b x] = fanbeamtomo(N,theta,p,R);<br />

imagesc(reshape(x,N,N))<br />

colormap gray, axis image off<br />

paralleltomo, seismictomo.<br />

References:<br />

1. A. C. Kak and M. Slaney, Principles of Computerized Tomographic Imaging,<br />

SIAM, 2001.<br />

Shepp−Logan Phantom, N = 100


120 Manual Pages<br />

kaczmarz<br />

Purpose:<br />

Kaczmarz’s iterative method also known as algebraic reconstruction<br />

technique (ART).<br />

Synopsis:<br />

[X info] = kaczmarz(A,b,K)<br />

[X info] = kaczmarz(A,b,K,x0)<br />

[X info] = kaczmarz(A,b,K,x0,options)<br />

Algorithm:<br />

For arbitrary x 0 ∈ R n the algorithm kaczmarz takes the following <strong>for</strong>m:<br />

Description:<br />

x k,0 = x k<br />

x k,i = x k,i−1 bi −<br />

+ λk<br />

ai , xk,i−1 ai2 2<br />

x k+1 = x k,m .<br />

The function implements Kaczmarz’s iterative method <strong>for</strong> solving the linear<br />

system Ax= b. The starting vector is x0; if no starting vector is given then<br />

x0 = 0 is used.<br />

The numbers given in the vector K are iteration numbers, that specify which<br />

iterations are stored in the putput matrix X. If a stopping rule is selected (see<br />

below) and K = [ ], then X contains the last iterate only.<br />

The maximum number of iterations is determined either by the maximum number<br />

in the vector K or by the stopping rule specified in the field stoprule in the<br />

struct options. If K is empty a stopping rule must be specified.<br />

The relaxation parameter is given in the field lambda in the struct options as<br />

a constant. As default lambda is set to 0.25.<br />

a i


121<br />

The second output info is a vector with two elements. The first element is an<br />

indicator that denotes why the iterations were stopped. The number 0 denotes<br />

that the iterations were stopped because the maximum number of iterations<br />

were reached, 1 denotes that the NCP-rule stopped the iterations and 2 denotes<br />

that the DP-rule stopped the iterations.<br />

Use of options:<br />

The following fields in options are used in this function:<br />

- options.lambda:<br />

- options.lambda = c, where c is a constant, satisfying 0 ≤ c ≤ 2. A<br />

warning is given if this requirement is estimated to be violated.<br />

- options.stoprule<br />

Examples:<br />

- options.stoprule.type<br />

- options.stoprule.type = ’none’, where no stopping rule is<br />

given and only the maximum number of iterations is used to<br />

stop the algorithm. This choice is default.<br />

- options.stoprule.type = ’NCP’, where the optimal number<br />

of iterations k∗ is chosen according to Normalized Cumulative<br />

Periodogram described in section 5.2.<br />

- options.stoprule.type = ’DP’, where the stopping index is<br />

determined according to the dicrepancy principle (DP) described<br />

in section 5.1.<br />

- options.stoprule.taudelta = τδ, where δ is the noise level and τ<br />

is user-chosen. This parameter is only needed <strong>for</strong> the stoprule type<br />

DP.<br />

Generate a “noisy” 50 × 50 parallel beam tomography problem, computes 10<br />

kaczmarz iterations and show the last iterate:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;


122 Manual Pages<br />

See also:<br />

X = kaczmarz(A,b,1:10);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off<br />

randkaczmarz, symkaczmarz.<br />

References:<br />

1. S. Kaczmarz, Angenäherte auflösung von systemen linearer gleichungen,<br />

Bulletin de l’Académie Polonaise des Sciences et Lettres, A35 (1937), p.<br />

355-357.


landweber<br />

Purpose:<br />

Synopsis:<br />

The Classical Landweber iterative method.<br />

[X info restart] = landweber(A,b,K)<br />

[X info restart] = landweber(A,b,K,x0)<br />

[X info restart] = landweber(A,b,K,x0,options)<br />

Algorithm:<br />

For arbitrary x 0 ∈ R n the algorithm <strong>for</strong> landweber takes the following <strong>for</strong>m:<br />

Description:<br />

x k+1 = x k + λkA T (b −Ax k ).<br />

123<br />

The function implements the Classical Landweber iterative method <strong>for</strong> solving<br />

the linear system Ax= b. The starting vector is x0; if no starting vector is given<br />

then x0 = 0 is used.<br />

The numbers given in the vector K are iteration numbers, that specify which<br />

iterations are stored in the output matrix X. If a stopping rule is selected (see<br />

below) and K = [ ], then X contains the last iterate only.<br />

The maximum number of iterations is determined either by the maximum number<br />

in the vector K or by the stopping rule specified in the field stoprule in the<br />

struct options. If K is empty a stopping rule must be specified.<br />

The relaxation parameter is given in the field lambda in the struct options,<br />

either as a constant or as a string that determines the method to compute<br />

lambda. As default lambda is set to 1/˜σ 2 1 , where ˜σ1 is an estimate of the largest<br />

singular value of A.<br />

The second output is a vector with two elements. The first element is an indicator,<br />

that denotes why the iterations were stopped. The number 0 denotes


124 Manual Pages<br />

that the iterations were stopped because the maximum number of iterations<br />

were reached, 1 denotes that the NCP-rule stopped the iterations, 2 denotes that<br />

the DP-rule stopped the iterations and 3 denotes that the ME-rule stopped the<br />

iterations. The second element is info is the number of used iterations.<br />

The struct restart, which can be given as output, contains in the field s1 the<br />

estimated largest singular value. restart also returns an empty vector in both<br />

the fields M and T. The struct restart can also be given as input in the struct<br />

options, such that the program does not have to recompute the contained<br />

values. We recommend only to use this, if the user has good knowledge of<br />

<strong>MATLAB</strong> and is completely sure of the use of restart as input.<br />

Use of options:<br />

The following fields in options are used in this function:<br />

- options.lambda:<br />

- options.lambda = c, wherecis a constant, satisfying 0 ≤ c ≤ 2/˜σ 2 1 .<br />

A warning is given if this requirement is estimated to be violated.<br />

- options.lambda = ’linesearch’, where the method linesearch<br />

is used to compute the value <strong>for</strong> λk in each iteration using (4.11)<br />

from section 4.2.<br />

- options.lambda = ’psi1’, where the method psi1 computes the<br />

values <strong>for</strong> λk using the Ψ1-based relaxation (4.12) from section 4.2.<br />

- options.lambda = ’psi1mod’, where the methodpsi1mod computes<br />

the values <strong>for</strong> λk using the modified Ψ1-based relaxation (4.16) with<br />

τ1 = 2 from section 4.2. The parameter τ1 can be changed in line<br />

299 in the code.<br />

- options.lambda = ’psi2’, where the method psi2 computes the<br />

values <strong>for</strong> λk using the Ψ2-based relaxation (4.15) from section 4.2.<br />

- options.lambda = ’psi2mod’, where the methodpsi2mod computes<br />

the values <strong>for</strong> λk using the modified Ψ2-based relaxation (4.17) with<br />

τ2 = 1.5 from section 4.2. The parameter τ2 can be changed in line<br />

344 in the code.<br />

- options.restart<br />

- options.restart.s1= ˆσ1, where ˆσ1 is the estimated largest singular<br />

value of A.<br />

- options.stoprule


Examples:<br />

- options.stoprule.type<br />

125<br />

- options.stoprule.type = ’none’, where no stopping rule is<br />

given and only the maximum number of iterations is used to<br />

stop the algorithm. This choice is default.<br />

- options.stoprule.type = ’NCP’, where the optimal number<br />

of iterations k∗ is chosen according to Normalized Cumulative<br />

Periodogram described in section 5.2.<br />

- options.stoprule.type = ’DP’, where the stopping index k∗<br />

is determined according to the discrepancy principle (DP) described<br />

in section 5.1.<br />

- options.stoprule.type = ’ME’, where the stopping index k∗<br />

is determined according to the monotone error rule (ME) described<br />

in section 5.1.<br />

options.stoprule.taudelta = τδ, where δ is the noise level and τ<br />

is user-chosen. This parameter is only needed <strong>for</strong> the stoprule types<br />

DP and ME.<br />

We generate a “noisy” 50 ×50 parallel beam tomography problem, computes 50<br />

landweber iterations and show the last iterate:<br />

See also:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;<br />

X = landweber(A,b,1:50);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off<br />

cav, cimminoProj, cimminoRefl, drop, sart<br />

References:<br />

1. L. Landweber, An iteration <strong>for</strong>mula <strong>for</strong> fredholm integral equations of the<br />

first kind, American Journal of Mathematics 73 (1951), p. 615-624.


126 Manual Pages<br />

paralleltomo<br />

Purpose:<br />

Synopsis:<br />

Creates a two-dimensional parallel beam tomography test problem.<br />

[A b x theta p w] = paralleltomo(N)<br />

[A b x theta p w] = paralleltomo(N,theta)<br />

[A b x theta p w] = paralleltomo(N,theta,p)<br />

[A b x theta p w] = paralleltomo(N,theta,p,w)<br />

[A b x theta p w] = paralleltomo(N,theta,p,w,isDisp)<br />

Description:<br />

This function creates a two-dimensional tomography test problem using parallel<br />

beams. A 2-dimensional domain is divided into N equally spaced intervals in<br />

both dimensions creating N 2 cells. For each specified angle theta in degrees,<br />

p parallel rays, arranged symmetrically around the center of the domain, such<br />

that the width from the first to the last ray is w, penetrate the domain. The<br />

default values <strong>for</strong> the angles are theta = 0:179. The number of rays p has the<br />

default value equal to round( √ 2N). The default value of the width between the<br />

first and the last ray w is √ 2N. If the input isDisp is different from 0 then the<br />

function also creates an illustration of the problem with the used angles and<br />

rays etc. As defaul isDisp is 0.<br />

The function returns a coefficient matrix A with the dimension nA·p×N 2 , where<br />

nA is the number of used angles, the right hand side b and the phantom head<br />

reshaped as a vector x. The figure below illustrates the phantom head <strong>for</strong> N<br />

= 100. In case the default values are used, the function also returns the used<br />

angles theta, the number of used rays <strong>for</strong> each angle p, and the used width of<br />

the rays w.<br />

Algorithm:<br />

The element aij is defined as the length of the i’th ray through the j’th cell<br />

with aij = 0 if ray i does not go through cell j. The exact solution of the head


127<br />

phantom is reshaped as a vector and the i’th element in the right hand side bi<br />

is<br />

<br />

bi = aijxj, i = 1, . . .,nA ·p.<br />

N 2<br />

j=1<br />

For further in<strong>for</strong>mation see chapter 6.<br />

Examples:<br />

Create a test problem and visualize the solution:<br />

See also:<br />

N = 64; theta = 0:5:179; p = 2*N;<br />

[A b x] = paralleltomo(N,theta,p);<br />

imagesc(reshape(x,N,N))<br />

colormap gray, axis image off<br />

fanbeamtomo, seismictomo.<br />

References:<br />

1. A. C. Kak and M. Slaney, Principles of Computerized Tomographic Imaging,<br />

SIAM, 2001.<br />

Shepp−Logan Phantom, N = 100


128 Manual Pages<br />

randkaczmarz<br />

Purpose:<br />

Synopsis:<br />

The randomized Kaczmarz iterative method.<br />

[X info] = randkaczmarz(A,b,K)<br />

[X info] = randkaczmarz(A,b,K,x0)<br />

[X info] = randkaczmarz(A,b,K,x0,options)<br />

Algorithm:<br />

For arbitrary x 0 ∈ R n the algorithm <strong>for</strong>randkaczmarz takes the following <strong>for</strong>m:<br />

x k+1 = x k + λ br(i) − ar(i) , xk ar(i) 2 a<br />

2<br />

r(i) ,<br />

where r(i) is chosen from the set {1, . . .,m} randomly with probability proportional<br />

with a r(i) 2 2 .<br />

Description:<br />

The function implements the Randomized Kaczmarz iterative method <strong>for</strong> solving<br />

the linear system Ax= b. The starting vector is x0; if no starting vector is<br />

given then x0 = 0 is used.<br />

The numbers given in the vector K are iteration numbers, that specify which<br />

iterations are stored in the output matrix X. If a stopping rule is selected (see<br />

below) and K = [ ], then X contains the last iterate only.<br />

The maximum number of iterations is determined either by the maximum number<br />

in the vector K or by the stopping rule specified in the field stoprule in the<br />

struct options. If K is empty a stopping rule must be specified.<br />

The relaxation parameter is given in the field lambda in the struct options as<br />

a constant. As default lambda is set to 1, since this corresponds to the original<br />

method.


129<br />

The second output info is a vector with two elements. The first element is an<br />

indicator that denotes why the iterations were stopped. The number 0 denotes<br />

that the iterations were stopped because the maximum number of iterations<br />

were reached, 1 denotes that the NCP-rule stopped the iterations and 2 denotes<br />

that the DP-rule stopped the iterations.<br />

Use of options:<br />

The following fields in options are used in this function:<br />

- options.lambda:<br />

- options.lambda = c, where c is a constant.<br />

- options.stoprule<br />

Examples:<br />

- options.stoprule.type<br />

- options.stoprule.type = ’none’, where no stopping rule is<br />

given and only the maximum number of iterations is used to<br />

stop the algorithm. This choice is default.<br />

- options.stoprule.type = ’NCP’, where the optimal number<br />

of iterations k∗ is chosen according to Normalized Cumulative<br />

Periodogram described in section 5.2.<br />

- options.stoprule.type = ’DP’, where the stopping index is<br />

determined according to the dicrepancy principle (DP) described<br />

in section 5.1.<br />

- options.stoprule.taudelta = τδ, where δ is the noise level and τ<br />

is user-chosen. This parameter is only needed <strong>for</strong> the stoprule type<br />

DP.<br />

Generate a “noisy” 50 × 50 parallel beam tomography problem, compute 10<br />

randkaczmarz iterations and show the last iterate:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;<br />

X = randkaczmarz(A,b,1:10);


130 Manual Pages<br />

See also:<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off<br />

kaczmarz, symkaczmarz.<br />

References:<br />

1. T. Strohmer and R. Vershynin, A randomized solver <strong>for</strong> linear systems<br />

with exponential convergence, Lecture Notes in Computer Science 4110<br />

(2006), p. 499-507.


sart<br />

Purpose:<br />

The Simultaneous <strong>Algebraic</strong> Reconstruction Technique (SART) iterative<br />

method.<br />

Synopsis:<br />

[X info restart] = sart(A,b,K)<br />

[X info restart] = sart(A,b,K,x0)<br />

[X info restart] = sart(A,b,K,x0,options)<br />

Algorithm:<br />

For arbitrary x 0 ∈ R n the algorithm <strong>for</strong> sart takes the following <strong>for</strong>m:<br />

x k+1 = x k + λkV −1 A T W(b −Ax k ),<br />

where V = diag m i=1 ai <br />

j <strong>for</strong> j = 1, . . .,n and W = diag<br />

1, . . .,m.<br />

Description:<br />

<br />

P 1<br />

n<br />

j=1 ai j<br />

131<br />

<br />

<strong>for</strong> i =<br />

The function implements the SART (Simultaneous <strong>Algebraic</strong> Reconstruction<br />

Technique) iterative method <strong>for</strong> solving the linear system Ax= b. The starting<br />

vector is x0; if no starting vector is given then x0 = 0 is used.<br />

The numbers given in the vector K are iteration numbers, that specify which<br />

iterations are stored in the output matrix X. If a stopping rule is selected (see<br />

below) and K = [ ], then X contains the last iterate only.<br />

The maximum number of iterations is determined either by the maximum number<br />

in the vector K or by the stopping rule specified in the field stoprule in the<br />

struct options. If K is empty a stopping rule must be specified.<br />

The relaxation parameter is given in the field lambda in the struct options,<br />

either as a constant or as a string that determines the method to compute


132 Manual Pages<br />

lambda. As default lambda is set to 1/ˆρ, where ˆρ is an estimate of the spectial<br />

radius of V −1 A T WA.<br />

The second output info is a vector with two elements. The first element is an<br />

indicator, that denotes why the iterations were stopped. The number 0 denotes<br />

that the iterations were stopped because the maximum number of iterations<br />

were reached 1 denotes that the NCP-rule stopped the iterations, 2 denotes that<br />

the DP-rule stopped the iterations and 3 denote that the ME-rule stopped the<br />

iterations. The second element in info is the number if used iterations. The<br />

second element in info is the number of used iterations.<br />

The struct restart, which can be given as output contains in the filed s1 the<br />

estimated largest singular value. restart also returns a vector containing the<br />

diagonal of the matrix W in the fieldMand the diagonal of the matrix V −1 in the<br />

field T. The struct restart can also be given as input in the struct options,<br />

such that the program do not have to recompute the contained values. We<br />

recommend only to use this, if the user has good knowledge of <strong>MATLAB</strong> and<br />

is completley sure of the use of restart as input.<br />

Use of options:<br />

The following fields in options are used in this function:<br />

- options.lambda:<br />

- options.lambda = c, where c is a constant, satisfying 0 ≤ c ≤ 2. A<br />

warning is given if this requirement is estimated to be violated.<br />

- options.lambda = ’psi1’, where the method psi1 computes the<br />

values <strong>for</strong> λk using the Ψ1-based relaxation (4.12) from section 4.2.<br />

- options.lambda = ’psi1mod’, where the methodpsi1mod computes<br />

the values <strong>for</strong> λk using the modified Ψ1-based relaxation (4.16) with<br />

τ1 = 2 from section 4.2. The parameter τ1 can be changed in line<br />

325 in the code.<br />

- options.lambda = ’psi2’, where the method psi2 computes the<br />

values <strong>for</strong> λk using the Ψ2-based relaxation (4.15) from section 4.2.<br />

- options.lambda = ’psi2mod’, where the methodpsi2mod computes<br />

the values <strong>for</strong> λk using the modified Ψ2-based relaxation (4.17) with<br />

τ2 = 1.5 from section 4.2. The parameter τ2 can be changed in line<br />

374 in the code.<br />

- options.restart


- options.restart.M = a vector containing the diagonal of W.<br />

- options.restart.T = a vector containing the diagonal of V −1 .<br />

- options.restart.s1 = ˆσ1, where ˆσ1 = √ ˆρ.<br />

- options.stoprule<br />

Examples:<br />

- options.stoprule.type<br />

133<br />

- options.stoprule.type = ’none’, where no stopping rule is<br />

given and only the maximum number of iterations is used to<br />

stop the algorithm. This choice is default.<br />

- options.stoprule.type = ’NCP’, where the optimal number<br />

of iterations k∗ is chosen according to the Normalized Cumulative<br />

Periodogram described in section 5.2.<br />

- options.stoprule.type = ’DP’, where the stopping index is<br />

determined sccording to the discrepancy principle (DP) described<br />

in section 5.1.<br />

- options.stoprule.type = ’ME’, where the stopping index is<br />

determined sccording to the monotone error rule (ME) described<br />

in section 5.1.<br />

- options.stoprule.taudelta = τδ, where δ is the noise level and τ<br />

is user chosen. This parameter is only needed <strong>for</strong> the stoprule types<br />

DP and ME.<br />

Generate a “noisy” 50 × 50 parallel beam tomography problem, compute 50<br />

sart iterations and show the last iterate:<br />

See also:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;<br />

X = sart(A,b,1:50);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off<br />

cav, cimminoProj, cimminoRefl, drop, landweber.


134 Manual Pages<br />

References:<br />

1. A. H. Andersen and A. C. Kak, Simultaneous algebraic reconstruction<br />

technique (SART): A superior implementation of the ART algorithm, Ultrasonic<br />

Imaging, 6 (1984), p. 81-94.


seismictomo<br />

Purpose:<br />

Synopsis:<br />

Creates a two-dimensional seismic tomography test problem.<br />

[A b x s p] = seismictomo(N)<br />

[A b x s p] = seismictomo(N,s)<br />

[A b x s p] = seismictomo(N,s,p)<br />

[A b x s p] = seismictomo(N,s,p,isDisp)<br />

Description:<br />

135<br />

This function creates a two-dimensional seismic tomography test problem. A<br />

two-dimensional domain illustrating a cross section of the subsurface is divided<br />

into N equally spaced intervals in boths dimensions creating N 2 cells. On the<br />

right boundary s sources are located and each source transmits waves to the<br />

p seismographs or receivers, which are scattered on the surface and on the left<br />

boundary. As default N sources and 2N receivers are chosen. If the input isDisp<br />

is different from 0 then the function also creates an illustration of the problem<br />

with the used angles and rays etc. As defaul isDisp is 0.<br />

The function returns a coefficient matrixAwith the dimensionsp·s×N 2 , the right<br />

hand side b and a created phantom of a subsurface as the vector x reshaped.<br />

The figure below illustrates the subsurface created when N = 100. In case the<br />

default values are used, the function also returns the used number of sources s<br />

and the used number of receivers p.<br />

Seismic Phantom, N = 100


136 Manual Pages<br />

Algorithm:<br />

The element aij is defined as the length of the i’th ray through the j’th cell with<br />

aij = 0 if ray i does not go through cell j. The exact solution of the subsurface<br />

phantom is reshaped as a vector and the i’th element in the right hand side bi<br />

is<br />

<br />

bi = aijxj, i = 1, . . .,s ·p.<br />

N 2<br />

j=1<br />

For further in<strong>for</strong>mation see chapter 6.<br />

Examples:<br />

Create a test problem and visualize the solution:<br />

See also:<br />

N = 100; s = N; p = 2*N;<br />

[A b x] = seismictomo(N,s,p);<br />

imagesc(reshape(x,N,N))<br />

colormap gray, axis image off<br />

fanbeamtomo, paralleltomo.<br />

References:<br />

1. See chapter 6.


symkaczmarz<br />

Purpose:<br />

Synopsis:<br />

The symmetric Kaczmarz iterative method.<br />

[X info] = symkaczmarz(A,b,K)<br />

[X info] = symkaczmarz(A,b,K,x0)<br />

[X info] = symkaczmarz(A,b,K,x0,options)<br />

Algorithm:<br />

137<br />

For arbitrary x 0 ∈ R n the algorithm <strong>for</strong> symkaczmarz takes the following <strong>for</strong>m:<br />

x k,0 = x k<br />

x k,i = x k,i−1 bi −<br />

+ λk<br />

ai , xk,i−1 ai2 2<br />

x k+1 = x k,1 .<br />

Description:<br />

a i , i = 1, . . . , m − 1, m, m − 1, . . .,1<br />

The function implements the symmetric Kaczmarz iterative method <strong>for</strong> solving<br />

the linear system Ax= b. The starting vector is x0; if no vector is given then<br />

x0 = 0 is used.<br />

The numbers given in the vector K are iteration numbers, that specify which<br />

iterations are stored in the output matrix X. If a stopping rule is selected (see<br />

below) and K = [ ], then X contains the last iterate only.<br />

The maximum number of iterations is determined either by the maximum number<br />

in the vector K or by the stopping rule specified in the field stoprule in the<br />

struct options. If K is empty a stopping rule must be specified.<br />

The relaxation parameter is given in the field lambda in the struct options,<br />

either as a constant or as a string that determines the method to compute<br />

lambda. As default lambda is set to 0.25.


138 Manual Pages<br />

The second output info is a vector with two elements. The first element is an<br />

indicator, that denotes why the iterations were stopped. The number 0 denotes<br />

that the iterations were stopped because the maximum number of iterations<br />

were reached, 1 denotes that the NCP-rule stopped the iterations, and 2 denotes<br />

that the DP-rule stopped the iterations.<br />

Use of options:<br />

The following fields in options are used in this function:<br />

- options.lambda:<br />

- options.lambda = c, where c is a constant, satisfying 0 ≤ c ≤ 2. A<br />

warning is given if this requirement is estimated to be violated.<br />

- options.lambda = ’psi1’, where the method psi1 computes the<br />

values <strong>for</strong> λk using the Ψ1-based relaxation (4.12) from section 4.2.<br />

- options.lambda = ’psi2’, where the method psi2 computes the<br />

values <strong>for</strong> λk using the Ψ2-based relaxation (4.15) from section 4.2.<br />

- options.stoprule<br />

Examples:<br />

- options.stoprule.type<br />

- options.stoprule.type = ’none’, where no stopping rule is<br />

given and only the maximum number of iterations is used to<br />

stop the algorithm. This choice is default.<br />

- options.stoprule.type = ’NCP’, where the optimal number<br />

of iterations k∗ is chosen according to Normalized Cumulative<br />

Periodogram described in section 5.2.<br />

- options.stoprule.type = ’DP’, where the stopping index is<br />

determined according to the dicrepancy principle (DP) described<br />

in section 5.1.<br />

- options.stoprule.taudelta = τδ, where δ is the noise level and τ<br />

is user-chosen. This parameter is only needed <strong>for</strong> the stoprule type<br />

DP.<br />

Generate a “noisy” 50 × 50 parallel beam tomography problem, computes 10<br />

symkaczmarz iterations and show the last iterate:


See also:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;<br />

X = symkaczmarz(A,b,1:10);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off<br />

kaczmarz, randkaczmarz.<br />

References:<br />

139<br />

1. ˚A. Björck and T. Elfving, Accelerared projection methods <strong>for</strong> computing<br />

pseudoinverse solutions of systems of linear equations, BIT Vol. 19 issue<br />

2 (1979), p. 145-163.


140 Manual Pages<br />

trainDPME<br />

Purpose:<br />

Training strategy to estimate the best parameter when the discrepancy<br />

principle or the monotone error rule is used as stopping rule.<br />

Synopsis:<br />

tau = trainlambda(A,b,x exact,method,type,delta,s)<br />

tau = trainlambda(A,b,x exact,method,type,delta,s,options)<br />

Description:<br />

This function implements the training strategy <strong>for</strong> estimation of the parameter<br />

τ, when using the discrepancy principle or the monotone error rule as stopping<br />

rule. From test solution x exact and the corresponding noise free right-hand<br />

side b s noisy samples are generated with noise level delta. From each sample<br />

the solutions <strong>for</strong> the given methodmethod are calculated and according to which<br />

type of stopping rule is chosen in type an estimate of tau is calculated and<br />

returned.<br />

A default maximum number of iterations is chosen <strong>for</strong> the SIRT methods to<br />

be 1000 and <strong>for</strong> the ART methods to 100. If the this is not enough it can be<br />

changed in line 74 <strong>for</strong> the SIRT methods and in line 87 <strong>for</strong> the ART methods.<br />

Algorithm:<br />

See section 5.1.<br />

Use of options:<br />

The following fields in options are used in this function.<br />

- options.lambda: See the chosen method method <strong>for</strong> the choices of this<br />

parameter.


141<br />

- options.restart: Only availible when method is a SIRT method. See<br />

the specific method <strong>for</strong> correct use.<br />

- options.w: If the chosen mehtod method allows weigths this parameter<br />

can be set.<br />

Examples:<br />

Generate a “noisy” 50 × 50 parallel beam tomography problem. Then the parameter<br />

tau is found using training <strong>for</strong> DP and this parameter is used with DP<br />

to stop the iterations and the last iterate is shown.<br />

See also:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

delta = 0.05;<br />

tau = trainDPME(A,b,x,@cimminoProj,’ME’,delta,20);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + delta*norm(b)*e;<br />

options.stoprule.type = ’ME’;<br />

options.stoprule.taudelta = tau*delta;<br />

[X info] = cimminoProj(A,b,200,[],options);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off<br />

cav, cimminoProj, cimminoRefl, drop, kaczmarz, landweber, randkaczmar,<br />

sart, symkaczmarz.<br />

References:<br />

1. T. Elfving and T. Nikazad, Stopping rules <strong>for</strong> Landweber-type iteration,<br />

Inverse Problems, Vol 23 (2007), p. 1417-1432.


142 Manual Pages<br />

trainLambdaART<br />

Purpose:<br />

Strategy to find the best constant relaxation parameter λ <strong>for</strong> a given<br />

ART method.<br />

Synopsis:<br />

lambda = trainLambdaART(A,b,x exact,method)<br />

lambda = trainLambdaART(A,b,x exact,method,kmax)<br />

Description:<br />

This function implements the training strategy <strong>for</strong> finding the optimal constant<br />

relaxation parameter λ <strong>for</strong> a given ART method, that solves the linear system<br />

Ax = b. The training strategy builts on a two part strategy.<br />

In the first part the resolution limit is calculated using kmax iterations of the<br />

iteration ART method given as a function handle in method. If kmax is not<br />

given or empty, the default value is 100.<br />

The first part of the strategy is to determine the resolution limit <strong>for</strong> the a specific<br />

value of λ.<br />

The second part of the stratgy is a modified version of a golden section search<br />

in which the optimal value of λ is found within the convergence interval of the<br />

specified iterative method. The method returns the optimal value in the output<br />

lambda.<br />

Algorithm:<br />

See section 4.2.1.


Examples:<br />

143<br />

Generate a “noisy” 50 ×50 parallel beam tomography problem, train to find the<br />

optimal value of λ <strong>for</strong> the ART method kaczmarz and use the found value, when<br />

10 iterations of the method are computed. At last the last iterate is shown:<br />

See also:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;<br />

lambda = trainLambdaART(A,b,x,@kaczmarz);<br />

options.lambda = lambda;<br />

X = kaczmarz(A,b,1:10,[],options);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off<br />

trainLambdaSIRT<br />

References:<br />

1. See section 4.2.1.


144 Manual Pages<br />

trainLambdaSIRT<br />

Purpose:<br />

Strategy to find the best constant relaxation parameter λ <strong>for</strong> a given<br />

SIRT method.<br />

Synopsis:<br />

lambda = trainLambdaSIRT(A,b,x exact,method)<br />

lambda = trainLambdaSIRT(A,b,x exact,method,kmax)<br />

lambda = trainLambdaSIRT(A,b,x exact,method,kmax,options)<br />

Description:<br />

This function implements the training strategy <strong>for</strong> finding the optimal constant<br />

relaxation parameter λ <strong>for</strong> a given SIRT method, that solves the linear system<br />

Ax = b. The training strategy builds on a two part strategy.<br />

In the first step the resolution limit is calculated using kmax iterations of the<br />

iteration SIRT method given as a function handle in method. If kmax is not<br />

given or empty, the default value is 1000.<br />

To determine the resolution limit the default value of λ is used together with<br />

the contents of options. See below <strong>for</strong> correct use of options.<br />

The second part of the stratgy is a modified version of a golden section search<br />

in which the optimal value of λ is found within the convergence interval of the<br />

specified iterative method. The method returns the optimal value in the output<br />

lambda.<br />

Algorithm:<br />

See section 4.2.1.


Use of options:<br />

The following fields in options are used in this function.<br />

- options.restart: See the specific method <strong>for</strong> correct use.<br />

145<br />

- options.w: If the chosen mehtod method allows weigths this parameter<br />

can be set.<br />

Examples:<br />

Generate a “noisy” 50 ×50 parallel beam tomography problem, train to find the<br />

optimal value of λ <strong>for</strong> the SIRT method cimminoProj and use the found value,<br />

when 50 iterations of the method is computed. At last the last iterate is shown:<br />

See also:<br />

[A b x] = paralleltomo(50,0:5:179,150);<br />

e = randn(size(b)); e = e/norm(e);<br />

b = b + 0.05*norm(b)*e;<br />

lambda = trainLambdaSIRT(A,b,x,@cimminoProj);<br />

options.lambda = lambda;<br />

X = cimminoProj(A,b,1:50,[],options);<br />

imagesc(reshape(X(:,end),50,50))<br />

colormap gray, axis image off<br />

trainLambdaART<br />

References:<br />

1. See section 4.2.1.


146 Manual Pages


Chapter 9<br />

Conclusion and Future Work<br />

The goal of this thesis was to develop and implement a <strong>MATLAB</strong> package containing<br />

a number of iterative methods <strong>for</strong> algebraic reconstruction and describe<br />

the methods individually, and we believe that we have completed the task successfully.<br />

We have described the implemented methods and the corresponding theory.<br />

Furthermore the theory <strong>for</strong> the strategies <strong>for</strong> choosing the relaxation parameter<br />

is described and <strong>for</strong> each of the implemented methods the relevant strategies are<br />

available. We have also discussed and implemented a few stopping rules. We<br />

also introduced three tomography test problems from parallel beam tomography,<br />

fan beam tomography and seismic tomography. Furthermore manual pages <strong>for</strong><br />

each function in the package are created.<br />

In our studies of the implemented methods and strategies we concluded that<br />

all the implemented strategies <strong>for</strong> choosing the relaxation parameter gave nice<br />

results. One should be aware that each method has its own advantage and<br />

disadvantage. The training strategy, which we developed, requires knowledge<br />

of the exact solution, but at the same time keeps the relative error very small.<br />

Line search can only be used on a small selection of the SIRT methods, and<br />

<strong>for</strong> larger noise levels it shows erratic behaviour but <strong>for</strong> small noise levels the<br />

per<strong>for</strong>mance is good. The last strategy which arose from the studies of the semiconvergence<br />

has the advantage that the noise-error is dampened which keeps the


148 Conclusion and Future Work<br />

relative error small when the resolution limit is reached. The disadvantage is<br />

that because of this damping then it requires many iterations to reach the same<br />

level <strong>for</strong> the relative error as the other strategies.<br />

The studies of the stopping rules showed very unstable results since the same<br />

stopping rule did not work equally good on <strong>for</strong> methods. The studies where<br />

we combined the relaxation parameter and the stopping rules confirmed the<br />

conclusion that neither of the stopping rules produced a stable result. The<br />

NCP stopping rule often gave the best result but when it did not the result was<br />

far away.<br />

We also compared the per<strong>for</strong>mance of the ART and the SIRT methods where<br />

we included the workload. We concluded that the ART methods in general used<br />

less work units to obtain a result of the same quality as <strong>for</strong> the SIRT methods.<br />

This caused a dilemma since the understanding <strong>for</strong> the SIRT methods are better<br />

since more theory is available <strong>for</strong> these methods.<br />

9.1 Future Work<br />

Finally we will discuss how the work in this thesis can be extended, and how<br />

we think the per<strong>for</strong>mance of the methods could be improved.<br />

An obvious way to continue the work from this thesis would be to look further<br />

into the block-iterative methods. We have only discussed a few of these and<br />

perhaps the implementation of the block-iterative methods could lead to an<br />

overall better per<strong>for</strong>mance.<br />

Another way to continue the work could have been to investigate the area of<br />

preconditioning methods <strong>for</strong> the already implemented methods. It could be<br />

interesting to observe the effect of the extension.<br />

If we should advise how the future development on the this field should proceed<br />

we could advise the development of more stable stopping rules. As discussed<br />

earlier the existing stopping rules are very unstable and to obtain a good result<br />

without knowing the exact solution the chosen stopping index is very important.<br />

Another field of development could be the development of an adaptive strategy<br />

<strong>for</strong> choosing the relaxation parameter <strong>for</strong> the ART methods. This is yet an<br />

unexplored field and the results <strong>for</strong> the SIRT methods suggest that this could<br />

be a good idea.


Appendix A<br />

Appendix<br />

A.1 Orthogonal Projection on a Hyperplane<br />

When defining the orthogonal projection on a hyperplane, we will first look at<br />

the case where origo lies in the hyperplane Hi and then at the general case<br />

where origo does not necessarily lies in the hyperplane Hi. We recall from [30]<br />

that the hyperplane Hi is defined as<br />

Hi = {x ∈ R n | a i , x = bi},<br />

and the case where origo lies in the hyperplane is when bi = 0.<br />

Figure A.1 shows the case when bi = 0 ⇒ O ∈ Hi. In the figure O denotes<br />

origo and z is the point z ∈ R n which is projected onto the hyperplane. Pi(z)<br />

denotes the projection of z onto the hyperplane. We want to derive a relation<br />

<strong>for</strong> Pi(z) = z ∗ . Since the projection is orthogonal, we can write Pi(z) as z minus


150 Appendix<br />

bi = 0 ⇒ O ∈ Hi:<br />

a i<br />

a i <br />

O<br />

✻ ai<br />

θ✻<br />

z<br />

✒<br />

✿ Pi(z)<br />

Figure A.1: Projection on the hyperplane Hi in the case where origo lies in the<br />

hyperplane.<br />

the orthogonal projection along a i , which give the following:<br />

Pi(z) = z − z ∗ − z2<br />

= z − cosθz2<br />

a i<br />

a i 2<br />

a i<br />

a i 2<br />

= z − 〈ai , z〉<br />

ai z2<br />

2z2<br />

= z − 〈ai , z〉<br />

ai2 a<br />

2<br />

i .<br />

a i<br />

a i 2<br />

To obtain this result we have used that cosθ = 〈ai ,z〉<br />

a i 2z2 .<br />

We will now derive the orthogonal projection on a hyperplane in the case where<br />

origo O does not lie in the hyperplane. This case is illustrated in figure A.2<br />

where we want to project z on the hyperplane Hi. We introduce the vector z0,<br />

which ends in the same point as z. However z0 does not start in origo but in<br />

the intersection between the hyperplane Hi and the vector orthogonal to the<br />

hyperplane through origo a i . We denote this point x, and this gives us the<br />

following relation between z0 and z:<br />

z = x + z0<br />

z0 = z − x.<br />

We define x = αa i . This leads to the following:<br />

a i , x = a i , αa i = αa i 2 2 = bi.<br />

Hi


A.1 Orthogonal Projection on a Hyperplane 151<br />

bi = 0 ⇒ O ∈ Hi:<br />

x<br />

O<br />

✻ ai<br />

z<br />

z0✒✕ ✿ Pi(z0)<br />

Figure A.2: Projection on the hyperplane Hi in the case where origo does not lie in<br />

the hyperplane.<br />

From this we get that<br />

Hi<br />

α = bi<br />

ai2 . (A.1)<br />

2<br />

We can now determine the orthogonal projection on the hyperplane <strong>for</strong> z0 as:<br />

Pi(z0) = z0 −<br />

We then use that z0 = z − x = z − αa i :<br />

Pi(z0) = z − αa i −<br />

<br />

i a , z0<br />

a i .<br />

a i 2 2<br />

a i , z − αa i <br />

a i 2 2<br />

The projection of z on the hyperplane is then Pi(z) = αa i + Pi(z0) and using<br />

a i .


152 Appendix<br />

imaginary<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

−0.2<br />

−0.4<br />

−0.6<br />

−0.8<br />

The roots <strong>for</strong> k = 10,...,30<br />

−1 −0.5 0<br />

real<br />

0.5 1<br />

k = 10<br />

k = 11<br />

k = 12<br />

k = 13<br />

k = 14<br />

k = 15<br />

k = 16<br />

k = 17<br />

k = 18<br />

k = 19<br />

k = 20<br />

k = 21<br />

k = 22<br />

k = 23<br />

k = 24<br />

k = 25<br />

k = 26<br />

k = 27<br />

k = 28<br />

k = 29<br />

k = 30<br />

Figure A.3: Illustration of the roots <strong>for</strong> the polynomials <strong>for</strong> k = 10, . . . , 30.<br />

this and (A.1) we get:<br />

Pi(z) = αa i + Pi(z0) = αa i + z − αa i −<br />

a i , z − αa i <br />

= z −<br />

ai2 a<br />

2<br />

i<br />

<br />

i i 2 a , z − αa 2 = z −<br />

a i<br />

= z −<br />

ai2 2<br />

<br />

i bi a , z − ai2 a<br />

2<br />

i2 2<br />

ai2 2<br />

= z + bi − a i , z <br />

a i 2 2<br />

A.2 Investigation of the Roots<br />

This section contains an investigation of the polynomial<br />

.<br />

a i<br />

a i , z − αa i <br />

a i 2 2<br />

gk−1(y) = (2k − 1)y k−1 − (y k−2 + . . . + y + 1) = 0, (A.2)<br />

and a description of the most suitable approach to calculate the unique root in<br />

the interval (0, 1).<br />

To investigate the behaiviour of the all the roots of the polynomial (A.2) we<br />

a i


A.2 Investigation of the Roots 153<br />

imaginary<br />

0.25<br />

0.2<br />

0.15<br />

0.1<br />

0.05<br />

0<br />

−0.05<br />

−0.1<br />

−0.15<br />

−0.2<br />

−0.25<br />

The roots <strong>for</strong> k = 10,...,30<br />

0.5 0.6 0.7 0.8 0.9 1<br />

real<br />

Figure A.4: Zoom of the roots <strong>for</strong> the polynomials <strong>for</strong> k = 10, . . . , 30 near the root 1.<br />

create a figure showing all the roots <strong>for</strong> the polynomials <strong>for</strong> k = 10, . . .,30,<br />

figure A.3. In the figure every polynomial is specified by a specific color and a<br />

specific marker type. This means that roots only belong to the same polynomial<br />

if both the color and the marker type are the same. This figure illustrates that<br />

every polynomial has a real root in the interval [0, 1]. The rest of the roots are<br />

either a real root in the interval [−1, 0] or complex roots. The complex roots<br />

create a circle that lies inside the unit circle in the complex plane.<br />

Since we are interested in the unique root in the interval [0, 1] we look at a zoom<br />

on these roots figure A.4. We see that the unique roots are isolated from the<br />

other real roots, but some of the complex roots are rather close. We will now<br />

investigate if this can cause problems when using Newton-Raphson’s iterative<br />

method to find the unique root.<br />

k = 10<br />

k = 11<br />

k = 12<br />

k = 13<br />

k = 14<br />

k = 15<br />

k = 16<br />

k = 17<br />

k = 18<br />

k = 19<br />

k = 20<br />

k = 21<br />

k = 22<br />

k = 23<br />

k = 24<br />

k = 25<br />

k = 26<br />

k = 27<br />

k = 28<br />

k = 29<br />

k = 30<br />

Newton-Raphson’s iterative method are in each step defined as:<br />

yk+1 = yk − g(yk)<br />

g ′ (yk) .<br />

We see that when finding a complex root with Newton-Raphson’s method the<br />

starting guess y0 has to be complex or the function g(yk) maps the real numbers<br />

into the complex numbers. Newton-Raphson’s method will there<strong>for</strong>e <strong>for</strong> our<br />

function (A.2) only find real roots if the staring guess is real since a polynomial<br />

maps the real numbers into the real numbers. And if we further give the starting<br />

guess y0 = 1, then we have isolated the unique root in the interval [0, 1]. In<br />

our implementataion of Newton’s method we will always use 6 iterations, since<br />

experince has shown that 6 would be a good choice.


154 Appendix<br />

To use Newton-Raphson’s method we need to calculate the derivative of the<br />

function but since the function is a polynomial, we can use Horner’s algorithm<br />

to determine both the function and the derivative [12].<br />

A.3 Work Units <strong>for</strong> the SIRT and ART methods<br />

To compare both the per<strong>for</strong>mance of the SIRT and the ART methods we look<br />

at the work load of one iteration of each of the methods. We define a work<br />

unit WU to be one sparse matrix vector multiplication. We let ̟ denote the<br />

average number of non-zero elements in a row. Since the SIRT methods can<br />

all be written in the same <strong>for</strong>m, we find the work load <strong>for</strong> one iteration in the<br />

following way:<br />

SIRT:<br />

r k = b − Ax k m + 2m · ̟<br />

z k = Mr k m<br />

v k = A T z k 2m · ̟<br />

q k = Tv k n<br />

x k+1 = x k + λq k 2n<br />

Total : (4̟ + 2)m + 3n<br />

≃ 2̟ · m ⇒ 2 WU.<br />

For Kaczmarz’s method one step can be written in the following way,<br />

Kaczmarz:<br />

ri = bi − 〈ai , xk,i−1 〉 2̟<br />

xk,i = xk,i−1 + λ ri<br />

ai2 a<br />

2<br />

i 2̟<br />

Total : 4̟ · m ⇒ 4 WU.<br />

Kaczmarz’s method require 4WU, since one iteration consists of m steps.<br />

Since one iteration of symmetric Kaczmarz consists of 2m−2 steps, the working<br />

units are:<br />

sym. Kaczmarz:<br />

ri = bi − 〈ai , xk,i−1 〉 2̟<br />

xk,i = xk,i−1 + λ ri<br />

ai2 a<br />

2<br />

i 2̟<br />

Total : 4̟ · (2m − 2) ⇒ 8 WU.<br />

Since the randomized Kacmarz method has the same <strong>for</strong>mular as Kaczmarz’s<br />

method, except the selection of the row, the calculation of the work load <strong>for</strong><br />

one step is the same as <strong>for</strong> Kaczmarz’s method. In the implementation of the<br />

randomized Kaczmarz method we define one iteration to be m steps. This means<br />

that <strong>for</strong> randomized Kaczmarz we have a work load of 4WU <strong>for</strong> one iteration.


Bibliography<br />

[1] A. H. Andersen and A. C. Kak, Simultaneous algebraic reconstruction technique<br />

(SART): A superior implementation of the ART algorithm, Ultrasonic<br />

Imaging, 6 (1984), p. 81-94.<br />

[2] G. Appleby and D. C. Smolarski, A linear acceleration row action method<br />

<strong>for</strong> projecting onto subspaces, Electron. Trans. Numer. Anal., 20 (2005), p.<br />

253-275.<br />

[3] ˚A. Björck and T. Elfving, Accelerated projection methods <strong>for</strong> computiong<br />

pseudoinverse solutions of systems of linear equations, BIT Vol. 19 issue 2<br />

(1979), p. 145-163.<br />

[4] Y. Censor and T. Elfving, Block-iterative algorithms with diagonally scaled<br />

oblique projections <strong>for</strong> the linear feasibility problem, SIAM Vol. 24 (2002),<br />

p. 40-58.<br />

[5] Y. Censor, T. Elfving, G. T. Herman and T. Nikazad, On diagonally relaxed<br />

orthogonal projection methods, SIAM Journal on Scientific Computing, Vol.<br />

30 issue 1 (2007), p. 473-504.<br />

[6] Y. Censor, D. Gordan and R. Gordan, Component averaging: An efficient<br />

iterative parallel algorithm <strong>for</strong> large sparse unstructured problems, Parallel<br />

Computing Vol. 27 issue 6 (2001), p. 777-808.<br />

[7] Y. Censor, D. Gordon and R. Gordon, BICAV: A block-iterative parallel algorithm<br />

<strong>for</strong> sparse systems with pixel-related weighting, IEEE Transactions<br />

on medical Imaging, Vol. 20, (2001), p. 1050-1060.<br />

[8] G. Cimmino, Calcolo approssimato per le soluzioni dei sistemi di equazioni<br />

lineari, La Ricerca Scientifica, XVI, Series II, Anno IX, 1 (1938), p. 326-333.


156 BIBLIOGRAPHY<br />

[9] A. Dax, Line search acceleration of iterative methods, Linear Algebra Appl.,<br />

130 (1990), p. 43-63.<br />

[10] A. R. De Pierro, Methodos de projeção para a resolução de sistemas gerais<br />

de equações algébricas lineares, Thesis (tese de Doutoramento), Institutode<br />

Matematica da UFRJ, Cidade Universitaria, Rio de Janeirom Brasil, 1981.<br />

[11] L. T. Dos Santos, A parallel subgradient projections method <strong>for</strong> the convex<br />

feasibility problem, J. Comput. Appl. Math., 18 (1987), p. 307-320.<br />

[12] L. Eldén, L. Wittmeyer-Koch and H. B. Nielsen, Introduction to Numerical<br />

Computation, Studentlitteratur AB, 2004.<br />

[13] T. Elfving and T. Nikazad, Some properties of ART-type reconstruction algorithms,<br />

accepted <strong>for</strong> publication in Mathematical Methods in Biomedical<br />

Imaging and Intensity-Modulated Radiation Therapy (IMRT)<br />

[14] T. Elfving and T. Nikazad, Some block-iterative methods used in image<br />

reconstruction, unpublished article.<br />

[15] T. Elfving and T. Nikazad, Stopping rules <strong>for</strong> Landweber-type iteration,<br />

Inverse Problems, Vol 23 (2007), p. 1417-1432.<br />

[16] T. Elfving, T. Nikazad and P. C. Hansen, Semi-convergence and relaxation<br />

parameters <strong>for</strong> a class of SIRT algorithms, submitted to ETNA.<br />

[17] R. Gordon, R. Bender and G. T. Herman, <strong>Algebraic</strong> reconstruction techniques<br />

<strong>for</strong> 3 dimensional electron microscopy and x-ray photograph, Journal<br />

of Theoretical Biology, Vol.29 (1970), p. 471-481.<br />

[18] D. Gordon and R. Gordon, Component-averaged row projections: A robust,<br />

block-parallel scheme <strong>for</strong> sparse linear systems, SIAM Journal on Scientific<br />

Computing, Vol. 27, No. 3, p. 1092-1117.<br />

[19] D. Gordon and R. Gordon, Component-averaged row projections: A robust,<br />

block-parallel scheme <strong>for</strong> sparse linear systems, SIAM Journal on Scientific<br />

Computing, Vol. 27, No. 3, p. 1092-1117.<br />

[20] P. C. Hansen, Rank-Deficient and Discrete Ill-Posed Problems, SIAM, 1998.<br />

[21] P. C. Hansen, Regularization <strong>Tools</strong> version 4.0 <strong>for</strong> Matlab 7.3, Numerical<br />

Algorithms (2007), 189-194.<br />

[22] P. C. Hansen, Discrete Inverse Problems: Insight and Algorithms, SIAM,<br />

2010.<br />

[23] U. Hämarik and U. Tautenhahn, On the monotone error rule <strong>for</strong> aprameter<br />

choice in iterative and continuous regularization methods, BIT, 41 (2001),<br />

p. 1029-1038.


BIBLIOGRAPHY 157<br />

[24] G. N. Hounsfield, Computericed transverse axial scanning tomography: Part<br />

I, discription of the system, Br. J. Radiol, 46 (1973), p. 1016-1022.<br />

[25] M. Jiang and G. Wang, Convergence studies on iterative algorithms <strong>for</strong><br />

image reconstruction, IEEE Transactions on Medical imaging, 22 (2003),<br />

p. 569-579.<br />

[26] M. Jiang and G. Wang, Convergence if the Simultaneous <strong>Algebraic</strong> Reconstruction<br />

Technique (SART), IEEE Transactions on Image Proceseeing,<br />

Vol. 12, 2003, p. 957-961.<br />

[27] S. Kaczmarz, Angenäherte auflösung von systemen linearer gleichungen,<br />

Bulletin de l’Académie Polonaise des Sciences et Lettres, A35 (1937), p.<br />

355-357.<br />

[28] A. C. Kak and M. Slaney, Principles of Computerized Tomographic Imaging,<br />

SIAM, 2001.<br />

[29] L. Landweber, An iteration <strong>for</strong>mula <strong>for</strong> Fredholm integral of the first kind,<br />

American Journal of Mathematics, Vol. 73 (1951), p. 615-624.<br />

[30] C. D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM, 2000.<br />

[31] F. Natterer, The Mathematics of Computerized Tomography, SIAM, 2001<br />

[32] F. Natterer and F. Wübbeling, Mathematical Methods in Image Reconstruction,<br />

SIAM, 2001.<br />

[33] T. S. Pan, Acceleration and filtering in the Generalized Landweber iteration<br />

using a variable shaping matrix, IEEE Transactions on Medical Imaging,<br />

Vol. 12, (1993), p. 278-286.<br />

[34] C. Popa, Extensions of block-projections methods with relaxation parameters<br />

to inconsistent and rank-deficient least-squares problems, BIT 38<br />

(1998), p. 151-176.<br />

[35] G. Qu, C. Wang and M. Jiang, Necessary and sufficient convergence conditions<br />

<strong>for</strong> algebraic image reconstruction algorithms, IEEE Transactions on<br />

Image Processing Vol. 18 issue 2 (2009), p. 435-440.<br />

[36] T. Strohmer and R. Vershynin, A randomized solver <strong>for</strong> linear systems with<br />

exponential convergence, Lecture Notes in Computer Science 4110 (2006),<br />

p. 499-507.<br />

[37] P. Toft, The Radon Trans<strong>for</strong>m, Theory and Implementation, unpubliched<br />

dissertation, p. 199-201.<br />

[38] C. F. Van Loan, Introduction to Scientific Computing - A Matrix-Vector<br />

Approach Using <strong>MATLAB</strong> , Pearson Higher Education, 1996.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!