Greville's Method for Preconditioning Least Squares ... - Projects

Noname manuscript No. 

(will be inserted by the editor) 

Greville’s Method for Preconditioning Least Squares 

Problems 

Xiaoke Cui · Ken Hayami · Jun-Feng Yin 

Received: date / Accepted: date 

Abstract In this paper, we propose a preconditioning algorithm for least squares 

problems min 

x∈R n‖b − Ax‖ 2, where A can be matrices with any shape or rank. The preconditioner 

is constructed to be a sparse approximation to the Moore-Penrose inverse 

of the coefficient matrix A. For this preconditioner, we provide theoretical analysis 

to show that under our assumption, the least squares problem preconditioned by this 

preconditioner is equivalent to the original problem, and the GMRES method can determine 

a solution to the preconditioned problem before breakdown happens. In the 

end of this paper, we also give some numerical examples showing the performance of 

the method. 

Keywords Least Squares Problem · Preconditioning · Moore-Penrose Inverse · 

Greville Algorithm · GMRES 

Mathematics Subject Classification (2000) 65F08 · 65F10 

This research was supported by the Grants-in-Aid for Scientific Research(C) of the Ministry 

of Education, Culture, Sports, Science and Technology, Japan. 

Xiaoke Cui 

Department of Informatics, School of Multidisciplinary Sciences The Graduate University for 

Advanced Studies (Sokendai), 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, Japan, 101-8430. 

Presently Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, 

Kashiwa, Chiba, Japan, 277-8568. 

E-mail: xkcui@sml.k.u-tokyo.ac.jp 

Ken Hayami 

National Institute of Informatics, Japan, also Department of Informatics, School of Multidisciplinary 

Sciences The Graduate University for Advanced Studies (Sokendai), 2-1-2 Hitotsubashi, 

Chiyoda-ku, Tokyo, Japan, 101-8430. 

Tel.: +81-3-4212-2553 

E-mail: hayami@nii.ac.jp 

Jun-Feng Yin 

Department of Mathematics, Tongji University, 1239 Siping Road, Shanghai 200092, P. R. 

China. 

E-mail: yinjf@tongji.edu.cn

2 

1 Introduction 

Consider the least squares problem, 

min ‖b − Ax‖ 2, (1.1) 

x∈R n 

where A ∈ R m×n , b ∈ R m . 

To solve least squares problems, there are two main kinds of methods : direct 

methods and iterative methods [4]. When A is large and sparse, iterative methods, 

especially Krylov subspace methods are preferred for solving (1.1), since they require 

less memory and computational work compared to direct methods. In [14], Hayami 

et al. proposed using GMRES [16] to solve least squares problems by using some 

preconditioners. By using a preconditioner B ∈ R n×m preconditioning (1.1) from the 

left, we can transform problem (1.1) to 

min ‖Bb − BAx‖ 2. (1.2) 

x∈R n 

On the other hand, we can also precondition problem (1.1) from the right and transform 

the problem (1.1) to 

min ‖b − ABy‖ 2. (1.3) 

y∈R m 

In [14], necessary and sufficient conditions for B such that (1.2) and (1.3) are equivalent 

to (1.1) were given, respectively. 

There are different methods to construct the preconditioner B, the simplest choice 

of B is to choose B = A T . If rank(A) = n, one may use B = CA T where C = 

diag(A T A) −1 or better, use Robust Incomplete Factorization(RIF) [2,3] to construct 

C[14]. Another possibility is to use the incomplete Givens orthogonalization method to 

construct B [19]. However, so far no efficient preconditioners has been proposed for the 

rank deficient case. In this paper, we will construct a preconditioner which also works 

when A is rank deficient. The idea is to use the approximate inverse of the coefficient 

matrix of a linear system 

Ax = b (1.4) 

with A ∈ R n×n as a preconditioner. This kind of preconditioners were originally developed 

for solving nonsingular linear systems [10,15]. Since now A is a general matrix, we 

construct matrix M ∈ R n×m which is an approximation to the Moore-Penrose inverse 

[9,17] of A, and use M to precondition the least squares problem (1.1). When A is rank 

deficient, the preconditioner M can also be rank deficient. Note, in passing, that in [20] 

it was shown that for singular linear systems, singular preconditioners are sometimes 

better than nonsingular preconditioners. 

The main contribution of this paper is to give a new way to precondition general 

least squares problems from the perspective of the approximate Moore-Penrose 

inverse. Similar to the RIF preconditioner[2,3] our method also includes an A T A- 

orthogonalization process when the coefficient matrix is full column rank. When A 

is rank deficient, our method tries to orthogonalize the linearly independent part in 

A. We also give a theoretical analysis on the equivalence between the preconditioned 

problem and the original problem, and discuss the possibility of breakdown when using 

GMRES to solve the preconditioned system. 

The rest of the paper is organized as follows. In Section 2, we first introduce some 

properties of the Moore-Penrose inverse of a rank-one updated matrix and the original

3 

Greville’s method. In Section 3, based on the Greville’s method, we propose a global 

algorithm and a vector-wise algorithm for constructing the preconditioner M which is 

an approximate generalized inverse of A. In Section 4, we show that for full column 

rank matrix A, our algorithm is similar to the RIF preconditioning algorithm[3] and 

includes an A T A-orthogonalization process. In Section 5 and Section 6, we prove that 

under a certain assumption, using our preconditioner M, the preconditioned problem is 

equivalent to the original problem, and the GMRES method can determine a solution 

to the preconditioned problem before breakdown happens. In Section 7, we consider 

some details on the implementation of our algorithms. Numerical results are presented 

in Section 8. Section 9 concludes the paper. 

2 Greville’s Method 

Given a rectangular matrix A ∈ R m×n , rank(A) = r ≤ min{m, n}, assume that the 

Moore-Penrose inverse of A is known. We are interested in how to compute the Moore- 

Penrose inverse of 

A + cd T , c ∈ R m , d ∈ R n , (2.1) 

which is a rank-one update of A [9,12]. In [9], the following six logical possibilities are 

considered 

1. c ∉ R(A), d ∉ R(A T ) and 1 + d T A † c arbitrary, 

2. c ∈ R(A), d ∉ R(A T ) and 1 + d T A † c = 0, 

3. c ∈ R(A), d arbitrary and 1 + d T A † c ≠ 0, 

4. c ∉ R(A), d ∈ R(A T ) and 1 + d T A † c = 0, 

5. c arbitrary, d ∈ R(A T ) and 1 + d T A † c ≠ 0, 

6. c ∈ R(A), d ∈ R(A T ) and 1 + d T A † c = 0. 

Here R(A) denotes the range space of A. For each possibility, an expression for the 

Moore-Penrose inverse of the rank one update of A is given by the following theorem 

[9]. 

Theorem 1 For A ∈ R m×n , c ∈ R m , d ∈ R n , let k = A † c, h = d T A † , u = (I−AA † )c, 

v = d T (I − A † A), and β = 1 + d T A † c. Note that, 

c ∈ R(A) ⇔ u = 0 (2.2) 

d ∈ R(A T ) ⇔ v = 0, (2.3) 

then, the generalized inverse of A + cd T is given as follows. 

1. If u ≠ 0 and v ≠ 0, then (A + cd T ) † = A † − ku † − v † h + βv † u † . 

2. If u = 0 and v ≠ 0, and β = 0, then (A + cd T ) † = A † − kk † A † − v † h. 

3. If u = 0 and β ≠ 0, then (A + cd T ) † = A † + 1¯β v T k T A † − ¯β 

σ 1 

p 1 q1 T , where p 1 = 

( ‖k‖ 

2 

) ( 

2 

− ¯β 

vT + k , q1 T ‖v‖ 

2 

) 

2 

= − ¯β 

kT A † + h . 

4. If u ≠ 0, v = 0 and β = 0, then (A + cd T ) † = A † − A † h † h − ku † . 

5. If v = 0 and β ≠ 0, then (A + cd T ) † = A † + 1¯β A † h T u T − ¯β 

σ 2 

p 2 q2 T , where p 2 = 

) 

( ‖h‖ 

2 

( ‖u‖2 

− ¯β 

A† h T + k , q2 T 2 

= − ¯β 

uT + h , and σ 2 = ‖h‖ 2 2‖u‖ 2 2 + |β| 2 . 

6. If u = 0, v = 0 and β = 0, then (A + cd T ) † = A † − kk † A † − A † h † h + (k † A † h † )kh. 

)

4 

To utilize the above theorem, let 

A = 

n∑ 

a i e T i , (2.4) 

i=1 

where a i is the ith column of A. Further let A i = [a 1 , . . . , a i , 0, . . . , 0] ∈ R m×n . Hence 

we have 

i∑ 

A i = a k e T k , i = 1, . . . , n, (2.5) 

and if we denote A 0 = 0 m×n , 

k=1 

A i = A i−1 + a i e T i , i = 1, . . . , n. (2.6) 

Thus every A i , i = 1, . . . , n is a rank-one update of A i−1 . Noticing that A † 0 = 0 n×m, 

we can utilize Theorem 1 to compute the Moore-Penrose inverse of A step by step and 

have A † = A † n in the end. 

In Theorem 1, substituting a i into c and e i into d, we can rewrite Equation (2.2) 

as follows. 

a i ∈ R(A i−1 ) ⇔ u = (I − A i−1 A † i−1 )a i = 0 (2.7) 

No matter whether a i is linearly dependent on the previous columns or not, we have 

for any i = 1, 2, . . . , n 

β = 1 + e T i A † i−1 a i = 1. (2.8) 

The vector v in equation (2.3) can be written as 

v = e T i (I − A † i−1 A i−1) ≠ 0, (2.9) 

which is nonzero for any i = 1, . . . , n. Hence, we can use Case 1 and Case 3 for case 

a i ∉ R(A i−1 ) and case a i ∈ R(A i−1 ), respectively. 

Then from Theorem 1, denoting A 0 = 0 m×n , we obtain a method to compte A † i 

based on A † i−1 as 

A † i = { A 

† 

i−1 + (e i − A † i−1 a i)((I − A i−1 A † i−1 )a i) † if a i ∉ R(A i−1 ) 

A † i−1 + 1 σ i 

(e i − A † i−1 a i)(A † i−1 a i) T A † i−1 if a i ∈ R(A i−1 ) (2.10) 

where σ i = 1 + ‖A † i−1 a i‖ 2 2. This method was proposed by Greville in the 1960s[13]. 

3 Preconditioning Algorithm 

In this section, we construct our preconditioning algorithm according to the Greville’s 

method in section 2. First of all, we notice that in Equation (2.10) the difference 

between case a i ∉ R(A i−1 ) and case a i ∈ R(A i−1 ) lies in the second term. If we define 

vectors k i , u i , v i and scalar f i for i = 1, . . . , n as 

k i = A † i−1 a i, (3.1) 

u i = a i − A i−1 k i = (I − A i−1 A † i−1 )a i, (3.2) 

{ ‖ui ‖ 2 2 if a 

f i = 

i ∉ R(A i−1 ) 

1 + ‖k i ‖ 2 2 if a i ∈ R(A i−1 ) , (3.3) 

{ 

u i if a i ∉ R(A i−1 ) 

v i = 

(A † i−1 )T k i if a i ∈ R(A i−1 ) , (3.4)

5 

we can express A † i 

in a unified form for general matrices as 

and we have 

A † i = A† i−1 + 1 f i 

(e i − k i )v T i , (3.5) 

A † = 

n∑ 

i=1 

If we define matrices K ∈ R n×n , V ∈ R m×n , F ∈ R n×n as 

we obtain a matrix factorization of A † as follows. 

1 

f i 

(e i − k i )v T i . (3.6) 

K = [k 1 , . . . , k n], (3.7) 

V = [v 1 , . . . , v n], (3.8) 

⎡ 

⎤ 

f 1 · · · 0 

⎢ 

F = 

. 

⎣ 0 .. 

⎥ 

0 ⎦ , (3.9) 

0 · · · f n 

Theorem 2 Let A ∈ R m×n and rank(A) ≤ min{m, n}. Using the above notations, 

the Moore-Penrose inverse of A has the following factorization 

A † = (I − K)F −1 V T . (3.10) 

Here I is the identity matrix of order n, K is a strict upper triangular matrix, F is a 

diagonal matrix whose diagonal elements are all positive. 

If A is full column rank, then 

V = A(I − K) (3.11) 

A † = (I − K)F −1 (I − K) T A T . (3.12) 

Proof 

Denote Āi = [a 1 , . . . , a i ]. Then since 

k i = A † i−1 a i (3.13) 

= [a 1 , . . . , a i−1 , 0, . . . , 0] † a i (3.14) 

= [ Ā i−1 , 0, . . . , 0 ] † ai (3.15) 

[ ] 

Ā† 

= i−1 a i (3.16) 

0 

⎡ ⎤ 

k i,1 

. 

= 

k i,i−1 

0 , (3.17) 

⎢ ... ⎥ 

⎣ ⎦ 

0 

K = [k 1 , . . . , k n] is a strictly upper triangular matrix. 

Since u i = 0 ⇔ a i ∈ R(A i−1 ), 

{ ‖ui ‖ 2 2 if 

f i = 

1 + ‖k i ‖ 2 2 if 

a i ∉ R(A i−1 ) 

a i ∈ R(A i−1 ) . (3.18)

6 

Thus f i (i = 1, . . . , n) are always positive, which implies that F is a diagonal matrix 

with positive diagonal elements. 

If A is a full rank matrix, we have 

V = [u 1 , . . . , u n] (3.19) 

] 

= 

[(I − A 0 A † 0 )a 1, . . . , (I − A n−1 A † n−1 )an (3.20) 

= [a 1 − A 0 k 1 , . . . , a n − A n−1 k n] (3.21) 

= A − [A 0 k 1 , . . . , A n−1 k n] (3.22) 

= A − [A 1 k 1 , . . . , A nk n] (3.23) 

= A(I − K). (3.24) 

The second from the bottom equality follows from the fact that K is a strictly upper 

triangular matrix. Now, when A is full rank, A † can be decomposed as follows. 

A † = (I − K)F −1 V T = (I − K)F −1 (I − K) T A T ✷ (3.25) 

Remark 1 From the above proof, it is easy to see that when A is a full column rank 

matrix, (I − K) −T F (I − K) −1 is a LDL T Decomposition of A T A. 

Based on the Greville’s method, we can construct a preconditioning algorithm. We 

want to construct a sparse approximation to the Moore-Penrose inverse of A. Hence, 

we perform some numerical droppings in the middle of the algorithm to maintain the 

sparsity of the preconditioner. We call the following algorithm the global Greville 

preconditioning algorithm, since it forms or updates the whole matrix at a time rather 

than column by column. 

Algorithm 1 Global Greville preconditioning algorithm 

1. set M 0 = 0 

2. for i = 1 : n 

3. k i = M i−1 a i 

4. u i = a i − A i−1 k i 

5. if ‖u i ‖ 2 is not small 

6. f i = ‖u i ‖ 2 2 

7. v i = u i 

8. else 

9. f i = 1 + ‖k i ‖ 2 2 

10. v i = Mi−1 T k i 

11. end if 

12. M i = M i−1 + f 1 i 

(e i − k i )vi 

T 

13. perform numerical droppings to M i 

14. end for 

15. Obtain M n ≈ A † . 

Remark 2 In Algorithm 1, we do not need to store k i , v i , f i , i = 1, . . . , n, because we 

form the M i explicitly. 

Remark 3 In Algorithm 1, we need to update M i at every step, but we do not need 

to update the whole matrix, since only the first i − 1 rows of M i−1 can be nonzero. 

Hence, to compute M i , we need to update the first i − 1 rows of M i−1 , and then add 

one new nonzero row to be the i-th row.

7 

Remark 4 If k i is sparse, when we update M i−1 , we only need to update the rows 

which correspond to the nonzero elements in k i . Hence, the rank-one update can be 

efficient. 

Remark 5 When A is nonsingular, the inverse of A can also be computed by the 

Sherman-Morrison formula, which is also a rank-one update algorithm. Following this 

Sherman-Morrison formula method, an incomplete factorization preconditioner can 

also be developed, we refer readers to [7,8]. 

If we want to construct the matrix K, F and V without forming M i explicitly, 

we can use a vector-wise version of the above algorithm. In Algorithm 1, the column 

vectors of K are constructed one column at a step, and u i , f i , v i are defined according 

to k i . Hence, it is possible to rewrite Algorithm 1 into a vector-wise form. 

Since u i can be computed from a i −A i−1 k i , which does not refer to M i−1 explicitly, 

to vectorize Algorithm 1, we only need to form k i and v i = Mi−1 T k i when linear 

dependence occurs, without using M i−1 explicitly. 

Consider the case when numerical droppings are not used. Since we already know 

that 

A † = (I − K)F −1 V T 

⎡ 

f1 

−1 

⎢ 

= (I − [ k 1 . . . k n ]) ⎣ 

= 

n∑ 

(e i − k i ) 1 vi T , 

f i 

i=1 

. .. 

f −1 n 

⎤ ⎡ 

v1 T 

⎥ ⎢ ... 

⎦ ⎣ 

for any integer 1 ≤ p ≤ n, it is easy to see that 

p∑ 

A † p = (e i − k i ) 1 vi T . (3.26) 

f i 

Therefore, we have 

and 

i=1 

v i = (A † i−1 )T k i (3.27) 

∑i−1 

= ( (e p − k 1 p) vp T ) T k i 

f p 

(3.28) 

= 

p=1 

∑i−1 

1 

v p(e p − k p) T k i (3.29) 

f p 

p=1 

k i = A † i−1 a i (3.30) 

∑i−1 

= (e p − k 1 p) vp T a i 

f p 

(3.31) 

p=1 

∑i−2 

= (e p − k 1 p) v T 1 

p a i + (e i−1 − k i−1 ) v 

f p f 

i−1a T i 

p=1 

i−1 

(3.32) 

= A † i−2 a 1 

i + (e i−1 − k i−1 ) v 

f 

i−1a T i . 

i−1 

(3.33) 

v T n 

⎤ 

⎥ 

⎦

8 

To make this more clear, from the last column of K, the requirement relationship 

can be shown as 

A † n−2 an 

↗ 

k n = A † n−1 an 

↖ 

k n−1 = A † n−2 a n−1 

↗ ↖ ↗ ↖ 

A † n−3 an k n−2 = A † n−3 a n−2 A † n−3 a n−1 k n−2 = A † n−3 a n−2 

. . . . . . . 

In other words, we need to compute every A † i a k, k = i + 1, . . . , n. Denote A † i a j, j > i 

as k i,j . In this sense, k i = k i−1,i . In the algorithm, k i,j , j > i will be stored in the j-th 

column of K, if j = i + 1, k i,j = k j , and it will not be updated any more. If j > i + 1, 

k i,j will be updated to k i+1,j and still stored in the same position. 

Based on the above discussion, and adding the numerical dropping strategy, we 

have the following algorithm. In the algorithm, we omit the first subscript of k i,j . 

Algorithm 2 Vector-wise Greville preconditioning algorithm 

1. set K = 0 n×n 


3. u = a i − A i−1 k i 

4. if ‖u‖ 2 is not small 

5. f i = ‖u‖ 2 2 

6. v i = u 

7. else 

8. f i = ‖k i ‖ 2 2 + 1 

9. v i = (A † i−1 )T k i = ∑ i−1 

p=1 f 1 

p 

v p(e p − k p) T k i 

10. end if 

11. for j = i + 1, . . . , n 

12. k j = k j + vT i aj 

f i 

(e i − k i ) 

13. perform numerical droppings on k j 



16. Obtain K = [k 1 , . . . , k n], F = Diag {f 1 , . . . , f n}, V = [v 1 , . . . , v n]. 

4 The Special Case When A is Full Column Rank 

In this section, we especially take a look at the full column rank case. When A is full 

column rank, Algorithm 2 can be simplified as follows. 

Algorithm 3 Vector-wise Greville preconditioning algorithm for full column rank matrices 

1. set K = 0 n×n 


3. u i = a i − A i−1 k i 

4. f i = ‖u i ‖ 2 2 

5. for j = i + 1, . . . , n

9 

6. k j = k j + uT i aj 

f i 

(e i − k i ) 

7. perform numerical droppings on k j 



10. Obtain K = [k 1 , . . . , k n], F = Diag{f 1 , . . . , f n}. 

In Algorithm 3, 

u i = a i − A i−1 k i 

⎡ ⎤ 

−k i,1 

. 

−k i,i−1 

= [a 1 , . . . , a i , 0, . . . , 0] 

1 

0 ⎢ 

⎣ ... ⎥ 

⎦ 

0 

= A i (e i − k i ) 

= A(e i − k i ). 

If we denote e i − k i as z i , then u i = Az i . 

Then, line 6 of Algorithm 3 can be rewritten as 

k j = k j + uT i a j 

‖u i ‖ 2 (e i − k i ) 

2 

e j − k j = e j − k j − uT i a j 

‖u i ‖ 2 (e i − k i ) 

2 

z j = z j − uT i a j 

‖u i ‖ 2 z i . 

2 

Denote d i = ‖u i ‖ 2 2 and θ = uT i a j 

. Then, combining all the new notations, we can 

d i 

rewrite the algorithm as follows. 

Algorithm 4 

1. set Z = I n×n 


3. u i = A i z i 

4. d i = (u i , u i ) 

5. for j = i + 1, . . . , n 

6. θ = (ui,aj) 

d i 

7. z j = z j − θz i 



10. Z = [z 1 , . . . , z n], D = Diag{d 1 , . . . , d n}. 

Remark 6 Since z i = e i − k i , we have Z = I − K. Denoting D = Diag{d 1 , . . . , d n}, the 

factorization of A † in Theorem 2 can be rewritten as 

A † = ZD −1 Z T A T (4.1)

10 

Remark 7 In Algorithm 4, the definition of θ can be rewritten as 

θ = (u i, a j ) 

d i 

(4.2) 

= (Az i, Ae j ) 

(Az i , Az i ) 

(4.3) 

= (z i, e j ) A T A 

. (4.4) 

(z i , z i ) A T A 

When A is full column rank, A T A is symmetric positive definite. Hence, (·, ·) A T A 

defines an inner product. Hence, Algorithm 4 includes an A T A-orthogonalization procedure. 

Based on this observation, we conclude that for the special case in which A 

is full column rank, our algorithm is very similar to the RIF preconditioner which 

was proposed by Benzi and Tůma [3]. The difference lies in the implementation. Our 

algorithm looks for an incomplete factorization of A † , while RIF looks for an LDL T 

decomposition of A T A. 

5 Equivalence Condition 

Consider solving the least squares problem (1.1) by transforming it into the left preconditioned 

form, 

min ‖Mb − MAx‖ 2, (5.1) 

x∈R n 

where A ∈ R m×n , M ∈ R n×m , and b is a right-hand-side vector b ∈ R m . 

When preconditioning a least squares problem, one important issue is whether 

the solution of the preconditioned problem is the solution of the original problem. For 

nonsingular linear systems, the condition for this equivalence is that the preconditioner 

M should be nonsingular. Since we are dealing with general rectangular matrices, 

we need some other conditions to ensure that the preconditioned problem (5.1) is 

equivalent to the original least squares problem (1.1). 

First, note the following lemma [14]. 

Lemma 1 

and 

‖b − Ax‖ 2 = min 

x∈R n ‖b − Ax‖ 2 

‖Mb − MAx‖ 2 = min 

x∈R n ‖Mb − MAx‖ 2 

are equivalent for all b ∈ R m , if and only if R(A) = R(M T MA). 

To analyze the equivalence between the original problem (1.1) and the preconditioned 

problem (5.1), where M is from any of our algorithms, we first consider the 

simple case in which A is a full column rank matrix. After numerical droppings, we 

have, 

A † ≈ M = ZF −1 Z T A T . (5.2) 

Note that Z is a unit upper triangular matrix and F is a diagonal matrix with positive 

elements. Hence, we have 

M = CA T , (5.3)

11 

where C is a nonsingular matrix. Hence, we have 

R(M T ) = R(A). (5.4) 

For the general case, K and F are still nonsingular. However, the expression for V is 

not straightforward. To simplify the problem, we make the following assumption. 

Assumption 1 

1. There is no zero column in A. 

2. Our preconditioning algorithm can detect all the linear independence in A. 

The definition of v i can be rewritten as follows. 

v i = 

{ ai − Ak i ∈ R(A i ) ∉ R(A i−1 ) if a i ∉ R(A i−1 ) 

(A † i−1 )T k i ∈ R(A i−1 ) = R(A i ) if a i ∈ R(A i−1 ) 

(5.5) 

Hence, we have 

span{v 1 , v 2 , . . . , v i } = span{a 1 , a 2 , . . . , a i }. (5.6) 

On the other hand, note the 12-th line of Algorithm 1 

M i = M i−1 + 1 f i 

(e i − k i )v T i , (5.7) 

which implies that every row of M i is a linear combination of the vectors v T k , 1 ≤ k ≤ i, 

i.e., 

R(M T i ) = span{v 1 , . . . , v i }. (5.8) 

Based on the above discussions, we obtain the following theorem. 

Theorem 3 Let A ∈ R m×n , m ≥ n. If Assumption 1 holds, then we have the following 

relationships, where M is the approximate Moore-Penrose inverse constructed by 

Algorithm 1. 

R(M T ) = R(V ) = R(A) (5.9) 

Remark 8 In Assumption 1, we assume that our algorithms can detect all the linear 

independence in the columns of A. Hence, we allow such mistakes that a linearly 

dependent column is recognized as a linearly independent column. An extreme case is 

that we recognize all the columns of A as linearly independent, i.e., we take A as a full 

column rank matrix. In this sense, our assumption can always be satisfied. 

Hence, we proved that for any matrix A ∈ R m×n , R(A) = R(M T ). We have the 

following theorem. 

Theorem 4 If Assumption 1 holds, then for all b ∈ R m , the preconditioned least 

squares problem (5.1), where M is constructed by Algorithm 1, is equivalent to the 

original least squares problem (1.1).

12 

Proof If Assumption 1 holds, we have 

R(A) = R(M T ). (5.10) 

Then there exists a nonsingular matrix C ∈ R n×n such that A = M T C. Hence, 

R(M T MA) = R(M T MA) (5.11) 

= R(M T MM T C) (5.12) 

= R(M T MM T ) (5.13) 

= R(M T M) (5.14) 

= R(M T ) (5.15) 

= R(A). (5.16) 

In the above equalities we used the relationship R(MM T ) = R(M). 

By Theorem 1 we complete the proof. ✷ 

6 Breakdown-free Condition 

In this section, we assume without losing generality that the first r columns of A are 

linearly independent. Hence, 

R(A) = span{a 1 , . . . , a r}, (6.1) 

where rank(A) = r, and a i (i = 1, . . . , r) is the i-th column of A. The reason is that we 

can incorporate a column pivoting in Algorithm 1 easily. Then we have, 

a i ∈ span{a 1 , . . . , a r}, i = r + 1, . . . , n. (6.2) 

In this case, after performing Algorithm 1 with numerical dropping, matrix V can be 

written in the form 

V = [v 1 , . . . , v r, v r+1 , . . . , v n]. (6.3) 

If we denote [v 1 , . . . , v r] as U r, then 

U r = A(I − K)I r, I r = 

There exists a matrix H ∈ R r×n−r such that 

H can be rank deficient. Then, V is given by 

[ ] Ir×r 

. (6.4) 

0 

[v r+1 , . . . , v n] = U rH = A(I − K)I rH. (6.5) 

V = [v 1 , . . . , v r, v r+1 , . . . , v n] (6.6) 

= [U r, U rH] (6.7) 

= U r [ I r×r H ] (6.8) 

[ ] Ir×r 

= A(I − K) [ I r×r H ] (6.9) 

0 

[ ] Ir×r H 

= A(I − K) 

. (6.10) 

0 0

13 

Hence, 

[ ] 

M = (I − K)F −1 Ir×r 0 

H T (I − K) T A T . (6.11) 

0 

From the above equation, we can also see that the difference between the full column 

rank case and the rank deficient case lies in 

[ ] Ir×r 0 

H T , (6.12) 

0 

which should be an identity matrix when A is full column rank. 

If there is no numerical dropping, M will be the Moore-Penrose inverse of A, 

[ ] 

A † = (I − ˆK) ˆF −1 Ir×r 0 

Ĥ T (I − 

0 

ˆK) T A T . (6.13) 

Comparing Equation (6.11) and Equation (6.13), we have the following theorem. 

Theorem 5 Let A ∈ R m×n . If Assumption 1 holds, then the following relationships 

hold, where M denotes the approximate Moore-Penrose inverse constructed by Algorithm 

1, 

R(M) = R(A † ) = R(A T ). (6.14) 

Based on Theorem 3 and Theorem 5, we have the following theorem which ensures 

that the GMRES method can determine a solution to the preconditioned problem 

MAx = Mb before breakdown happens for any b ∈ R m . 

Theorem 6 Let A ∈ R m×n . If Assumption 1 holds, then GMRES can determine a 

least squares solution to 

min ‖MAx − Mb‖ x∈R n 2 (6.15) 

before breakdown happens for all b ∈ R m , where M is constructed by Algorithm 1. 

Proof According to Theorem 2.1 in Brown and Walker [6], we only need to prove 

which is equivalent to 

N (MA) = N (A T M T ), (6.16) 

R(MA) = R(A T M T ). (6.17) 

Using the result from Theorem 3, there exists a nonsingular matrix T ∈ R n×n such 

that A = M T T . Hence, 

On the other hand, 

R(MA) = R(MM T T ) (6.18) 

= R(MM T ) (6.19) 

= R(M). (6.20) 

R(A T M T ) = R(A T AT −1 ) (6.21) 

= R(A T A) (6.22) 

= R(A T ). (6.23) 

The proof is completed using Theorem 5. 

✷

14 

Remark 9 From the proof of the above theorem, we have R(MA) = R(M), which 

means that the preconditioned least squares problem (5.1) is a consistent problem. 

Thus, by Theorem 4 and Theorem 6, we finally obtain our main result for our 

Greville preconditioner M. 

Theorem 7 Let A ∈ R m×n . If Assumption 1 holds, then for any b ∈ R m , preconditioned 

GMRES determines a least squares solution of 

min ‖MAx − Mb‖ x∈R n 2 (6.24) 

before breakdown and this solution attains min 

x∈R n‖b − Ax‖ 2, where M is computed by 

Algorithm 1. 

Remark 10 Since all the proofs are based on the factorization M = (I − K)F −1 V T , 

the theorems hold also for Algorithm 2. 

7 Implementation Consideration 

7.1 Detecting Linear Dependence 

In Algorithm 1 and Algorithm 2, one important issue is how to recognize the columns 

which are linearly dependent on the previous columns. In the algorithms we check 

whether ‖u i ‖ 2 is small to detect the rank deficiency. Simply speaking, we can set up 

a tolerance τ in advance, and switch to “else” when ‖u i ‖ 2 < τ. However, is this good 

enough to help us detect the linear dependent columns of A when we perform numerical 

droppings? To address this issue, we first take a look at the RIF preconditioning 

algorithm. 

The RIF preconditioner was developed for full rank matrices. However, numerical 

experiments showed that it also works for rank deficient matrices. For this phenomenon, 

our equivalent Algorithm 3 can give a good insight into the RIF preconditioner, since 

the only possibility for the RIF preconditioner to breakdown is when f i = 0, which 

implies that u i is a zero vector. From Algorithm 1 and Algorithm 2 we have 

u i = a i − A i−1 k i (7.1) 

= a i − A i−1 A † i−1 a i (7.2) 

= (I − A i−1 A † i−1 )a i. (7.3) 

Thus, u i is the projection of a i onto R(A i−1 ) ⊥ . Hence, in exact arithmetic u i = 0 

if and only if a i ∈ R(A i−1 ). Algorithm 1 and Algorithm 2 have an alternative when 

a i ∈ R(A i−1 ), i.e. when u i = 0, Algorithm 1 and Algorithm 2 will turn to the “else” 

case. However, this is not always necessary, because of the numerical droppings. With 

numerical droppings, the u i is actually, 

u i = a i − A i−1 M i−1 a i (7.4) 

= a i − A i−1 (A † i−1 + ∆E)a i (7.5) 

= a i − A i−1 A † i−1 a i − A i−1 ∆Ea i (7.6) 

= −A i−1 ∆Ea i (7.7) 

≠ 0, (7.8)

15 

where ∆E is an error matrix. Hence, even though a i ∈ R(A i−1 ), since u i will not 

be the exact projection of a i onto R(A i−1 ) ⊥ , the RIF algorithm will not necessarily 

breakdown when linear dependence happens. 

The RIF preconditioner does not take the rank deficient columns or nearly rank deficient 

columns into consideration. Hence, if we can capture the rank deficient columns, 

we might be able to have a better acceleration to the convergence of the preconditioned 

problem. Assume the M we compute from Algorithm 1 or Algorithm 2 can be viewed 

as an approximation to A † with error matrix E ∈ R n×m , that is 

M = A † + E. (7.9) 

First note a theoretical result about the perturbation lower bound of the generalized 

inverse. 

Theorem 8 [18] If rank(A + E) ≠ rank(A), then 

‖(A + E) † − A † ‖ 2 ≥ 1 

‖E‖ 2 

. (7.10) 

By Theorem 8, if the rank of M = A † + E from our algorithm is not equal to the 

rank of A † , ( or A, since they have the same rank), by the above theorem, we have, 

‖M † − (A † ) † ‖ 2 ≥ 1 

‖E‖ 2 

(7.11) 

⇒ ‖M † − A‖ 2 ≥ 1 

‖E‖ 2 

. (7.12) 

The above inequality says that, if we denote M † = A + ∆A, then ‖∆A‖ 2 ≥ 1 

‖E‖ 2 

. 

Hence, when ‖E‖ 2 is small, which means M is a good approximation to A † , M can 

be an exact generalized inverse of another matrix which is far from A, and the smaller 

the ‖E‖ 2 is, the further M † is from A. In this sense, if the rank of M is not the same 

as that of A, M may not be a good preconditioner. 

Thus, it is important to maintain the rank of M to be the same as rank(A). Hence, 

when we perform our algorithm, we need to sparsify the preconditioner M, but at the 

same time we also want to capture as many rank deficient columns as possible, and 

maintain the rank of M. To achieve this, apparently, it is very important to decide 

how to detect whether the exact value u i = ‖(I − A i−1 A † i−1 )a i‖ 2 is close to zero or 

not based on the computed value u i = ‖(I − A i−1 M i−1 )a i ‖ 2 . 

From the previous discussion, we have 

When a i ∈ R(A i−1 ), u i = a i − A i−1 A † i−1 a i = 0. Then, 

u i = u i − A i−1 ∆Ea i . (7.13) 

u i = −A i−1 ∆Ea i , (7.14) 

‖u i ‖ 2 ≤ ‖A i−1 ‖ F ‖a i ‖ 2 ‖∆E‖ F . (7.15) 

If we require ∆E to be small, we can use a tolerance τ s. If 

‖u i ‖ 2 ≤ τ s‖A i−1 ‖ F ‖a i ‖ 2 , (7.16) 

we suppose we detect a column a i which is in the range space of A i−1 . From now on 

we call τ s the switching tolerance.

16 

An appropriate switching tolerance τ s is important to the efficiency of the preconditioner. 

When we choose a small τ s, e.g. 10 −8 , we have less chance to detect the 

linearly dependent columns. On the other hand, when a relatively large τ s, e.g. 10 −3 , is 

chosen, the preconditioning algorithm tends to find more linearly dependent columns. 

When a large τ s is taken, many linearly dependent columns can be found. However, it 

is also possible that Assumption 1 is violated so that the original least squares problem 

can not be solved. When a relatively small τ s is taken, it is possible that less or no 

linearly dependent columns can be found. Hence, the preconditioner can loose some 

efficiency. When τ s is fixed, a larger dropping tolerance τ d results in less possibility 

to detect linearly dependent columns, and a smaller τ d results in larger possibility to 

detect linearly dependent columns. Choosing optimal τ s and the dropping tolerance τ d 

to achieve the balance between them remains a problem. 

Another issue related with the dropping tolerance τ d is how much memory is needed 

to store the preconditioner. From the previous discussion, we know that R(M T ) = 

span{V }. When numerical droppings is performed on V , the relation may not hold. 

Hence, in Algorithm 2 we only perform numerical droppings on the columns of K. In 

this way, as long as Assumption 1 holds, we have R(M T ) = span{V }. However, on 

the other hand, we cannot control the sparsity of V directly. We will discuss this issue 

again at the end of this section. 

7.2 Right-Preconditioning 

So far we assumed A ∈ R m×n , m ≥ n, and only discussed left-preconditioning. When 

m ≥ n, it is better to perform left-preconditioning since the size of the preconditioned 

problem will be smaller. When m ≤ n, right-preconditioning will be better, i.e, solving 

the preconditioned problem 

min ‖b − ABy‖ 2. (7.17) 

y∈R m 

When m ≤ n, Theorem 3 and Theorem 5 still hold, since in the proof of these two 

theorems we did not refer to the fact that m ≥ n. Then, according to the following 

lemma from [14], it is easy to obtain similar conclusions for right-preconditioning. 

Lemma 2 min ‖b − Ax‖ x∈R n 2 = min ‖b − ABy‖ y∈R m 2 holds for any b ∈ R m if and only if 

R(A) = R(AB). 

Theorem 9 Let A ∈ R m×n . If Assumption 1 holds, then for any b ∈ R m we have 

min ‖b − Ax‖ x∈R n 2 = min ‖b − AMy‖ 2, where M is a preconditioner constructed by Algorithm 

y∈R m 

1. 

Proof From Theorem 5, we know that R(M) = R(A T ), which implies that there exists 

a nonsingular matrix C ∈ R m×m such that M = A T C. Hence, 

R(AM) = R(AA T C) (7.18) 

= R(AA T ) (7.19) 

= R(A). (7.20) 

Using Theorem 2 we complete the proof. 

✷

17 

Theorem 10 Let A ∈ R m×n . If Assumption 1 holds, then GMRES determines a 

solution to the right preconditioned problem 

before breakdown happens. 

min ‖b − AMy‖ y∈R m 2 (7.21) 

Proof The proof is the same as Theorem 6. 

✷ 

Theorem 11 Let A ∈ R m×n . If Assumption 1 holds, then for any b ∈ R m , GMRES 

determines a least squares solution of 

min ‖b − AMy‖ y∈R m 2 (7.22) 

before breakdown and this solution attains min 

x∈R n‖b − Ax‖ 2, where M is computed by 

Algorithm 1. 

We would like to remark that it is preferable to perform Algorithm 1 and Algorithm 

2 to A T rather than A when m < n, based on the following three reasons. By doing 

so, we construct ˆM, an approximate generalized inverse of A T . Then, we can use ˆM T 

as the preconditioner to the original least squares problem. 

1. In Algorithm 1 and Algorithm 2, the approximate generalized inverse is constructed 

row by row. Hence, we perform a loop which goes through all the columns of A 

once. When m ≥ n, this loop is relatively short. However, when m < n, this loop 

could become very long, and the preconditioning will be more time-consuming. 

2. Another reason is that, linear dependence will always happen in this case even 

though matrix A is full row rank. If m τ s ∗ ‖A i−1 ‖ F ∗ ‖a i ‖ 2

18 

10. f i = ‖u‖ 2 2 

11. v i = u 

12. else 

13. f i = ‖k i ‖ 2 2 + 1 

14. v i = (A † ∑i−1 

1 

i−1 )T k i = v p(e p − k p) T k i 

f p 

p=1 

15. end if 


17. K = [k 1 , . . . , k n], F = Diag {f 1 , . . . , f n}, V = [v 1 , . . . , v n]. 

In Algorithm 2, all the columns k j , j = i, . . . , n are updated in one outer loop. In 

Algorithm 5, all the updates for each k i column are finished at a time. This left-looking 

approach is suggested in [1,5]. From [1], a left-looking version is sometimes more robust 

and efficient. In Section 8 our preconditioners are computed by Algorithm 5. 

In the next section, we will use Algorithm 5 as our preconditioning algorithm. 

Since we cannot control the sparsity of V , and V can be very dense, we try not to 

store V . Note that when column a j is recognized as a linearly independent column, 

the corresponding column becomes v j = A(e j − k j ). Hence, we do not have to store 

v j . When we need v j in the 4th line in Algorithm 5, we can compte θ = a T i A(e j − k j ), 

where a T i A is used for i − 1 times and discarded. For the v j corresponding to a linearly 

dependent column, we need to store it. 

8 Numerical Examples 

In this section, we use matrices from the Florida University Sparse Matrices Collection 

[11] to test our algorithms, where zero rows are omitted and if the original matrix 

has more columns than rows we use its transpose. All computations were run on a 

Dell Precision 690, where the CPU is 3 GHz and the memory is 16 GB, and the 

programming language and compiling environment was GNU C/C++ 3.4.3 in Redhat 

Linux. Detailed information of the matrices is given in Table 1.

19 

Table 1 Information on the test matrices 

Name m n rank density(%) cond 

vtp base 346 198 198 1.53 3.55 × 10 7 

capri 482 271 271 1.45 8.09 × 10 3 

scfxm1 600 330 330 1.38 2.42 × 10 4 

scfxm2 1200 660 660 0.69 2.42 × 10 4 

perold 1560 625 625 0.65 5.13 × 10 5 

scfxm3 1800 990 990 0.51 1.39 × 10 3 

maros 1966 846 846 0.61 1.95 × 10 6 

pilot we 2928 722 722 0.44 4.28 × 10 5 

stocfor2 3045 2157 2157 0.14 2.84 × 10 4 

bnl2 4486 2324 2324 0.14 7.77 × 10 3 

pilot 4860 1441 1441 0.63 2.66 × 10 3 

d2q06c 5831 2171 2171 0.26 1.33 × 10 5 

80bau3b 11934 2262 2262 0.09 567.23 

beaflw 500 492 460 21.71 1.52 × 10 9 

beause 505 492 459 17.93 6.74 × 10 8 

lp cycle 3371 1890 1875 0.33 1.46 × 10 7 

Pd rhs 5804 4371 4368 0.02 3.36 × 10 8 

Model9 10939 2879 2787 0.18 7.90 × 10 20 

landmark 71952 2673 2671 0.60 1.02 × 10 8 

In Table 1, the matrices in the first section are full column rank matrices, the 

matrices in the second section are rank deficient matrices. The condition number of A 

is given by σ1(A) , where r = rank(A). We construct the preconditioner M by Algorithm 

σ r(A) 

5 and perform the BA-GMRES [14] which is given below, where B is the preconditioner. 

Algorithm 6 BA- GMRES 

1. Choose x 0 , ˜r 0 = B(b − Ax 0 ) 

2. v 1 = ˜r 0 /‖˜r 0 ‖ 2 

3. for i = 1, 2, . . . , k 

4. w i = BAv i 

5. for j = 1, 2, . . . , i 

6. h j,i = (w i , v j ) 

7. w i = w i − h j,i v j 


9. h i+1,i = ‖w i ‖ 2 

10. v i+1 = w i /h i+1,i 

11. Find y i ∈ R i which minimizes ‖˜r i ‖ 2 = ∥ ∥ ‖˜r0 ‖ 2 e i − ¯H i y ∥ ∥ 2 

12. x i = x 0 + [v 1 , . . . , v i ]y i 

13. r i = b − Ax i 

14. if ‖A T r i ‖ 2 < ε stop 


16. x 0 = x k 

17. Go to 1. 

The BA-GMRES is a method that solves least squares problems with GMRES by 

preconditioning the original problem with a suitable left preconditioner. 

In the following examples, the right hand side vectors b are generated randomly by 

Matlab so that the least squares problems we solve are inconsistent. In this section,

20 

our dropping rule is to drop the i-th element of k j , i.e., k j,i when the following two 

inequalities holds. 

|k j,i |‖a i ‖ 2 < τ d and |k j,i | < τ d ′ max 

l=1,...,i−1 |k l,i|. (8.1) 

Here we have another tolerance τ d ′. In the following examples, we simply fix it to 

10.0 ∗ τ d . We use 

‖u i ‖ 2 ≤ τ s‖A i−1 ‖ F ‖a i ‖ 2 , (8.2) 

as the criterion to judge if column a i is linearly dependent on the previous columns or 

not, where τ s is our switching tolerance. The stopping rule for BA-GMRES is 

‖A T (b − Ax)‖ 2 ≤ 10 −8 · ‖A T b‖ 2 . (8.3) 

In the following tables, we use our preconditioner to precondition GMRES and compare 

with GMRES preconditioned by RIF, which is denoted by “RIF-GMRES”, CGLS 

[4] preconditioned by Incomplete Modified Gram-Schmidt orthogonalization preconditioner 

(IMGS) [15], which is denoted by “IMGS-CGLS”, CGLS preconditioned by using 

the diagonal scaling preconditioner, which is denoted by “D-CGLS” and CGLS preconditioned 

by the Incomplete Cholesky Factorization preconditioner [4], which is denoted 

by “IC-CGLS”, to solve the original least squares problem min 

x∈R n‖b−Ax‖ 2. The columns 

τ d and τ s list the dropping tolerance and the switching tolerance we used in our preconditioning 

algorithm. The column “ITS” gives the number of iterations. The columns 

“Pre. T” and “Tol. T” give the preconditioning time and the total time in seconds, 

respectively. For “RIF-GMRES”, “IMGS-CGLS”, “D-CGLS” and “D-CGSL”, in each 

cell we give the numbers of iterations and the total CPU time. For “RIF-GMRES” and 

“IMGS-CGLS”, we give the parameters we used in brackets. In the “ nnzP 

nnzA ” column, 

we give the ratio of the number of nonzero elements in our preconditioner and A, where 

nnzP is the number of nonzero elements in K plus the number of the nonzero elements 

in diagonal matrix F . The ∗ indicates the fastest method for each problem. 

Table 2 Numerical Results for Full Rank Matrices 

matrix τ d 

nnzP 

nnzA ITS Pre. T Tot. T RIF-GMRES IMGS-CGLS D-CGLS IC-CGLS 

vtp base 1.e-6 18.0 4 0.03 *0.03 5, 0.04 (1.e-9) 9, 0.10 (1.e-5) 3667, 0.29 1348, 0.12 

capri 1.e-3 10.0 17 0.03 0.04 9, 0.07 (1.e-4) 371, 0.04 (1.0) 496, 0.06 195, *0.03 

scfxm1 1.e-3 13.3 17 0.05 0.06 7, *0.05 (1.e-5) 116, 0.08 (1.e-1) 761, 0.11 318, 0.06 

scfxm2 1.e-3 16.6 27 0.12 *0.18 12, 0.26 (1.e-5) 734, 0.28 (1.0) 1332, 0.42 680, 0.23 

perold 1.e-2 12.4 17 0.17 0.21 45, 0.36 (1.e-4) 134, 0.26 (1.e-1) 1058, 0.36 394, *0.14 

scfxm3 1.e-4 27.0 7 0.30 *0.34 12, 0.46 (1.e-5) 878, 0.53 (1.0) 1583, 0.72 847, 0.41 

maros 1.e-6 28.8 3 1.02 *1.05 10, 1.13 (1.e-7) 383, 1.47 (1.e-1) 13714, 6.71 3602, 1.8 

pilot we 1.e-1 4.5 41 0.10 0.20 12, 0.4 (1.e-5) 536, 0.37 (1.0) 736, 0.38 287, *0.14 

stocfor2 1.e-5 152.2 5 10.55 10.68 16, 11.5 (1.e-7) 138, 3.14 (1.e-1) 2147, 1.59 1362, *1.12 

bnl2 1.e-1 3.0 114 0.4 1.22 127, 2.79 (1.e-2) 598, 1.24 (1.0) 657, 0.64 240, *0.36 

pilot 1.e-2 2.9 18 1.54 1.72 359, 4.57 (1.e-1) 741, 1.49 (1.0) 822, 1.24 269, *0.75 

d2q06c 1.e-3 18.7 32 4.27 5.15 19, 10.68 (1.e-5) 1209, 3.6 (1.0) 2743, 4.05 1077, *1.63 

80bau3b 5.e-1 0.5 28 0.46 0.52 26, 0.54 (5.e-1) 47,1.34 (1.0) 56, *0.09 21, 0.21 

For full rank matrices, we did not list τ s because we simply set τ s to be a very small 

number and do not try to detect any linearly dependent columns in the coefficient 

matrices. From the numerical results in Table 2, we conclude that our preconditioner 

performs competitively when the matrices are ill-conditioned. We can also see that our

21 

preconditioner is usually much denser than the original matrix A when the density of 

A is larger than 0.5%. When the density of A is less than 0.3%, the density of our 

preconditioner is usually similar to or less than that of A. In Table 3 we show the 

numerical results for rank deficient matrices. Column “D” gives the number of linearly 

dependent columns that we detected. 

Table 3 Numerical Results for Rank Deficient Matrices 

matrix τ d , τ s 

nnzP 

nnzA D ITS Pre. T Tot. T 

beaflw 1.e-5, 1.e-10 2.25 26 209 1.22 *2.25 

beause 1.e-5, 1.e-11 2.71 31 20 1.16 *1.22 

lp cycle 1.e-4, 1.e-7 33.5 15 27 2.17 *2.68 

Pd rhs 1.e-4, 1.e-8 0.75 3 3 0.79 0.80 

model09 1.e-2, 1.e-6 2.92 0 12 0.97 *1.12 

landmark 1.e-1, 1.e-8 0.05 0 143 2.83 10.37 

matrix RIF-GMRES IMGS-CGLS D-CGLS IC-CGLS 

beaflw † † † † 

beause † † † † 

lp cycle † 5832, 4.87(100.0) 6799, 6.74 † 

Pd rhs † 25, 0.93(0.1) 242, *0.29 89, 0.54 

model09 46, 2.53(1.e-3) † 1267, 3.09 412, 1.22 

landmark 239, 38.43(10.0) 252, 36.68(10.0) 252, *8.77 † 

†: GMRES did not converge in 2, 000 steps or D-CGLS did not converge in 25, 000 steps. 

From the results in Table 3 we can conclude that our preconditioner works for 

all the tested problems, and performed competitively with other preconditioners. The 

density of our preconditioner is about 3 times the original matrix A or less except for 

lp cycle. 

For the matrix lp cycle, we know that the rank deficient columns are, 

182 184 216 237 253 

717 754 961 1221 1239 

1260 1261 1278 1640 1859, 

(8.4) 

15 columns in all. As we know that the rank deficient columns are not unique, the 

above columns we list are the columns which are linearly dependent on their previous 

columns. In the following example, we can see that our preconditioning algorithm can 

detect many of them. 

Table 4 Numerical Results for lp cycle 

τ d τ s deficiency detected ITS Pre. T Its. T Tot. T 

1.e-6 1.e-10 −1239, −1261, −1278 6 3.65 0.16 3.81 

1.e-6 1.e-7 detect all exactly 4 3.7 0.10 3.8 

1.e-6 1.e-5 detected 95 col † 4.7 

In the above table, we tested our preconditioning algorithm with fixed dropping 

tolerance τ d = 10 −6 and different switching tolerances τ s. For this problem lp cycle,

22 

RIF preconditioned GMRES did not converge in 2000 steps for all the dropping tolerances 

τ d we tried. The column “deficiency detected” gives the linearly dependent 

columns detected by our algorithm. −1260 means that the 1260th column, which is 

a rank deficient column, is missed by our algorithm. For example, when τ d = 10 −6 , 

τ s = 10 −10 our preconditioning algorithm detected 12 rank deficient columns, and 

missed the 1239-th, 1261-th, and 1278-th columns, and did not detect wrong linearly 

dependent columns. Hence, Assumption 1 is satisfied. When τ d = 10 −6 , τ s = 10 −7 

our preconditioning algorithm found exactly the 15 linearly dependent columns. When 

τ d = 10 −6 , τ s = 10 −5 many more than 15 columns were detected. Thus, assumption 

1 is not satisfied. Hence, the preconditioned GMRES did not converge. From this example 

we can see that the numerical results coincide with our theory very well. From 

Figure 1, we obtain a better insight into this situation. We observe that when the 

0 

Residual Curve of lp_cycle, τ d 

= 10 −6 

−2 

log 10 

||A T r|| 2 

/||A T b|| 2 

−4 

−6 

−8 

−10 

τ s 

=10 −5 

τ s 

=10 −7 

τ s 

=10 −10 

−12 

−14 

0 50 100 150 200 

Iteration Number 

Fig. 1 Convergence Curve for lp cycle 

switching tolerance τ s = 10 −5 , ‖AT r‖ 2 

‖A T maintains the level 10 −1 in the end, which 

b‖ 2 

means that the solution is not a solution to the original least squares problem. This 

phenomenon illustrates that Assumption 1 is necessary. 

9 Conclusion 

In this paper, we proposed a new preconditioner for least squares problems. When 

matrix A is full column rank, our preconditioning method is similar to the RIF preconditioner 

[3]. When A is rank deficient, our preconditioner is also rank deficient. 

We proved that under Assumption 1, using our preconditioners, the preconditioned

23 

problems are equivalent to the original problems. Also, under the same assumption, 

we showed that the GMRES determines a solution to the preconditioned problem. 

Numerical experiments confirmed our theories and Assumption 1. 

Numerical results showed that our preconditioner performed competitively both for 

ill-conditioned full column rank matrices and ill-conditioned rank deficient matrices. 

For rank deficient problems, the RIF preconditioner may not work but our preconditioner 

worked efficiently. 

References 

1. M. Benzi and M. Tůma, A sparse approximate inverse preconditioner for nonsymmetric 

linear systems, SIAM Journal on Scientific Computing, 19 (1998), pp. 968–994. 

2. , A robust incomplete factorization preconditioner for poitive definite matrices, Numerical 

linear algebra with applications, 10 (2003), pp. 385–400. 

3. , A robust preconditioner with low memory requirements for large sparse least 

squares problems, SIAM Journal on Scientific Computing, 25 (2003), pp. 499–512. 

4. Å. Björck, Numerical Methods for Least Squares Problems, 1996. 

5. M. Bollhöfer and Y. Saad, On the relations between ILUs and factored approximate 

inverses, SIAM Journal on Matrix Analysis and Applications, 24 (2002), pp. 219–237. 

6. P. N. Brown and H. F. Walker, GMRES on (nearly) singular systems, SIAM Journal 

on Matrix Analysis and Applications, 18 (1997), pp. 37–51. 

7. R. Bru, J. Cerdán, J. Marín, and J. Mas, Preconditioning sparse nonsymmetric linear 

systems with the sherman–morrison formula, SIAM Journal on Scientific Computing, 25 

(2003), pp. 701–715. 

8. R. Bru, J. MarÍn, and J. M. M. TUMA, Balanced incomplete factorization, SIAM 

Journal on Scientific Computing, 30 (2008), pp. 2302–2318. 

9. S. L. Campbell and C. D. Meyer, Jr., Generalized Inverses of Linear Transformations, 

Pitman, London, 1979. Reprinted by Dover, New York, 1991. 

10. E. Chow and Y. Saad, Approximate inverse preconditioners via sparse-sparse iterations, 

SIAM Journal on Scientific Computing, 19 (1998), pp. 995–1023. 

11. T. A. Davis, University of florida sparse matrix collection, NA Digest, 92 (1994). 

12. J. A. Fill and D. E. Fishkind, The Moore–Penrose generalized inverse for sums of 

matrices, SIAM Journal on Matrix Analysis and Applications, 21 (2000), pp. 629–635. 

13. T. N. E. Greville, Some applications of the pseudoinverse of a matrix, SIAM Review, 2 

(1960), pp. 15–22. 

14. K. Hayami, J.-F. Yin, and T. Ito, GMRES methods for least squares problems, Tech. 

Report NII-2007-009E, National Institute of Informatics, Tokyo, July 2007, (also to appear 

in SIAM Journal on Matrix Analysis and Applications). 

15. Y. Saad, Iterative Methods for Sparse Linear Systems, Society for Industrial and Applied 

Mathematics, Philadelphia, PA, USA, second ed., 2003. 

16. Y. Saad and M. H. Schultz, GMRES: A generalized minimal residual algorithm for solving 

nonsymmetric linear systems, SIAM Journal on Scientific and Statistical Computing, 

7 (1986), pp. 856–869. 

17. G. Wang, Y. Wei, and S. Qiao, Generalized Inverses: Theory and Computations, Since 

Press, Beijing, 2003. 

18. P.-Å. Wedin, Perturbation theory for pseudo-inverses, BIT Numerical Mathematics, 13 

(1973), pp. 217 – 232. 

19. J.-F. Yin and K. Hayami, Preconditioned GMRES methods with incomplete Givens orthogonalization 

method for large sparse least-squares problems, J. Comput. Appl. Math., 

226 (2009), pp. 177–186. 

20. N. Zhang and Y. Wei, On the convergence of general stationary iterative methods for 

range-hermitian singular linear systems (p n/a), Numerical Linear Algebra with Applications, 

published on line (2009).

Greville's Method for Preconditioning Least Squares ... - Projects

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?