Gerard L.G. Sleijpen, Peter Sonneveld and Martin B. van Gijzen, Bi ...

More documents

Recommendations

Info

Author's personal copy 1108 G.L.G. Sleijpen et al. / Applied Numerical Mathematics 60 (2010) 1100–1114 Select an x 0 Select an ˜R0 Compute r 0 = b − Ax 0 k =−1, σ k = I, Set U k = 0, C k = 0, ˜Rk = 0. repeat s = r k+1 for j = 1,...,s u = s, Compute s = Au β k = σ −1 ( ˜R∗ k k s) u = u − U k β k , U k+1 e j = u s = s − C k β k , C k+1 e j = s end for k ← k + 1 σ k = ˜R∗ k C k, α k = σ −1 ( ˜R∗ k k r k) x k+1 = x k + U k α k , r k+1 = r k − C k α k end repeat Algorithm 3. Bi-CG. 0 is the n × s zero matrix, I is the s × s identity matrix. ˜Rk is an n × s matrix such that Span( ˜Rk ) + K k (A ∗ , ˜R0 ) equals K k+1 (A ∗ , ˜R0 ). Example 16. ˜Rk = ¯p k (A ∗ ) ˜R0 with p k (λ) = (1 − ω k λ) ···(1 − ω 1 λ). We will now explain how to construct a residual vector r k in K K (A, r 0 ) with K = ks + 1 that is orthogonal to K k (A ∗ , ˜R0 ). Let C 0 be the n × s matrix with columns Ar 0 ,...,A s r 0 .ThematrixC 0 is of the form AU 0 with U 0 explicitly available: U 0 =[r 0 ,...,A s−1 r 0 ]. Now, find a vector ⃗α 0 ∈ C s such that r 1 ≡ r 0 − C 0 ⃗α 0 ⊥ ˜R0 , r 1 ∈ K s+1 (A, r 0 ), and update x 0 as x 1 = x 0 + U 0 ⃗α 0 . Note that, with σ 0 ≡ ˜R∗ 0 C 0 any vector of the form w − C 0 σ −1 0 ˜R ∗ 0 w is orthogonal to ˜R0 : I − C 0 σ −1 0 ˜R 0 isaskewprojection onto the orthogonal complement of ˜R0 . Here, for ease of explanation, we assume that σ 0 is s × s non-singular. If σ 0 is singular (or ill conditioned), then s can be reduced to overcome breakdown or loss of accuracy (see Note 4 and [10] for more details). Let v = r 1 . We construct an n × s matrix C 1 orthogonal to ˜R0 as follows: s = Av, s = s − C 0 (σ −1 0 ˜R ∗ 0 s), C 1e j = s, v = s for j = 1,...,s. ˜R ∗ 0s). Note that more stable approaches as GCR (or 1 C 1, the vector Here, C 1 e j = s indicates that the j-column of C 1 is set to the vector s: e j is the jth (s-dimensional) standard basis vector. Then C 1 is orthogonal to ˜R0 and its columns form a basis for the Krylov subspace of order s generated by A 1 ≡ (I − C 0 σ −1 0 ˜R ∗ 0 )A and A 1r 1 . Note that there is a matrix U 1 such that C 1 = AU 1 . The columns of U 1 can be computed simultaneously with the columns of C 1 : U 1 e j = v − U 0 (σ −1 0 Arnoldi) for computing a basis of this Krylov subspace could have been used as well. Now, with σ 1 ≡ ˜R∗ r 2 ≡ r 1 − C 1 (σ −1 1 ˜R ∗ 1 r 1) is orthogonal to ˜R0 as well as to ˜R1 ,itbelongstoK 2s+1 (A, r 0 ), and x 2 = x 1 + U 1 (σ −1 1 ˜R ∗ 1 r 1) is the associated approximate solution. Repeating the procedure r k+1 = r k − C k ⃗α k ⊥ ˜Rk v = r k+1 for j = 1,...,s s = Av C k+1 e j = s − C k ⃗β j ⊥ ˜Rk v = C k+1 e j end for (11) leads to the residuals as announced: r k ∈ K ks+1 (A, r 0 ), r k ⊥ K k (A ∗ , ˜R0 ). Withσ k ≡ ˜R∗ k C k, the⃗α k and ⃗β j = ⃗β (k) can be j computed as ⃗α k = σ −1 ( ˜R∗ k k r k), ⃗β j = σ −1 ( ˜R∗ k k s). Simultaneously with the computation of the columns of C k+1, the columns of a matrix U k+1 can be computed. A similar remark applies to the update of x k . The resulting algorithm can be found in Algorithm 3. The columns of C k+1 form a Krylov basis of the Krylov subspace generated by A k+1 ≡ (I − C k σ −1 ˜R ∗ k k )A and A k+1r k . The generalization of Bi-CG that is presented above is related to the method ML(k)BiCG [13]. The orthogonalization conditions on the ML(k)BiCG residuals, however, are more strict. Note 4. If σ k = ˜R∗ k C k is (close to) singular, then reduce s. We can take the following approach (which has not been included in Algorithm 3).
Author's personal copy G.L.G. Sleijpen et al. / Applied Numerical Mathematics 60 (2010) 1100–1114 1109 Determine the singular value decomposition of σ k , σ k = Q Σ V ∗ with Q and V s × s unitary matrices, Σ = diag(ν 1 ,...,ν s ), with singular values ν j ordered such that ν 1 ··· ν s 0. Find p < s such that ν p >δ>ν p+1 for some appropriate (small) δ>0. Now, replace ˜Rk by ˜Rk Q ( : , 1: p) (hereweuseMATLAB’snotation)andC k by C k ( : , 1: p). Then the ‘new’ σ k has singular values ν 1 ,...,ν p . 7. Bi-CGSTAB We are now ready to formulate the Bi-CGSTAB version of IDR for s > 1. As before, for each k, select a polynomial p k of exact degree k, and put P k ≡ p k (A). For ease of notation, replace C k = C Bi-CG , U k k = U Bi-CG , and r k k = r Bi-CG in (11) by C k k , U k , and r k , respectively. Now (11) can be reformulated as (take ˜R k = ¯p k (A ∗ ) ˜R0 ) P k r k+1 = P k r k − P k C k ⃗α k ⊥ ˜R0 v = P k r k+1 for j = 1,...,s s = Av P k C k+1 e j = s − P k C k ⃗β j ⊥ ˜R0 v = P k C k+1 e j end for (12) Note that, for obtaining a formula for computing ⃗α k and ⃗β j = ⃗β (k) there is no need to refer to (11): it suffices to refer to j the orthogonality conditions in (12). To repeat the k-loop, represented in (12), k has to be increased, that is, P k+1 r k+1 and P k+1 C k+1 have to be computed from P k r k+1 and P k C k+1 , respectively. If P k+1 = (I − ω k A)P k , then these quantities can be obtained simply by multiplication by I − ω k A. This naive approach would require s + 1 additional multiplications by A. However, in the process of updating P k C k+1 , the vectors AP k r k+1 and AP k C k+1 e j ( j = 1,...,s − 1) are computed. The following loop exploits this information. P k r k+1 = P k r k − AP k U k α k ⊥ ˜R0 v = P k r k+1 , s = Av, AP k r k+1 = s for j = 1,...,s AP k U k+1 e j = s − AP k U k β j ⊥ ˜R0 P k U k+1 e j = v − P k U k β j v = AP k U k+1 e j , s = Av, A 2 P k U k+1 e j = s end for Select an ω k , P k+1 r k+1 = P k r k+1 − ω k AP k r k+1 for i = 0, 1, A i P k+1 U k+1 = A i P k U k+1 − ω k A i+1 P k U k+1 , end for (13) Here, AU k equals C k .WeusedU k in our formulation here since we also need A 2 U k and we want to limit the number of different symbols. Computing P k+1 r k+1 as soon as P k r k+1 is available, and computing P k+1 C k+1 e j as soon as P k C k+1 e j , is available leads to the following variant that requires less memory. P k r k+1 = P k r k − P k C k α k ⊥ ˜R0 v = P k r k+1 , s = Av Select an ω k , P k+1 r k+1 = v − ω k s for j = 1,...,s v = s − P k C k β j ⊥ ˜R0 s = Av, P k+1 C k+1 e j = v − ω k s end for (14) The additional matrix P k C k+1 need not to be stored. The formation of P k C k+1 and AP k C k+1 in the last step of (13) may be superfluous if P k+1 r k+1 = P k r k+1 − ω k AP k r k+1 turns out to be small enough. Unfortunately, this step cannot be avoided in (13), whereas (14) allows a ‘break’ after the computation of P k+1 r k+1 . Nevertheless, the approach in (13) may be useful, since it may allow easier generalization to other hybrid Bi-CG variants. Note that in coding (13) the matrix update of P k+1 C k+1 can exploit BLAS2 subroutines with better parallelization properties then the sequence of vector updates in (14). With r k ≡ P k r k , S k ≡ P k C k and v k ≡ P k r k+1 , we obtain the algorithm as displayed in the left panel of Algorithm 4.
Page 1 and 2: This article appeared in a journal
Page 3 and 4: Author's personal copy G.L.G. Sleij
Page 9: Author's personal copy G.L.G. Sleij

Gerard L.G. Sleijpen, Peter Sonneveld and Martin B. van Gijzen, Bi ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?