ble. As shown, all linear systems are now solved in a relatively small number of iterations,

with the exception of F2DB which still takes 130 steps to converge with l¬l = 1 (but only

10 with l¬l = 5.) In addition, observe a marked improvement in the operation count and

error norms. Note that the operation counts shown in the column K¬‚ops do not account for

the operations required in the set-up phase to build the preconditioners. For large values of

l¬l , this may be large.

Matrix Iters K¬‚ops Residual Error

F2DA 18 964 0.47E-03 0.41E-04

F3D 14 3414 0.11E-02 0.39E-03

ORS 6 341 0.13E+00 0.60E-04

F2DB 130 7167 0.45E-02 0.51E-03

FID 59 19112 0.19E+00 0.11E-03

¤ )0§ ¥¥©

¢ E

A test run of GMRES(10)-ILUT(1% ) precon-

d

G

ditioning.

If the total time to solve one linear system with is considered, a typical curve of

the total time required to solve a linear system when the l¬l parameter varies would look

like the plot shown in Figure 10.12. As l¬l increases, a critical value is reached where

the preprocessing time and the iteration time are equal. Beyond this critical point, the

preprocessing time dominates the total time. If there are several linear systems to solve

with the same matrix , then it is advantageous to use a more accurate factorization, since

the cost of the factorization will be amortized. Otherwise, a smaller value of l¬l will be

more ef¬cient.

Matrix Iters K¬‚ops Residual Error

F2DA 7 478 0.13E-02 0.90E-04

F3D 9 2855 0.58E-03 0.35E-03

ORS 4 270 0.92E-01 0.43E-04

F2DB 10 724 0.62E-03 0.26E-03

FID 40 14862 0.11E+00 0.11E-03

¤ )0§ ¥¥©

¢ E

A test run of GMRES(10)-ILUT(5% ) precon-

d

G

ditioning.

p¶

·

’ gc ”8

˜© ˜ ªpv

88 8k10 ³

¡

¨¥ c© ¨ ¥£" Bc

$ ¡©

¢

¡

12.

10.

C 8.0

P

U

6.0

T

i

m 4.0

e

2.0

0.

3.0 5.0 7.0 9.0 11. 13. 15.

level of ¬ll-in

75§"0#

¶ ¢§

¥£ ¥

¤¡ ¥

Typical CPU time as a function of l¬l The

dashed line is the ILUT time, the dotted line is the GMRES time,

and the solid line shows the total.

¦ ¦ #¦ !¥ !U 3

2 ¡3¢6

2 ¡ ¤@ T dHcbY

a

YP 9 a9e

The ILUT approach may fail for many of the matrices that arise from real applications, for

one of the following reasons.

¨

The ILUT procedure encounters a zero pivot;

! The ILUT procedure encounters an over¬‚ow or under¬‚ow condition, because of an

exponential growth of the entries of the factors;

The ILUT preconditioner terminates normally but the incomplete factorization pre-

conditioner which is computed is unstable.

¦

¡

An unstable ILU factorization is one for which has a very large norm

d

f d

f d

f

leading to poor convergence or divergence of the outer iteration. The case (1) can be over-

come to a certain degree by assigning an arbitrary nonzero value to a zero diagonal element

that is encountered. Clearly, this is not a satisfactory remedy because of the loss in accuracy

in the preconditioner. The ideal solution in this case is to use pivoting. However, a form of

pivoting is desired which leads to an algorithm with similar cost and complexity to ILUT.

Because of the data structure used in ILUT, row pivoting is not practical. Instead, column

pivoting can be implemented rather easily.

Here are a few of the features that characterize the new algorithm which is termed

±C

ILUTP (“P” stands for pivoting). ILUTP uses a permutation array to hold the new

orderings of the variables, along with the reverse permutation array. At step of the elim-

ination process the largest entry in a row is selected and is de¬ned to be the new -th

variable. The two permutation arrays are then updated accordingly. The matrix elements

¦ ¦

of and are kept in their original numbering. However, when expanding the - row

which corresponds to the -th outer step of Gaussian elimination, the elements are loaded

C

±

with respect to the new labeling, using the array for the translation. At the end of

the process, there are two options. The ¬rst is to leave all elements labeled with respect

p¶

· 8˜ H ˜ ¤ ¨°¡ $ H&’ ”8 ” &’ ¤ ©¦§

©8 $ $

¡¨ ¨¦ ¤

¡

© §¥ c#&) $

©1

¢

¡