˜ { {  
“£§
¢ ¢
¡
4 ¢
$#
! § "#
this strategy is rather complicated and involves several parameters.
In order to describe the forward and backward solution, we introduce some notation.
We start by applying the “global permutation,” i.e., the product
2ii¢h ¨ ¦£™ ¡2 ’c ¥£™ ¡2
¤¢ t ¤¢
thh
v v
to the righthand side. We overwrite the result on the current solution vector, an vector
Q

called . Now partition this vector into
¥
 D
¦ “
c
according to the partitioning (12.20). The forward step consists of transforming the second
component of the righthand side as
c v „ 5 “D  c
c h

Now is partitioned in the same manner as and the forward elimination is continued

c
the same way. Thus, at each step, each is partitioned as )¡
¥
)
D )
h
¦ c r) 
A forward elimination step de¬nes the new using the old and for
c r (
) &
) )
cr H D
while a backward step de¬nes using the old and , for
e— dihii¢t ¢ ) ) )
˜R it h h cr H D
¢ 2i¢i’ di
. Algorithm 12.5 describes the general structure of the forward and back
˜R
thhhte
ward solution sweeps. Because the global permutation was applied at the beginning, the
successive permutations need not be applied. However, the ¬nal result obtained must be
permuted back into the original ordering.
’‚ % # vx¤…¨©§¥¢
¡ ¦ ¦¤ ¡
§9753¢ 99¢ (¥¤ © $¡0 % £8 0 ¢ (¤ ) (
¢ 6
)0
! 8 64 2 ' 6 @ ) 0
{
& 6
¨

1. Apply global permutation to righthandside and copy into .
z
2. For Do: [Forward sweep]
˜{2¢hiit i¢
DH R ith h et `
e
c
3. &
) c r )¥pD c r ) v) „ ) 5

4. EndDo
§
5. Solve with a relative tolerance :
˜
6. .
pD ¦£™ ¦¨™
¤ ¢ ¤ ¢ ¦¨™
¤¢
7. For Do: [Backward sweep]
¢ ’e¢2hih¢hit ` {i D H
˜R
e
tt
c
}
8. .
c r &i) £ ) x v ) „ D )
)
9. EndDo
10. Permute the resulting solution vector back to the original
11. ordering to obtain the solution . 
Computer implementations of ILUM can be rather tedious. The implementation issues
are similar to those of parallel directsolution methods for sparse linear systems.
—™
™
`˜ }{ } fw3˜ w
7 {7{
p ˜
£¤¢
¡ ¦
!
! $
!
¡
”9 ‘’w ‚ !‘r gb –k ˜‘
’ ’
s
@ d
% 7 &{ 3 "@
@
—˜•
–
£
¢
This section describes parallel variants of the block Successive OverRelaxation (BSOR)
and ILU(0) preconditioners which are suitable for distributed memory environments.
Chapter 11 brie¬‚y discussed distributed sparse matrices.. A distributed matrix is a ma
trix whose entries are located in the memories of different processors in a multiprocessor
system. These types of data structures are very convenient for distributed memory com
puters and it is useful to discuss implementations of preconditioners that are speci¬cally
developed for them. Refer to Section 11.5.6 for the terminology used here. In particular, the
term subdomain is used in the very general sense of subgraph. For both ILU and SOR, mul
ticoloring or level scheduling can be used at the macro level, to extract parallelism. Here,
macro level means the level of parallelism corresponding to the processors, or blocks, or
subdomains.