16 4 17 5 18 6

1 13 2 14 3 15

Y™% #

3¡

!

Red-black coloring of a grid. Red-black

¤ 5¢

labeling of the nodes.

Since the red nodes are not coupled with other red nodes and, similarly, the black

nodes are not coupled with other black nodes, the system that results from this reordering

will have the structure

¥ ¥ ¥

un# j

£ ¥ ˜

˜

c„ “|

c cz

D t

¦¨ „ ¦¨| ¦ ¨z

5

in which and are diagonal matrices. The reordered matrix associated with this new

¨„

c„

labeling is shown in Figure 12.4.

Two issues will be explored regarding red-black ordering. The ¬rst is how to exploit

this structure for solving linear systems. The second is how to generalize this approach for

systems whose graphs are not necessarily 2-colorable.

™ —t

7 —— p w7 z p — |w z

˜ { { | |

UU

“£§

¢ ¢

¡

4 ¢

$#

! § "#

& $"

%#!

Matrix associated with the red-black reordering

of Figure 12.3.

© ¡

¤ H¦ § ¥I

¤

&D $"

#!

!# "I

9

A 1 9¥

F ( 9 )$ 5D

A S

I B

) 3 I

The easiest way to exploit the red-black ordering is to use the standard SSOR or ILU(0)

preconditioners for solving the block system (12.18) which is derived from the original sys-

tem. The resulting preconditioning operations are highly parallel. For example, the linear

system that arises from the forward solve in SSOR will have the form

¥ ¥ ¥

¢ c|

c„ cz

D h

5 ¦¨ „ ¦ |

¨ ¦ ¨z

This system can be solved by performing the following sequence of operations:

1. Solve .

c z #` Ic „

Dc|

2. Compute . c '5

¨ z` D ¨ z |

3. Solve .

¨ z 82¨ „

D¨|

This consists of two diagonal scalings (operations 1 and 3) and a sparse matrix-by-

vector product. Therefore, the degree of parallelism, is at least if an atomic task is d Ei

considered to be any arithmetic operation. The situation is identical with the ILU(0) pre-

conditioning. However, since the matrix has been reordered before ILU(0) is applied to it,

the resulting LU factors are not related in any simple way to those associated with the orig-

inal matrix. In fact, a simple look at the structure of the ILU factors reveals that many more

elements are dropped with the red-black ordering than with the natural ordering. The result

is that the number of iterations to achieve convergence can be much higher with red-black

ordering than with the natural ordering.

A second method that has been used in connection with the red-black ordering solves

the reduced system which involves only the black unknowns. Eliminating the red un-

knowns from (12.18) results in the reduced system:

c c

5 ¨ z D | } £ v c „ 5 ¨ „ x

¨ vc „ hcz

™

{f`Apgp wp7k…

| { |} U

£¤¢ ¢f

¡

¡ §

¤

Note that this new system is again a sparse linear system with about half as many un-

knowns. In addition, it has been observed that for “easy problems,” the reduced system

can often be solved ef¬ciently with only diagonal preconditioning. The computation of the

reduced system is a highly parallel and inexpensive process. Note that it is not necessary

to form the reduced system. This strategy is more often employed when is not diag- c

„

onal, such as in domain decomposition methods, but it can also have some uses in other

situations. For example, applying the matrix to a given vector can be performed using |

nearest-neighbor communication, and this can be more ef¬cient than the standard approach

c

of multiplying the vector by the Schur complement matrix . In addition, this £

5 "¨ „

vc „

can save storage, which may be more critical in some cases.

§ FCA

¤ © © © 5

© § F 0¤

©

&D #$"

!

%# G3

1 A9 9 (F 9 4

) )0( H5