can be solved in parallel) are , , , , , and ¬nally .

( (% ( D p" ( ( (

%% %% p

%

G G 0G

G G

The ¬rst and last few steps may take a heavy toll on achievable speed-ups.

9 10 11 12

w¥ 07`

¨% a a ¥ 0 `

%

G

6 ¢

5 6 7 8

5

i7`

¨ a ¥%

G

1 2 3 4

Stencil

3

1 2 4

0§5 Dd ¥ ¤¥£ ¢ Q

¢ §§ ¡

¯

Level scheduling for a grid problem.

"

The idea of proceeding by levels or wavefronts is a natural one for ¬nite difference

matrices on rectangles. Discussed next is the more general case of irregular matrices, a

textbook example of scheduling, or topological sorting, and is well known in different

forms to computer scientists.

§ ¦

B C 9¨A8 e &T @ pCe35P XU ¦W P T @cIA!¥ BVQb DT

§ 3 2 I6

26 TH H

RHa

8

8H e e

9

e

a

The simple scheme described above can be generalized for irregular grids. The objective

of the technique, called level scheduling, is to group the unknowns in subsets so that they

can be determined simultaneously. To explain the idea, consider again Algorithm 11.11 for

solving a unit lower triangular system. The -th unknown can be determined once all the

other ones that participate in equation become available. In the -th step, all unknowns ¥

E

¢

that must be known. To use graph terminology, these unknowns are adjacent

£ a ¥ 0 `

%

to unknown number . Since is lower triangular, the adjacency graph is a directed acyclic

graph. The edge in the graph simply indicates that must be known before can S

¥

be determined. It is possible and quite easy to ¬nd a labeling of the nodes that satisfy the

£ ²—a ¥ ` £

property that if , then task must be executed before task . This is

a`

¥

called a topological sorting of the unknowns.

The ¬rst step computes and any other unknowns for which there are no predecessors

f

v

8 ’ ”p£¡¦&©¡ $ H&’ ”8 ” ’ ¤ © §

8¥©§ ’ $ $ ¤ ¡

¨¦#"£

¥$¥ &

$

¡

in the graph, i.e., all those unknowns for which the offdiagonal elements of row areS

zero. These unknowns will constitute the elements of the ¬rst level. The next step computes

in parallel all those unknowns that will have the nodes of the ¬rst level as their (only)

predecessors in the graph. The following steps can be de¬ned similarly: The unknowns

that can be determined at step are all those that have as predecessors equations that have

i „BDB% H % G

¨ %¬¬¬

been determined in steps . This leads naturally to the de¬nition of a depth

G

for each unknown. The depth of a vertex is de¬ned by performing the following loop for

s®% BBD% H % G ¢

¬¬¬ , after initializing to zero for all . a ¥ ` ¥ & ¢ ¥

v( E ¢

w

”a 7¥ ¢ ¢ ¢

% %

` ¬

for all such that £

% a ¥ ¥ ¢

` a ¥ 07`

%

¥

G

By de¬nition, a level of the graph is the set of nodes with the same depth. A data struc- ®

ture for the levels can be de¬ned: A permutation de¬nes the new ordering and £

`

a

s% ¦ ¦ ¦ % G ™ % a ` £ w£ G

® points to the beginning of the -th level in that array.

G

Natural ordering Wavefront ordering

D5D# ¥ ¥£ ¥

§ § § § ¤¡

Lower triangular matrix associated with mesh

of Figure 11.10.

Once these level sets are found, there are two different ways to proceed. The permu-

¯& Q

tation vector can be used to permute the matrix according to the new order. In the "

& & &

example mentioned in the previous subsection, this means renumbering the variables , (

¡H H

¢ G

D¬B¬BF( D p"