block-diagonal of . To enhance performance, these preconditioners can themselves be ˜

approach. In the simplest case, a Jacobi preconditioner may consist of the diagonal or a

in this chapter. The simplest approach is to use a Jacobi or, even better, a block Jacobi

but only for parallel computers. There are at least three such types of techniques discussed

are preconditioning techniques that would normally not be used on a standard machine,

techniques can be developed that are speci¬cally targeted at parallel environments. These

the standard preconditioners such as ILU and SSOR. Fortunately, a number of alternative

As seen in the previous chapter, a limited amount of parallelism can be extracted from

s•–•

—

”xDb’ihD–i

“ ‘

z ds Vt

WBEQy3e” B Q p@A@I h2@ HfT8EFyC”3gQI’C s„p@hsBA@hhv@eSV ‚@PIIh@@wEthHt2eVV yCTVxQ„9h2FGASEB f Tf‚„Q qsBI 2FqWB @ TfQ f HfxFASEF9@G aEDyQh xQSW„V 2Q8„ A@…88 `EQPXEFzIsF EFe8I dF 8

GEth‘iVC …QV 2fd•@pRBABTVGvFB @–I bTIsV`2S@ 9SEQEI W8eIEQ• z`B @wWQ EBVI

EIhsVV p@gIh WBxQASEQbhA@m…EQwG8 ysBhV ‚@sIRFPI2”gqWVqTazPI2h”zIV 2q…ytI gQsByFxQTSECh•bQEW8“WBWtEQ8„ s@‚idPI”V

F”3 F q 7

GSxFfQTfBWq @ f 2 AEF…tyRQEByFxQTSECdhfEaUBRHIe„EQThQA•DEQ8WPIEF8W9xQASF‚eduQbt 2f”sVW8wQEhX AVsWxBESEQhV ‚@I

t@ V

IASxQCeVhEt@oy EGTfQ f 2fxFASEF“s„hA@f GV TfEQ” eX2QwEh• EQPfg3Ip 3˜2SVw}Y|9sBzT8CEW”sBysBPIHhIF c@ASFEV ztrASF2Eth2gFIB

I8 ˜ { F

TaxQHSVUzESV uQq–gEQ• e8h 2hEFW8IyI TSuQ„YASF xT8ECgb”q uQwVX quBEtThgQghV p@PIxFTSgQI @vs„h@pI 2fG W”BWxQbIS EQ8WUTaaqpsBh V p@PIxFTSgQI @

G2q”2’EQe8QyECl h‚@B‘j s 2f”eaQgsBiTSt EQrTIsV@h B @uTSEQhV @pI @tEF hWVyCxQASG B A@B 8WtHs8 ThQRrI q—3„sdhA@hV ‚@I

FyCeVhEt@oxQAS…n A@•eVTCFsmHSVikHfEFhAVt s„F @iEQW8I–B hE8TCAFxVTS2GGHbIWBQ 2fgq‚@˜QE97fWqB @ TfQ f f2xFASEFG

ASyQyFe `hEQ@ dq@ u@…P™f bq˜h„s A@RVrVhI B‚@—I@ @f2EtuFhTVeV…xQyC„ tyASxQw@GSp2SWB2QvVruBEQHtW8VsI–e8eIThrrbdq8QEQ DEQXQ p@2fPI2•EFgFhiI A@TSW”QgI“DcQedFASF2fB xQ uBbta H`e8FsV YBeITSQrXEQE’WVq UWBCgQ TSEQRQ9PI8 GH‘EF7 E8 DgQCTq B IA@sB98 B 7

Fr a

5)4321))'%"#$"!

6 & %0 0 ( &

©§

¨

¦¥¤£¢¡

‘t™

™

—‘ p w7 w p { p ly Ap|

o

˜ {{

¥£¤¢

¡¡ ¦ ¦

§¨

§ © § "

! $#

be put in the general framework of “domain decomposition” approaches. These will be

covered in detail in the next chapter.

Algorithms are emphasized rather than implementations. There are essentially two

types of algorithms, namely, those which can be termed coarse-grain and those which can

be termed ¬ne-grain. In coarse-grain algorithms, the parallel tasks are relatively big and

may, for example, involve the solution of small linear systems. In ¬ne-grain parallelism, the

subtasks can be elementary ¬‚oating-point operations or consist of a few such operations.

As always, the dividing line between the two classes of algorithms is somewhat blurred.

D hx ˜…hgw “ v gb “

‘“ “

&'% 420 )

31 ( 8¨5 %

6

7 A97

@

e–—˜•

–

Overlapping block-Jacobi preconditioning consists of a general block-Jacobi approach as

described in Chapter 4, in which the sets overlap. Thus, we de¬ne the index sets

"B

C

bS`XWVTRSQIFGEBDC

aCY U H U C P H

with

fD dR

c

e

iph2Y

Dg

VC Y

q e‚AwxU'vuc sC R

tr

e

yU

where is the number of blocks. Now use the block-Jacobi method with this particular

y

partitioning, or employ the general framework of additive projection processes of Chapter

5, and use an additive projection method onto the sequence of subspaces

„

mmbujii¢it c rgdfSeIs¢˜ —EC•bu“’C

t aC ”F ‘ … D

™ ˜ t d™ – D ”

hldk ˜thh h

Each of the blocks will give rise to a correction of the form

sn j

ec r qo¤n

p r so¤n

p

r ¥

˜ ˜

r so

}p

˜ ˜

c

D '

vC zx h

{yC ”

w #

| h

C uC

t

One problem with the above formula is related to the overlapping portions of the vari- |

ables. The overlapping sections will receive two different corrections in general. According

to the de¬nition of “additive projection processes” seen in Chapter 5, the next iterate can

be de¬ned as

g

‚

˜ c

t p D c r p | p Y h”

| v' C” wC

C

c sC

˜

where is the residual vector at the previous iteration. Thus, the corrections

D pY p#

z |

for the overlapping regions simply are added together. It is also possible to weigh these

contributions before adding them up. This is equivalent to rede¬ning (12.1) into

ec r so n

p r so n

p

r

˜ ˜

c }

p#

C „xuC

D z{x …”

wC |

vC

C t

in which is a nonnegative diagonal matrix of weights. It is typical to weigh a nonover-

C„

lapping contribution by one and an overlapping contribution by where is the number e

t™

—t

7 —— p w7 z p — |w z