fH

9e

H Pa DCeA@ Y

BH

The distributed memory model refers to the distributed memory message passing archi-

tectures as well as to distributed memory SIMD computers. A typical distributed memory

system consists of a large number of identical processors which have their own memories

and which are interconnected in a regular topology. Examples are depicted in Figures 11.4

and 11.5. In these diagrams, each processor unit can be viewed actually as a complete pro-

cessor with its own memory, CPU, I/O subsystem, control unit, etc. These processors are

linked to a number of “neighboring” processors which in turn are linked to other neighbor-

ing processors, etc. In “Message Passing” models there is no global synchronization of the

parallel tasks. Instead, computations are data driven because a processor performs a given

task only when the operands it requires become available. The programmer must program

all the data exchanges explicitly between processors.

In SIMD designs, a different approach is used. A host processor stores the program

and each slave processor holds different data. The host then broadcasts instructions to pro-

cessors which execute them simultaneously. One advantage of this approach is that there

is no need for large memories in each node to store large programs since the instructions

are broadcast one by one to all processors.

pp

·¶

vc¦"¡8

’ ©§ 8 8 H˜

¤ ¨b¡c0¨¥£¨¥ § c " ¤ ¨

¥ © ©

© 1

Rb

¡b

©b

§

¤b ¡b

¦

¡b ¡b

¥ ¢

£¤b

D# ¥ ¥£ ¥

§§ ¤ ¡ Q Q

¯

An eight-processor ring (left) and a multi-

processor mesh (right).

An important advantage of distributed memory computers is their ability to exploit lo-

cality of data in order to keep communication costs to a minimum. Thus, a two-dimensional

processor grid such as the one depicted in Figure 11.4 is perfectly suitable for solving

discretized elliptic Partial Differential Equations (e.g., by assigning each grid point to a

corresponding processor) because some iterative methods for solving the resulting linear

systems will require only interchange of data between adjacent grid points.

A good general purpose multiprocessor must have powerful mapping capabilities be-

cause it should be capable of easily emulating many of the common topologies such as 2-D

and 3-D grids or linear arrays, FFT-butter¬‚ies, ¬nite element meshes, etc.

Three-dimensional con¬gurations are also popular. A massively parallel commercial

computer based on a 3-D mesh, called T3D, is marketed by CRAY Research, Inc. For

2-D and 3-D grids of processors, it is common that processors on each side of the grid

are connected to those on the opposite side. When these “wrap-around” connections are

included, the topology is sometimes referred to as a torus.

111

110

101

100

10 11

010 011

00 01

0 1 000 001

D# ¥ ¥£ ¥

§§ ¤ ¡ H

® f®

The -cubes of dimensions .

% %

"

G

®

Hypercubes are highly concurrent multiprocessors based on the binary -cube topol-

ogy which is well known for its rich interconnection capabilities. A parallel processor sH

®

based on the -cube topology, called a hypercube hereafter, consists of identical pro-

®

cessors, interconnected with neighbors. A -cube can be represented as an ordinary cube

"

8˜ m ’ ”p£8 ¡©

8¥ $

£¡

¢ ¡¨ ¨¦ ¤

© §¥ Yc¦ ¡©0c¨£"¥ §

¥

© § &

$

£H

¨

in three dimensions where the vertices are the nodes of the 3-cube; see Figure 11.5. sH

®

More generally, one can construct an -cube as follows: First, the nodes are labeled by

sH sH

E

¨

the binary numbers from to . Then a link between two nodes is drawn if and

G

only if their binary numbers differ by one (and only one) bit.

®

The ¬rst property of an -cube graph is that it can be constructed recursively from

¨ ®o`

lower dimensional cubes. More precisely, consider two identical -cubes whose a

sH

G fx®`

¨

vertices are labeled likewise from 0 to . By joining every vertex of the ¬rst - a

d

f

G

®

cube to the vertex of the second having the same number, one obtains an -cube. Indeed, it

¡E

suf¬ces to renumber the nodes of the ¬rst cube as and those of the second as SY£ ££

S

G

¨—x®`

where is a binary number representing the two similar nodes of the -cubes and

£ S a

G

where denotes the concatenation of binary numbers.

®

Separating an -cube into the subgraph of all the nodes whose leading bit is 0 and

the subgraph of all the nodes whose leading bit is 1, the two subgraphs are such that each

node of the ¬rst is connected to one node of the second. If the edges between these two

fo®`

¨

graphs is removed, the result is 2 disjoint -cubes. Moreover, generally, for a given

a

G