h. They do, however, require variable
s to have the same names in different physical address spaces, so they are most suited for Single Program Multiple Data (SPMD) algorithms.
Other BSP primitives are better suited for different types of parallelism.
Put
and
get
are nonblocking, so the process that calls them can proceed immediately. Issuing a
put
or
get
guarantees only that the requested data operation will be completed by the end of the superstep or next barrier synchronization.
The C function,
bsp_allsums()
, hints at the flavor of BSP programming. It calculates the running sums of
p
integers stored on
p
processors. Put another way, if integer
xi
is stored on processor
i
, the result on processor
i
is
x0 + x1 + ... + xi
.
You use the primitives
bsp_pushregister
and
bsp_popregister
to register the name of the destination variable
left
across all the processors. The cost of this algorithm is
log(p)x(g + l + 1) + l
, as there are
log(p) + 1
supersteps (including one for the registration), and the addition operation costs 1 FLOP.
#include "bsp.h"
#include
#include
int bsp_allsums(int x) {
int i, left, right;
bsp_pushregister(&left,sizeof(int));
bsp_sync();
right = x;
for(i=1;i
=i) right = left + right;
}
bsp_popregister(&left);
return right;
}