Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers Request free information on products written about or advertised in BYTE Submit a press release, or scan recent announcements Talk with BYTE's staff and readers about products and technologies

ArticlesParallel Processing in Bulk


November 1996 / Core Technologies / Parallel Processing in Bulk

A new programming model for parallel processing simplifies writing programs and promises code portability.

Dick Pountain

Parallel processing -- using more than one CPU to increase computation speed -- is one of those cutting-edge technologies that always seems poised to break through into the mainstream. The problem has been that writing software for parallel computers is just too difficult. It requires complex code to synchronize the data and activity of tens or hundreds of processors. Furthermore, because parallel programs aren't portable between different supercomputer architectures, no volume software market has ever taken off to sustain the effort.

A new parallel-programming model, called Bulk Synchronou s Parallelism (BSP), promises to remedy this situation. Developed by teams at Oxford in the U.K. and Harvard in the U.S., BSP offers a simple synchronization mechanism. It also has the potential to make parallel programs portable between different parallel-computer architectures.

Parallel's Pitfalls

Parallel programming's fundamental problem is that there has been no generally accepted model of what a parallel computer should look like. Some designers have favored distributed-memory machines, whose processing nodes communicate by passing messages. Others have preferred symmetric-multiprocessing (SMP) or shared-memory architectures, where all the processors read and write the same memory.

Yet others have used collections of workstations connected by a LAN to simulate a single parallel computer. The clustering of workstations into virtual parallel computers is immeasurably easier since the emergence of standard message-passing environments such as M essage Passing Interface (MPI) and Parallel Virtual Machine (PVM). Still, moving a program from one type of parallel machine to another normally involves a complete rewrite of the program.

What's needed is an abstract model of a parallel computer that describes all these schemes, hiding the physical details of particular architectures from programs to make them portable. Another desirable feature of such a model would be predictability (i.e., the ability to accurately analyze how such portable programs will perform on different real architectures). BSP provides just such a model.

BSP's Structure

The BSP model assumes a set of processor/memory pairs. These pairs are connected by a communications network that efficiently supports a global memory space and a mechanism for the barrier synchronization of all the processors. Execution of a BSP program proceeds in phases, and all global communication takes place between phases.

Each phase is called a superstep . It consists of a number of parallel-running threads that contain any number of operations. The threads perform only local communication until they reach a synchronization point or barrier. At a barrier, all the threads must wait until the last one becomes ready, at which time all global communication (i.e., accesses to the physical memory of remote processors) takes place, as shown in the figure "Supersteps Toward Synchronization." This model completely decouples communication from synchronization, so that the synchronization of individual messages ceases to be of concern to the programmer.

BSP doesn't care whether a parallel computer implements barrier synchronization via hardware or software, because this affects only absolute performance. As an example, Cray's T3D, a massively parallel supercomputer based on Digital Equipment Alpha RISC processors, supports hardware barrier synchronization by providing each processor/memory node with a special barrier register.

BSP is equally unc oncerned about the underlying mechanism used for communication. Thus, the same program could run on an Ethernet of PCs using the WinPVM library or on a T3D.

BSP Parameters

As important as BSP's support for program portability is the way it provides an analytic cost model for assessing the performance of parallel algorithms. This is something to which synchronized message-passing programs have never been amenable.

Any BSP computer is defined by three parameters: p , the number of processor/memory pairs; l , the latency, or number of time steps consumed by barrier synchronization; and g , a ratio obtained by dividing the total local operations performed by all processors per second by the total words delivered by the communications system per second. Note that g measures only a bulk property of the whole system, not the speed of individual CPUs or links. The l and g parameters are normalized with respect to a fourth paramet er, s , the number of time steps per second or MFLOPS rate, so you can compare algorithms running on different hardware.

You can consider any scalable parallel system to be a BSP computer and determine what its p , l , and g parameters are by benchmarking. You can then use these results to analyze the computational complexity of both architectures and algorithms. Designers of parallel architectures strive to reduce l  and g to a minimum. Likewise, a programmer's choice of algorithm will try to offset the bad effects of large l  and g inherent in a hardware design.

A Parallel Future

The Oxford University Computer Lab (OUCL) has set up a unit called Oxford Parallel to commercialize and spread the word about BSP. The firm offers the Oxford BSP Library for a number of machines, including a free generic version for any homogeneous parallel Unix machine that has access to PARMACS, PVM, TCP/IP, or System V Share d Memory primitives. There are also optimized native libraries for IBM's SP1/SP2, Cray's T3D, Silicon Graphics' Power Challenge, Meiko's CS/2, and other supercomputers.

Oxford Parallel is working on BSP support for SMP machines running Windows NT and for clusters of NT servers. Such a development would be timely indeed, given that the PC industry is entering an era of multimedia and 3-D graphics applications that cry out to be accelerated by parallel processing. Perhaps after all those false dawns, BSP is the technique that will bring parallel processing into the mainstream for the first time.


Where to Find


Oxford Parallel

OUCL
Oxford, U.K.
Phone:    +44 1865 273897
Fax:      +44 1865 273819

HotBYTEs
 - information on products covered or advertised in BYTE


Supersteps Toward Synchronization

illustration_link (25 Kbytes)

BSP's step-by-step scheme simplifies the synchronization of both processors and any shared (global) data.


Dick Pountain is a BYTE contributing editor based in London. You can contact him at dickp@bix.com .

Up to the Core Technologies section contentsGo to previous article: Go to next article: Programming with BSPSearchSend a comment on this articleSubscribe to BYTE or BYTE on CD-ROM   Copyrig
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network