Title:

Two approaches to architectureindependent parallel computation

Two approaches to architectureindependent parallel computation are investigated: a constructive functional notation for specifying implicitly parallel operations on multidimensional arrays, and an extension to imperative sequential programming languages for implementing bulksynchronous parallel algorithms. An algebra of multidimensional rectangular arrays is defined constructively, by means of an injective singleton operator which maps each value from a base type into a oneelement array, and a set of join operators which map a pair of arrays into their concatenation along one of a set of dimensions. A repertoire of array operations is defined in the context of the BirdMeertens Formalism, using array versions of polymorphic higherorder functions such as map, reduce, zip and cross. This approach gives rise to a collection of algebraic laws which can be used to guide the transformation of array expressions into different equivalent forms. In particular, the promotion laws have a natural interpretation as descriptions of different parallel realisations of an array computation. The use of the array algebra is illustrated by the derivation of two example algorithms: the LU decomposition of a matrix, and the numerical solution of an elliptic partial differential equation. The bulksynchronous parallel model of an abstract generalpurpose parallel computer is described, along with several variants of the BSP cost model. A refinement to the cost model is proposed, introducing a `halfbandwidth' parameter to quantify the effect of data granularity on communication cost. A simple BSP programming model is defined, with semantics specified in CSP. The programming model is realised by extending sequential programming languages with a small set of primitives for process creation and termination, bulk synchronisation, and interprocess data access. The author has created implementations of these primitives in the Oxford BSP library, with versions for a variety of parallel systems including networked workstations, sharedmemory multiprocessors and massivelyparallel distributedmemory machines, and has used them to produce an architectureindependent parallel version of the molecular dynamics module of the UCSF AMBER 4.0 package.
