Two approaches to architecture-independent parallel computation
Two approaches to architecture-independent parallel computation are investigated: a constructive functional notation for specifying implicitly parallel operations on multidimensional arrays, and an extension to imperative sequential programming languages for implementing bulk-synchronous parallel algorithms. An algebra of multidimensional rectangular arrays is defined constructively, by means of an injective singleton operator which maps each value from a base type into a one-element array, and a set of join operators which map a pair of arrays into their concatenation along one of a set of dimensions. A repertoire of array operations is defined in the context of the Bird-Meertens Formalism, using array versions of polymorphic higher-order functions such as map, reduce, zip and cross. This approach gives rise to a collection of algebraic laws which can be used to guide the transformation of array expressions into different equivalent forms. In particular, the promotion laws have a natural interpretation as descriptions of different parallel realisations of an array computation. The use of the array algebra is illustrated by the derivation of two example algorithms: the L-U decomposition of a matrix, and the numerical solution of an elliptic partial differential equation. The bulk-synchronous parallel model of an abstract general-purpose parallel computer is described, along with several variants of the BSP cost model. A refinement to the cost model is proposed, introducing a `half-bandwidth' parameter to quantify the effect of data granularity on communication cost. A simple BSP programming model is defined, with semantics specified in CSP. The programming model is realised by extending sequential programming languages with a small set of primitives for process creation and termination, bulk synchronisation, and inter-process data access. The author has created implementations of these primitives in the Oxford BSP library, with versions for a variety of parallel systems including networked workstations, shared-memory multiprocessors and massively-parallel distributed-memory machines, and has used them to produce an architecture-independent parallel version of the molecular dynamics module of the UCSF AMBER 4.0 package.