Partitioning an Array onto a Mesh of Processors


Achieving an even load balance with a low communication overhead is a fundamental task in parallel computing. In this paper we consider the problem of partitioning an array into a number of blocks such that the maximum amount of work in any block is as low as possible. We review different proposed schemes for this problem and the complexity of their communication pattern. We present new approximation algorithms for computing a well balanced generalized block distribution as well as an algorithm for computing an optimal semi-generalized block distribution. The various algorithms are tested and compared on a number of different matrices.