Efficient sparse Cholesky factorization on a parallel SIMD computer
Abstract
We investigate the effect of load balancing when performing Cholesky factorization
on a SIMD computer. In particular we describe a supernodal algorithm for
performing sparse Cholesky factorization. The way the matrix is mapped
onto the processors has significant effect on its efficiency. We show that
this assignment problem can be modeled as a graph coloring problem in a
weighted graph. By a simple greedy algorithm, we obtain substantial speed-up
compared to previously suggested data mapping schemes. Experimental runs
have been made on a 16K processor MasPar MP-2 parallel computer using symmetric
test matrices with irregular sparsity structure. On these problems our
implementation achieves performance rates of well above 200 Mflops in double
precision arithmetic.