Efficient Matrix Multiplication on SIMD Computers

Abstract

We describe efficient algorithms for matrix multiplication on SIMD computers. We consider SIMD implementations of Winograd's algorithm in the case where additions are faster than multiplications, as well as classical kernels and the use of Strassen's algorithm. Actual performance figures using the MasPar family of SIMD computers are presented and discussed.