November 7th, Wednesday 14:15, Room 303, Jacobs Building

Title: Fast Parallel Matrix Multiplication

Lecturer: Oded Schwartz

Lecturer homepage : http://www.cs.berkeley.edu/~odedsc/

Affiliation : EECS Department, University of Berkeley

 

Faster algorithms can be obtained by minimizing communication. That is, reducing the amount of data sent across the memory hierarchy and between processors.

The communication costs of algorithms, in terms of time or energy, are typically much higher than the arithmetic costs. We have computed lower bounds on these communication costs by analyzing geometric and expansion properties of the underlying computation DAG of algorithms. These techniques (honored by SIAG-LA prize for 2009-2011 and SPAA'11 best paper award) inspired many new algorithms, where existing ones proved not to be communication-optimal.

Parallel matrix multiplication is one of the most studied fundamental problems in parallel computing. We obtain a new parallel algorithm based on Strassen's fast matrix multiplication that is communication-optimal. It exhibits perfect strong scaling, within the maximum possible range. The algorithm asymptotically outperforms all known parallel matrix multiplication algorithms, classical and Strassen-based. It also demonstrates significant speedups in practice, as benchmarked on several super-computers (Cray XT4, Cray XE6, and IBM BG/P). Our parallelization approach is simple to implement, and extends to other algorithms. Both the lower bounds and the new algorithms have an immediate impact on saving power and energy at the algorithmic level.

Based on joint work with Grey Ballard, James Demmel, Olga Holtz, Ben Lipshitz