Impact of Mixed-Parallelism on Parallel Implementations of the Strassen and Winograd Matrix Multiplication Algorithms

Concurrency Computation Practice and Experience - United Kingdom
doi 10.1002/cpe.791