Communication Avoiding Power Scaling

John Leidel, Yong Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Recent system on chip (SoC) techniques have permitted the continued scaling of core densities at a rate sufficient to track Moore's Law. However, this continued increase in transistor density has warranted new hardware features in order to sufficiently scale the degree of on-chip concurrency. Features such as complex multi-level caches, hierarchical core configurations and hardware-assisted threading have increased the overall energy requirements of the SoC and decreased the programmer's ability to realize efficient scaling. This increase in overall system power requirements has resulted in research and development activities associated with hardware techniques such as dynamic frequency scaling and software techniques such as power-aware, fine-grained thread scheduling algorithms. We present the basis for a third area of research: power-scaling algorithmic complexity. The goal of this research focus is to describe techniques by which one may weigh the timing and power derivatives of competitive parallel algorithms in order to provide data necessary to make algorithmic choices based upon both the projected performance and the expected power requirements. This work presents a model and associated technique to describe the relative energy performance scaling characteristics of parallel and mixed parallel-sequential algorithms. The model and equations are then applied to a study of matrix multiplication techniques on a symmetric multiprocessing platform. We utilize a tuned Open BLAS blocking matrix multiplication, a classic parallel Strassen-Winograd technique and a Communication Avoiding Parallel Strassen (CAPS) technique to elicit the relative energy performance scaling on our aforementioned platform. In doing so, we show that while a blocking matrix multiplication may provide the highest potential performance on our platform, both the Strassen and CAPS techniques have ideal energy scaling properties. Furthermore, we show that by reducing the communication requirements of Strassen multiplication, we have the ability to gain a slight improvement in power scaling over traditional Strassen implementations.

Original languageEnglish
Title of host publicationProceedings - 2015 International Conference on Parallel Processing Workshops, The 44th Annual Conference, ICPPW 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages9
ISBN (Electronic)9781467375894
StatePublished - Dec 8 2015
Event44th Annual Conference of the International Conference on Parallel Processing Workshops, ICPPW 2015 - Beijing, China
Duration: Sep 1 2015Sep 4 2015

Publication series

NameProceedings of the International Conference on Parallel Processing Workshops
ISSN (Print)1530-2016


Conference44th Annual Conference of the International Conference on Parallel Processing Workshops, ICPPW 2015


  • High performance computing
  • Multithreading
  • Parallel algorithms
  • Parallel programming
  • Performance analysis


Dive into the research topics of 'Communication Avoiding Power Scaling'. Together they form a unique fingerprint.

Cite this