Toward a microarchitecture for efficient execution of irregular applications

John D. Leidel, Xi Wang, Brody Williams, Yong Chen

Research output: Contribution to journalArticlepeer-review


Given the increasing importance of efficient data-intensive computing, we find that modern processor designs are not well suited to the irregular memory access patterns often found in these algorithms. Applications and algorithms that do not exhibit spatial and temporal memory request locality induce high latency and low memory bandwidth due to the high cache miss rate. In response to the performance penalties inherently present in applications with irregular memory accesses, we introduce a GoblinCore-64 (GC64) architecture and a unique memory hierarchy that are explicitly designed to exploit memory performance from irregular memory access patterns. GC64 provides a pressure-driven hardware-managed concurrency control to minimize pipeline stalls and lower the latency of context switches. A novel memory coalescing model is also introduced to enhance the performance of memory systems via request aggregations. We have evaluated the performance benefits of our approach using a series of 24 benchmarks and the results show nearly 50% memory request reductions and a performance acceleration of up to 14.6×.

Original languageEnglish
Article number26
JournalACM Transactions on Parallel Computing
Issue number4
StatePublished - Nov 2020


  • Data-intensive computing
  • context switching
  • irregular algorithms
  • thread concurrency


Dive into the research topics of 'Toward a microarchitecture for efficient execution of irregular applications'. Together they form a unique fingerprint.

Cite this