Concurrent dynamic memory coalescing on goblin core-64 architecture

Xi Wang, John D. Leidel, Yong Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

The majority of modern microprocessors are architected to utilize multi-level data caches as a primary optimization to reduce the latency and increase the perceived bandwidth from an application. The spatial and temporal locality provided by data caches work well in conjunction with applications that access memory in a linear fashion. However, applications that exhibit random or non-deterministic memory access patterns often induce a significant number of data cache misses, thus reducing the natural performance benefit from the data cache. In response to the performance penalties inherently present with non-deterministic applications, we have constructed a unique memory hierarchy within the Goblin Core-64 (GC64) architecture explicitly designed to exploit memory performance from irregular memory access patterns. The GC64 architecture combines a RISC-V-based core coupled with latency-hiding architectural features to a memory hierarchy with Hybrid Memory Cube (HMC) devices. In order to cope with the inherent non-determinism of applications and to exploit the packetized interface presented by the HMC device, we develop a methodology and associated implementation of a dynamic memory coalescing unit for the GC64 memory hierarchy that permits us to statistically sample memory requests from non-deterministic applications and coalesce them into the largest possible HMC payload requests. In this work, we present two parallel methodologies and associated implementations for coalescing non-deterministic memory requests into the largest potential HMC request by constructing a binary tree representation of the live memory requests from disparate cores. We present the coalesced HMC memory request results from applications that exhibit linear and non-linear memory request patterns compiled for a RISC-V core in contrast with a traditional memory hierarchy.

Original languageEnglish
Title of host publicationMEMSYS 2016 - Proceedings of the International Symposium on Memory Systems
PublisherAssociation for Computing Machinery
Pages177-187
Number of pages11
ISBN (Electronic)9781450343053
DOIs
StatePublished - Oct 3 2016
Event2nd International Symposium on Memory Systems, MEMSYS 2016 - Washington, United States
Duration: Oct 3 2016Oct 6 2016

Publication series

NameACM International Conference Proceeding Series
Volume03-06-October-2016

Conference

Conference2nd International Symposium on Memory Systems, MEMSYS 2016
CountryUnited States
CityWashington
Period10/3/1610/6/16

Keywords

  • Data-intensive computing
  • Goblin core-64
  • Memory coalescing
  • Microcode
  • Parallel computing
  • RISC-v

Fingerprint Dive into the research topics of 'Concurrent dynamic memory coalescing on goblin core-64 architecture'. Together they form a unique fingerprint.

  • Cite this

    Wang, X., Leidel, J. D., & Chen, Y. (2016). Concurrent dynamic memory coalescing on goblin core-64 architecture. In MEMSYS 2016 - Proceedings of the International Symposium on Memory Systems (pp. 177-187). (ACM International Conference Proceeding Series; Vol. 03-06-October-2016). Association for Computing Machinery. https://doi.org/10.1145/2989081.2989128