TY - GEN
T1 - Memory coalescing for hybrid memory cube
AU - Wang, Xi
AU - Leidel, John D.
AU - Chen, Yong
N1 - Publisher Copyright:
© 2018 Copyright held by the owner/author(s).
PY - 2018/8/13
Y1 - 2018/8/13
N2 - Arguably, many data-intensive applications pose significant challenges to conventional architectures and memory systems, especially when applications exhibit non-contiguous, irregular, and small memory access patterns. The long memory access latency can dramatically slow down the overall performance of applications. The growing desire of high memory bandwidth and low latency access stimulate the advent of novel 3D-staked memory devices such as the Hybrid Memory Cube (HMC), which provides significantly higher bandwidth compared with the conventional JEDEC DDR devices. Even though many existing studies have been devoted to achieving high bandwidth throughput of HMC, the bandwidth potential cannot be fully exploited due to the lack of highly efficient memory coalescing and interfacing methodology for HMC devices. In this research, we introduce a novel memory coalescer methodology that facilitates memory bandwidth efficiency and the overall performance through an efficient and scalable memory request coalescing interface for HMC. We present the design and implementation of this approach on RISC-V embedded cores with attached HMC devices. Our evaluation results show that the new memory coalescer eliminates 47.47% memory accesses to HMC and improves the overall performance by 13.14% on average.
AB - Arguably, many data-intensive applications pose significant challenges to conventional architectures and memory systems, especially when applications exhibit non-contiguous, irregular, and small memory access patterns. The long memory access latency can dramatically slow down the overall performance of applications. The growing desire of high memory bandwidth and low latency access stimulate the advent of novel 3D-staked memory devices such as the Hybrid Memory Cube (HMC), which provides significantly higher bandwidth compared with the conventional JEDEC DDR devices. Even though many existing studies have been devoted to achieving high bandwidth throughput of HMC, the bandwidth potential cannot be fully exploited due to the lack of highly efficient memory coalescing and interfacing methodology for HMC devices. In this research, we introduce a novel memory coalescer methodology that facilitates memory bandwidth efficiency and the overall performance through an efficient and scalable memory request coalescing interface for HMC. We present the design and implementation of this approach on RISC-V embedded cores with attached HMC devices. Our evaluation results show that the new memory coalescer eliminates 47.47% memory accesses to HMC and improves the overall performance by 13.14% on average.
KW - Cache
KW - Hybrid memory cube
KW - MSHR
KW - Memory coalescing
UR - http://www.scopus.com/inward/record.url?scp=85054830411&partnerID=8YFLogxK
U2 - 10.1145/3225058.3225062
DO - 10.1145/3225058.3225062
M3 - Conference contribution
AN - SCOPUS:85054830411
SN - 9781450365109
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018
PB - Association for Computing Machinery
T2 - 47th International Conference on Parallel Processing, ICPP 2018
Y2 - 13 August 2018 through 16 August 2018
ER -