TY - GEN
T1 - PAC
AU - Wang, Xi
AU - Leidel, John D.
AU - Williams, Brody
AU - Chen, Yong
N1 - Publisher Copyright:
© 2020 ACM.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/6/23
Y1 - 2020/6/23
N2 - Many contemporary data-intensive applications exhibit irregular and highly concurrent memory access patterns and thus challenge the performance of conventional memory systems. Driven by an expanding need for high-bandwidth memory featuring low access latency, 3D-stacked memory devices, such as the Hybrid Memory Cube (HMC) and High Bandwidth Memory (HBM), were designed to provide significantly higher throughput as compared to standard JEDEC DDR devices. However, existing memory interfaces and coalescing models, designed for conventional DDR devices, are unable to fully exploit the bandwidth potential inherent in these new 3D-stacked memory devices. In order to remedy this disparity, we introduce in this work a novel paged adaptive coalescer (PAC) infrastructure with a scalable coalescing network for 3D-stacked memory. We present the design and simulated implementation of this approach on RISC-V embedded cores with attached HMC devices. We have carried out extensive evaluations and the results show that the proposed PAC methodology yields an average coalescing efficiency of 56.01%. Further, our evaluation results also show that the PAC reduces bank conflicts and the power consumption by 85.16% and 59.21%, respectively. Overall, PAC achieves an average performance gain of 14.35% (and up to 26.06%) across 14 test suites. These results showcase the potential of the PAC methodology as applied to architecture design for increasingly critical data-intensive algorithms and applications.
AB - Many contemporary data-intensive applications exhibit irregular and highly concurrent memory access patterns and thus challenge the performance of conventional memory systems. Driven by an expanding need for high-bandwidth memory featuring low access latency, 3D-stacked memory devices, such as the Hybrid Memory Cube (HMC) and High Bandwidth Memory (HBM), were designed to provide significantly higher throughput as compared to standard JEDEC DDR devices. However, existing memory interfaces and coalescing models, designed for conventional DDR devices, are unable to fully exploit the bandwidth potential inherent in these new 3D-stacked memory devices. In order to remedy this disparity, we introduce in this work a novel paged adaptive coalescer (PAC) infrastructure with a scalable coalescing network for 3D-stacked memory. We present the design and simulated implementation of this approach on RISC-V embedded cores with attached HMC devices. We have carried out extensive evaluations and the results show that the proposed PAC methodology yields an average coalescing efficiency of 56.01%. Further, our evaluation results also show that the PAC reduces bank conflicts and the power consumption by 85.16% and 59.21%, respectively. Overall, PAC achieves an average performance gain of 14.35% (and up to 26.06%) across 14 test suites. These results showcase the potential of the PAC methodology as applied to architecture design for increasingly critical data-intensive algorithms and applications.
KW - 3D-stacked memory
KW - data-intensive computing
KW - memory coalescing
UR - http://www.scopus.com/inward/record.url?scp=85088390817&partnerID=8YFLogxK
U2 - 10.1145/3369583.3392670
DO - 10.1145/3369583.3392670
M3 - Conference contribution
AN - SCOPUS:85088390817
T3 - HPDC 2020 - Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing
SP - 137
EP - 148
BT - HPDC 2020 - Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing
PB - Association for Computing Machinery, Inc
Y2 - 23 June 2020 through 26 June 2020
ER -