TY - GEN
T1 - HMC-Sim-2.0
T2 - 30th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2016
AU - Leidel, John D.
AU - Chen, Yong
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/7/18
Y1 - 2016/7/18
N2 - The recent advent of stacked memory devices has led to a resurgence of researchassociated with the fundamental memory hierarchy and associated memory pipeline. The bandwidth advantages provided by stacked logic and DRAM devices haveinspired research associated with eliminating the bandwidth bottlenecksassociated with many applications in high performance computing. Further, recent efforts have focused on utilizing stacked memory devices as last-levelcaches. In addition to the two aforementioned focus areas, a third area of research isemerging to explore augmenting the stacked memory logic layer with additionaloperations. This first generation of Hybrid Memory Cube (HMC) devices providedrudimentary atomic memory operations. The Gen2 Hybrid Memory Cube devicesprovide more expressive atomic memory operations that include primitiveinteger arithmetic operations. Despite the inclusion of more expressivearithmetic operations, many users have expressed interest in more complex andpotentially orthogonal custom memory cube, or CMC, operations in futurerevisions of the Hybrid Memory Cube specification. This work presents recent development associated with the HMC-Sim Hybrid MemoryCube simulation framework that provides users a powerful infrastructure toexperiment and research augmented custom memory cube, or CMC, operations withinthe current Gen2 Hybrid Memory Cube device infrastructure. We provide anoverview of extending the original HMC-Sim simulation infrastructure to includesupport for CMC operations with requiring users to modify the core simulationcode base. We also present three examples of building and utilizing custom, user-defined CMC operations in sample simulations to exhibit potentialapplication speedup with future HMC device specifications. In doing so, we presenta model to replace traditional thread mutexes with custom HMC mutex commands.
AB - The recent advent of stacked memory devices has led to a resurgence of researchassociated with the fundamental memory hierarchy and associated memory pipeline. The bandwidth advantages provided by stacked logic and DRAM devices haveinspired research associated with eliminating the bandwidth bottlenecksassociated with many applications in high performance computing. Further, recent efforts have focused on utilizing stacked memory devices as last-levelcaches. In addition to the two aforementioned focus areas, a third area of research isemerging to explore augmenting the stacked memory logic layer with additionaloperations. This first generation of Hybrid Memory Cube (HMC) devices providedrudimentary atomic memory operations. The Gen2 Hybrid Memory Cube devicesprovide more expressive atomic memory operations that include primitiveinteger arithmetic operations. Despite the inclusion of more expressivearithmetic operations, many users have expressed interest in more complex andpotentially orthogonal custom memory cube, or CMC, operations in futurerevisions of the Hybrid Memory Cube specification. This work presents recent development associated with the HMC-Sim Hybrid MemoryCube simulation framework that provides users a powerful infrastructure toexperiment and research augmented custom memory cube, or CMC, operations withinthe current Gen2 Hybrid Memory Cube device infrastructure. We provide anoverview of extending the original HMC-Sim simulation infrastructure to includesupport for CMC operations with requiring users to modify the core simulationcode base. We also present three examples of building and utilizing custom, user-defined CMC operations in sample simulations to exhibit potentialapplication speedup with future HMC device specifications. In doing so, we presenta model to replace traditional thread mutexes with custom HMC mutex commands.
KW - Custom Memory Cube
KW - Hybrid Memory Cube
KW - Simulation
UR - http://www.scopus.com/inward/record.url?scp=84991691000&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2016.43
DO - 10.1109/IPDPSW.2016.43
M3 - Conference contribution
AN - SCOPUS:84991691000
T3 - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
SP - 621
EP - 630
BT - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 23 May 2016 through 27 May 2016
ER -