TY - JOUR
T1 - Collective input/output under memory constraints
AU - Lu, Yin
AU - Chen, Yong
AU - Zhuang, Yu
AU - Liu, Jialin
AU - Thakur, Rajeev
N1 - Funding Information:
This work was supported in part by National Science Foundation (NSF; grant CNS-1162488). It was also supported in part by the US Department of Energy, Office of Science (contract DE-AC02-06CH11357).
Publisher Copyright:
© The Author(s) 2014.
PY - 2015/2/13
Y1 - 2015/2/13
N2 - Compared with current high-performance computing (HPC) systems, exascale systems are expected to have much less memory per node, which can significantly reduce necessary collective input/output (I/O) performance. In this study, we introduce a memory-conscious collective I/O strategy that takes into account memory capacity and bandwidth constraints. The new strategy restricts aggregation data traffic within disjointed subgroups, coordinates I/O accesses in intranode and internode layers, and determines I/O aggregators at run time considering memory consumption among processes. We have prototyped the design and evaluated it with commonly used benchmarks to verify its potential. The evaluation results demonstrate that this strategy holds promise in mitigating the memory pressure, alleviating the contention for memory bandwidth, and improving the I/O performance for projected extreme-scale systems. Given the importance of supporting increasingly data-intensive workloads and projected memory constraints on increasingly larger scale HPC systems, this new memory-conscious collective I/O can have a significant positive impact on scientific discovery productivity.
AB - Compared with current high-performance computing (HPC) systems, exascale systems are expected to have much less memory per node, which can significantly reduce necessary collective input/output (I/O) performance. In this study, we introduce a memory-conscious collective I/O strategy that takes into account memory capacity and bandwidth constraints. The new strategy restricts aggregation data traffic within disjointed subgroups, coordinates I/O accesses in intranode and internode layers, and determines I/O aggregators at run time considering memory consumption among processes. We have prototyped the design and evaluated it with commonly used benchmarks to verify its potential. The evaluation results demonstrate that this strategy holds promise in mitigating the memory pressure, alleviating the contention for memory bandwidth, and improving the I/O performance for projected extreme-scale systems. Given the importance of supporting increasingly data-intensive workloads and projected memory constraints on increasingly larger scale HPC systems, this new memory-conscious collective I/O can have a significant positive impact on scientific discovery productivity.
KW - Exascale system
KW - collective input/output
KW - data-intensive computing
KW - high-performance computing
KW - many-core architecture
KW - parallel input/output
UR - http://www.scopus.com/inward/record.url?scp=84922565727&partnerID=8YFLogxK
U2 - 10.1177/1094342014561696
DO - 10.1177/1094342014561696
M3 - Article
AN - SCOPUS:84922565727
VL - 29
SP - 21
EP - 36
JO - International Journal of High Performance Computing Applications
JF - International Journal of High Performance Computing Applications
SN - 1094-3420
IS - 1
ER -