TY - GEN
T1 - A dataflow-based runtime support on a 100P actual system
AU - Su, Zhichao
AU - Chen, Junshi
AU - Lin, Han
AU - An, Hong
AU - Han, Wenting
AU - Yu, Yang
AU - Liao, Chenzhi
AU - Chen, Yong
N1 - Funding Information:
ACKNOWLEDGMENT The work is supported by the National Key Research and Development Program of China (GrantsNo. 2016YFB1000403). We are grateful to National Supercomputing Center in Wuxi for providing the experimental environment of this paper.
Publisher Copyright:
© 2017 IEEE.
PY - 2018/5/25
Y1 - 2018/5/25
N2 - Chips equipped with numerous simple cores and heterogeneous computing resources have become mainstream in the present supercomputer system design. However, for many real-world scientific applications, off-The-shelf parallel models can't adapt to such architecture effectively, which leads to challenges of both designing program and exploiting system performance. To solve this problem, a fine-grained and event-driven program execution model, Codelet, is proposed, which is based on the data flow method. By providing a runtime support between system interfaces and Codelet-based applications, fine-grained parallelism can be exploited and high utilization of computing resources can be obtained. Therefore, in this paper, we design and implement a dataflow-based runtime support, SunwayFlow, on a 100P actual system-The Sunway TaihuLight, the supercomputer system with the highest computing performance in the world so far, to provide a user-friendly and promising solution to utilize this supercomputer fully. To evaluate the efficiency of SunwayFlow, we choose HPCG as the case study and refactor it onto SunwayFlow. We rewrite main computing kernels of HPCG carefully, especially the most time-consuming and intricate one, the symmetric Gauss-Seidel relaxation function, where a speedup of 11.79X is achieved. Moreover, the whole HPCG performance reaches 2.47 GFlops on a single core group and 534.98 GFlops on 256 core groups.
AB - Chips equipped with numerous simple cores and heterogeneous computing resources have become mainstream in the present supercomputer system design. However, for many real-world scientific applications, off-The-shelf parallel models can't adapt to such architecture effectively, which leads to challenges of both designing program and exploiting system performance. To solve this problem, a fine-grained and event-driven program execution model, Codelet, is proposed, which is based on the data flow method. By providing a runtime support between system interfaces and Codelet-based applications, fine-grained parallelism can be exploited and high utilization of computing resources can be obtained. Therefore, in this paper, we design and implement a dataflow-based runtime support, SunwayFlow, on a 100P actual system-The Sunway TaihuLight, the supercomputer system with the highest computing performance in the world so far, to provide a user-friendly and promising solution to utilize this supercomputer fully. To evaluate the efficiency of SunwayFlow, we choose HPCG as the case study and refactor it onto SunwayFlow. We rewrite main computing kernels of HPCG carefully, especially the most time-consuming and intricate one, the symmetric Gauss-Seidel relaxation function, where a speedup of 11.79X is achieved. Moreover, the whole HPCG performance reaches 2.47 GFlops on a single core group and 534.98 GFlops on 256 core groups.
KW - Codelet
KW - Data flow
KW - HPCG
KW - Runtime support
KW - The Sunway TaihuLight
UR - http://www.scopus.com/inward/record.url?scp=85048359581&partnerID=8YFLogxK
U2 - 10.1109/ISPA/IUCC.2017.00096
DO - 10.1109/ISPA/IUCC.2017.00096
M3 - Conference contribution
AN - SCOPUS:85048359581
T3 - Proceedings - 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017
SP - 599
EP - 606
BT - Proceedings - 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017
A2 - Martinez, Gregorio
A2 - Hill, Richard
A2 - Fox, Geoffrey
A2 - Mueller, Peter
A2 - Wang, Guojun
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 12 December 2017 through 15 December 2017
ER -