Memory-conscious collective I/O for extreme scale HPC systems

Yin Lu, Yong Chen, Yu Zhuang, Rajeev Thakur

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Upcoming extreme scale platforms are expected to have millions of nodes with hundreds to thousands of small cores for each node. The continuing decrease in memory capacity per core and the increasing disparity between core count and off-chip memory bandwidth can lead to significant challenges for I/O operations in extreme scale systems. Collective I/O is a critical I/O optimization technique, and the extreme scale challenges require rethinking this strategy for the effective exploitation of the correlation among I/O accesses. In this study, considering the constraint of the memory capacity and bandwidth, we introduce a Memory-Conscious Collective I/O. The new collective I/O strategy restricts aggregation data traffic within disjointed subgroups, coordinates I/O accesses in intra-node and inter-node layer, and determines I/O aggregators at run time considering memory consumption and variance among processes. The preliminary results have demonstrated that this strategy holds promise in mitigating the memory pressure, alleviating the contention for memory bandwidth, and improving the I/O performance for projected extreme scale HPC systems.

Original languageEnglish
Title of host publicationProceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2013 - In Conjunction with ICS 2013
DOIs
StatePublished - 2013
Event3rd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2013 - In Conjunction with ICS 2013 - Eugene, OR, United States
Duration: Jun 10 2013Jun 10 2013

Publication series

NameProceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2013 - In Conjunction with ICS 2013

Conference

Conference3rd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2013 - In Conjunction with ICS 2013
CountryUnited States
CityEugene, OR
Period06/10/1306/10/13

Keywords

  • collective I/O
  • extreme scale system
  • high performance computing
  • many-core architecture
  • parallel I/O

Fingerprint Dive into the research topics of 'Memory-conscious collective I/O for extreme scale HPC systems'. Together they form a unique fingerprint.

Cite this