Hiding I/O latency with pre-execution prefetching for parallel applications

Yong Chen, Surendra Byna, Xian He Sun, Rajeev Thakur, William Gropp

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

52 Scopus citations

Abstract

Parallel applications are usually able to achieve high computational performance but suffer from large latency in I/O accesses. I/O prefetching is an effective solution for masking the latency. Most of existing I/O prefetching techniques, however, are conservative and their effectiveness is limited by low accuracy and coverage. As the processor-I/O performance gap has been increasing rapidly, data-access delay has become a dominant performance bottleneck. We argue that it is time to revisit the "I/O wall" problem and trade the excessive computing power with data-access speed. We propose a novel pre-execution approach for masking I/O latency. We describe the pre-execution I/O prefetching framework, the pre-execution thread construction methodology, the underlying library support, and the prototype implementation in the ROMIO MPI-IO implementation in MPICH2. Preliminary experiments show that the pre-execution approach is promising in reducing I/O access latency and has real potential.

Original languageEnglish
Title of host publication2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008
DOIs
StatePublished - 2008
Event2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008 - Austin, TX, United States
Duration: Nov 15 2008Nov 21 2008

Publication series

Name2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008

Conference

Conference2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008
Country/TerritoryUnited States
CityAustin, TX
Period11/15/0811/21/08

Fingerprint

Dive into the research topics of 'Hiding I/O latency with pre-execution prefetching for parallel applications'. Together they form a unique fingerprint.

Cite this