Cost-intelligent application-specific data layout optimization for parallel file systems

Huaiming Song, Yanlong Yin, Yong Chen, Xian He Sun

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Parallel file systems have been developed in recent years to ease the I/O bottleneck of high-end computing system. These advanced file systems offer several data layout strategies in order to meet the performance goals of specific I/O workloads. However, while a layout policy may perform well on some I/O workload, it may not perform as well for another. Peak I/O performance is rarely achieved due to the complex data access patterns. Data access is application dependent. In this study, a cost-intelligent data access strategy based on the application-specific optimization principle is proposed. This strategy improves the I/O performance of parallel file systems. We first present examples to illustrate the difference of performance under different data layouts. By developing a cost model which estimates the completion time of data accesses in various data layouts, the layout can better match the application. Static layout optimization can be used for applications with dominant data access patterns, and dynamic layout selection with hybrid replications can be used for applications with complex I/O patterns. Theoretical analysis and experimental testing have been conducted to verify the proposed cost-intelligent layout approach. Analytical and experimental results show that the proposed cost model is effective and the application-specific data layout approach can provide up to a 74% performance improvement for data-intensive applications.

Original languageEnglish
Pages (from-to)285-298
Number of pages14
JournalCluster Computing
Volume16
Issue number2
DOIs
StatePublished - Jun 2013

Keywords

  • Data layout
  • Data-intensive computing
  • I/O performance modeling
  • Parallel I/O
  • Parallel file systems

Fingerprint Dive into the research topics of 'Cost-intelligent application-specific data layout optimization for parallel file systems'. Together they form a unique fingerprint.

Cite this