I/O characteristic discovery for storage system optimizations

Jiang Zhou, Yong Chen, Dong Dai, Yu Zhuang, Weiping Wang

Research output: Contribution to journalArticlepeer-review


In this paper, we introduce a new I/O characteristic discovery methodology for performance optimizations on object-based storage systems. Different from traditional methods that select limited access attributes or heavily reply on domain knowledge about applications’ I/O behaviors, our method enables capturing data-access features as many as possible to eliminate human bias. It utilizes a machine-learning based strategy (principal component analysis, PCA) to derive the most important set of features automatically, and groups data objects with a clustering algorithm (DBSCAN) to reveal I/O characteristics discovered. We have evaluated the proposed I/O characteristic discovery solution based on Sheepdog storage system and further implemented a data prefetching mechanism as a sample use case of this approach. Evaluation results confirm that the proposed solution can successfully identify access patterns and achieve efficient data prefetching by improving the buffer cache hit ratio up to 48.24%. The overall performance was improved by up to 42%.

Original languageEnglish
Pages (from-to)1-13
Number of pages13
JournalJournal of Parallel and Distributed Computing
StatePublished - Feb 1 2021


  • Access pattern analysis
  • I/O characteristic discovery
  • I/O optimization
  • Object-based storage
  • Parallel/distributed file systems


Dive into the research topics of 'I/O characteristic discovery for storage system optimizations'. Together they form a unique fingerprint.

Cite this