TY - JOUR
T1 - I/O characteristic discovery for storage system optimizations
AU - Zhou, Jiang
AU - Chen, Yong
AU - Dai, Dong
AU - Zhuang, Yu
AU - Wang, Weiping
N1 - Funding Information:
This research is supported in part by the National Science Foundation, USA under grant CCF-1409946 , CNS-1526055 , CCF-1718336 , OAC-1835892 , and CNS-1817094 . This research is also supported by Beijing Municipal Science and Technology Commission, China under Project No. Z191100007119002 and the Strategic Priority Research Program of Chinese Academy of Sciences , Grant No. XDC02010900 .
Funding Information:
This research is supported in part by the National Science Foundation, USA under grant CCF-1409946, CNS-1526055,CCF-1718336, OAC-1835892, and CNS-1817094. This research is also supported by Beijing Municipal Science and Technology Commission, China under Project No. Z191100007119002 and the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDC02010900.
Publisher Copyright:
© 2020 Elsevier Inc.
PY - 2021/2/1
Y1 - 2021/2/1
N2 - In this paper, we introduce a new I/O characteristic discovery methodology for performance optimizations on object-based storage systems. Different from traditional methods that select limited access attributes or heavily reply on domain knowledge about applications’ I/O behaviors, our method enables capturing data-access features as many as possible to eliminate human bias. It utilizes a machine-learning based strategy (principal component analysis, PCA) to derive the most important set of features automatically, and groups data objects with a clustering algorithm (DBSCAN) to reveal I/O characteristics discovered. We have evaluated the proposed I/O characteristic discovery solution based on Sheepdog storage system and further implemented a data prefetching mechanism as a sample use case of this approach. Evaluation results confirm that the proposed solution can successfully identify access patterns and achieve efficient data prefetching by improving the buffer cache hit ratio up to 48.24%. The overall performance was improved by up to 42%.
AB - In this paper, we introduce a new I/O characteristic discovery methodology for performance optimizations on object-based storage systems. Different from traditional methods that select limited access attributes or heavily reply on domain knowledge about applications’ I/O behaviors, our method enables capturing data-access features as many as possible to eliminate human bias. It utilizes a machine-learning based strategy (principal component analysis, PCA) to derive the most important set of features automatically, and groups data objects with a clustering algorithm (DBSCAN) to reveal I/O characteristics discovered. We have evaluated the proposed I/O characteristic discovery solution based on Sheepdog storage system and further implemented a data prefetching mechanism as a sample use case of this approach. Evaluation results confirm that the proposed solution can successfully identify access patterns and achieve efficient data prefetching by improving the buffer cache hit ratio up to 48.24%. The overall performance was improved by up to 42%.
KW - Access pattern analysis
KW - I/O characteristic discovery
KW - I/O optimization
KW - Object-based storage
KW - Parallel/distributed file systems
UR - http://www.scopus.com/inward/record.url?scp=85092747349&partnerID=8YFLogxK
U2 - 10.1016/j.jpdc.2020.08.005
DO - 10.1016/j.jpdc.2020.08.005
M3 - Article
AN - SCOPUS:85092747349
SN - 0743-7315
VL - 148
SP - 1
EP - 13
JO - Journal of Parallel and Distributed Computing
JF - Journal of Parallel and Distributed Computing
ER -