TY - GEN
T1 - I/O Characteristics Discovery in Cloud Storage Systems
AU - Zhou, Jiang
AU - Dai, Dong
AU - Mao, Yu
AU - Chen, Xin
AU - Zhuang, Yu
AU - Chen, Yong
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/9/7
Y1 - 2018/9/7
N2 - The data growth from many applications in clouds poses significant challenges to cloud storage systems. To deliver the best storage and I/O performance possible, it is often required to understand and leverage the I/O characteristics based on data accesses. A number of research studies have been carried out on this topic. However, most of them either utilize a limited number of data-access attributes, restricting the general applicability of the method for different applications, or heavily rely on the domain knowledge or expertise about applications' I/O behaviors to select the best representative features, introducing bias for certain workloads. To overcome these limitations, in this study, we present a new I/O characteristic discovery methodology. This method enables capturing data-access features as many as possible to eliminate human bias. It utilizes a machine-learning based strategy to derive the most important set of features automatically, and groups data objects with a clustering algorithm (DBSCAN) to reveal I/O characteristics discovered. These I/O characteristics revealed can direct I/O performance optimizations in numerous scenarios, such as in data prefeteching and data reorganization optimizations in cloud storage systems.
AB - The data growth from many applications in clouds poses significant challenges to cloud storage systems. To deliver the best storage and I/O performance possible, it is often required to understand and leverage the I/O characteristics based on data accesses. A number of research studies have been carried out on this topic. However, most of them either utilize a limited number of data-access attributes, restricting the general applicability of the method for different applications, or heavily rely on the domain knowledge or expertise about applications' I/O behaviors to select the best representative features, introducing bias for certain workloads. To overcome these limitations, in this study, we present a new I/O characteristic discovery methodology. This method enables capturing data-access features as many as possible to eliminate human bias. It utilizes a machine-learning based strategy to derive the most important set of features automatically, and groups data objects with a clustering algorithm (DBSCAN) to reveal I/O characteristics discovered. These I/O characteristics revealed can direct I/O performance optimizations in numerous scenarios, such as in data prefeteching and data reorganization optimizations in cloud storage systems.
KW - Cloud storage systems
KW - File systems
KW - I/O characteristics discovery
UR - http://www.scopus.com/inward/record.url?scp=85057452066&partnerID=8YFLogxK
U2 - 10.1109/CLOUD.2018.00029
DO - 10.1109/CLOUD.2018.00029
M3 - Conference contribution
AN - SCOPUS:85057452066
T3 - IEEE International Conference on Cloud Computing, CLOUD
SP - 170
EP - 177
BT - Proceedings - 2018 IEEE International Conference on Cloud Computing, CLOUD 2018 - Part of the 2018 IEEE World Congress on Services
PB - IEEE Computer Society
T2 - 11th IEEE International Conference on Cloud Computing, CLOUD 2018
Y2 - 2 July 2018 through 7 July 2018
ER -