TY - GEN
T1 - Block2Vec
AU - Dai, Dong
AU - Bao, Forrest Sheng
AU - Zhou, Jiang
AU - Chen, Yong
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/9/23
Y1 - 2016/9/23
N2 - Block correlations represent the semantic patterns in storage systems. These correlations can be exploited for data caching, pre-fetching, layout optimization, I/O scheduling, etc. In this paper, we introduce Block2Vec, a deep learning based strategy to mine the block correlations in storage systems. The core idea of Block2Vec is twofold. First, it proposes a new way to abstract blocks, which are considered as multi-dimensional vectors instead of traditional block Ids. In this way, we are able to capture similarity between blocks through the distances of their vectors. Second, based on vector representation of blocks, it further trains a deep neural network to learn the best vector assignment for each block. We leverage the recently advanced word embedding technique in natural language processing to efficiently train the neural network. To demonstrate the effectiveness of Block2Vec, we design a demonstrative block prediction algorithm based on mined correlations. Empirical comparison based on the simulation of real system traces shows that Block2Vec is capable of mining block-level correlations efficiently and accurately. This research and trial show that the deep learning strategy is a promising direction in optimizing storage system performance.
AB - Block correlations represent the semantic patterns in storage systems. These correlations can be exploited for data caching, pre-fetching, layout optimization, I/O scheduling, etc. In this paper, we introduce Block2Vec, a deep learning based strategy to mine the block correlations in storage systems. The core idea of Block2Vec is twofold. First, it proposes a new way to abstract blocks, which are considered as multi-dimensional vectors instead of traditional block Ids. In this way, we are able to capture similarity between blocks through the distances of their vectors. Second, based on vector representation of blocks, it further trains a deep neural network to learn the best vector assignment for each block. We leverage the recently advanced word embedding technique in natural language processing to efficiently train the neural network. To demonstrate the effectiveness of Block2Vec, we design a demonstrative block prediction algorithm based on mined correlations. Empirical comparison based on the simulation of real system traces shows that Block2Vec is capable of mining block-level correlations efficiently and accurately. This research and trial show that the deep learning strategy is a promising direction in optimizing storage system performance.
KW - IO
KW - block correlation
KW - deep learning
KW - embedding
UR - http://www.scopus.com/inward/record.url?scp=84990946956&partnerID=8YFLogxK
U2 - 10.1109/ICPPW.2016.43
DO - 10.1109/ICPPW.2016.43
M3 - Conference contribution
AN - SCOPUS:84990946956
T3 - Proceedings of the International Conference on Parallel Processing Workshops
SP - 230
EP - 239
BT - Proceedings - 45th International Conference on Parallel Processing Workshops, ICPPW 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 16 August 2016 through 19 August 2016
ER -