TY - GEN
T1 - SSDUP
T2 - 31st ACM International Conference on Supercomputing, ICS 2017
AU - Shi, Xuanhua
AU - Li, Ming
AU - Liu, Wei
AU - Jin, Hai
AU - Yu, Chen
AU - Chen, Yong
N1 - Publisher Copyright:
© 2017 Association for Computing Machinery.
PY - 2017/6/14
Y1 - 2017/6/14
N2 - Many high performance computing (HPC) applications are highly data intensive. Current HPC storage systems still use hard disk drives (HDDs) as their dominant storage devices, which suffer from disk head thrashing when accessing random data. New storage devices such as solid state drives (SSDs), which can handle random data access much more efficiently, have been widely deployed as the buffer to HDDs in many production HPC systems. Burst buffer has also been proposed to manage the SSD buffering of bursty write requests. Although burst buffer can improve I/O performance in many cases, we find that it has some limitations such as requiring large SSD capacity and harmonious overlapping between computation phase and data flushing stage. In this paper, we propose a scheme, called SSDUP (a traffic-aware SSD burst buffer), to improve the burst buffer by addressing the above limitations. In order to reduce the SSD capacity demand, we 5develop a novel traffic-detection method to detect the randomness in the write traffic. Based on this method, only the random writes are buffered to SSD and other writes are deemed sequential and propagated to HDDs directly. In order to overcome the difficulty of perfectly overlapping the computation phase and the flushing stage, we propose a pipeline mechanism for the SSD buffer, in which the data buffering and data flushing are performed in pipeline. Finally, in order to further improve the performance of buffering random writes in SSD, we covert the random writes to sequential writes in SSD by storing the data with a log structure. Further, we propose to use the AVL tree structure to store the sequence information of the data. We have implemented a prototype of SSDUP based on the OrangeFS and performed extensive experimental evaluation. The experimental results show that the proposed SSDUP scheme can improve the write performance by more than 50% on average.
AB - Many high performance computing (HPC) applications are highly data intensive. Current HPC storage systems still use hard disk drives (HDDs) as their dominant storage devices, which suffer from disk head thrashing when accessing random data. New storage devices such as solid state drives (SSDs), which can handle random data access much more efficiently, have been widely deployed as the buffer to HDDs in many production HPC systems. Burst buffer has also been proposed to manage the SSD buffering of bursty write requests. Although burst buffer can improve I/O performance in many cases, we find that it has some limitations such as requiring large SSD capacity and harmonious overlapping between computation phase and data flushing stage. In this paper, we propose a scheme, called SSDUP (a traffic-aware SSD burst buffer), to improve the burst buffer by addressing the above limitations. In order to reduce the SSD capacity demand, we 5develop a novel traffic-detection method to detect the randomness in the write traffic. Based on this method, only the random writes are buffered to SSD and other writes are deemed sequential and propagated to HDDs directly. In order to overcome the difficulty of perfectly overlapping the computation phase and the flushing stage, we propose a pipeline mechanism for the SSD buffer, in which the data buffering and data flushing are performed in pipeline. Finally, in order to further improve the performance of buffering random writes in SSD, we covert the random writes to sequential writes in SSD by storing the data with a log structure. Further, we propose to use the AVL tree structure to store the sequence information of the data. We have implemented a prototype of SSDUP based on the OrangeFS and performed extensive experimental evaluation. The experimental results show that the proposed SSDUP scheme can improve the write performance by more than 50% on average.
KW - Burst Buffer
KW - High Performance Computing
KW - Hybrid Storage System
KW - Solid State Drive
UR - http://www.scopus.com/inward/record.url?scp=85023755173&partnerID=8YFLogxK
U2 - 10.1145/3079079.3079087
DO - 10.1145/3079079.3079087
M3 - Conference contribution
AN - SCOPUS:85023755173
T3 - Proceedings of the International Conference on Supercomputing
BT - ICS 2017
PB - Association for Computing Machinery
Y2 - 13 June 2017 through 16 June 2017
ER -