TY - GEN
T1 - SUORA
T2 - 11th IEEE International Conference on Networking Architecture and Storage, NAS 2016
AU - Zhou, Jiang
AU - Xie, Wei
AU - Noble, Jason
AU - Echo, Kace
AU - Chen, Yong
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/8/23
Y1 - 2016/8/23
N2 - The data scale in many data centers is growing explosively with emerging applications and usages of big data technologies. Data distribution is a key issue in large-scale distributed storage systems to place petabytes of data or even beyond, among tens or hundreds of thousands of storage devices. In the meantime, heterogeneous storage systems, such as those having devices with hard disk drives (HDDs) and storage class memories (SCMs), have become increasingly popular for massive data storage due to balanced performance, capacity, and cost. Current data distribution algorithms can achieve efficient, scalable, and balanced mapping, but do not distinguish different characteristics of heterogeneous devices well. This paper presents a novel data distribution algorithm called SUORA (Scalable and Uniform storage via Optimally-adaptive and Random number Addressing), to take full advantage of heterogeneous devices. SUORA is a pseudo-random algorithm that uniformly distributes data cross a hybrid and tiered storage cluster. It divides heterogeneous devices, maps them onto different buckets and assigns them to various segments in each bucket. A pseudo-random and deterministic number sequence is generated to map data among segments and devices. Data movement is performed for achieving better read throughput while keeping load balance according to data hotness and bucket threshold. With considering distinct characteristics of heterogeneous storage devices well, the SUORA algorithm achieves a highly efficient adaptive data distribution for data centers and heterogeneous storage systems.
AB - The data scale in many data centers is growing explosively with emerging applications and usages of big data technologies. Data distribution is a key issue in large-scale distributed storage systems to place petabytes of data or even beyond, among tens or hundreds of thousands of storage devices. In the meantime, heterogeneous storage systems, such as those having devices with hard disk drives (HDDs) and storage class memories (SCMs), have become increasingly popular for massive data storage due to balanced performance, capacity, and cost. Current data distribution algorithms can achieve efficient, scalable, and balanced mapping, but do not distinguish different characteristics of heterogeneous devices well. This paper presents a novel data distribution algorithm called SUORA (Scalable and Uniform storage via Optimally-adaptive and Random number Addressing), to take full advantage of heterogeneous devices. SUORA is a pseudo-random algorithm that uniformly distributes data cross a hybrid and tiered storage cluster. It divides heterogeneous devices, maps them onto different buckets and assigns them to various segments in each bucket. A pseudo-random and deterministic number sequence is generated to map data among segments and devices. Data movement is performed for achieving better read throughput while keeping load balance according to data hotness and bucket threshold. With considering distinct characteristics of heterogeneous storage devices well, the SUORA algorithm achieves a highly efficient adaptive data distribution for data centers and heterogeneous storage systems.
KW - Data centers
KW - Data distribution algorithm
KW - Data management
KW - Data placement
KW - Heterogeneous storage
UR - http://www.scopus.com/inward/record.url?scp=84988369386&partnerID=8YFLogxK
U2 - 10.1109/NAS.2016.7549423
DO - 10.1109/NAS.2016.7549423
M3 - Conference contribution
AN - SCOPUS:84988369386
T3 - 2016 IEEE International Conference on Networking Architecture and Storage, NAS 2016 - Proceedings
BT - 2016 IEEE International Conference on Networking Architecture and Storage, NAS 2016 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 8 August 2016 through 10 August 2016
ER -