TY - GEN
T1 - GraphMeta
AU - Dai, Dong
AU - Chen, Yong
AU - Carns, Philip
AU - Jenkins, John
AU - Zhang, Wei
AU - Ross, Robert
N1 - Funding Information:
This material is based upon work supported by the U.S. Department of Energy, Office of Science, under Contract No. DE-AC02-06CH11357; and by the National Science Foundation under grant CCF-1409946 and CNS-1338078.
Publisher Copyright:
© 2016 IEEE.
PY - 2016/12/6
Y1 - 2016/12/6
N2 - High-performance computing (HPC) systems face increasingly critical metadata management challenges, especially in the approaching exascale era. These challenges arise not only from exploding metadata volumes but also from increasingly diverse metadata, which contains data provenance and userdefined attributes in addition to traditional POSIX metadata. This "rich" metadata is critical to support many advanced data management functionality such as data auditing and validation. In our prior work, we presented a graph-based model that could be a promising solution to uniformly manage such rich metadata because of its flexibility and generality. At the same time, however, graph-based rich metadata management introduces significant challenges. In this study, we first identify the challenges presented by the underlying infrastructure in supporting scalable, high-performance rich metadata management. To tackle these challenges, we then present GraphMeta, a graph-based engine designed for managing large-scale rich metadata. We also utilize a series of optimizations designed for rich metadata graphs. We evaluate GraphMeta with both synthetic and real HPC metadata workloads and compare it with other approaches. The results show that its advantages in terms of rich metadata management in HPC systems, including better performance and scalability compared with existing solutions.
AB - High-performance computing (HPC) systems face increasingly critical metadata management challenges, especially in the approaching exascale era. These challenges arise not only from exploding metadata volumes but also from increasingly diverse metadata, which contains data provenance and userdefined attributes in addition to traditional POSIX metadata. This "rich" metadata is critical to support many advanced data management functionality such as data auditing and validation. In our prior work, we presented a graph-based model that could be a promising solution to uniformly manage such rich metadata because of its flexibility and generality. At the same time, however, graph-based rich metadata management introduces significant challenges. In this study, we first identify the challenges presented by the underlying infrastructure in supporting scalable, high-performance rich metadata management. To tackle these challenges, we then present GraphMeta, a graph-based engine designed for managing large-scale rich metadata. We also utilize a series of optimizations designed for rich metadata graphs. We evaluate GraphMeta with both synthetic and real HPC metadata workloads and compare it with other approaches. The results show that its advantages in terms of rich metadata management in HPC systems, including better performance and scalability compared with existing solutions.
UR - http://www.scopus.com/inward/record.url?scp=85013194985&partnerID=8YFLogxK
U2 - 10.1109/CLUSTER.2016.50
DO - 10.1109/CLUSTER.2016.50
M3 - Conference contribution
AN - SCOPUS:85013194985
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
SP - 298
EP - 307
BT - Proceedings - 2016 IEEE International Conference on Cluster Computing, CLUSTER 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 13 September 2016 through 15 September 2016
ER -