A virtual shared metadata storage for HDFS

Jiang Zhou, Yong Chen, Xiaoyan Gu, Weiping Wang, Dan Meng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Hadoop is a popular open-source framework that allows distributed analysis of large datasets using the MapReduce programming model. A distributed file system HDFS is implemented to provide high-throughput access to datasets. HDFS can achieve high performance metadata service but has two disadvantages. First, when the metadata server stores metadata on persistent devices, it is restricted to read and write operations of local disks. Second, it also lacks effective methods for metadata synchronization and replication, which is critical for metadata availability and reliability. In this research, we introduce a novel Virtual Shared Storage Pool (VSSP) concept and design for storing and sharing metadata in HDFS. The VSSP is a virtual storage device which is built on existing servers and transparent to upper layers. Two strategies, a journal synchronization based on the 2PC protocol and a fine-grained image replication, are introduced in the VSSP according to different metadata access features. The VSSP not only reduces the overhead on metadata modification operations, but also improves the I/O performance for namespace storage. Experimental results show that the VSSP improved the average performance by 40.51% and 23.46% when writing logs compared with the BookKeeper and Hadoop QJM. The average image read and write throughput was nearly 5 times and 2.4 times better than NFS and the original approach. These results confirm that the proposed VSSP solution significantly improves the metadata access performance, scalability, and reliability for HDFS.

Original languageEnglish
Title of host publicationProceedings of the 2015 IEEE International Conference on Networking, Architecture and Storage, NAS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages265-274
Number of pages10
ISBN (Electronic)9781467378918
DOIs
StatePublished - Sep 10 2015
Event10th IEEE International Conference on Networking, Architecture and Storage, NAS 2015 - Boston, United States
Duration: Aug 6 2015Aug 7 2015

Publication series

NameProceedings of the 2015 IEEE International Conference on Networking, Architecture and Storage, NAS 2015

Conference

Conference10th IEEE International Conference on Networking, Architecture and Storage, NAS 2015
CountryUnited States
CityBoston
Period08/6/1508/7/15

Keywords

  • Distributed file systems
  • HDFS
  • cluster file systems
  • distributed metadata management
  • shared storage

Fingerprint Dive into the research topics of 'A virtual shared metadata storage for HDFS'. Together they form a unique fingerprint.

  • Cite this

    Zhou, J., Chen, Y., Gu, X., Wang, W., & Meng, D. (2015). A virtual shared metadata storage for HDFS. In Proceedings of the 2015 IEEE International Conference on Networking, Architecture and Storage, NAS 2015 (pp. 265-274). [7255195] (Proceedings of the 2015 IEEE International Conference on Networking, Architecture and Storage, NAS 2015). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/NAS.2015.7255195