Object storage is considered a promising solution for next-generation (exascale) high-performance computing platform because of its flexible and high-performance object interface. However, delivering high burst-write throughput is still a critical challenge. Although deploying more storage servers can potentially provide higher throughput, it can be ineffective because the burst-write throughput can be limited by a small number of stragglers (storage servers that are occasionally slower than others). In this paper, we propose a two-choice randomized dynamic I/O scheduler that schedules the concurrent burst-write operations in a balanced way to avoid stragglers and hence achieve high throughput. The contributions in this study are threefold. First, we propose a two-choice randomized dynamic I/O scheduler with collaborative probe and preassign strategies. Second, we design and implement a redirect table and metadata maintainer to address the metadata management challenge introduced by dynamic I/O scheduling. Third, we evaluate the proposed scheduler with both simulation tests and experimental tests in an HPC cluster. The evaluation results confirm the scalability and performance benefits of the proposed I/O scheduler.
|Number of pages||12|
|Journal||International Conference for High Performance Computing, Networking, Storage and Analysis, SC|
|State||Published - Jan 16 2014|
|Event||International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2014 - New Orleans, United States|
Duration: Nov 16 2014 → Nov 21 2014