Cost-aware client-side file caching for data-intensive applications

Yaning Huang, Hai Jin, Xuanhua Shi, Song Wu, Yong Chen

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

Parallel and distributed file systems are widely used to provide high throughput in high-performance computing and Cloud computing systems. To increase the parallelism, I/O requests are partitioned into multiple sub-requests (or 'flows') and distributed across different data nodes. Therefore the completion time of an I/O request depends on the slowest sub-request and the performance of file systems is extremely poor if data nodes have highly unbalanced response time. Client-side caching offers a promising direction for addressing this issue. However, current work has primarily used client-side memory as a read cache and employed a write-through policy which provides the strictest consistency. Write-through requires synchronous update for every write and significantly under-utilizes the client-side cache when the applications are write-intensive. The write-back policy, on the other hand, can better utilize the client-side cache for applications with significantly I/O requirements. However, this policy introduces the notorious cache consistency problem and is ineffective when the cache size is constrained. we propose a cost-aware client-side file caching (CCFC) strategy, that is designed to perform better than conventional write-back and write-through. This strategy enables new trade-off points across I/O performance, data consistency and cache size dimensions. Using benchmark workloads such as MADbench2, IOR and HPIO, we evaluate our new cache policy alongside conventional write-back and write-through. We find that the proposed CCFC strategy can achieve up to 110% throughput improvement compared to the conventional write-back and write-through policies with the same cache size on an 85-node cluster.

Original languageEnglish
Article number6735429
Pages (from-to)248-251
Number of pages4
JournalProceedings of the International Conference on Cloud Computing Technology and Science, CloudCom
Volume2
DOIs
StatePublished - 2013
Event5th IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2013 - Bristol, United Kingdom
Duration: Dec 2 2013Dec 5 2013

Keywords

  • Cloud computing
  • Parallel file system
  • client-side file caching
  • data-intensive computing
  • high-performance computing

Fingerprint Dive into the research topics of 'Cost-aware client-side file caching for data-intensive applications'. Together they form a unique fingerprint.

Cite this