Abstract
Parallel and distributed file systems are widely used to provide high throughput in high-performance computing and Cloud computing systems. To increase the parallelism, I/O requests are partitioned into multiple sub-requests (or 'flows') and distributed across different data nodes. Therefore the completion time of an I/O request depends on the slowest sub-request and the performance of file systems is extremely poor if data nodes have highly unbalanced response time. Client-side caching offers a promising direction for addressing this issue. However, current work has primarily used client-side memory as a read cache and employed a write-through policy which provides the strictest consistency. Write-through requires synchronous update for every write and significantly under-utilizes the client-side cache when the applications are write-intensive. The write-back policy, on the other hand, can better utilize the client-side cache for applications with significantly I/O requirements. However, this policy introduces the notorious cache consistency problem and is ineffective when the cache size is constrained. we propose a cost-aware client-side file caching (CCFC) strategy, that is designed to perform better than conventional write-back and write-through. This strategy enables new trade-off points across I/O performance, data consistency and cache size dimensions. Using benchmark workloads such as MADbench2, IOR and HPIO, we evaluate our new cache policy alongside conventional write-back and write-through. We find that the proposed CCFC strategy can achieve up to 110% throughput improvement compared to the conventional write-back and write-through policies with the same cache size on an 85-node cluster.
Original language | English |
---|---|
Article number | 6735429 |
Pages (from-to) | 248-251 |
Number of pages | 4 |
Journal | Proceedings of the International Conference on Cloud Computing Technology and Science, CloudCom |
Volume | 2 |
DOIs | |
State | Published - 2013 |
Event | 5th IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2013 - Bristol, United Kingdom Duration: Dec 2 2013 → Dec 5 2013 |
Keywords
- Cloud computing
- Parallel file system
- client-side file caching
- data-intensive computing
- high-performance computing