






















Enabling efficient, high-performance data access in parallel file systems (PFS) is critical for today's high-performance computing systems. PFS client-side I/O heavily impacts the final I/O performance delivered to individual applications and the entire system. Autotuning the key client-side I/O behaviors has been extensively studied and shows promising results. However, existing work has heavily relied on extensive number of global runtime metrics to monitor and accurate modeling of applications' I/O patterns. Such heavy overheads significantly limit the ability to enable fine-grained, dynamic tuning in practical systems. In this study, we propose DIAL (Decentralized I/O AutoTuning via Learned Client-side Local Metrics) which takes a drastically different approach. Instead of trying to extract the global I/O patterns of applications, DIAL takes a decentralized approach, treating each I/O client as an independent unit and tuning configurations using only its locally observable metrics. With the help of machine learning models, DIAL enables multiple tunable units to make independent but collective decisions, reacting to what is happening in the global storage systems in a timely manner and achieving better I/O performance globally for the application.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。