

























An out-of-core stencil computation code handles large data whose size is beyond the capacity of GPU memory. Whereas, such an code requires streaming data to and from the GPU frequently. As a result, data movement between the CPU and GPU usually limits the performance. In this work, compression-based optimizations are proposed. First, an on-the-fly compression technique is applied to an out-of-core stencil code, reducing the CPU-GPU memory copy. Secondly, a single working buffer technique is used to reduce GPU memory consumption. Experimental results show that the stencil code using the proposed techniques achieved 1.1x speed and reduced GPU memory consumption by 33.0\% on an NVIDIA Tesla V100 GPU.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。