






















We present a simple library which equips MPI implementations with truly asynchronous non-blocking point-to-point operations, and which is independent of the underlying communication infrastructure. It utilizes the MPI profiling interface (PMPI) and the MPI_THREAD_MULTIPLE thread compatibility level, and works with current versions of Intel MPI, Open MPI, MPICH2, MVAPICH2, Cray MPI, and IBM MPI. We show performance comparisons on a commodity InfiniBand cluster and two tier-1 systems in Germany, using low-level and application benchmarks. Issues of thread/process placement and the peculiarities of different MPI implementations are discussed in detail. We also identify the MPI libraries that already support asynchronous operations. Finally we show how our ideas can be extended to MPI-IO.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。