



























This paper presents a distributed memory method for anisotropic mesh adaptation that is designed to avoid the use of collective communication and global synchronization techniques. In the presented method, meshing functionality is separated from performance aspects by utilizing a separate entity for each - a multicore cc-NUMA-based (shared memory) mesh generation software and a parallel runtime system that is designed to help applications leverage the concurrency offered by emerging high-performance computing (HPC) architectures. First, an initial mesh is decomposed and its interface elements (subdomain boundaries) are adapted on a single multicore node (shared memory). Subdomains are then distributed among the nodes of an HPC cluster so that their interior elements are adapted while interface elements (already adapted) remain frozen to maintain mesh conformity. Lessons are presented regarding some re-designs of the shared memory software and how its speculative execution model is utilized by the distributed memory method to achieve good performance. The presented method is shown to generate meshes (of up to approximately 1 billion elements) with comparable quality and performance to existing state-of-the-art HPC meshing software.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。