



















The recent multi-modality models have achieved great performance in many vision tasks because the extracted features contain the multi-modality knowledge. However, most of the current registration descriptors have only concentrated on local geometric structures. This paper proposes a method to boost point cloud registration accuracy by transferring the multi-modality knowledge of pre-trained multi-modality model to a new descriptor neural network. Different to the previous multi-modality methods that requires both modalities, the proposed method only requires point clouds during inference. Specifically, we propose an ensemble descriptor neural network combining pre-trained sparse convolution branch and a new point-based convolution branch. By fine-tuning on a single modality data, the proposed method achieves new state-of-the-art results on 3DMatch and competitive accuracy on 3DLoMatch and KITTI.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。