





















Abstract:Machine learning deployments in real-world wireless communication tasks face significant generalization challenges due to location and environment-specific signal structure, high diversity in data across different deployments, and limited availability of real-world data. Current approaches for assessing data similarity between training and inference (deployment) distributions, as well as evaluating model transferability, suffer from high computational costs and inconsistent performance, leaving critical model deployment and model life cycle management decisions without a principled foundation. To address this, we introduce a dataset similarity framework built upon the feature space of a pretrained wireless foundation model. Our method, LWM-CDE (Contrastive learning of Dataset Embedding), fine-tunes the dataset embeddings of the foundation model using a combination of contrastive and geometry-shaping losses, creating a structured manifold where distance reliably indicates transferability. Extensive experiments on wireless benchmarks show that LWM-CDE achieves stronger correlation with empirical transfer performance than existing metrics while being more computationally efficient. The learned representation space supports more effective and data-efficient decision-making for tasks like source dataset selection, label-aware augmentation, and budgeted pretraining, demonstrating its broader utility across different wireless communication applications.
| Comments: | The model and relevant scripts are available on the WILab Hugging Face page: this https URL |
| Subjects: | Signal Processing (eess.SP); Machine Learning (cs.LG) |
| Cite as: | arXiv:2605.24077 [eess.SP] |
| (or arXiv:2605.24077v1 [eess.SP] for this version) | |
| https://doi.org/10.48550/arXiv.2605.24077 arXiv-issued DOI via DataCite (pending registration) |
From: Sadjad Alikhani [view email]
[v1]
Fri, 22 May 2026 17:09:06 UTC (4,187 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。