

















Abstract:Real-world weather, illumination, and imaging variations often induce severe domain shifts, degrading single-source detectors in unseen environments. Existing single-domain generalized object detection (SDGOD) methods mainly rely on data augmentation or domain-invariant learning, while largely overlooking how domain shift disrupts detector prediction stability. Through analytical experiments, we find that performance degradation is mainly dominated by increasing missed detections. Further analysis shows that this phenomenon stems from reduced cross-domain stability in DETR-style detectors: domain shift disrupts encoder-side object-background and inter-instance relations, and further weakens the semantic-spatial binding between decoder queries and real objects. Motivated by this, we find that vision foundation models (VFMs) still preserve stable relational structures and object responses under severe shifts, making them suitable cross-domain stability priors to compensate for detector degradation. To this end, we propose VFM$^{4}$SDG, a dual-prior learning framework for SDGOD, which introduces a frozen VFM into encoder representation learning and decoder query modeling. Specifically, we propose Cross-domain Stable Relational Prior Distillation to distill stable object-background and inter-instance relations from the VFM into the encoder, compensating for relational degradation. Meanwhile, we propose Semantic-Contextual Prior-based Query Enhancement, which injects category semantic prototypes and global object context into queries before they enter the decoder layer, enhancing semantic-spatial query-object binding stability. Extensive experiments show that VFM$^{4}$SDG significantly outperforms existing advanced methods on standard SDGOD benchmarks and two mainstream DETR-based detection frameworks, demonstrating its effectiveness, robustness, and generality.
| Subjects: | Computer Vision and Pattern Recognition (cs.CV) |
| Cite as: | arXiv:2604.21502 [cs.CV] |
| (or arXiv:2604.21502v2 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2604.21502 arXiv-issued DOI via DataCite |
From: Yupeng Zhang [view email]
[v1]
Thu, 23 Apr 2026 10:04:36 UTC (761 KB)
[v2]
Fri, 22 May 2026 15:42:19 UTC (1,535 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。