





















Abstract:Optimizing the trade-off among predictive performance and computational cost is a central focus in the deployment of Large Language Models (LLMs). Current routing methods primarily rely on direct mapping from queries to models based on surface-level features, making them susceptible to the memorization trap and leading to poor generalizability on out-of-distribution (OOD) data. In this paper, we propose DecoR, a novel routing framework that recasts the routing task as a matching process of sifting similar queries from historical logs, effectively mitigating the memorization trap. To enhance matching accuracy, we introduce a query capability deconstruction method that decouples linguistic surface forms from task-intrinsic requirements, directing matching toward capability dimensions to ground decisions in essential task attributes. Furthermore, we develop CodaSet, a comprehensive benchmark for assessing routing generalization, where experimental results demonstrate that DecoR maintains superior accuracy while substantially lowering inference costs across both in-distribution and OOD settings. All the codes and data are available at this https URL.
| Subjects: | Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2605.25558 [cs.AI] |
| (or arXiv:2605.25558v1 [cs.AI] for this version) | |
| https://doi.org/10.48550/arXiv.2605.25558 arXiv-issued DOI via DataCite (pending registration) |
From: Bo Lv [view email]
[v1]
Mon, 25 May 2026 08:12:58 UTC (797 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。