惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News | PayPal Newsroom
Cloudbric
Cloudbric
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
腾讯CDC
www.infosecurity-magazine.com
www.infosecurity-magazine.com
NISL@THU
NISL@THU
阮一峰的网络日志
阮一峰的网络日志
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
PCI Perspectives
PCI Perspectives
The Hacker News
The Hacker News
A
Arctic Wolf
Blog — PlanetScale
Blog — PlanetScale
宝玉的分享
宝玉的分享
P
Privacy International News Feed
博客园 - 叶小钗
量子位
博客园 - Franky
Apple Machine Learning Research
Apple Machine Learning Research
有赞技术团队
有赞技术团队
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Security Latest
Security Latest
T
Tailwind CSS Blog
Help Net Security
Help Net Security
P
Privacy & Cybersecurity Law Blog
人人都是产品经理
人人都是产品经理
酷 壳 – CoolShell
酷 壳 – CoolShell
雷峰网
雷峰网
S
SegmentFault 最新的问题
C
CERT Recently Published Vulnerability Notes
O
OpenAI News
T
Tenable Blog
T
Troy Hunt's Blog
T
Threat Research - Cisco Blogs
Forbes - Security
Forbes - Security
C
Cybersecurity and Infrastructure Security Agency CISA
V
Visual Studio Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Schneier on Security
Schneier on Security
IT之家
IT之家
Last Week in AI
Last Week in AI
月光博客
月光博客
博客园 - 司徒正美
Application and Cybersecurity Blog
Application and Cybersecurity Blog
小众软件
小众软件
爱范儿
爱范儿
Simon Willison's Weblog
Simon Willison's Weblog
S
Security Archives - TechRepublic
C
Cisco Blogs

cs.CL updates on arXiv.org

Indexing Multimodal Language Models for Large-scale Image Retrieval UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments Caption First, VQA Second: Knowledge Density, Not Task Format, Drives Multimodal Scaling PersonaVLM: Long-Term Personalized Multimodal LLMs MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments MedRCube: A Multidimensional Framework for Fine-Grained and In-Depth Evaluation of MLLMs in Medical Imaging Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking Reward Design for Physical Reasoning in Vision-Language Models When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning? Failure Makes the Agent Stronger: Enhancing Accuracy through Structured Reflection for Reliable Tool Interactions Addressing Overthinking in Large Vision-Language Models via Gated Perception-Reasoning Optimization ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning (How) Learning Rates Regulate Catastrophic Overtraining Parameter Importance is Not Static: Evolving Parameter Isolation for Supervised Fine-Tuning $\pi$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space A Domain-Specific Language for LLM-Driven Trigger Generation in Multimodal Data Collection The Consciousness Cluster: Emergent preferences of Models that Claim to be Conscious KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context Dental-TriageBench: Benchmarking Multimodal Reasoning for Hierarchical Dental Triage Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data Detection Without Correction: A Robust Asymmetry in Activation-Based Hallucination Probing LiveClawBench: Benchmarking LLM Agents on Complex, Real-World Assistant Tasks Better and Worse with Scale: How Contextual Entrainment Diverges with Model Size Functional Emotions or Situational Contexts? A Discriminating Test from the Mythos Preview System Card C2: Scalable Rubric-Augmented Reward Modeling from Binary Preferences Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain Text-as-Signal: Quantitative Semantic Scoring with Embeddings, Logprobs, and Noise Reduction A Multi-Model Approach to English-Bangla Sentiment Classification of Government Mobile Banking App Reviews A Proactive EMR Assistant for Doctor-Patient Dialogue: Streaming ASR, Belief Stabilization, and Preliminary Controlled Evaluation Token Statistics Reveal Conversational Drift in Multi-turn LLM Interaction Mathematical Reasoning Enhanced LLM for Formula Derivation: A Case Study on Fiber NLI Modellin Red Skills or Blue Skills? A Dive Into Skills Published on ClawHub Correct Chains, Wrong Answers: Dissociating Reasoning from Output in LLM Logic Curation of a Palaeohispanic Dataset for Machine Learning EVE: A Domain-Specific LLM Framework for Earth Intelligence OmniTrace: A Unified Framework for Generation-Time Attribution in Omni-Modal LLMs DeEscalWild: A Real-World Benchmark for Automated De-Escalation Training with SLMs Document-tuning for robust alignment to animals Can Large Language Models Reliably Extract Physiology Index Values from Coronary Angiography Reports? IWLV-Ramayana: A Sarga-Aligned Parallel Corpus of Valmiki's Ramayana Across Indian Languages Unleashing Implicit Rewards: Prefix-Value Learning for Distribution-Level Optimization InfiniteScienceGym: An Unbounded, Procedurally-Generated Benchmark for Scientific Analysis Evaluating the Evaluator: Problems with SemEval-2020 Task 1 for Lexical Semantic Change Detection Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs L2D-Clinical: Learning to Defer for Adaptive Model Selection in Clinical Text Classification English is Not All You Need: Systematically Exploring the Role of Multilinguality in LLM Post-Training Giving Voice to the Constitution: Low-Resource Text-to-Speech for Quechua and Spanish Using a Bilingual Legal Corpus AgentSPEX: An Agent SPecification and EXecution Language Peer-Predictive Self-Training for Language Model Reasoning TLoRA+: A Low-Rank Parameter-Efficient Fine-Tuning Method for Large Language Models Empirical Evidence of Complexity-Induced Limits in Large Language Models on Finite Discrete State-Space Problems with Explicit Validity Constraints From Prediction to Justification: Aligning Sentiment Reasoning with Human Rationale via Reinforcement Learning CANVAS: Continuity-Aware Narratives via Visual Agentic Storyboarding Using reasoning LLMs to extract SDOH events from clinical notes ToolSpec: Accelerating Tool Calling via Schema-Aware and Retrieval-Augmented Speculative Decoding Synthesizing Instruction-Tuning Datasets with Contrastive Decoding Debate to Align: Reliable Entity Alignment through Two-Stage Multi-Agent Debate Training-Free Test-Time Contrastive Learning for Large Language Models YOCO++: Enhancing YOCO with KV Residual Connections for Efficient LLM Inference MM-Doc-R1: Training Agents for Long Document Visual Question Answering through Multi-turn Reinforcement Learning BenGER Platform: A Collaborative Web Platform for End-to-End Benchmarking of German Legal Tasks Foresight Optimization for Strategic Reasoning in Large Language Models Syn-TurnTurk: A Synthetic Dataset for Turn-Taking Prediction in Turkish Dialogues IndicDB -- Benchmarking Multilingual Text-to-SQL Capabilities in Indian Languages Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection Beyond Arrow's Impossibility: Fairness as an Emergent Property of Multi-Agent Collaboration Co-FactChecker: A Framework for Human-AI Collaborative Claim Verification Using Large Reasoning Models Learning the Cue or Learning the Word? Analyzing Generalization in Metaphor Detection for Verbs An Empirical Investigation of Practical LLM-as-a-Judge Improvement Techniques on RewardBench 2 Toward Generalized Cross-Lingual Hateful Language Detection with Web-Scale Data and Ensemble LLM Annotations Self-Calibrating Language Models via Test-Time Discriminative Distillation HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation Generating High Quality Synthetic Data for Dutch Medical Conversations GIANTS: Generative Insight Anticipation from Scientific Literature Claim2Vec: Embedding Fact-Check Claims for Multilingual Similarity and Clustering Spoiler Alert: Narrative Forecasting as a Metric for Tension in LLM Storytelling Should We be Pedantic About Reasoning Errors in Machine Translation? Human vs. Machine Deception: Distinguishing AI-Generated and Human-Written Fake News Using Ensemble Learning Weird Generalization is Weirdly Brittle Computational Implementation of a Model of Category-Theoretic Metaphor Comprehension CoSToM:Causal-oriented Steering for Intrinsic Theory-of-Mind Alignment in Large Language Models Mirroring Minds: Asymmetric Linguistic Accommodation and Diagnostic Identity in ADHD and Autism Reddit Communities ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models Why Supervised Fine-Tuning Fails to Learn: A Systematic Study of Incomplete Learning in Large Language Models Reason Only When Needed: Efficient Generative Reward Modeling via Model-Internal Uncertainty SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models Who Wrote This Line? Evaluating the Detection of LLM-Generated Classical Chinese Poetry CircuitSynth: Reliable Synthetic Data Generation Training-Free Cross-Lingual Dysarthria Severity Assessment via Phonological Subspace Analysis in Self-Supervised Speech Representations Simulating Organized Group Behavior: New Framework, Benchmark, and Analysis Think in Sentences: Explicit Sentence Boundaries Enhance Language Model's Capabilities ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification Nationality encoding in language model hidden states: Probing culturally differentiated representations in persona-conditioned academic text
Efficient Rationale-based Retrieval: On-policy Distillation from Generative Rerankers based on JEPA
Teng Chen, S · 2026-04-28 · via cs.CL updates on arXiv.org

View PDF HTML (experimental)

Abstract:Unlike traditional fact-based retrieval, rationale-based retrieval typically necessitates cross-encoding of query-document pairs using large language models, incurring substantial computational costs. To address this limitation, we propose Rabtriever, which independently encodes queries and documents, while providing comparable cross query-document comprehension capabilities to rerankers. We start from training a LLM-based generative reranker, which puts the document prior to the query and prompts the LLM to generate the relevance score by log probabilities. We then employ it as the teacher of an on-policy distillation framework, with Rabtriever as the student to reconstruct the teacher's contextual-aware query embedding. To achieve this effect, Rabtriever is first initialized from the teacher, with parameters frozen. The Joint-Embedding Predictive Architecture (JEPA) paradigm is then adopted, which integrates a lightweight, trainable predictor between LLM layers and heads, projecting the query embedding into a new hidden space, with the document embedding as the latent vector. JEPA then minimizes the distribution difference between this projected embedding and the teacher embedding. To strengthen the sampling efficiency of on-policy distillation, we also add an auxiliary loss on the reverse KL of LLM logits, to reshape the student's logit distribution. Rabtriever optimizes the teacher's quadratic complexity on the document length to linear, verified both theoretically and empirically. Experiments show that Rabtriever outperforms different retriever baselines across diverse rationale-based tasks, including empathetic conversations and robotic manipulations, with minor accuracy degradation from the reranker. Rabtriever also generalizes well on traditional retrieval benchmarks such as MS MARCO and BEIR, with comparable performance to the best retriever baseline.
Comments: 11 pages, 8 figures. ICMR 2026
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as: arXiv:2604.23336 [cs.IR]
  (or arXiv:2604.23336v1 [cs.IR] for this version)
  https://doi.org/10.48550/arXiv.2604.23336

arXiv-issued DOI via DataCite (pending registration)

Related DOI: https://doi.org/10.1145/3805622.3810780

DOI(s) linking to related resources

Submission history

From: Luo Ji [view email]
[v1] Sat, 25 Apr 2026 14:45:33 UTC (317 KB)