惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

A
Arctic Wolf
V
V2EX
P
Proofpoint News Feed
The Hacker News
The Hacker News
GbyAI
GbyAI
G
Google Developers Blog
S
Schneier on Security
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
W
WeLiveSecurity
Security Archives - TechRepublic
Security Archives - TechRepublic
博客园 - Franky
Recent Announcements
Recent Announcements
腾讯CDC
Hacker News - Newest:
Hacker News - Newest: "LLM"
K
Kaspersky official blog
U
Unit 42
Engineering at Meta
Engineering at Meta
J
Java Code Geeks
Google Online Security Blog
Google Online Security Blog
Last Week in AI
Last Week in AI
V
Vulnerabilities – Threatpost
N
News and Events Feed by Topic
O
OpenAI News
量子位
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Y
Y Combinator Blog
博客园 - 【当耐特】
Vercel News
Vercel News
Hacker News: Ask HN
Hacker News: Ask HN
T
Tor Project blog
Apple Machine Learning Research
Apple Machine Learning Research
Microsoft Security Blog
Microsoft Security Blog
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
AWS News Blog
AWS News Blog
MongoDB | Blog
MongoDB | Blog
S
Security Affairs
A
About on SuperTechFans
Project Zero
Project Zero
D
Darknet – Hacking Tools, Hacker News & Cyber Security
博客园 - 聂微东
Webroot Blog
Webroot Blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Cloudbric
Cloudbric
T
Tenable Blog
月光博客
月光博客
C
Check Point Blog
宝玉的分享
宝玉的分享
V
Visual Studio Blog
T
The Blog of Author Tim Ferriss
NISL@THU
NISL@THU

cs.IR updates on arXiv.org

DeMix: Debugging Training Data with Mixed Data Error Types by Investigating Influence Vectors OneFeed: A Unified Generative Framework for Feed ContentEnhancement and Query Generation Beyond Retrieval: Learning Compact User Representations for Scalable LLM Personalization TechGraphRAG: An Agentic Graph-Augmented RAG Framework for Technical Literature Reasoning Bridging Passive and Active: Enhancing Conversation Starter Recommendation via Active Expression Modeling Empirical Study of Pop and Jazz Mix Ratios for Genre-Adaptive Chord Generation RoTRAG: Rule of Thumb Reasoning for Conversation Harm Detection with Retrieval-Augmented Generation Beyond Predefined Schemas: TRACE-KG for Context-Enriched Knowledge Graphs from Complex Documents AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access AgenticRec: End-to-End Tool-Integrated Policy Optimization for Ranking-Oriented Recommender Agents All-Mem: Agentic Lifelong Memory via Dynamic Topology Evolution Orcheo: A Modular Full-Stack Platform for Conversational Search From Noise to Order: Learning to Rank via Denoising Diffusion Self-Supervised Learning as Discrete Communication Beyond Case Law: Evaluating Structure-Aware Retrieval and Safety in Statute-Centric Legal QA MIRAGE: Runtime Scheduling for Multi-Vector Image Retrieval with Hierarchical Decomposition Peeking inside the Black-Box: Reinforcement Learning for Explainable and Accurate Relation Extraction Projection and Quantisation: A Unifying View of Learning to Hash, from Random Projections to the RAG Era Learning Unified User Quantized Tokenizers for User Representation A Comprehensive Survey of Knowledge-Based Vision Question Answering Systems: The Lifecycle of Knowledge in Visual Reasoning Task Similarity of Semantic Relations Expressing Implicit Semantic Relations without Supervision Lexical Adaptation of Link Grammar to the Biomedical Sublanguage: a Comparative Evaluation of Three Approaches Adapting a general parser to a sublanguage Evaluating Variable Length Markov Chain Models for Analysis of User Web Navigation Sessions Inference and Evaluation of the Multinomial Mixture Model for Text Clustering Similarity of Objects and the Meaning of Words A Multi-Relational Network to Support the Scholarly Communication Process Better than the real thing? Iterative pseudo-query processing using cluster-based language models PageRank without hyperlinks: Structural re-ranking using links induced by language models The Nature of Novelty Detection Hiérarchisation des règles d'association en fouille de textes Sur le statut référentiel des entités nommées Authoring case based training by document data extraction Transitive Text Mining for Information Extraction and Hypothesis Generation Lattices for Dynamic, Hierarchic & Overlapping Categorization: the Case of Epistemic Communities Corpus-based Learning of Analogies and Semantic Relations Summarizing Reports on Evolving Events; Part I: Linear Evolution Measuring Semantic Similarity by Latent Relational Analysis Universal Similarity Metalinguistic Information Extraction for Terminology Summarization from Medical Documents: A Survey An Introduction to the Summarization of Evolving Events: Linear and Non-linear Evolution Top-Down Unsupervised Image Segmentation (it sounds like oxymoron, but actually it is not) Ontology-Based Users & Requests Clustering in Customer Service Management System Combining Independent Modules in Lexical Multiple-Choice Problems The Google Similarity Distance Human-Level Performance on Word Analogy Questions by Latent Relational Analysis Ranking Pages by Topology and Popularity within Web Sites Building Chinese Lexicons from Scratch by Unsupervised Short Document Self-Segmentation Automatic Keyword Extraction from Spoken Text. A Comparison of two Lexical Resources: the EDR and WordNet An argumentative annotation schema for meeting discussions Semantic filtering by inference on domain knowledge in spoken dialogue systems A knowledge-based approach to semi-automatic annotation of multimedia documents via user adaptation Automated Pattern Detection--An Algorithm for Constructing Optimally Synchronizing Multi-Regular Language Filters Word Sense Disambiguation by Web Mining for Word Co-occurrence Probabilities Search Using N-gram Technique Based Statistical Analysis for Knowledge Extraction in Case Based Reasoning Systems A Dynamic Clustering-Based Markov Model for Web Usage Mining Corpus structure, language models, and ad hoc information retrieval "In vivo" spam filtering: A challenge problem for data mining Artificial Sequences and Complexity Measures Evolving a Stigmergic Self-Organized Data-Mining Polyhierarchical Classifications Induced by Criteria Polyhierarchies, and Taxonomy Algebra Acquiring Lexical Paraphrases from a Single Corpus Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval Data mining and Privacy in Public Sector using Intelligent Agents (discussion paper) A Neural Network Assembly Memory Model Based on an Optimal Binary Signal Detection Theory Combining Independent Modules to Solve Multiple-choice Synonym and Analogy Problems Measuring Praise and Criticism: Inference of Semantic Orientation from Association Semi-metric Behavior in Document Networks and its Application to Recommendation Systems ROC Curves Within the Framework of Neural Network Assembly Memory Model: Some Analytic Results Coherent Keyphrase Extraction via Web Mining Learning Analogies and Semantic Relations Bayesian Information Extraction Network A Method for Clustering Web Attacks Using Edit Distance A Neural Network Assembly Memory Model with Maximum-Likelihood Recall and Recognition Properties Analysis and Interface for Instructional Video Segmentation, Indexing, and Visualization of Extended Instructional Videos Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews Learning Algorithms for Keyphrase Extraction Question Answering over Unstructured Data without Domain Restrictions Knowledge management for enterprises (Wissensmanagement fuer Unternehmen) The Traits of the Personable Intelligent Anticipated Exploration of Web Sites Towards Solving the Interdisciplinary Language Barrier Problem Conceptual Analysis of Lexical Taxonomies: The Case of WordNet Top-Level Information Extraction Using the Structured Language Model Bipartite graph partitioning and data clustering Coupled Clustering: a Method for Detecting Structural Correspondence Iterative Residual Rescaling: An Analysis and Generalization of LSI File mapping Rule-based DBMS and Natural Language Processing Retrieval from Captioned Image Databases Using Natural Language Processing Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies How to Evaluate your Question Answering System Every Day and Still Get Real Work Done PIPE: Personalizing Recommendations via Partial Evaluation Representing Scholarly Claims in Internet Digital Libraries: A Knowledge Modelling Approach Using Local Optimality Criteria for Efficient Information Retrieval with Redundant Information Filters
miniReranker: Efficient Multimodal Reranking through Visual Cache Reuse and Interaction Sparsity
[Submitted on 9 Jun 2026 (v1), last revised 16 Jun 2026 (this ve · 2026-06-17 · via cs.IR updates on arXiv.org

View PDF HTML (experimental)

Abstract:Multimodal large language models (MLLMs) have recently shown strong potential as point-wise rerankers by directly modeling query--document relevance through next-token prediction. However, point-wise reranking suffers from substantial repeated computation across query--document pairs, while the causal structure of transformers allows only prefix segments to be reused via pre-caching. To address the misalignment of existing query-first and document-first formats with both VQA-style prompting and computation-aware reuse, we propose a $\textit{vision-first}$ formulation that improves both cache reuse efficiency and reranking performance. However, the remaining cost is still considerable and stems from three main sources: (1) $\textit{model depth}$, for which we reduce active parameters via early exit; (2) $\textit{cross-segment attention}$, which we restrict to a narrow interaction band across a few layers; and (3) $\textit{visual tokens}$, where we reduce the number of tokens via embedder-guided pruning. Together, these designs form miniReranker, which reduces reranking runtime to <1% of the dense implementation under high-reuse settings for a single query, while preserving >96% of the dense model performance.

Submission history

From: Yingqi Fan [view email]
[v1] Tue, 9 Jun 2026 12:11:02 UTC (691 KB)
[v2] Tue, 16 Jun 2026 02:36:19 UTC (691 KB)