惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

C
CXSECURITY Database RSS Feed - CXSecurity.com
V
Visual Studio Blog
aimingoo的专栏
aimingoo的专栏
博客园_首页
C
Check Point Blog
T
Threatpost
SecWiki News
SecWiki News
宝玉的分享
宝玉的分享
AWS News Blog
AWS News Blog
博客园 - 三生石上(FineUI控件)
Scott Helme
Scott Helme
The Register - Security
The Register - Security
Cyberwarzone
Cyberwarzone
C
Cyber Attacks, Cyber Crime and Cyber Security
Know Your Adversary
Know Your Adversary
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
P
Proofpoint News Feed
I
InfoQ
WordPress大学
WordPress大学
A
Arctic Wolf
T
Threat Research - Cisco Blogs
大猫的无限游戏
大猫的无限游戏
J
Java Code Geeks
A
About on SuperTechFans
P
Palo Alto Networks Blog
博客园 - Franky
I
Intezer
T
Tenable Blog
S
Secure Thoughts
Project Zero
Project Zero
S
Securelist
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
H
Heimdal Security Blog
Google Online Security Blog
Google Online Security Blog
The Cloudflare Blog
云风的 BLOG
云风的 BLOG
Security Latest
Security Latest
M
MIT News - Artificial intelligence
Martin Fowler
Martin Fowler
H
Hackread – Cybersecurity News, Data Breaches, AI and More
B
Blog
MongoDB | Blog
MongoDB | Blog
Forbes - Security
Forbes - Security
Application and Cybersecurity Blog
Application and Cybersecurity Blog
MyScale Blog
MyScale Blog
The Last Watchdog
The Last Watchdog
F
Fortinet All Blogs
雷峰网
雷峰网
V2EX - 技术
V2EX - 技术

cs.IR updates on arXiv.org

From Top-1 to Top-K: A Reproducibility Study and Benchmarking of Counterfactual Explanations for Recommender Systems Impact of large language models on peer review opinions from a fine-grained perspective: Evidence from top conference proceedings in AI Diagnosable ColBERT: Debugging Late-Interaction Retrieval Models Using a Learned Latent Space as Reference Enhancing Unsupervised Keyword Extraction in Academic Papers through Integrating Highlights with Abstract CAST: Modeling Semantic-Level Transitions for Complementary-Aware Sequential Recommendation IndiaFinBench: An Evaluation Benchmark for Large Language Model Performance on Indian Financial Regulatory Text Think Before Writing: Feature-Level Multi-Objective Optimization for Generative Citation Visibility RARE: Redundancy-Aware Retrieval Evaluation Framework for High-Similarity Corpora Personalized Benchmarking: Evaluating LLMs by Individual Preferences Modular Representation Compression: Adapting LLMs for Efficient and Effective Recommendations JFinTEB: Japanese Financial Text Embedding Benchmark UsefulBench: Towards Decision-Useful Information as a Target for Information Retrieval SIMMER: Cross-Modal Food Image--Recipe Retrieval via MLLM-Based Embedding Rethinking the Necessity of Adaptive Retrieval-Augmented Generation through the Lens of Adaptive Listwise Ranking BioHiCL: Hierarchical Multi-Label Contrastive Learning for Biomedical Retrieval with MeSH Labels Learning Behaviorally Grounded Item Embeddings via Personalized Temporal Contexts Collaborative Filtering Through Weighted Similarities of User and Item Embeddings IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning Metric-agnostic Learning-to-Rank via Boosting and Rank Approximation GenRec: A Preference-Oriented Generative Framework for Large-Scale Recommendation Uncertainty-aware Generative Learning Path Recommendation with Cognition-Adaptive Diffusion CPGRec+: A Balance-oriented Framework for Personalized Video Game Recommendations Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG NewsTorch: A PyTorch-based Toolkit for Learner-oriented News Recommendation Controlling Authority Retrieval: A Missing Retrieval Objective for Authority-Governed Knowledge APEX-MEM: Agentic Semi-Structured Memory with Temporal Reasoning for Long-Term Conversational AI ID and Graph View Contrastive Learning with Multi-View Attention Fusion for Sequential Recommendation Large Language Models to Enhance Business Process Modeling: Past, Present, and Future Trends Dual-Enhancement Product Bundling: Bridging Interactive Graph and Large Language Model Evaluation of Agents under Simulated AI Marketplace Dynamics Driving Engagement in Daily Fantasy Sports with a Scalable and Urgency-Aware Ranking Engine TokenFormer: Unify the Multi-Field and Sequential Recommendation Worlds Hybrid Retrieval for COVID-19 Literature: Comparing Rank Fusion and Projection Fusion with Diversity Reranking FRAGATA: Semantic Retrieval of HPC Support Tickets via Hybrid RAG over 20 Years of Request Tracker History Debate to Align: Reliable Entity Alignment through Two-Stage Multi-Agent Debate From Relevance to Authority: Authority-aware Generative Retrieval in Web Search Engines Indexing Multimodal Language Models for Large-scale Image Retrieval FRESCO: Benchmarking and Optimizing Re-rankers for Evolving Semantic Conflict in Retrieval-Augmented Generation TRACE: A Conversational Framework for Sustainable Tourism Recommendation with Agentic Counterfactual Explanations Adaptive Query Routing: A Tier-Based Framework for Hybrid Retrieval Across Financial, Legal, and Medical Documents Knowledge Graph RAG: Agentic Crawling and Graph Construction in Enterprise Documents NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment Think Before you Write: QA-Guided Reasoning for Character Descriptions in Books Frugal Knowledge Graph Construction with Local LLMs: A Zero-Shot Pipeline, Self-Consistency and Wisdom of Artificial Crowds ATANT v1.1: Positioning Continuity Evaluation Against Memory, Long-Context, and Agentic-Memory Benchmarks Deep-Reporter: Deep Research for Grounded Multimodal Long-Form Generation HeceTokenizer: A Syllable-Based Tokenization Approach for Turkish Retrieval NSFL: A Post-Training Neuro-Symbolic Fuzzy Logic Framework for Boolean Operators in Neural Embeddings Hijacking Text Heritage: Hiding the Human Signature through Homoglyphic Substitution ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification MOSAIC: Multi-Domain Orthogonal Session Adaptive Intent Capture for Prescient Recommendations Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions PriHA: A RAG-Enhanced LLM Framework for Primary Healthcare Assistant in Hong Kong Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA MAB-DQA: Addressing Query Aspect Importance in Document Question Answering with Multi-Armed Bandits PRAGMA: Revolut Foundation Model Rag Performance Prediction for Question Answering Pretrain-then-Adapt: Uncertainty-Aware Test-Time Adaptation for Text-based Person Search Evaluating Scene-based In-Situ Item Labeling for Immersive Conversational Recommendation Do We Still Need GraphRAG? Benchmarking RAG and GraphRAG for Agentic Search Systems Hydra: Unifying Document Retrieval and Generation in a Single Vision-Language Model SocialWise: LLM-Agentic Conversation Therapy for Individuals with Autism Spectrum Disorder to Enhance Communication Skills Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models Resolving the Robustness-Precision Trade-off in Financial RAG through Hybrid Document-Routed Retrieval Spectral Tempering for Embedding Compression in Dense Passage Retrieval AdaQE-CG: Adaptive Query Expansion for Web-Scale Generative AI Model and Data Card Generation To LLM, or Not to LLM: How Designers and Developers Navigate LLMs as Tools or Teammates A Domain-Specific Language for LLM-Driven Trigger Generation in Multimodal Data Collection MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents LiveGraph: Active-Structure Neural Re-ranking for Exercise Recommendation GaiaFlow: Semantic-Guided Diffusion Tuning for Carbon-Frugal Search Hunt Globally: Wide Search AI Agents for Drug Asset Scouting in Investing, Business Development, and Competitive Intelligence From Speech-to-Spatial: Grounding Utterances on A Live Shared View with Augmented Reality Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics Exploring Structural Complexity in Normative RAG with Graph-based approaches: A case study on the ETSI Standards SRBench: A Comprehensive Benchmark for Sequential Recommendation with Large Language Models MCERF: Advancing Multimodal LLM Evaluation of Engineering Documentation with Enhanced Retrieval SemaCDR: LLM-Powered Transferable Semantics for Cross-Domain Sequential Recommendation Beyond Offline A/B Testing: Context-Aware Agent Simulation for Recommender System Evaluation AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows Agentic Conversational Search with Contextualized Reasoning via Reinforcement Learning Retrieval-Augmented Large Language Models for Evidence-Informed Guidance on Cannabidiol Use in Older Adults RLPO: Residual Listwise Preference Optimization for Long-Context Review Ranking When & How to Write for Personalized Demand-aware Query Rewriting in Video Search Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows WisPaper: Your AI Scholar Search Engine GroupRank: A Groupwise Paradigm for Effective and Efficient Passage Reranking with LLMs Hierarchical Semantic Retrieval with Cobweb WARBERT: A Hierarchical BERT-based Model for Web API Recommendation Reliable Evaluation Protocol for Low-Precision Retrieval VoteGCL: Enhancing Graph-based Recommendations with Majority-Voting LLM-Rerank Augmentation Exploitation Over Exploration: Unmasking the Bias in Linear Bandit Recommender Offline Evaluation ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking What Makes LLMs Effective Sequential Recommenders? A Study on Preference Intensity and Temporal Context From Limited Labels to Open Domains:An Efficient Learning Method for Drone-view Geo-Localization User Simulation in the Era of Generative AI: User Modeling, Synthetic Data Generation, and System Evaluation PoTable: Towards Systematic Thinking via Plan-then-Execute Stage Reasoning on Tables An Iterative Utility Judgment Framework Inspired by Philosophical Relevance via LLMs Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data
RARR: Researching and Revising What Language Models Say, Using Language Models
Luyu Gao, Zhuyun Dai, Panupong Pasupat, Anthony Chen, Arun Tejas · 2022-10-17 · via cs.IR updates on arXiv.org

Language models (LMs) now excel at many tasks such as few-shot learning, question answering, reasoning, and dialog. However, they sometimes generate unsupported or misleading content. A user cannot easily determine whether their outputs are trustworthy or not, because most LMs do not have any built-in mechanism for attribution to external evidence. To enable attribution while still preserving all the powerful advantages of recent generation models, we propose RARR (Retrofit Attribution using Research and Revision), a system that 1) automatically finds attribution for the output of any text generation model and 2) post-edits the output to fix unsupported content while preserving the original output as much as possible. When applied to the output of several state-of-the-art LMs on a diverse set of generation tasks, we find that RARR significantly improves attribution while otherwise preserving the original input to a much greater degree than previously explored edit models. Furthermore, the implementation of RARR requires only a handful of training examples, a large language model, and standard web search.