惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

有赞技术团队
有赞技术团队
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Martin Fowler
Martin Fowler
IT之家
IT之家
Engineering at Meta
Engineering at Meta
D
Docker
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
I
InfoQ
A
About on SuperTechFans
美团技术团队
S
SegmentFault 最新的问题
GbyAI
GbyAI
宝玉的分享
宝玉的分享
人人都是产品经理
人人都是产品经理
酷 壳 – CoolShell
酷 壳 – CoolShell
Blog — PlanetScale
Blog — PlanetScale
B
Blog
Recent Announcements
Recent Announcements
The Cloudflare Blog
大猫的无限游戏
大猫的无限游戏
Jina AI
Jina AI
Google DeepMind News
Google DeepMind News
Recorded Future
Recorded Future
V
Visual Studio Blog
博客园_首页
SecWiki News
SecWiki News
The Last Watchdog
The Last Watchdog
D
Darknet – Hacking Tools, Hacker News & Cyber Security
V
V2EX
Google Online Security Blog
Google Online Security Blog
The Register - Security
The Register - Security
博客园 - 司徒正美
Cisco Talos Blog
Cisco Talos Blog
腾讯CDC
MongoDB | Blog
MongoDB | Blog
Hugging Face - Blog
Hugging Face - Blog
Microsoft Security Blog
Microsoft Security Blog
T
Threat Research - Cisco Blogs
L
LINUX DO - 最新话题
The Hacker News
The Hacker News
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
J
Java Code Geeks
博客园 - 三生石上(FineUI控件)
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
博客园 - 【当耐特】
F
Fortinet All Blogs
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
L
Lohrmann on Cybersecurity
P
Privacy & Cybersecurity Law Blog

cs.IR updates on arXiv.org

From Top-1 to Top-K: A Reproducibility Study and Benchmarking of Counterfactual Explanations for Recommender Systems Impact of large language models on peer review opinions from a fine-grained perspective: Evidence from top conference proceedings in AI Diagnosable ColBERT: Debugging Late-Interaction Retrieval Models Using a Learned Latent Space as Reference Enhancing Unsupervised Keyword Extraction in Academic Papers through Integrating Highlights with Abstract CAST: Modeling Semantic-Level Transitions for Complementary-Aware Sequential Recommendation IndiaFinBench: An Evaluation Benchmark for Large Language Model Performance on Indian Financial Regulatory Text Think Before Writing: Feature-Level Multi-Objective Optimization for Generative Citation Visibility RARE: Redundancy-Aware Retrieval Evaluation Framework for High-Similarity Corpora Personalized Benchmarking: Evaluating LLMs by Individual Preferences Modular Representation Compression: Adapting LLMs for Efficient and Effective Recommendations JFinTEB: Japanese Financial Text Embedding Benchmark UsefulBench: Towards Decision-Useful Information as a Target for Information Retrieval SIMMER: Cross-Modal Food Image--Recipe Retrieval via MLLM-Based Embedding Rethinking the Necessity of Adaptive Retrieval-Augmented Generation through the Lens of Adaptive Listwise Ranking BioHiCL: Hierarchical Multi-Label Contrastive Learning for Biomedical Retrieval with MeSH Labels Learning Behaviorally Grounded Item Embeddings via Personalized Temporal Contexts Collaborative Filtering Through Weighted Similarities of User and Item Embeddings IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning Metric-agnostic Learning-to-Rank via Boosting and Rank Approximation GenRec: A Preference-Oriented Generative Framework for Large-Scale Recommendation Uncertainty-aware Generative Learning Path Recommendation with Cognition-Adaptive Diffusion CPGRec+: A Balance-oriented Framework for Personalized Video Game Recommendations Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG NewsTorch: A PyTorch-based Toolkit for Learner-oriented News Recommendation Controlling Authority Retrieval: A Missing Retrieval Objective for Authority-Governed Knowledge APEX-MEM: Agentic Semi-Structured Memory with Temporal Reasoning for Long-Term Conversational AI ID and Graph View Contrastive Learning with Multi-View Attention Fusion for Sequential Recommendation Large Language Models to Enhance Business Process Modeling: Past, Present, and Future Trends Dual-Enhancement Product Bundling: Bridging Interactive Graph and Large Language Model Evaluation of Agents under Simulated AI Marketplace Dynamics Driving Engagement in Daily Fantasy Sports with a Scalable and Urgency-Aware Ranking Engine TokenFormer: Unify the Multi-Field and Sequential Recommendation Worlds Hybrid Retrieval for COVID-19 Literature: Comparing Rank Fusion and Projection Fusion with Diversity Reranking FRAGATA: Semantic Retrieval of HPC Support Tickets via Hybrid RAG over 20 Years of Request Tracker History Debate to Align: Reliable Entity Alignment through Two-Stage Multi-Agent Debate From Relevance to Authority: Authority-aware Generative Retrieval in Web Search Engines Indexing Multimodal Language Models for Large-scale Image Retrieval FRESCO: Benchmarking and Optimizing Re-rankers for Evolving Semantic Conflict in Retrieval-Augmented Generation TRACE: A Conversational Framework for Sustainable Tourism Recommendation with Agentic Counterfactual Explanations Adaptive Query Routing: A Tier-Based Framework for Hybrid Retrieval Across Financial, Legal, and Medical Documents Knowledge Graph RAG: Agentic Crawling and Graph Construction in Enterprise Documents NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment Think Before you Write: QA-Guided Reasoning for Character Descriptions in Books Frugal Knowledge Graph Construction with Local LLMs: A Zero-Shot Pipeline, Self-Consistency and Wisdom of Artificial Crowds ATANT v1.1: Positioning Continuity Evaluation Against Memory, Long-Context, and Agentic-Memory Benchmarks Deep-Reporter: Deep Research for Grounded Multimodal Long-Form Generation HeceTokenizer: A Syllable-Based Tokenization Approach for Turkish Retrieval NSFL: A Post-Training Neuro-Symbolic Fuzzy Logic Framework for Boolean Operators in Neural Embeddings Hijacking Text Heritage: Hiding the Human Signature through Homoglyphic Substitution ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification MOSAIC: Multi-Domain Orthogonal Session Adaptive Intent Capture for Prescient Recommendations Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions PriHA: A RAG-Enhanced LLM Framework for Primary Healthcare Assistant in Hong Kong Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA MAB-DQA: Addressing Query Aspect Importance in Document Question Answering with Multi-Armed Bandits PRAGMA: Revolut Foundation Model Rag Performance Prediction for Question Answering Pretrain-then-Adapt: Uncertainty-Aware Test-Time Adaptation for Text-based Person Search Evaluating Scene-based In-Situ Item Labeling for Immersive Conversational Recommendation Do We Still Need GraphRAG? Benchmarking RAG and GraphRAG for Agentic Search Systems Hydra: Unifying Document Retrieval and Generation in a Single Vision-Language Model SocialWise: LLM-Agentic Conversation Therapy for Individuals with Autism Spectrum Disorder to Enhance Communication Skills Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models Resolving the Robustness-Precision Trade-off in Financial RAG through Hybrid Document-Routed Retrieval Spectral Tempering for Embedding Compression in Dense Passage Retrieval AdaQE-CG: Adaptive Query Expansion for Web-Scale Generative AI Model and Data Card Generation To LLM, or Not to LLM: How Designers and Developers Navigate LLMs as Tools or Teammates A Domain-Specific Language for LLM-Driven Trigger Generation in Multimodal Data Collection MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents LiveGraph: Active-Structure Neural Re-ranking for Exercise Recommendation GaiaFlow: Semantic-Guided Diffusion Tuning for Carbon-Frugal Search Hunt Globally: Wide Search AI Agents for Drug Asset Scouting in Investing, Business Development, and Competitive Intelligence From Speech-to-Spatial: Grounding Utterances on A Live Shared View with Augmented Reality Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics Exploring Structural Complexity in Normative RAG with Graph-based approaches: A case study on the ETSI Standards SRBench: A Comprehensive Benchmark for Sequential Recommendation with Large Language Models MCERF: Advancing Multimodal LLM Evaluation of Engineering Documentation with Enhanced Retrieval SemaCDR: LLM-Powered Transferable Semantics for Cross-Domain Sequential Recommendation Beyond Offline A/B Testing: Context-Aware Agent Simulation for Recommender System Evaluation AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows Agentic Conversational Search with Contextualized Reasoning via Reinforcement Learning Retrieval-Augmented Large Language Models for Evidence-Informed Guidance on Cannabidiol Use in Older Adults RLPO: Residual Listwise Preference Optimization for Long-Context Review Ranking When & How to Write for Personalized Demand-aware Query Rewriting in Video Search Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows WisPaper: Your AI Scholar Search Engine GroupRank: A Groupwise Paradigm for Effective and Efficient Passage Reranking with LLMs Hierarchical Semantic Retrieval with Cobweb WARBERT: A Hierarchical BERT-based Model for Web API Recommendation Reliable Evaluation Protocol for Low-Precision Retrieval VoteGCL: Enhancing Graph-based Recommendations with Majority-Voting LLM-Rerank Augmentation Exploitation Over Exploration: Unmasking the Bias in Linear Bandit Recommender Offline Evaluation ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking What Makes LLMs Effective Sequential Recommenders? A Study on Preference Intensity and Temporal Context From Limited Labels to Open Domains:An Efficient Learning Method for Drone-view Geo-Localization User Simulation in the Era of Generative AI: User Modeling, Synthetic Data Generation, and System Evaluation PoTable: Towards Systematic Thinking via Plan-then-Execute Stage Reasoning on Tables An Iterative Utility Judgment Framework Inspired by Philosophical Relevance via LLMs Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data
ULTRA: An Unbiased Learning To Rank Algorithm Toolbox
Anh Tran, Tao Yang, Qingyao Ai · 2021-08-11 · via cs.IR updates on arXiv.org

Learning to rank systems has become an important aspect of our daily life. However, the implicit user feedback that is used to train many learning to rank models is usually noisy and suffered from user bias (i.e., position bias). Thus, obtaining an unbiased model using biased feedback has become an important research field for IR. Existing studies on unbiased learning to rank (ULTR) can be generalized into two families-algorithms that attain unbiasedness with logged data, offline learning, and algorithms that achieve unbiasedness by estimating unbiased parameters with real-time user interactions, namely online learning. While there exist many algorithms from both families, there lacks a unified way to compare and benchmark them. As a result, it can be challenging for researchers to choose the right technique for their problems or for people who are new to the field to learn and understand existing algorithms. To solve this problem, we introduced ULTRA, which is a flexible, extensible, and easily configure ULTR toolbox. Its key features include support for multiple ULTR algorithms with configurable hyperparameters, a variety of built-in click models that can be used separately to simulate clicks, different ranking model architecture and evaluation metrics, and simple learning to rank pipeline creation. In this paper, we discuss the general framework of ULTR, briefly describe the algorithms in ULTRA, detailed the structure, and pipeline of the toolbox. We experimented on all the algorithms supported by ultra and showed that the toolbox performance is reasonable. Our toolbox is an important resource for researchers to conduct experiments on ULTR algorithms with different configurations as well as testing their own algorithms with the supported features.