惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Hacker News: Front Page
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
博客园 - 聂微东
WordPress大学
WordPress大学
Google DeepMind News
Google DeepMind News
T
The Blog of Author Tim Ferriss
IT之家
IT之家
Recent Announcements
Recent Announcements
S
SegmentFault 最新的问题
博客园 - 三生石上(FineUI控件)
G
Google Developers Blog
Recorded Future
Recorded Future
Apple Machine Learning Research
Apple Machine Learning Research
F
Full Disclosure
D
DataBreaches.Net
酷 壳 – CoolShell
酷 壳 – CoolShell
T
The Exploit Database - CXSecurity.com
T
Threat Research - Cisco Blogs
C
Cisco Blogs
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Google Online Security Blog
Google Online Security Blog
爱范儿
爱范儿
V
V2EX
N
News and Events Feed by Topic
U
Unit 42
P
Privacy International News Feed
M
MIT News - Artificial intelligence
S
Security Affairs
T
Tenable Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
TaoSecurity Blog
TaoSecurity Blog
人人都是产品经理
人人都是产品经理
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
美团技术团队
H
Hackread – Cybersecurity News, Data Breaches, AI and More
Hacker News: Ask HN
Hacker News: Ask HN
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
V
V2EX - 技术
aimingoo的专栏
aimingoo的专栏
L
LINUX DO - 热门话题
MongoDB | Blog
MongoDB | Blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
K
Kaspersky official blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
B
Blog
V
Vulnerabilities – Threatpost
博客园_首页
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
CERT Recently Published Vulnerability Notes
SecWiki News
SecWiki News

cs.AI updates on arXiv.org

Detecting Safety Violations Across Many Agent Traces C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks Discourse Diversity in Multi-Turn Empathic Dialogue Evaluating Cooperation in LLM Social Groups through Elected Leadership SWE-AGILE: A Software Agent Framework for Efficiently Managing Dynamic Reasoning Context Agentic Driving Coach: Robustness and Determinism of Agentic AI-Powered Human-in-the-Loop Cyber-Physical Systems Legal2LogicICL: Improving Generalization in Transforming Legal Cases to Logical Formulas via Diverse Few-Shot Learning Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind RPA-Check: A Multi-Stage Automated Framework for Evaluating Dynamic LLM-based Role-Playing Agents A Triadic Suffix Tokenization Scheme for Numerical Reasoning Synthius-Mem: Brain-Inspired Hallucination-Resistant Persona Memory Achieving 94.4% Memory Accuracy and 99.6% Adversarial Robustness on LoCoMo Time is Not a Label: Continuous Phase Rotation for Temporal Knowledge Graphs and Agentic Memory NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization METER: Evaluating Multi-Level Contextual Causal Reasoning in Large Language Models Quantization Dominates Rank Reduction for KV-Cache Compression Anthropogenic Regional Adaptation in Multimodal Vision-Language Model Low-rank Optimization Trajectories Modeling for LLM RLVR Acceleration Think Before you Write: QA-Guided Reasoning for Character Descriptions in Books METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues Retrieval as Generation: A Unified Framework with Self-Triggered Information Planning Learning from Contrasts: Synthesizing Reasoning Paths from Diverse Search Trajectories Do LLMs Know Tool Irrelevance? Demystifying Structural Alignment Bias in Tool Invocations The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems Enhancing Multimodal Large Language Models for Ancient Chinese Character Evolution Analysis via Glyph-Driven Fine-Tuning The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping RECIPER: A Dual-View Retrieval Pipeline for Procedure-Oriented Materials Question Answering Exploring Knowledge Conflicts for Faithful LLM Reasoning: Benchmark and Method CocoaBench: Evaluating Unified Digital Agents in the Wild MathAgent: Adversarial Evolution of Constraint Graphs for Mathematical Reasoning Data Synthesis Use of AI Tools: Guidelines to Maintain Academic Integrity in Computing Colleges Efficient Training for Cross-lingual Speech Language Models Guardrails Beat Guidance: A Large-Scale Study of Rules, Skills, and Persistent Configuration for Coding Agents Towards Proactive Information Probing: Customer Service Chatbots Harvesting Value from Conversation Shared Emotion Geometry Across Small Language Models: A Cross-Architecture Study of Representation, Behavior, and Methodological Confounds A Systematic Analysis of the Impact of Persona Steering on LLM Capabilities Uncertainty-Aware Web-Conditioned Scientific Fact-Checking Min-$k$ Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics When Valid Signals Fail: Regime Boundaries Between LLM Features and RL Trading Policies When Verification Fails: How Compositionally Infeasible Claims Escape Rejection Back to the Barn with LLAMAs: Evolving Pretrained LLM Backbones in Finetuning Vision Language Models CFMS: A Coarse-to-Fine Multimodal Synthesis Framework for Enhanced Tabular Reasoning A molecular clock for writing systems reveals the quantitative impact of imperial power on cultural evolution Mem$^2$Evolve: Towards Self-Evolving Agents via Co-Evolutionary Capability Expansion and Experience Distillation Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music ZoomR: Memory Efficient Reasoning through Multi-Granularity Key Value Retrieval AOP-Smart: A RAG-Enhanced Large Language Model Framework for Adverse Outcome Pathway Analysis Speaking to No One: Ontological Dissonance and the Double Bind of Conversational AI Advancing Polish Language Modeling through Tokenizer Optimization in the Bielik v3 7B and 11B Series TInR: Exploring Tool-Internalized Reasoning in Large Language Models Do BERT Embeddings Encode Narrative Dimensions? A Token-Level Probing Analysis of Time, Space, Causality, and Character in Fiction Generating Multiple-Choice Knowledge Questions with Interpretable Difficulty Estimation using Knowledge Graphs and Large Language Models Deep-Reporter: Deep Research for Grounded Multimodal Long-Form Generation Too Nice to Tell the Truth: Quantifying Agreeableness-Driven Sycophancy in Role-Playing Language Models Teaching Language Models How to Code Like Learners: Conversational Serialization for Student Simulation Detecting RAG Extraction Attack via Dual-Path Runtime Integrity Game Bringing Value Models Back: Generative Critics for Value Modeling in LLM Reinforcement Learning SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting Skill-SD: Skill-Conditioned Self-Distillation for Multi-turn LLM Agents Learning and Enforcing Context-Sensitive Control for LLMs Efficient Process Reward Modeling via Contrastive Mutual Information Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment NSFL: A Post-Training Neuro-Symbolic Fuzzy Logic Framework for Boolean Operators in Neural Embeddings Bridging Linguistic Gaps: Cross-Lingual Mapping in Pre-Training and Dataset for Enhanced Multilingual LLM Performance Calibration Collapse Under Sycophancy Fine-Tuning: How Reward Hacking Breaks Uncertainty Quantification in LLMs Early Decisions Matter: Proximity Bias and Initial Trajectory Shaping in Non-Autoregressive Diffusion Language Models LLMs Should Incorporate Explicit Mechanisms for Human Empathy AI Patents in the United States and China: Measurement, Organization, and Knowledge Flows ReFEree: Reference-Free and Fine-Grained Method for Evaluating Factual Consistency in Real-World Code Summarization Thinking Fast, Thinking Wrong: Intuitiveness Modulates LLM Counterfactual Reasoning in Policy Evaluation From Query to Counsel: Structured Reasoning with a Multi-Agent Framework and Dataset for Legal Consultation CodaRAG: Connecting the Dots with Associativity Inspired by Complementary Learning The Amazing Agent Race: Strong Tool Users, Weak Navigators Learning from Emptiness: De-biasing Listwise Rerankers with Content-Agnostic Probability Calibration Think in Sentences: Explicit Sentence Boundaries Enhance Language Model's Capabilities CircuitSynth: Reliable Synthetic Data Generation ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models Computational Implementation of a Model of Category-Theoretic Metaphor Comprehension CoSToM:Causal-oriented Steering for Intrinsic Theory-of-Mind Alignment in Large Language Models FinTrace: Holistic Trajectory-Level Evaluation of LLM Tool Calling for Long-Horizon Financial Tasks Demographic and Linguistic Bias Evaluation in Omnimodal Language Models Cross-Cultural Value Awareness in Large Vision-Language Models From UAV Imagery to Agronomic Reasoning: A Multimodal LLM Benchmark for Plant Phenotyping Should We be Pedantic About Reasoning Errors in Machine Translation? Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards COMPOSITE-Stem GIANTS: Generative Insight Anticipation from Scientific Literature Pioneer Agent: Continual Improvement of Small Language Models in Production SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning Many-Tier Instruction Hierarchy in LLM Agents Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories PhysInOne: Visual Physics Learning and Reasoning in One Suite Neural Distribution Prior for LiDAR Out-of-Distribution Detection Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition PDE-regularized Dynamics-informed Diffusion with Uncertainty-aware Filtering for Long-Horizon Dynamics Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA
Projection and Quantisation: A Unifying View of Learning to Hash, from Random Projections to the RAG Era
Sean Moran · 2025-10-05 · via cs.AI updates on arXiv.org

Approximate nearest neighbour (ANN) search underpins large-scale retrieval, increasingly within the retrieval-augmented generation pipelines that ground large language models, yet the methods that address it have multiplied across communities until they are seldom read as a single field. We argue they form one field with three design choices, and develop the projection-quantisation-organisation (PQO) lens, under which locality-sensitive hashing, learned binary hashing, deep end-to-end hashing, product quantisation, graph-based indexes, and the binary embeddings of modern vector databases are all settings of three coupled questions: where to place the projections, where to place the quantisation thresholds, and how to organise the resulting codes. The projection-then-quantisation reading is established; our contribution is the third, co-equal organisation stage, a demonstration that the three run unbroken from the field's origins to the deep, product-quantisation, graph, and retrieval-augmented eras, and a reproducible measurement that turns the lens from classifying methods to predicting them. The measurement yields three findings. First, memory is won on the quantisation axis: a one-bit code is a thirty-second the size of the float, and a single full-precision re-ranking pass over a short candidate list recovers uncompressed quality in full. Second, the trade-off orderings the lens anticipates recur unchanged as the embedding grows. Third, where supervision is available, an eight-byte code more than doubles the quality of the two-kilobyte float it replaces. We release these measurements as BitBudget, an extensible benchmark with a live leaderboard, recast generative retrieval's "semantic identifiers" as quantisation codes, and identify the open problems that follow as compact codes return to the centre of large-scale retrieval.