惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Spread Privacy
Spread Privacy
P
Palo Alto Networks Blog
NISL@THU
NISL@THU
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
C
Cisco Blogs
Project Zero
Project Zero
AWS News Blog
AWS News Blog
S
Securelist
Simon Willison's Weblog
Simon Willison's Weblog
P
Proofpoint News Feed
The Hacker News
The Hacker News
V
Vulnerabilities – Threatpost
S
Schneier on Security
L
LINUX DO - 热门话题
MongoDB | Blog
MongoDB | Blog
Cisco Talos Blog
Cisco Talos Blog
阮一峰的网络日志
阮一峰的网络日志
WordPress大学
WordPress大学
博客园_首页
小众软件
小众软件
博客园 - 叶小钗
The Cloudflare Blog
Know Your Adversary
Know Your Adversary
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
P
Privacy International News Feed
T
Threat Research - Cisco Blogs
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Scott Helme
Scott Helme
Security Latest
Security Latest
Blog — PlanetScale
Blog — PlanetScale
F
Fortinet All Blogs
T
Threatpost
Recorded Future
Recorded Future
Apple Machine Learning Research
Apple Machine Learning Research
月光博客
月光博客
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
T
Tor Project blog
T
Tenable Blog
有赞技术团队
有赞技术团队
I
Intezer
D
Darknet – Hacking Tools, Hacker News & Cyber Security
G
GRAHAM CLULEY
Cyberwarzone
Cyberwarzone
U
Unit 42
美团技术团队
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
P
Privacy & Cybersecurity Law Blog
B
Blog
T
The Exploit Database - CXSecurity.com

cs.AI updates on arXiv.org

Generating High Quality Synthetic Data for Dutch Medical Conversations GIANTS: Generative Insight Anticipation from Scientific Literature Should We be Pedantic About Reasoning Errors in Machine Translation? Computational Implementation of a Model of Category-Theoretic Metaphor Comprehension CoSToM:Causal-oriented Steering for Intrinsic Theory-of-Mind Alignment in Large Language Models ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models CircuitSynth: Reliable Synthetic Data Generation Think in Sentences: Explicit Sentence Boundaries Enhance Language Model's Capabilities CodaRAG: Connecting the Dots with Associativity Inspired by Complementary Learning From Query to Counsel: Structured Reasoning with a Multi-Agent Framework and Dataset for Legal Consultation ReFEree: Reference-Free and Fine-Grained Method for Evaluating Factual Consistency in Real-World Code Summarization LLMs Should Incorporate Explicit Mechanisms for Human Empathy Early Decisions Matter: Proximity Bias and Initial Trajectory Shaping in Non-Autoregressive Diffusion Language Models Bridging Linguistic Gaps: Cross-Lingual Mapping in Pre-Training and Dataset for Enhanced Multilingual LLM Performance Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment Efficient Process Reward Modeling via Contrastive Mutual Information Learning and Enforcing Context-Sensitive Control for LLMs Too Nice to Tell the Truth: Quantifying Agreeableness-Driven Sycophancy in Role-Playing Language Models Deep-Reporter: Deep Research for Grounded Multimodal Long-Form Generation Generating Multiple-Choice Knowledge Questions with Interpretable Difficulty Estimation using Knowledge Graphs and Large Language Models Do BERT Embeddings Encode Narrative Dimensions? A Token-Level Probing Analysis of Time, Space, Causality, and Character in Fiction TInR: Exploring Tool-Internalized Reasoning in Large Language Models Advancing Polish Language Modeling through Tokenizer Optimization in the Bielik v3 7B and 11B Series AOP-Smart: A RAG-Enhanced Large Language Model Framework for Adverse Outcome Pathway Analysis Mem$^2$Evolve: Towards Self-Evolving Agents via Co-Evolutionary Capability Expansion and Experience Distillation Uncertainty-Aware Web-Conditioned Scientific Fact-Checking A Systematic Analysis of the Impact of Persona Steering on LLM Capabilities When Verification Fails: How Compositionally Infeasible Claims Escape Rejection When Valid Signals Fail: Regime Boundaries Between LLM Features and RL Trading Policies Shared Emotion Geometry Across Small Language Models: A Cross-Architecture Study of Representation, Behavior, and Methodological Confounds Efficient Training for Cross-lingual Speech Language Models CocoaBench: Evaluating Unified Digital Agents in the Wild MathAgent: Adversarial Evolution of Constraint Graphs for Mathematical Reasoning Data Synthesis Exploring Knowledge Conflicts for Faithful LLM Reasoning: Benchmark and Method Do LLMs Know Tool Irrelevance? Demystifying Structural Alignment Bias in Tool Invocations Enhancing Multimodal Large Language Models for Ancient Chinese Character Evolution Analysis via Glyph-Driven Fine-Tuning Retrieval as Generation: A Unified Framework with Self-Triggered Information Planning METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues Think Before you Write: QA-Guided Reasoning for Character Descriptions in Books METER: Evaluating Multi-Level Contextual Causal Reasoning in Large Language Models Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment Time is Not a Label: Continuous Phase Rotation for Temporal Knowledge Graphs and Agentic Memory Synthius-Mem: Brain-Inspired Hallucination-Resistant Persona Memory Achieving 94.4% Memory Accuracy and 99.6% Adversarial Robustness on LoCoMo A Triadic Suffix Tokenization Scheme for Numerical Reasoning RPA-Check: A Multi-Stage Automated Framework for Evaluating Dynamic LLM-based Role-Playing Agents Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind Legal2LogicICL: Improving Generalization in Transforming Legal Cases to Logical Formulas via Diverse Few-Shot Learning Evaluating Cooperation in LLM Social Groups through Elected Leadership Discourse Diversity in Multi-Turn Empathic Dialogue C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks Seven simple steps for log analysis in AI systems LETGAMES: An LLM-Powered Gamified Approach to Cognitive Training for Patients with Cognitive Impairment Generative UI: LLMs are Effective UI Generators ACE-TA: An Agentic Teaching Assistant for Grounded Q&A, Quiz Generation, and Code Tutoring LABBench2: An Improved Benchmark for AI Systems Performing Biology Research DeepReviewer 2.0: A Traceable Agentic System for Auditable Scientific Peer Review CID-TKG: Collaborative Historical Invariance and Evolutionary Dynamics Learning for Temporal Knowledge Graph Reasoning Unifying Ontology Construction and Semantic Alignment for Deterministic Enterprise Reasoning at Scale Digital hybridity and relics in cultural heritage: using corpus linguistics to inform design in emerging technologies from AI to VR LLM Nepotism in Organizational Governance Explainability and Certification of AI-Generated Educational Assessments How LLMs Might Think Assessing the Pedagogical Readiness of Large Language Models as AI Tutors in Low-Resource Contexts: A Case Study of Nepal's K-10 Curriculum CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation Pioneer Agent: Continual Improvement of Small Language Models in Production COMPOSITE-Stem Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards Cross-Cultural Value Awareness in Large Vision-Language Models Demographic and Linguistic Bias Evaluation in Omnimodal Language Models From UAV Imagery to Agronomic Reasoning: A Multimodal LLM Benchmark for Plant Phenotyping SenBen: Sensitive Scene Graphs for Explainable Content Moderation Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories WOMBET: World Model-Based Experience Transfer for Robust and Sample-efficient Reinforcement Learning Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines ASTRA: Adaptive Semantic Tree Reasoning Architecture for Complex Table Question Answering Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA An Adaptive Horizon-Aware Model Selection Framework for Demand Forecasting under Horizon-Induced Degradation SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework Re-Mask and Redirect: Exploiting Denoising Irreversibility in Diffusion Language Models QuanBench+: A Unified Multi-Framework Benchmark for LLM-Based Quantum Code Generation Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring eBandit: Kernel-Driven Reinforcement Learning for Adaptive Video Streaming Aligned Agents, Biased Swarm: Measuring Bias Amplification in Multi-Agent Systems Neural Distribution Prior for LiDAR Out-of-Distribution Detection Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition Many-Tier Instruction Hierarchy in LLM Agents H-AdminSim: A Multi-Agent Simulator for Realistic Hospital Administrative Workflows with FHIR Integration Exploring Structural Complexity in Normative RAG with Graph-based approaches: A case study on the ETSI Standards MCERF: Advancing Multimodal LLM Evaluation of Engineering Documentation with Enhanced Retrieval AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts Reasoning Models Will Sometimes Lie About Their Reasoning Multi-agent Adaptive Mechanism Design Relational Visual Similarity From Navigation to Refinement: Revealing the Two-Stage Nature of Flow-based Diffusion Models through Oracle Velocity On-the-Fly Adaptation to Quantization: Configuration-Aware LoRA for Efficient Fine-Tuning of Quantized LLMs STCast: Adaptive Boundary Alignment for Global and Regional Weather Forecasting HCAST: Human-Calibrated Autonomy Software Tasks OmniPrism: Learning Disentangled Visual Concept for Image Generation
Fast Autoregressive Video Diffusion and World Models with Temporal Cache Compression and Sparse Attention
Dvir Samuel, Issar Tzachor, Matan Levy, Micahel Green, Gal Chech · 2026-02-02 · via cs.AI updates on arXiv.org

Autoregressive video diffusion models enable streaming generation, opening the door to long-form synthesis, video world models, and interactive neural game engines. However, their core attention layers become a major bottleneck at inference time: as generation progresses, the KV cache grows, causing both increasing latency and escalating GPU memory, which in turn restricts usable temporal context and harms long-range consistency. In this work, we study redundancy in autoregressive video diffusion and identify three persistent sources: near-duplicate cached keys across frames, slowly evolving (largely semantic) queries/keys that make many attention computations redundant, and cross-attention over long prompts where only a small subset of tokens matters per frame. Building on these observations, we propose a unified, training-free attention framework for autoregressive diffusion: TempCache compresses the KV cache via temporal correspondence to bound cache growth; AnnCA accelerates cross-attention by selecting frame-relevant prompt tokens using fast approximate nearest neighbor (ANN) matching; and AnnSA sparsifies self-attention by restricting each query to semantically matched keys, also using a lightweight ANN. Together, these modules reduce attention, compute, and memory and are compatible with existing autoregressive diffusion backbones and world models. Experiments demonstrate up to x5--x10 end-to-end speedups while preserving near-identical visual quality and, crucially, maintaining stable throughput and nearly constant peak GPU memory usage over long rollouts, where prior methods progressively slow down and suffer from increasing memory usage.