惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Fox-IT International blog
Recent Announcements
Recent Announcements
D
Docker
IT之家
IT之家
B
Blog
Jina AI
Jina AI
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
博客园 - 【当耐特】
Google DeepMind News
Google DeepMind News
F
Fortinet All Blogs
量子位
C
Check Point Blog
Microsoft Azure Blog
Microsoft Azure Blog
罗磊的独立博客
博客园 - 司徒正美
李成银的技术随笔
美团技术团队
Blog — PlanetScale
Blog — PlanetScale
雷峰网
雷峰网
The GitHub Blog
The GitHub Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
J
Java Code Geeks
T
The Blog of Author Tim Ferriss
酷 壳 – CoolShell
酷 壳 – CoolShell
MongoDB | Blog
MongoDB | Blog
P
Proofpoint News Feed
L
LangChain Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Y
Y Combinator Blog
大猫的无限游戏
大猫的无限游戏
有赞技术团队
有赞技术团队
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
V
Visual Studio Blog
T
Tailwind CSS Blog
H
Help Net Security
Engineering at Meta
Engineering at Meta
小众软件
小众软件
B
Blog RSS Feed
Stack Overflow Blog
Stack Overflow Blog
月光博客
月光博客
M
Microsoft Research Blog - Microsoft Research
宝玉的分享
宝玉的分享
人人都是产品经理
人人都是产品经理
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
GbyAI
GbyAI
H
Hackread – Cybersecurity News, Data Breaches, AI and More
Last Week in AI
Last Week in AI
Martin Fowler
Martin Fowler
Stack Overflow Blog
Stack Overflow Blog

cs.AI updates on arXiv.org

Teaching Through Analogies: A Modular Pipeline for Educational Analogy Generation Clarify, Abstain or Answer? Strategising in Conversation with Belief-Augmented Generation Guarded Repair for Harm-Aware Post-hoc Replacement of LLM Mathematical Reasoning Mimir: Large-scale Multilingual Concept Modeling STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media Evidence-Linked Radiology Reporting: A Human-Supervised Reference Architecture for Structured Imaging Intelligence Overcoming "Physics Shock" in Earth Observation A Heteroscedastic Uncertainty Framework for PINN-based Flood Inference TGFormer: Towards Temporal Graph Transformer with Auto-Correlation Mechanism Beyond the Aggregation Dilemma: Prior-Retaining Decoupled Learning for Multimodal Graphs A Large-Scale Dataset and Benchmark: Do Protein-Ligand Models Learn Binding Sites or Just Binding Likelihood? Complement Submodular Information Measures for Balanced and Robust Data Selection LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs Rethinking Federated Unlearning via the Lens of Memorization Language Bias in LVLMs: From In-Depth Analysis to Simple and Effective Mitigation Who judges the judges? Governance from metrics: a runtime framework for continuous LLM compliance monitoring Verified SHAP: Provable Bounds for Exact Shapley Values of Neural Networks Disentangled Double Machine Learning for Accurate Causal Effect Estimation BC Protocol: Structured Dual-Expert Dialogue for Eliciting High-Quality Chain-of-Thought Post-Training Data Cascade-KDE: Robust Time-Series Restoration under Out-of-Distribution Impulse Corruptions The Concept Allocation Zone: Tracking How Concepts Form Across Transformer Depth Spectral Probe-Circuits: A Three-Step Recipe for Identifying Attention-Head Circuits in Pretrained Transformers PILOT: Policy-Informed Learned Optimization for Adaptive Deep Network Training Fine-Tuning Over Architectural Complexity: Broad-Coverage PII Detection on PIIBench with DeBERTa Riemannian-Manifold Steering: Geometry-Aware Generative Autoencoders for Label-Free Steering Assessing the Operational Viability of Foundation Models for Time Series Forecasting Generative OOD-regularized Model-based Policy Optimization Filtered Posterior Mean Collections: A Unified Framework for Analytical Models of Diffusion Generalization When Reasoning Hurts: Source-Aware Evaluation of Frontier LLMs for Clinical SOAP Note Generation Grammatically-Guided Sparse Attention for Efficient and Interpretable Transformers Knowledge Graph Modulated Deep Learning for Limited-Sample Clinical Data Analysis Towards a Universal Causal Reasoner Raon-Speech Technical Report Tiny Brains, Giant Impact: Uncovering the Keystone Neurons of LLM with Just a Few Prompts LAPLEX: The FFT of Learnable Laplace Kernels Parameter Efficient Multi-Class Intelligent Scheduling for Multimodal Online Distributed Industrial Anomaly Detection A Controlled Synthetic Benchmark for Educational Aspect-Based Sentiment Analysis Quaternion Self-Attention with Shared Scores CONF-KV: Confidence-Aware KV Cache Eviction with Mixed-Precision Storage for Long-Horizon LLM TIAR: Trajectory-Informed Advantage Reweighting for LLM Abstention Learning Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning Measuring the Depth of LLM Unlearning via Activation Patching RealBench: Benchmarking Data-Driven Numerical Weather Forecasting Under Operational Conditions and Extreme Event Challenges The Path Matters: Learning a Token-Commitment Policy for Diffusion Language Models Feature Lottery? A Bifurcation Theory of Concept Emergence OSDTW: Optimal Shared Depth and Task Weighting for Long-Tailed Recognition Balancing Fairness, Privacy, and Accuracy: A Multitask Adversarial Framework for Centralized Data-Driven Systems Extracting Training Data from Diffusion Language Models via Infilling Bilevel Optimization of Synthetic Trajectories for Multi-Turn LLM Fine-Tuning ChaosBench-Logic v2: Evaluating LLM Logical Reasoning over Dynamical Systems at Scale An Interactive Paradigm for Deep Research Eureka: Intelligent Feature Engineering for Enterprise AI Cloud Resource Demand Prediction SemanticZip: A Pilot Framework for Lossy Text Compression with LLMs as Semantic Decompressors Improving Labeling Consistency with Detailed Constitutional Definitions and AI-Driven Evaluation Side-by-side Comparison Amplifies Dialect Bias in Language Models Momentum Streams for Optimizer-Inspired Transformers Mixture of Complementary Agents for Robust LLM Ensemble Treatment Effect Estimation with Differentiated Networked Effect on Graph Data JudgmentBench: Comparing Rubric and Preference Evaluation for Quality Assessment LLM Agent Based Renewable Energy Forecasting Using Edge and IoT Data A Review of Solar Wind Weather and Grid Aware Decision Support Federated Learning over Human-Body Communication for On-Body Edge Intelligence: A Survey, Taxonomy, and BODYFED-HBC Scheduling Vignette By Their Fruits You Will Know Them: Comparing Formalizations of Law by the Decisions They Encode READER: Reasoning-Enhanced AI-Generated Text Detection SEP-Attack: A Simple and Effective Paradigm for Transfer-Based Textual Adversarial Attack On the Stability and Realizability of Recurrent Polynomial Surrogate Ternary Logic Gate Networks SomaliBench Eval: Measuring English-to-Somali Refusal Gaps in Open-Weight Language Models Hidden-State Privacy Has an Empty Middle IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference Toward a Benchmark for Controllable Simulation of Imperfect Students with Large Language Models Polymorphism Is Rotation: Operational Mechanistic Interpretability from a Two-Layer Transformer to Pythia-70m AutoSG: LLM-Driven Solver Generation Solely from Task Prompts for Expensive Optimization Beyond Generative Priors: Minority Sampling with JEPA-Guided Diffusion TriVAL: A Tri-Validation Framework for Faithful Automatic Optimization Modeling Simulating Human Memory with Language Models Cross-Domain Energy-Guided Diffusion Generation for Off-Dynamics Reinforcement Learning Adaptive Graph Refinement and Label Propagation with LLMs for Cost-Effective Entity Resolution QUIET: A Multi-Blank Cascaded Story Cloze Benchmark for LLM Creative Generation Capability Not All Transitions Matter: Evidence from PPO Can LLMs Time Travel? Enhancing Temporal Consistency in Legal Agentic Search through Reinforcement Learning Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning Generative Representation Learning on Hyper-relational Knowledge Graphs via Masked Discrete Diffusion Explainable Retinal Imaging for Prediction of Multi-Organ Dysfunction in Type 2 Diabetes Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing Factorize to Generalize: Retrieval-Guided Invariant-Dynamic Decomposition for Time Series Forecasting PromptAudit: Auditing Prompt Sensitivity in LLM-Based Vulnerability Detection World-State Transformations for Neuro-symbolic Interactive Storytelling Investigating the Interplay between Contextual and Parametric Chain-of-Thought Faithfulness under Optimization TS-Skill: A Benchmark for Evaluating Analytical Skills in Time-Series Question Answering EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs A general tensor-structured compression scheme for efficient large language models Temporal Concept Drift in Legal Judgment Prediction: Neural Baselines Across Three Epochs of Ukrainian Court Decisions Knowledge Graph-Driven Expert-Level Reasoning for Neuroscience Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs Causal Tongue-Tie: LLMs Can Encode Causal Direction, But Their Yes/No Outputs Fail to Express Iterative Refinement Neural Operators are Learned Fixed-Point Solvers: A Principled Approach to Spectral Bias Mitigation On the Impact of Class Imbalance on the Learning Dynamics of Deep Neural Networks:An Intuitive Insight AI-Associated Lexical Shifts Across 34 Languages: Cross-Lingual Convergence and Diachronic Uptake in News Writing Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches Batch Normalization Amplifies Memorization and Privacy Risks Actionable and diverse counterfactual explanations incorporating domain knowledge and plausibility constraints A Multi-Agent LLM Framework for Rating the Quality of Surgical Feedback
Human-like Working Memory Interference in Large Language Models
2026-04-14 · via cs.AI updates on arXiv.org

Authors:Hua-Dong Xiong (1), Li Ji-An (2), Jiaqi Huang (3 and 4), Robert C. Wilson (1 and 5), Kwonjoon Lee (4), Xue-Xin Wei (6) ((1) School of Psychological and Brain Sciences, Georgia Tech, (2) Department of Psychology, New York University, (3) Department of Cognitive Science, Indiana University Bloomington, (4) Honda Research Institute, (5) Center of Excellence for Computational Cognition, Georgia Tech, (6) Departments of Neuroscience and Psychology, The University of Texas at Austin)

View PDF

Abstract:Intelligent systems must maintain and manipulate task-relevant information online to adapt to dynamic environments and changing goals. This capacity, known as working memory, is fundamental to human reasoning and intelligence. Despite having on the order of 100 billion neurons, both biological and artificial systems exhibit limitations in working memory. This raises a key question: why do large language models (LLMs) show such limitations, given that transformers have full access to prior context through attention? We find that although a two-layer transformer can be trained to solve working memory tasks perfectly, a diverse set of pretrained LLMs continues to show working memory limitations. Notably, LLMs reproduce interference signatures observed in humans: performance degrades with increasing memory load and is biased by recency and stimulus statistics. Across models, stronger working memory capacity correlates with broader competence on standard benchmarks, mirroring its link to general intelligence in humans. Yet despite substantial variability in working memory performance, LLMs surprisingly converge on a common computational mechanism. Rather than directly copying the relevant memory item from context, models encode multiple memory items in entangled representations, such that successful recall depends on interference control -- actively suppressing task-irrelevant content to isolate the target for readout. Moreover, a targeted intervention that suppresses stimulus content information improves performance, providing causal support for representational interference. Together, these findings identify representational interference as a core constraint on working memory in pretrained LLMs, suggesting that working-memory limits in biological and artificial systems may reflect a shared computational challenge: selecting task-relevant information under interference.
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as: arXiv:2604.09670 [cs.LG]
  (or arXiv:2604.09670v1 [cs.LG] for this version)
  https://doi.org/10.48550/arXiv.2604.09670

arXiv-issued DOI via DataCite

Submission history

From: Hua-Dong Xiong [view email]
[v1] Wed, 1 Apr 2026 17:19:46 UTC (1,582 KB)