惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

cs.AI updates on arXiv.org

Teaching Through Analogies: A Modular Pipeline for Educational Analogy Generation Nano World Models: A Minimalist Implementation of Future Video Prediction Guarded Repair for Harm-Aware Post-hoc Replacement of LLM Mathematical Reasoning Breaking the Chains of Probability: Neutrosophic Logic as a New Framework for Epistemic Uncertainty in Large Language Models Catching The Correct Answer Trap: Characterising AI Tutor Blind Spots When Analysing Student Reasoning LAPLEX: The FFT of Learnable Laplace Kernels Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning An Empirical Evaluation of LLM-Generated Code Security Across Prompting Methods CausaLab: A Scalable Environment for Interactive Causal Discovery Toward AI Scientists A Large-Scale Dataset and Benchmark: Do Protein-Ligand Models Learn Binding Sites or Just Binding Likelihood? Quantum Frog: Emergent Cooperation and Difficulty Scaling in a Quantized-Time Cooperative Game LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs Measuring Reasoning Quality in LLMs: A Multi-Dimensional Behavioral Framework Polymorphism Is Rotation: Operational Mechanistic Interpretability from a Two-Layer Transformer to Pythia-70m ChaosBench-Logic v2: Evaluating LLM Logical Reasoning over Dynamical Systems at Scale Filtered Posterior Mean Collections: A Unified Framework for Analytical Models of Diffusion Generalization DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research Cascade-KDE: Robust Time-Series Restoration under Out-of-Distribution Impulse Corruptions Low-Cost Labels, Reliable Choices: Rollout-Calibrated Hyper-Heuristics for Job Shop Scheduling Federated Learning over Human-Body Communication for On-Body Edge Intelligence: A Survey, Taxonomy, and BODYFED-HBC Scheduling Vignette Exploration of Perceptual Speech Features for Clinical Decision-Support in Mental Health Care LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs Context: Proactive Goal-Directed Intelligence via Composable Sandboxed Programs, Declarative Wiring, and Structured Interaction Machine Psychometrics: A Mathematical Psychology of Artificial Intelligence Generative OOD-regularized Model-based Policy Optimization Balancing Fairness, Privacy, and Accuracy: A Multitask Adversarial Framework for Centralized Data-Driven Systems Grammatically-Guided Sparse Attention for Efficient and Interpretable Transformers Treatment Effect Estimation with Differentiated Networked Effect on Graph Data Momentum Streams for Optimizer-Inspired Transformers Raon-Speech Technical Report QUIVER: A Formal Framework for Quantifying Perturbation Propagation and Bifurcation in Compound AI Systems Clustering as Reasoning: A $k$-Means Interpretation of Chain-of-Thought Graph Learning Parameter Efficient Multi-Class Intelligent Scheduling for Multimodal Online Distributed Industrial Anomaly Detection MuCRASP: Multimodal Chain-of-thought Reasoning aware Structured Pruning An Interpretable CF-RL-TOPSIS Fusion Model for Skills-Aware Talent Recommendation BoxLitE: A Faithful Knowledge Base Embedding Based on Convex Optimization Reason--Imagine--Act: Closed-Loop LLM Decision Making with World Models for Autonomous Driving Batch Normalization Amplifies Memorization and Privacy Risks Measuring the Depth of LLM Unlearning via Activation Patching Distributionally Robust Transfer Learning with Structurally Missing Covariates, with Application to Cross-National Cardiac Arrest Prediction Generative Representation Learning on Hyper-relational Knowledge Graphs via Masked Discrete Diffusion Feature Lottery? A Bifurcation Theory of Concept Emergence Concept Drift Adaptation Using Self-Supervised and Reinforcement Learning In Android Malware Detection AgentFugue: Agent Scaling for Long-Horizon Tasks through Collective Reasoning Extracting Training Data from Diffusion Language Models via Infilling High-Risk AI Systems and the Problem of Identity in the European AI Act PILOT: Policy-Informed Learned Optimization for Adaptive Deep Network Training An Interactive Paradigm for Deep Research Hypothesis Generation and Inductive Inference in Children and Language Models GlobalDentBench: A Multinational Benchmark for Evaluating LLM Clinical Reasoning in Dentistry with Expert Calibration Improving Labeling Consistency with Detailed Constitutional Definitions and AI-Driven Evaluation Side-by-side Comparison Amplifies Dialect Bias in Language Models LC-ERD: Mining Latent Logic for Self-Evolving Reasoning via Consistency-Regulated Reward Decomposition Mixture of Complementary Agents for Robust LLM Ensemble In Search of the Ingredients of Open-Endedness: Replicating Picbreeder with Large Vision-Language Models Why We Need World Models for AGI: Where LLMs Fail and How World Models May Outperform Beyond Generative Priors: Minority Sampling with JEPA-Guided Diffusion Not All Transitions Matter: Evidence from PPO Residual Drift Dominates Contradiction in Multi-Turn Constraint Reasoning TRACER: A Semantic-Aware Framework for Fine-Grained Contamination Detection in Code LLMs SemanticZip: A Pilot Framework for Lossy Text Compression with LLMs as Semantic Decompressors Trust but Verify: Prover-Verifier Deliberation for Selective LLM Prediction Fundamental Limitation in Explaining AI Hidden-State Privacy Has an Empty Middle Second Guess: Detecting Uncertainty Through Abstention and Answer Stability in Small Language Models How Much Thinking is Enough? Quantifying and Understanding Redundancy in LLM Reasoning Automated Detection and Classification of Delusion-related Content in Naturalistic Audio Diaries Using Multi-Agent Language Models Fuzzy, Neutrosophic, and Uncertain Graph Theory: Properties and Applications Privacy-Preserving Local Language Models for Longitudinal Data Retrieval in Chronic Dermatologic Disease: Implementation in Pemphigus Patients TriVAL: A Tri-Validation Framework for Faithful Automatic Optimization Modeling Authority Inversion in LLM-Mediated Ubiquitous Systems: When Models Trust Users Over Sensors From Accuracy to Auditability: A Survey of Determinism in Financial AI Systems Multimodal Alignment and Preference Optimization for Zero-Shot Conditional RNA Generation Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild Knowledge Graph Modulated Deep Learning for Limited-Sample Clinical Data Analysis AvalancheBench: Evaluating Enterprise Data Agents Through Latent World Recovery Spectral Probe-Circuits: A Three-Step Recipe for Identifying Attention-Head Circuits in Pretrained Transformers Overcoming "Physics Shock" in Earth Observation A Heteroscedastic Uncertainty Framework for PINN-based Flood Inference MAPLE: Multi-State Aggregated Policy Evaluation for AlphaZero in Imperfect-Information Games Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing Diff-Instruct with Diffused Reward: Towards Principled One-step Generator RL Assessing the Operational Viability of Foundation Models for Time Series Forecasting PromptAudit: Auditing Prompt Sensitivity in LLM-Based Vulnerability Detection Rethinking Federated Unlearning via the Lens of Memorization Verified SHAP: Provable Bounds for Exact Shapley Values of Neural Networks EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs Jailbreak to Protect: Buffering and Reinforcing via Temporary Jailbreaking for Safe Fine-Tuning in Large Language Models Temporal Concept Drift in Legal Judgment Prediction: Neural Baselines Across Three Epochs of Ukrainian Court Decisions When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning Inference Time Context Sparsity: Illusion or Opportunity? Iterative Refinement Neural Operators are Learned Fixed-Point Solvers: A Principled Approach to Spectral Bias Mitigation IVR-R1: Refining Trajectories through Iterative Visual-Grounded Reasoning in Reinforcement Learning Learning to Reason Efficiently with A* Post-Training Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches Towards trustworthy agentic AI: a comprehensive survey of safety, robustness, privacy, and system security Catching MRI outliers: unsupervised detection and localization of MRI artefacts and clinical anomalies using deep learning Geo-Expert: Towards Expert-Level Geological Reasoning via Parameter-Efficient Fine-Tuning
Back to Parsimonious Latents: Learning Task-Centric World Models from Visual Foundations
Minghao Fu, · 2026-05-26 · via cs.AI updates on arXiv.org

View PDF HTML (experimental)

Abstract:World models enable agents to predict future dynamics conditioned on actions, making the choice of latent representation central to planning and control. Such representations are often either learned directly from pixels with limited semantic structure or inherited from frozen visual foundation models with excessive task-irrelevant detail, yielding state spaces that are poorly matched to downstream planning and control. This is especially challenging in reward-free offline settings, where the model must learn from fixed trajectories without reward supervision or online interaction. To address this, we propose TC-WM, a framework for turning foundation-model embeddings into compact, task-sufficient world representations. The key design is to treat the pretrained embedding space as a semantic scaffold rather than as the final state space: TC-WM linearly projects high-dimensional visual embeddings into a compact latent as the dynamic space, aligns a subspace with the agent's physical state via contrastive learning, and reconstructs embeddings to preserve useful visual structure. This combines the generality of foundation features with the controllability of task-centric dynamics. Theoretically, we show that TC-WM suffices to identify the underlying task-centric latent factors up to a simple transformation. Empirically, TC-WM enables test-time planning across diverse environments (e.g., Robomimic and D4RL), achieving better world-modeling quality and more precise control than state-of-the-art approaches.
Subjects: Artificial Intelligence (cs.AI)
Cite as: arXiv:2605.25620 [cs.AI]
  (or arXiv:2605.25620v1 [cs.AI] for this version)
  https://doi.org/10.48550/arXiv.2605.25620

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Minghao Fu [view email]
[v1] Mon, 25 May 2026 09:21:43 UTC (21,327 KB)