惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

cs.AI updates on arXiv.org

Iterative Refinement Neural Operators are Learned Fixed-Point Solvers: A Principled Approach to Spectral Bias Mitigation Beyond Final Answers: Auditing Trajectory-Level Hallucinations in Multi-Agent Industrial Workflows Generative Representation Learning on Hyper-relational Knowledge Graphs via Masked Discrete Diffusion DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research Second Guess: Detecting Uncertainty Through Abstention and Answer Stability in Small Language Models Hypothesis Generation and Inductive Inference in Children and Language Models ConceptM$^3$oE: Concept-Guided Multimodal Mixture of Experts for Interpretable Computational Pathology Summoning the Oracle to Slay It: Mitigating Look-Ahead Bias in Financial Backtesting with Large Language Models Federated Learning over Human-Body Communication for On-Body Edge Intelligence: A Survey, Taxonomy, and BODYFED-HBC Scheduling Vignette Mitigating Object Hallucinations in Vision-Language Models through Region-Aware Attention Recalibration Spectral Probe-Circuits: A Three-Step Recipe for Identifying Attention-Head Circuits in Pretrained Transformers IVR-R1: Refining Trajectories through Iterative Visual-Grounded Reasoning in Reinforcement Learning Trust but Verify: Prover-Verifier Deliberation for Selective LLM Prediction Jailbreak to Protect: Buffering and Reinforcing via Temporary Jailbreaking for Safe Fine-Tuning in Large Language Models AgentFugue: Agent Scaling for Long-Horizon Tasks through Collective Reasoning CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents HeartBeatAI: An Interpretable and Robust Deep Learning Framework for Multi-Label ECG Arrhythmia Detection In Search of the Ingredients of Open-Endedness: Replicating Picbreeder with Large Vision-Language Models CITYREP: A Unified Benchmark for Urban Representations Across Cities, Tasks, and Modalities Residual Drift Dominates Contradiction in Multi-Turn Constraint Reasoning Diff-Instruct with Diffused Reward: Towards Principled One-step Generator RL Identifying and Mitigating Systemic Measurement Bias in Production LLM Inference Benchmarks More Skills, Worse Agents? Skill Shadowing Degrades Performance When Expanding Skill Libraries Confidence Calibration in Large Language Models Quantum Frog: Emergent Cooperation and Difficulty Scaling in a Quantized-Time Cooperative Game Measuring Reasoning Quality in LLMs: A Multi-Dimensional Behavioral Framework Automated Detection and Classification of Delusion-related Content in Naturalistic Audio Diaries Using Multi-Agent Language Models Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing Learning to Reason Efficiently with A* Post-Training Fundamental Limitation in Explaining AI Raon-Speech Technical Report FLOATBench: A Dataset and Benchmark for Floating Offshore Wind Turbine Tower Fatigue Inference Time Context Sparsity: Illusion or Opportunity? Cascade-KDE: Robust Time-Series Restoration under Out-of-Distribution Impulse Corruptions Market Regime Council for Dynamic Credit Assignment in Multi-Agent LLM Decision Systems Adaptive Human-AI Coordination via Hierarchical Action Disentanglement Beyond the Frontier: Stochastic Backtracking for Efficient Test-Time Scaling Safety-Oriented Routing Analysis of Mixtral MoE Under Benign and Harmful Prompts Exploration of Perceptual Speech Features for Clinical Decision-Support in Mental Health Care Verified SHAP: Provable Bounds for Exact Shapley Values of Neural Networks Distilling Game Code World Model Generation into Lightweight Large Language Models Machine Psychometrics: A Mathematical Psychology of Artificial Intelligence Catching The Correct Answer Trap: Characterising AI Tutor Blind Spots When Analysing Student Reasoning Understanding and Mitigating Premature Confidence for Better LLM Reasoning QUIVER: A Formal Framework for Quantifying Perturbation Propagation and Bifurcation in Compound AI Systems Parameter Efficient Multi-Class Intelligent Scheduling for Multimodal Online Distributed Industrial Anomaly Detection When Mean CE Fails: Median CE Can Better Track Language Model Quality Privacy-Preserving Local Language Models for Longitudinal Data Retrieval in Chronic Dermatologic Disease: Implementation in Pemphigus Patients LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs Low-Cost Labels, Reliable Choices: Rollout-Calibrated Hyper-Heuristics for Job Shop Scheduling Nano World Models: A Minimalist Implementation of Future Video Prediction Hidden-State Privacy Has an Empty Middle A Large-Scale Dataset and Benchmark: Do Protein-Ligand Models Learn Binding Sites or Just Binding Likelihood? Authority Inversion in LLM-Mediated Ubiquitous Systems: When Models Trust Users Over Sensors Not All Transitions Matter: Evidence from PPO CausaLab: A Scalable Environment for Interactive Causal Discovery Toward AI Scientists Fuzzy, Neutrosophic, and Uncertain Graph Theory: Properties and Applications MuCRASP: Multimodal Chain-of-thought Reasoning aware Structured Pruning Why We Need World Models for AGI: Where LLMs Fail and How World Models May Outperform How Much Thinking is Enough? Quantifying and Understanding Redundancy in LLM Reasoning From Accuracy to Auditability: A Survey of Determinism in Financial AI Systems Geo-Expert: Towards Expert-Level Geological Reasoning via Parameter-Efficient Fine-Tuning The Model Is Not the Product: A Dual-Pillar Architecture for Local-First Psychological Coaching Reason--Imagine--Act: Closed-Loop LLM Decision Making with World Models for Autonomous Driving Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning SPACE: Unifying Symmetric and Asymmetric Routing Problems for Generalist Neural Solver MDIA: A Multi-Agent Diagnostic Intelligence Pipeline on HealthBench Professional MAPLE: Multi-State Aggregated Policy Evaluation for AlphaZero in Imperfect-Information Games RECTOR: Priority-Aware Rule-Based Reranking for Compliance-Aware Autonomous Driving Trajectory Selection When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs TriVAL: A Tri-Validation Framework for Faithful Automatic Optimization Modeling A Signal-Language Foundation Model for Broad-Spectrum Cardiovascular Assessment from Routine Electrocardiography Insuring Every Action: An Authority Frontier Framework for Runtime Actuarial Control of Autonomous AI Agents From Model Scaling to System Scaling: Scaling the Harness in Agentic AI A governance horizon for ethical-use constraints in open-weight AI models Breaking the Chains of Probability: Neutrosophic Logic as a New Framework for Epistemic Uncertainty in Large Language Models Remote sensing data imputation using deep learning for multispectral imagery When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure LC-ERD: Mining Latent Logic for Self-Evolving Reasoning via Consistency-Regulated Reward Decomposition When Does Synthetic Patent Data Help? Volume-Fidelity Trade-offs in Low-Resource Multi-Label Classification Overcoming "Physics Shock" in Earth Observation A Heteroscedastic Uncertainty Framework for PINN-based Flood Inference Toward Enactive Artificial Intelligence GlobalDentBench: A Multinational Benchmark for Evaluating LLM Clinical Reasoning in Dentistry with Expert Calibration TRACER: A Semantic-Aware Framework for Fine-Grained Contamination Detection in Code LLMs Clustering as Reasoning: A $k$-Means Interpretation of Chain-of-Thought Graph Learning Towards trustworthy agentic AI: a comprehensive survey of safety, robustness, privacy, and system security EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs Multimodal Alignment and Preference Optimization for Zero-Shot Conditional RNA Generation Mixture of Complementary Agents for Robust LLM Ensemble High-Risk AI Systems and the Problem of Identity in the European AI Act Context: Proactive Goal-Directed Intelligence via Composable Sandboxed Programs, Declarative Wiring, and Structured Interaction Partner-Aware Hierarchical Skill Discovery for Robust Human-AI Collaboration Feature Lottery? A Bifurcation Theory of Concept Emergence How Well Do Models Follow Their Constitutions? LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs Document Classification Pattern Recognition via Information Fusion: A Systematic Review of Multimodal and Multiview Representation Approaches BoxLitE: A Faithful Knowledge Base Embedding Based on Convex Optimization Understanding Conversational Patterns in Multi-agent Programming: A Case Study on Fibonacci Game Development An Interpretable CF-RL-TOPSIS Fusion Model for Skills-Aware Talent Recommendation
MASt3R-Nav: WayPixel Navigation in Relative 3D Maps
Vansh Garg, · 2026-05-26 · via cs.AI updates on arXiv.org

View PDF HTML (experimental)

Abstract:Visual navigation ability is strongly tied to its underlying representation of the world. Unlike classical 3D maps that require globally-consistent geometry, image- or object-relative topological graphs almost entirely do away with geometric understanding. But, this comes at the cost of navigation capability, often limiting it to merely teach-and-repeat. In this work, we propose a novel map representation in the form of pixel-relative connectivity, which is geometrically accurate but does not require global geometric consistency. Inspired by recent progress in 3D grounded image matching, we construct a map from an image sequence through inter-image connectivity based on pixel correspondences in the relative 3D coordinate systems of individual image pairs. We then use this pixel-level graph to perform global path planning by approximating and sparsifying intra-image pixel connectivity. Through this, we derive a ''WayPixel Costmap'' representation and train a controller conditioned on it to predict a trajectory rollout. We show that this dense pixel-level costmap based on relative geometry is a more accurate conditioning variable for control prediction than its image- and object-level counterparts. This enables a highly capable navigation system, as validated on four types of navigation tasks in the simulator and through real world demonstrations.
Comments: 2026 IEEE International Conference on Robotics & Automation (ICRA)
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
Cite as: arXiv:2605.24111 [cs.RO]
  (or arXiv:2605.24111v1 [cs.RO] for this version)
  https://doi.org/10.48550/arXiv.2605.24111

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Vansh Garg [view email]
[v1] Fri, 22 May 2026 18:18:07 UTC (13,630 KB)