惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

A
Arctic Wolf
T
The Blog of Author Tim Ferriss
月光博客
月光博客
Recent Announcements
Recent Announcements
V
V2EX
Microsoft Azure Blog
Microsoft Azure Blog
博客园 - 三生石上(FineUI控件)
P
Proofpoint News Feed
The Register - Security
The Register - Security
博客园 - 叶小钗
博客园 - Franky
The Cloudflare Blog
雷峰网
雷峰网
罗磊的独立博客
M
MIT News - Artificial intelligence
I
InfoQ
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
博客园 - 【当耐特】
Engineering at Meta
Engineering at Meta
N
Netflix TechBlog - Medium
爱范儿
爱范儿
博客园 - 司徒正美
Recorded Future
Recorded Future
酷 壳 – CoolShell
酷 壳 – CoolShell
Google DeepMind News
Google DeepMind News
Martin Fowler
Martin Fowler
Microsoft Security Blog
Microsoft Security Blog
F
Full Disclosure
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
B
Blog
大猫的无限游戏
大猫的无限游戏
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
腾讯CDC
WordPress大学
WordPress大学
小众软件
小众软件
K
Kaspersky official blog
Attack and Defense Labs
Attack and Defense Labs
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Forbes - Security
Forbes - Security
aimingoo的专栏
aimingoo的专栏
IT之家
IT之家
The Last Watchdog
The Last Watchdog
N
News and Events Feed by Topic
B
Blog RSS Feed
S
Security @ Cisco Blogs
美团技术团队
量子位
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Cloudbric
Cloudbric
Hacker News - Newest:
Hacker News - Newest: "LLM"

cs.RO updates on arXiv.org

FineCog-Nav: Integrating Fine-grained Cognitive Modules for Zero-shot Multimodal UAV Navigation DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs SENSE: Stereo OpEN Vocabulary SEmantic Segmentation Continual Hand-Eye Calibration for Open-world Robotic Manipulation PLAF: Pixel-wise Language-Aligned Feature Extraction for Efficient 3D Scene Understanding GaussianFlow SLAM: Monocular Gaussian Splatting SLAM Guided by GaussianFlow GIST: Multimodal Knowledge Extraction and Spatial Grounding via Intelligent Semantic Topology $π_{0.7}$: a Steerable Generalist Robotic Foundation Model with Emergent Capabilities R3D: Revisiting 3D Policy Learning Vision-Based Safe Human-Robot Collaboration with Uncertainty Guarantees Benchmarking Classical Coverage Path Planning Heuristics on Irregular Hexagonal Grids for Maritime Coverage Scenarios NEAT-NC: NEAT guided Navigation Cells for Robot Path Planning HRDexDB: A Large-Scale Dataset of Dexterous Human and Robotic Hand Grasps ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints An Intelligent Robotic and Bio-Digestor Framework for Smart Waste Management Efficient closed-form approaches for pose estimation using Sylvester forms World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems A Nonasymptotic Theory of Gain-Dependent Error Dynamics in Behavior Cloning CooperDrive: Enhancing Driving Decisions Through Cooperative Perception SpaceMind: A Modular and Self-Evolving Embodied Vision-Language Agent Framework for Autonomous On-orbit Servicing HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System UMI-3D: Extending Universal Manipulation Interface from Vision-Limited to 3D Spatial Perception Towards Multi-Object-Tracking with Radar on a Fast Moving Vehicle: On the Potential of Processing Radar in the Frequency Domain Beyond Conservative Automated Driving in Multi-Agent Scenarios via Coupled Model Predictive Control and Deep Reinforcement Learning Failure Identification in Imitation Learning Via Statistical and Semantic Filtering A Dynamic-Growing Fuzzy-Neuro Controller, Application to a 3PSP Parallel Robot Vision-Language-Action Jump-Starting for Reinforcement Learning Robotic Agents A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies ESCAPE: Episodic Spatial Memory and Adaptive Execution Policy for Long-Horizon Mobile Manipulation Evolvable Embodied Agent for Robotic Manipulation via Long Short-Term Reflection and Optimization Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning RadarSplat-RIO: Indoor Radar-Inertial Odometry with Gaussian Splatting-Based Radar Bundle Adjustment RobotPan: A 360$^\circ$ Surround-View Robotic Vision System for Embodied Perception Diffusion Sequence Models for Generative In-Context Meta-Learning of Robot Dynamics GeoVision-Enabled Digital Twin for Hybrid Autonomous-Teleoperated Medical Responses 4th Workshop on Maritime Computer Vision (MaCVi): Challenge Overview Multi-modal panoramic 3D outdoor datasets for place categorization Learning Probabilistic Responsibility Allocations for Multi-Agent Interactions Solving Physics Olympiad via Reinforcement Learning on Physics Simulators StarVLA-$α$: Reducing Complexity in Vision-Language-Action Systems Grounded World Model for Semantically Generalizable Planning SCORP: Scene-Consistent Multi-agent Diffusion Planning with Stable Online Reinforcement Post-Training for Cooperative Driving Agentic Driving Coach: Robustness and Determinism of Agentic AI-Powered Human-in-the-Loop Cyber-Physical Systems AffordSim: A Scalable Data Generator and Benchmark for Affordance-Aware Robotic Manipulation Efficient Emotion-Aware Iconic Gesture Prediction for Robot Co-Speech Minimal Embodiment Enables Efficient Learning of Number Concepts in Robot Learning to Forget -- Hierarchical Episodic Memory for Lifelong Robot Deployment 3D-Anchored Lookahead Planning for Persistent Robotic Scene Memory via World-Model-Based MCTS EmbodiedGovBench: A Benchmark for Governance, Recovery, and Upgrade Safety in Embodied Agent Systems Federated Single-Agent Robotics: Multi-Robot Coordination Without Intra-Robot Multi-Agent Fragmentation Robust Adversarial Policy Optimization Under Dynamics Uncertainty BridgeSim: Unveiling the OL-CL Gap in End-to-End Autonomous Driving AffordGen: Generating Diverse Demonstrations for Generalizable Object Manipulation with Afford Correspondence Genie 4D: Semantic-Prior-Guided 4D Dynamic Scene Reconstruction RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models PhysInOne: Visual Physics Learning and Reasoning in One Suite C$^2$T: Captioning-Structure and LLM-Aligned Common-Sense Reward Learning for Traffic--Vehicle Coordination WOMBET: World Model-Based Experience Transfer for Robust and Sample-efficient Reinforcement Learning Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring 3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding Action Images: End-to-End Policy Learning via Multiview Video Generation Towards Generalizable Robotic Manipulation in Dynamic Environments General-purpose LLMs as Models of Human Driver Behavior: The Case of Simplified Merging Uncertainty, Vagueness, and Ambiguity in Human-Robot Interaction: Why Conceptualization Matters IROSA: Interactive Robot Skill Adaptation using Natural Language Online Navigation Planning for Long-term Autonomous Operation of Underwater Gliders Optimized Human-Robot Co-Dispatch Planning for Petro-Site Surveillance under Varying Criticalities MerNav: A Highly Generalizable Memory-Execute-Review Framework for Zero-Shot Object Goal Navigation From Instruction to Event: Sound-Triggered Mobile Manipulation Self-Organizing Dual-Buffer Adaptive Clustering Experience Replay (SODACER) for Safe Reinforcement Learning in Optimal Control Enhanced-FQL($λ$), an Efficient and Interpretable RL with novel Fuzzy Eligibility Traces and Segmented Experience Replay LEAD: Minimizing Learner-Expert Asymmetry in End-to-End Driving Learning to Plan, Planning to Learn: Adaptive Hierarchical RL-MPC for Sample-Efficient Decision Making Target-Bench: Can Video World Models Achieve Mapless Path Planning with Semantic Targets? Robust Verification of Controllers under State Uncertainty via Hamilton-Jacobi Reachability Analysis Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion Volumetric Ergodic Control RoboTAG: End-to-end Robot Configuration Estimation via Topological Alignment Graph TwinOR: Photorealistic Digital Twins of Dynamic Operating Rooms for Embodied AI Research Multimodal Diffusion Forcing for Forceful Manipulation X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models Flow with the Force Field: Learning 3D Compliant Flow Matching Policies from Force and Demonstration-Guided Simulation Data AFFORD2ACT: Affordance-Guided Automatic Keypoint Selection for Generalizable and Lightweight Robotic Manipulation HAMLET: Switch your Vision-Language-Action Model into a History-Aware Policy TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance Multi-Modal Manipulation via Multi-Modal Policy Consensus AutoDrive-R$^2$: Incentivizing Reasoning and Self-Reflection Capacity for VLA Model in Autonomous Driving Constrained Decoding for Safe Robot Navigation Foundation Models FCBV-Net: Category-Level Robotic Garment Smoothing via Feature-Conditioned Bimanual Value Prediction PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving LLM-based Realistic Safety-Critical Driving Video Generation Scalable Multi-Task Learning through Spiking Neural Networks with Adaptive Task-Switching Policy for Intelligent Autonomous Agents Learning to Play Piano in the Real World Scalable Unseen Objects 6-DoF Absolute Pose Estimation with Robotic Integration Sixth-Sense: Self-Supervised Learning of Spatial Awareness of Humans from a Planar Lidar Curriculum-based Sample Efficient Reinforcement Learning for Robust Stabilization of a Quadrotor Generative Models and Connected and Automated Vehicles: A Survey in Exploring the Intersection of Transportation and AI Convex Hulls of Reachable Sets
SynAD: Enhancing Real-World End-to-End Autonomous Driving Models through Synthetic Data Integration
Jongsuk Kim, Jaeyoung Lee, Gyojin Han, Dongjae Lee, Minki Jeong, · 2025-10-28 · via cs.RO updates on arXiv.org

Recent advancements in deep learning and the availability of high-quality real-world driving datasets have propelled end-to-end autonomous driving. Despite this progress, relying solely on real-world data limits the variety of driving scenarios for training. Synthetic scenario generation has emerged as a promising solution to enrich the diversity of training data; however, its application within E2E AD models remains largely unexplored. This is primarily due to the absence of a designated ego vehicle and the associated sensor inputs, such as camera or LiDAR, typically provided in real-world scenarios. To address this gap, we introduce SynAD, the first framework designed to enhance real-world E2E AD models using synthetic data. Our method designates the agent with the most comprehensive driving information as the ego vehicle in a multi-agent synthetic scenario. We further project path-level scenarios onto maps and employ a newly developed Map-to-BEV Network to derive bird's-eye-view features without relying on sensor inputs. Finally, we devise a training strategy that effectively integrates these map-based synthetic data with real driving data. Experimental results demonstrate that SynAD effectively integrates all components and notably enhances safety performance. By bridging synthetic scenario generation and E2E AD, SynAD paves the way for more comprehensive and robust autonomous driving models.