惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

宝玉的分享
宝玉的分享
The GitHub Blog
The GitHub Blog
Vercel News
Vercel News
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
酷 壳 – CoolShell
酷 壳 – CoolShell
Last Week in AI
Last Week in AI
F
Fortinet All Blogs
Jina AI
Jina AI
I
InfoQ
T
The Blog of Author Tim Ferriss
P
Proofpoint News Feed
博客园 - 三生石上(FineUI控件)
G
Google Developers Blog
V
Visual Studio Blog
L
LangChain Blog
WordPress大学
WordPress大学
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
T
Tor Project blog
GbyAI
GbyAI
MongoDB | Blog
MongoDB | Blog
V
V2EX
Stack Overflow Blog
Stack Overflow Blog
H
Help Net Security
Recorded Future
Recorded Future
N
News and Events Feed by Topic
云风的 BLOG
云风的 BLOG
Martin Fowler
Martin Fowler
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
罗磊的独立博客
O
OpenAI News
Google DeepMind News
Google DeepMind News
S
Schneier on Security
C
Check Point Blog
N
Netflix TechBlog - Medium
The Register - Security
The Register - Security
aimingoo的专栏
aimingoo的专栏
TaoSecurity Blog
TaoSecurity Blog
T
Tenable Blog
H
Hackread – Cybersecurity News, Data Breaches, AI and More
Hugging Face - Blog
Hugging Face - Blog
Cyberwarzone
Cyberwarzone
月光博客
月光博客
The Last Watchdog
The Last Watchdog
B
Blog
有赞技术团队
有赞技术团队
Blog — PlanetScale
Blog — PlanetScale
T
Tailwind CSS Blog
Hacker News: Ask HN
Hacker News: Ask HN
H
Heimdal Security Blog
美团技术团队

cs.RO updates on arXiv.org

FineCog-Nav: Integrating Fine-grained Cognitive Modules for Zero-shot Multimodal UAV Navigation DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs SENSE: Stereo OpEN Vocabulary SEmantic Segmentation Continual Hand-Eye Calibration for Open-world Robotic Manipulation PLAF: Pixel-wise Language-Aligned Feature Extraction for Efficient 3D Scene Understanding GaussianFlow SLAM: Monocular Gaussian Splatting SLAM Guided by GaussianFlow GIST: Multimodal Knowledge Extraction and Spatial Grounding via Intelligent Semantic Topology $π_{0.7}$: a Steerable Generalist Robotic Foundation Model with Emergent Capabilities R3D: Revisiting 3D Policy Learning Vision-Based Safe Human-Robot Collaboration with Uncertainty Guarantees Benchmarking Classical Coverage Path Planning Heuristics on Irregular Hexagonal Grids for Maritime Coverage Scenarios NEAT-NC: NEAT guided Navigation Cells for Robot Path Planning HRDexDB: A Large-Scale Dataset of Dexterous Human and Robotic Hand Grasps ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints An Intelligent Robotic and Bio-Digestor Framework for Smart Waste Management Efficient closed-form approaches for pose estimation using Sylvester forms World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems A Nonasymptotic Theory of Gain-Dependent Error Dynamics in Behavior Cloning CooperDrive: Enhancing Driving Decisions Through Cooperative Perception SpaceMind: A Modular and Self-Evolving Embodied Vision-Language Agent Framework for Autonomous On-orbit Servicing HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System UMI-3D: Extending Universal Manipulation Interface from Vision-Limited to 3D Spatial Perception Towards Multi-Object-Tracking with Radar on a Fast Moving Vehicle: On the Potential of Processing Radar in the Frequency Domain Beyond Conservative Automated Driving in Multi-Agent Scenarios via Coupled Model Predictive Control and Deep Reinforcement Learning Failure Identification in Imitation Learning Via Statistical and Semantic Filtering A Dynamic-Growing Fuzzy-Neuro Controller, Application to a 3PSP Parallel Robot Vision-Language-Action Jump-Starting for Reinforcement Learning Robotic Agents A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies ESCAPE: Episodic Spatial Memory and Adaptive Execution Policy for Long-Horizon Mobile Manipulation Evolvable Embodied Agent for Robotic Manipulation via Long Short-Term Reflection and Optimization Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning RadarSplat-RIO: Indoor Radar-Inertial Odometry with Gaussian Splatting-Based Radar Bundle Adjustment RobotPan: A 360$^\circ$ Surround-View Robotic Vision System for Embodied Perception Diffusion Sequence Models for Generative In-Context Meta-Learning of Robot Dynamics GeoVision-Enabled Digital Twin for Hybrid Autonomous-Teleoperated Medical Responses 4th Workshop on Maritime Computer Vision (MaCVi): Challenge Overview Multi-modal panoramic 3D outdoor datasets for place categorization Learning Probabilistic Responsibility Allocations for Multi-Agent Interactions Solving Physics Olympiad via Reinforcement Learning on Physics Simulators StarVLA-$α$: Reducing Complexity in Vision-Language-Action Systems Grounded World Model for Semantically Generalizable Planning SCORP: Scene-Consistent Multi-agent Diffusion Planning with Stable Online Reinforcement Post-Training for Cooperative Driving Agentic Driving Coach: Robustness and Determinism of Agentic AI-Powered Human-in-the-Loop Cyber-Physical Systems AffordSim: A Scalable Data Generator and Benchmark for Affordance-Aware Robotic Manipulation Efficient Emotion-Aware Iconic Gesture Prediction for Robot Co-Speech Minimal Embodiment Enables Efficient Learning of Number Concepts in Robot Learning to Forget -- Hierarchical Episodic Memory for Lifelong Robot Deployment 3D-Anchored Lookahead Planning for Persistent Robotic Scene Memory via World-Model-Based MCTS EmbodiedGovBench: A Benchmark for Governance, Recovery, and Upgrade Safety in Embodied Agent Systems Federated Single-Agent Robotics: Multi-Robot Coordination Without Intra-Robot Multi-Agent Fragmentation Robust Adversarial Policy Optimization Under Dynamics Uncertainty BridgeSim: Unveiling the OL-CL Gap in End-to-End Autonomous Driving AffordGen: Generating Diverse Demonstrations for Generalizable Object Manipulation with Afford Correspondence Genie 4D: Semantic-Prior-Guided 4D Dynamic Scene Reconstruction RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models PhysInOne: Visual Physics Learning and Reasoning in One Suite C$^2$T: Captioning-Structure and LLM-Aligned Common-Sense Reward Learning for Traffic--Vehicle Coordination WOMBET: World Model-Based Experience Transfer for Robust and Sample-efficient Reinforcement Learning Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring 3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding Action Images: End-to-End Policy Learning via Multiview Video Generation Towards Generalizable Robotic Manipulation in Dynamic Environments General-purpose LLMs as Models of Human Driver Behavior: The Case of Simplified Merging Uncertainty, Vagueness, and Ambiguity in Human-Robot Interaction: Why Conceptualization Matters IROSA: Interactive Robot Skill Adaptation using Natural Language Online Navigation Planning for Long-term Autonomous Operation of Underwater Gliders Optimized Human-Robot Co-Dispatch Planning for Petro-Site Surveillance under Varying Criticalities MerNav: A Highly Generalizable Memory-Execute-Review Framework for Zero-Shot Object Goal Navigation From Instruction to Event: Sound-Triggered Mobile Manipulation Self-Organizing Dual-Buffer Adaptive Clustering Experience Replay (SODACER) for Safe Reinforcement Learning in Optimal Control Enhanced-FQL($λ$), an Efficient and Interpretable RL with novel Fuzzy Eligibility Traces and Segmented Experience Replay LEAD: Minimizing Learner-Expert Asymmetry in End-to-End Driving Learning to Plan, Planning to Learn: Adaptive Hierarchical RL-MPC for Sample-Efficient Decision Making Target-Bench: Can Video World Models Achieve Mapless Path Planning with Semantic Targets? Robust Verification of Controllers under State Uncertainty via Hamilton-Jacobi Reachability Analysis Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion Volumetric Ergodic Control RoboTAG: End-to-end Robot Configuration Estimation via Topological Alignment Graph TwinOR: Photorealistic Digital Twins of Dynamic Operating Rooms for Embodied AI Research Multimodal Diffusion Forcing for Forceful Manipulation X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models Flow with the Force Field: Learning 3D Compliant Flow Matching Policies from Force and Demonstration-Guided Simulation Data AFFORD2ACT: Affordance-Guided Automatic Keypoint Selection for Generalizable and Lightweight Robotic Manipulation HAMLET: Switch your Vision-Language-Action Model into a History-Aware Policy TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance Multi-Modal Manipulation via Multi-Modal Policy Consensus AutoDrive-R$^2$: Incentivizing Reasoning and Self-Reflection Capacity for VLA Model in Autonomous Driving Constrained Decoding for Safe Robot Navigation Foundation Models FCBV-Net: Category-Level Robotic Garment Smoothing via Feature-Conditioned Bimanual Value Prediction PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving LLM-based Realistic Safety-Critical Driving Video Generation Scalable Multi-Task Learning through Spiking Neural Networks with Adaptive Task-Switching Policy for Intelligent Autonomous Agents Learning to Play Piano in the Real World Scalable Unseen Objects 6-DoF Absolute Pose Estimation with Robotic Integration Sixth-Sense: Self-Supervised Learning of Spatial Awareness of Humans from a Planar Lidar Curriculum-based Sample Efficient Reinforcement Learning for Robust Stabilization of a Quadrotor Generative Models and Connected and Automated Vehicles: A Survey in Exploring the Intersection of Transportation and AI Convex Hulls of Reachable Sets
LLM-Driven 3D Scene Generation of Agricultural Simulation Environments
Arafa Yoncalik, Wouter Jansen, Nico Huebel, Mohammad Hasan Rahma · 2026-02-12 · via cs.RO updates on arXiv.org

Procedural generation techniques in 3D rendering engines have revolutionized the creation of complex environments, reducing reliance on manual design. Recent approaches using Large Language Models (LLMs) for 3D scene generation show promise but often lack domain-specific reasoning, verification mechanisms, and modular design. These limitations lead to reduced control and poor scalability. This paper investigates the use of LLMs to generate agricultural synthetic simulation environments from natural language prompts, specifically to address the limitations of lacking domain-specific reasoning, verification mechanisms, and modular design. A modular multi-LLM pipeline was developed, integrating 3D asset retrieval, domain knowledge injection, and code generation for the Unreal rendering engine using its API. This results in a 3D environment with realistic planting layouts and environmental context, all based on the input prompt and the domain knowledge. To enhance accuracy and scalability, the system employs a hybrid strategy combining LLM optimization techniques such as few-shot prompting, Retrieval-Augmented Generation (RAG), finetuning, and validation. Unlike monolithic models, the modular architecture enables structured data handling, intermediate verification, and flexible expansion. The system was evaluated using structured prompts and semantic accuracy metrics. A user study assessed realism and familiarity against real-world images, while an expert comparison demonstrated significant time savings over manual scene design. The results confirm the effectiveness of multi-LLM pipelines in automating domain-specific 3D scene generation with improved reliability and precision. Future work will explore expanding the asset hierarchy, incorporating real-time generation, and adapting the pipeline to other simulation domains beyond agriculture.