





















Abstract:Although quadcopters boast impressive traversal capabilities enabled by their omnidirectional maneuverability, the need for continuous pilot control in complex environments impedes their application in GNSS and telemetry-denied scenarios. To this end, we propose a novel sensorimotor policy that uses stereo-vision depth and visual-inertial odometry (VIO) to autonomously navigate through obstacles in an unknown environment to reach a goal point. The policy is comprised of a pre-trained autoencoder as the perception head followed by a planning and control LSTM network which outputs velocity commands that can be followed by an off-the-shelf commercial drone. We leverage reinforcement and privileged learning paradigms to train the policy in simulation through a two-stage process: 1) initial training with optimal trajectories generated by a global motion planner acting as a supervisory backbone, 2) further fine-tuning in a curriculum environment. To bridge the sim-to-real gap, we employ domain randomization and reward shaping to create a policy that is both robust to noise and domain shift. In outdoor experiments, our approach achieves successful zero-shot transfer to both obstacle environments and a drone platform that were never encountered during training.
| Comments: | Published in IEEE Robotics and Automation Letters, vol 11, no 2. Presented at the IEEE International Conference on Robotics and Automation 2026 |
| Subjects: | Robotics (cs.RO); Machine Learning (cs.LG) |
| Cite as: | arXiv:2605.24449 [cs.RO] |
| (or arXiv:2605.24449v1 [cs.RO] for this version) | |
| https://doi.org/10.48550/arXiv.2605.24449 arXiv-issued DOI via DataCite (pending registration) |
|
| Related DOI: | https://doi.org/10.1109/LRA.2025.3641120
DOI(s) linking to related resources |
From: Shiladitya Dutta [view email]
[v1]
Sat, 23 May 2026 07:41:13 UTC (6,350 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。