惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

cs.LG updates on arXiv.org

Batch Normalization Amplifies Memorization and Privacy Risks Disentangled Double Machine Learning for Accurate Causal Effect Estimation Rethinking Federated Unlearning via the Lens of Memorization Streaming Reinforcement Learning under Partial Observability with Real-Time Recurrent Learning The Normalized Maximum Likelihood for Regular Non-Smooth Models: Measure-Theoretic Foundations and Geometric Sampling Evolving Robustness--Exploration Trade-off in Online Reinforcement Learning via Quantile Bayesian Risk MDPs ChainLearn: A Blockchain-Based Capacity-Aware Framework for Federated Ensemble Learning Assessing the Operational Viability of Foundation Models for Time Series Forecasting Generative Representation Learning on Hyper-relational Knowledge Graphs via Masked Discrete Diffusion Position: AI for Science Should Treat Measurement-to-Dataset Pipelines as Inference Components ChaosBench-Logic v2: Evaluating LLM Logical Reasoning over Dynamical Systems at Scale Private Adaptive Covariance Estimation via Gaussian Graphical Models LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs SemanticZip: A Pilot Framework for Lossy Text Compression with LLMs as Semantic Decompressors A Large-Scale Dataset and Benchmark: Do Protein-Ligand Models Learn Binding Sites or Just Binding Likelihood? Hardware-Aware Federated Learning for Speech Emotion Recognition Momentum Streams for Optimizer-Inspired Transformers Not All Transitions Matter: Evidence from PPO A computational phase transition for learning-to-sample from Ising models Parameter Efficient Multi-Class Intelligent Scheduling for Multimodal Online Distributed Industrial Anomaly Detection Treatment Effect Estimation with Differentiated Networked Effect on Graph Data Towards Verifiable Transformers: Solver-Checkable Circuit Explanations Representation-Guided Discrete Molecular Graph Retrosynthesis Balancing Fairness, Privacy, and Accuracy: A Multitask Adversarial Framework for Centralized Data-Driven Systems A lift for input-convex neural network training Extracting Training Data from Diffusion Language Models via Infilling Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning Faithfulness as Information Flow: Evaluating and Training Faithful Chain-of-Thought Reasoning Hermite-NGP: Gradient-Augmented Hash Encoding for Learning PDEs Cross-Domain Energy-Guided Diffusion Generation for Off-Dynamics Reinforcement Learning Active Learning for Stochastic Contextual Linear Bandits Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing Beyond Generative Priors: Minority Sampling with JEPA-Guided Diffusion CAFD: Concept-Aware DNN Fault Detection using VLMs Polymorphism Is Rotation: Operational Mechanistic Interpretability from a Two-Layer Transformer to Pythia-70m Discovering Lexical Gaps Using Embeddings from Multilingual LLMs WLNO: Wavelet-Laplace Neural Operator for Solving Partial Differential Equations Beyond Fixed Points: Superpolynomial Capacity of Asymmetric Hopfield Networks Feature Lottery? A Bifurcation Theory of Concept Emergence Feature Learning in Wide Neural Networks under $μ$P: Identifiability and Sparse-Dictionary Decomposition of the Mean-Field Limit A Unified Python Framework for Direct PPO-based Control of AHUs with Economizer Logic and CO2-Constrained Ventilation Optimizing Digital Therapeutic Interventions: Online Learning under Endogenous Adherence Knowledge Graph Modulated Deep Learning for Limited-Sample Clinical Data Analysis Beyond the Aggregation Dilemma: Prior-Retaining Decoupled Learning for Multimodal Graphs Complement Submodular Information Measures for Balanced and Robust Data Selection Cascade-KDE: Robust Time-Series Restoration under Out-of-Distribution Impulse Corruptions LLMs Show No Signs Of Individuated Metacognition The Perception-Physics Paradox: Probing Scientific Alignment with TC-Bench {\Phi}-Noise: Training-Free Temporal Video Conditioning via Phase-Based Noise Manipulation Rethinking Continual Anomaly Detection on the Edge: Benchmarking Under Realistic Industrial Conditions Synheart Capacity: A Theory-Driven Physiological Representation of Cognitive Capacity Dynamics from Wearable Signals Bilevel Optimization of Synthetic Trajectories for Multi-Turn LLM Fine-Tuning Omissive Bias in Religious Representation: Benchmarking LLM Answers to Everyday Ethical Decision-making Fourier Feature Pyramids for Physics-Informed Neural Networks Algometrics: Forecasting Under Algorithmic Feedback ChainzRule: Sample-Efficient, Robust Deep Learning Across Tabular, NLP, and Vision Tasks Learning Laplacian Eigenspace with Mass-Aware Neural Operators on Point Clouds From One-Pass SGD to Data Reuse: Mini-Batch Scaling Laws in Sketched Linear Regression Overcoming "Physics Shock" in Earth Observation A Heteroscedastic Uncertainty Framework for PINN-based Flood Inference Refined Analysis of Entropy-Regularized Actor-Critic LLMTabBench: Evaluating LLMs on Binary Tabular Classification From Zero to Few Shots Zeroth-Order Nonconvex Nonsmooth Optimization with Heavy-Tailed Noise PILOT: Policy-Informed Learned Optimization for Adaptive Deep Network Training LAPLEX: The FFT of Learnable Laplace Kernels Trajectory-Based Difficulty Scoring for Reliable Learning on Tabular Data RL with Learnable Textual Feedback: A Bilevel Approach ECHO: Terminal Agents Learn World Models for Free Temporal Concept Drift in Legal Judgment Prediction: Neural Baselines Across Three Epochs of Ukrainian Court Decisions On the Stability and Realizability of Recurrent Polynomial Surrogate Ternary Logic Gate Networks MindAlign: Bridging EEG, Vision, and Language for Zero-Shot Visual Decoding A Contractive Feedback Semantics for Reinforcement Learning Riemannian Archetypal Analysis: Interpretable non-linear data analysis on deformed star distributions Aligning Molecular Graph Explanations with Chemical Identity via InChIfied Invariants Verified SHAP: Provable Bounds for Exact Shapley Values of Neural Networks IterInject: Indirect Prompt Injection Against LLM Agents via Feedback-Guided Iterative Optimization Muon in Vision Transformers: Optimizer-Recipe Interactions and Gradient Spectra Mixture of Complementary Agents for Robust LLM Ensemble Lake Detection and Water Quality Estimation in Sentinel-2 Data PromptAudit: Auditing Prompt Sensitivity in LLM-Based Vulnerability Detection Federated Learning over Human-Body Communication for On-Body Edge Intelligence: A Survey, Taxonomy, and BODYFED-HBC Scheduling Vignette GEESE: Genotype-aware End-to-End Spatio-temporal Embedding for Behavioral Phenotyping PrivFusion: A Privacy-preserving Multi-Agent Framework for Harmonizing Distributed Datasets Generative OOD-regularized Model-based Policy Optimization Interdomain Attention: Beyond Token-Level Key-Value Memory Characterizing the Representational Capacity of Neural Processes TUBE: Tangent Upper Bound on Evidence for Discrete Diffusion Language Models Iterative Refinement Neural Operators are Learned Fixed-Point Solvers: A Principled Approach to Spectral Bias Mitigation CONF-KV: Confidence-Aware KV Cache Eviction with Mixed-Precision Storage for Long-Horizon LLM Spectral Probe-Circuits: A Three-Step Recipe for Identifying Attention-Head Circuits in Pretrained Transformers High-fidelity Modeling of Full-scale Pressurized Water Reactor Flow Fields for Machine Learning Applications Structure-Aware RAG: Structured Retrieval Augmented Generation from Noisy Data for Conversational Agents Hidden-State Privacy Has an Empty Middle CAffNet: Hard Constraint-Affine Neural Networks Reinforcement Learning for Reachability: Guaranteeing Asymptotic Optimality Deep ZakaiJ: Structured Filtering for Jump-Diffusion Time Series Forecasting An Effective-Rank Audit of Alignment-Induced Activation Shifts: Confound Control, Constructive Calibration, and Limits What Are We Actually Decoding? Source Attribution for Non-Invasive Brain-to-Language Retrieval CurveRL: Principled Distribution-Aware Context Reweighting for LLM Reasoning Filtered Posterior Mean Collections: A Unified Framework for Analytical Models of Diffusion Generalization
Vision-Guided Outdoor Flight and Obstacle Evasion via Reinforcement Learning
Shiladitya D · 2026-05-26 · via cs.LG updates on arXiv.org

View PDF HTML (experimental)

Abstract:Although quadcopters boast impressive traversal capabilities enabled by their omnidirectional maneuverability, the need for continuous pilot control in complex environments impedes their application in GNSS and telemetry-denied scenarios. To this end, we propose a novel sensorimotor policy that uses stereo-vision depth and visual-inertial odometry (VIO) to autonomously navigate through obstacles in an unknown environment to reach a goal point. The policy is comprised of a pre-trained autoencoder as the perception head followed by a planning and control LSTM network which outputs velocity commands that can be followed by an off-the-shelf commercial drone. We leverage reinforcement and privileged learning paradigms to train the policy in simulation through a two-stage process: 1) initial training with optimal trajectories generated by a global motion planner acting as a supervisory backbone, 2) further fine-tuning in a curriculum environment. To bridge the sim-to-real gap, we employ domain randomization and reward shaping to create a policy that is both robust to noise and domain shift. In outdoor experiments, our approach achieves successful zero-shot transfer to both obstacle environments and a drone platform that were never encountered during training.
Comments: Published in IEEE Robotics and Automation Letters, vol 11, no 2. Presented at the IEEE International Conference on Robotics and Automation 2026
Subjects: Robotics (cs.RO); Machine Learning (cs.LG)
Cite as: arXiv:2605.24449 [cs.RO]
  (or arXiv:2605.24449v1 [cs.RO] for this version)
  https://doi.org/10.48550/arXiv.2605.24449

arXiv-issued DOI via DataCite (pending registration)

Related DOI: https://doi.org/10.1109/LRA.2025.3641120

DOI(s) linking to related resources

Submission history

From: Shiladitya Dutta [view email]
[v1] Sat, 23 May 2026 07:41:13 UTC (6,350 KB)