惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Microsoft Security Blog
Microsoft Security Blog
Forbes - Security
Forbes - Security
月光博客
月光博客
WordPress大学
WordPress大学
Last Week in AI
Last Week in AI
罗磊的独立博客
V
Visual Studio Blog
Help Net Security
Help Net Security
宝玉的分享
宝玉的分享
H
Heimdal Security Blog
The Last Watchdog
The Last Watchdog
V
V2EX - 技术
S
SegmentFault 最新的问题
爱范儿
爱范儿
C
Check Point Blog
GbyAI
GbyAI
L
LINUX DO - 最新话题
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
W
WeLiveSecurity
Martin Fowler
Martin Fowler
Google Online Security Blog
Google Online Security Blog
F
Fortinet All Blogs
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Google DeepMind News
Google DeepMind News
aimingoo的专栏
aimingoo的专栏
H
Hacker News: Front Page
M
MIT News - Artificial intelligence
T
Threatpost
IT之家
IT之家
AI
AI
P
Privacy & Cybersecurity Law Blog
H
Hackread – Cybersecurity News, Data Breaches, AI and More
美团技术团队
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Stack Overflow Blog
Stack Overflow Blog
博客园 - 叶小钗
云风的 BLOG
云风的 BLOG
The Hacker News
The Hacker News
N
News and Events Feed by Topic
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
大猫的无限游戏
大猫的无限游戏
C
CXSECURITY Database RSS Feed - CXSecurity.com
S
Security Archives - TechRepublic
T
The Blog of Author Tim Ferriss
Cloudbric
Cloudbric
博客园_首页
Hugging Face - Blog
Hugging Face - Blog
G
GRAHAM CLULEY
V
V2EX
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知

cs.RO updates on arXiv.org

Bounding Boxes as Goals: Language-Conditioned Grasping via Neuro-Symbolic Planning EquiDexFlow: Contact-Grounded SE(3)-Equivariant Dexterous Grasp Generative Flows FAWAM: Force-Aware World Action Models for Closed-Loop Contact-Rich Manipulation Planning with the Views via Scene Self-Exploration Lifted Schrödinger Bridges for Gaussian Mixture Endpoints: Projection Gaps and Path-Space Obstructions Micro-Swarm Locomotion Optimization in Dynamic Flow using Multi-Objective Multi-Agent Reinforcement Learning OGPO: Sample Efficient Full-Finetuning of Generative Control Policies 4D Radar Semantic Segmentation of People in Field Conditions Using Temporal Multi-View Networks Delay-Aware Reinforcement Learning for Highway On-Ramp Merging under Stochastic Communication Latency Compact 3D Gaussian Splatting For Dense Visual SLAM Personalized Embodied Navigation for Portable Object Finding Generative Models and Connected and Automated Vehicles: A Survey in Exploring the Intersection of Transportation and AI Value Explicit Pretraining for Learning Transferable Representations Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own DSSE: a drone swarm search environment Multi-Modal World Model for Physical Robot Interactions: Simultaneous Visual and Tactile Predictions for Enhanced Accuracy Transformer-Based Autonomous Driving Models and Deployment-Oriented Compression: A Survey Convex Hulls of Reachable Sets Learning A Simulation-based Visual Policy for Real-world Peg In Unseen Holes Continual Model-Based Reinforcement Learning with Hypernetworks Planning Optimal Paths for Multiple Robots on Graphs Distance Optimal Formation Control on Graphs with a Tight Convergence Time Guarantee Seeing Unseeability to See the Unseeable Publishing Identifiable Experiment Code And Configuration Is Important, Good and Easy Learning from Humans as an I-POMDP Robust Filtering and Smoothing with Gaussian Processes MAV Stabilization using Machine Learning and Onboard Sensors Predicting Contextual Sequences via Submodular Function Maximization Memory Based Machine Intelligence Techniques in VLSI hardware Bootstrapping Intrinsically Motivated Learning with Human Demonstrations Contextually Guided Semantic Labeling and Search for 3D Point Clouds Towards Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models Promoting scientific thinking with robots Feature Reinforcement Learning In Practice Controlling wheelchairs by body motions: A learning framework for the adaptive remapping of space Active Classification: Theory and Application to Underwater Inspection Inferring 3D Articulated Models for Box Packaging Robot Symmetry-Based Search Space Reduction For Grid Maps Learning Geometrically-Constrained Hidden Markov Models for Robot Navigation: Bridging the Topological-Geometrical Gap Markov Localization for Mobile Robots in Dynamic Environments A Real-Time Model-Based Reinforcement Learning Architecture for Robot Control Self-organized adaptation of a simple neural circuit enables complex robot behaviour Quantum Interaction Approach in Cognition, Artificial Intelligence and Robotics Doubly Robust Policy Evaluation and Learning Climbing depth-bounded adjacent discrepancy search for solving hybrid flow shop scheduling problems with multiprocessor tasks Boolean network robotics: a proof of concept Active Markov Information-Theoretic Path Planning for Robotic Environmental Sensing Use of Python and Phoenix-M Interface in Robotics The Ethics of Robotics To study the phenomenon of the Moravec's Paradox Survey on Various Gesture Recognition Techniques for Interfacing Machines Based on Ambient Intelligence Artificial Hormone Reaction Networks: Towards Higher Evolvability in Evolutionary Multi-Modular Robotics The Inverse Task of the Reflexive Game Theory: Theoretical Matters, Practical Applications and Relationship with Other Issues Fundamentals of Mathematical Theory of Emotional Robots The Use of Probabilistic Systems to Mimic the Behaviour of Idiotypic AIS Robot Controllers Two-Timescale Learning Using Idiotypic Behaviour Mediation For A Navigating Mobile Robot A Probabilistic Perspective on Gaussian Filtering and Smoothing The Application of a Dendritic Cell Algorithm to a Robotic Classifier Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms Mimicking the Behaviour of Idiotypic AIS Robot Controllers Using Probabilistic Systems A Minimum Relative Entropy Controller for Undiscounted Markov Decision Processes A Little More, a Lot Better: Improving Path Quality by a Simple Path Merging Algorithm Single-Agent On-line Path Planning in Continuous, Unpredictable and Highly Dynamic Environments Combining a Probabilistic Sampling Technique and Simple Heuristics to solve the Dynamic Path Planning Problem A Multi-stage Probabilistic Algorithm for Dynamic Path-Planning An Idiotypic Immune Network as a Short Term Learning Architecture for Mobile Robots Higher coordination with less control - A result of information maximization in the sensorimotor loop FaceBots: Steps Towards Enhanced Long-Term Human-Robot Interaction by Utilizing and Publishing Online Social Information Intent expression using eye robot for mascot robot system Fuzzy inference based mentality estimation for eye robot agent Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning Time Hopping technique for faster reinforcement learning in simulations Time manipulation technique for speeding up reinforcement learning in simulations Modeling the Experience of Emotion I, Quantum Robot: Quantum Mind control on a Quantum Computer A Computational Study on Emotions and Temperament in Multi-Agent Systems I'm sorry to say, but your understanding of image processing fundamentals is absolutely wrong Towards Physarum robots: computing and manipulating on water surface Idiotypic Immune Networks in Mobile Robot Control Performance Bounds for Lambda Policy Iteration and Application to the Game of Tetris Multi-Sensor Fusion Method using Dynamic Bayesian Network for Precise Vehicle Localization and Road Matching The Cyborg Astrobiologist: Porting from a wearable computer to the Astrobiology Phone-cam Cross-Entropic Learning of a Machine for the Decision in a Partially Observable Universe Integration of navigation and action selection functionalities in a computational model of cortico-basal ganglia-thalamo-cortical loops Applying Evolutionary Optimisation to Robot Obstacle Avoidance Explorations in engagement for humans and robots Field geology with a wearable computer: 1st results of the Cyborg Astrobiologist System Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks The Cyborg Astrobiologist: Scouting Red Beds for Uncommon Features with Geological Significance The Self-Organization of Speech Sounds Multi-Vehicle Cooperative Control Using Mixed Integer Linear Programming Neural Networks in Mobile Robot Motion Artificial Intelligence and Systems Theory: Applied to Cooperative Robots Bionic Humans Using EAP as Artificial Muscles Reality and Challenges Topological Navigation of Simulated Robots using Occupancy Grid The Cyborg Astrobiologist: First Field Experience Robust Global Localization Using Clustered Particle Filtering Learning from Scarce Experience Safe cooperative robot dynamics on graphs A Human - machine interface for teleoperation of arm manipulators in a complex environment
Combining RL and IL using a dynamic, performance-based modulation over learning signals and its application to local planning
Francisco Leiva, Javier Ruiz-del-Solar · 2024-05-16 · via cs.RO updates on arXiv.org

This paper proposes a method to combine reinforcement learning (RL) and imitation learning (IL) using a dynamic, performance-based modulation over learning signals. The proposed method combines RL and behavioral cloning (IL), or corrective feedback in the action space (interactive IL/IIL), by dynamically weighting the losses to be optimized, taking into account the backpropagated gradients used to update the policy and the agent's estimated performance. In this manner, RL and IL/IIL losses are combined by equalizing their impact on the policy's updates, while modulating said impact such that IL signals are prioritized at the beginning of the learning process, and as the agent's performance improves, the RL signals become progressively more relevant, allowing for a smooth transition from pure IL/IIL to pure RL. The proposed method is used to learn local planning policies for mobile robots, synthesizing IL/IIL signals online by means of a scripted policy. An extensive evaluation of the application of the proposed method to this task is performed in simulations, and it is empirically shown that it outperforms pure RL in terms of sample efficiency (achieving the same level of performance in the training environment utilizing approximately 4 times less experiences), while consistently producing local planning policies with better performance metrics (achieving an average success rate of 0.959 in an evaluation environment, outperforming pure RL by 12.5% and pure IL by 13.9%). Furthermore, the obtained local planning policies are successfully deployed in the real world without performing any major fine tuning. The proposed method can extend existing RL algorithms, and is applicable to other problems for which generating IL/IIL signals online is feasible. A video summarizing some of the real world experiments that were conducted can be found in https://youtu.be/mZlaXn9WGzw.