
























Abstract:Behavioral cloning becomes difficult when the same observation admits several valid actions. We study this problem for action-chunking policies and show that different multimodal parameterizations fail in different ways. For latent-variable policies, posterior-prior regularization makes deployment-time sampling more reliable, but excessive regularization removes the action-conditioned information needed to distinguish demonstrated modes. Reducing this regularization can preserve mode information, but then success depends on whether the prior covers the relevant latent regions. For action-space generative policies, multimodality is constrained by the smoothness of the base-to-action transport: a map with small Lipschitz constant cannot assign substantial probability to many well-separated modes. Covering many modes therefore requires either sharp transitions in base space or off-support bridge regions in action space. Experiments on synthetic multimodal tasks and robotic simulation benchmarks support these mechanisms.
| Subjects: | Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO) |
| Cite as: | arXiv:2605.22493 [cs.LG] |
| (or arXiv:2605.22493v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.22493 arXiv-issued DOI via DataCite (pending registration) |
From: Lorenzo Mazza [view email]
[v1]
Thu, 21 May 2026 13:45:28 UTC (3,466 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。