





















Abstract:Human motion diffusion models can synthesize action sequences from text, but controlling motion intensity remains challenging. Existing approaches rely on effort-related adverbs, which are ambiguous and fail to capture quantitative aspects such as pacing, often resulting in flat and monotonous dynamics. We propose an intensity-control framework based on Effort Metric Attention (EMA), a cross-attention module that conditions diffusion on numerical effort signals. Inspired by Laban Movement Analysis (LMA), the framework focuses on the Time and Weight effort factors. We approximate these factors using two kinematic metrics: peak joint positional change for pacing and collective joint positional change for motion amount. EMA enables fine-grained, region-wise control without costly post-hoc optimization. We introduce two evaluation tasks, metric-to-motion consistency and body-part-level effort modulation, to assess numerical fidelity and localized control. Experiments and a user study show near-monotonic alignment between specified effort levels, generated motion dynamics, and established LMA descriptors. These results indicate effective and interpretable control of effort dynamics in practice.
| Comments: | Accepted at IEEE International Conference on Automatic Face and Gesture Recognition (FG 2026) |
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG) |
| Cite as: | arXiv:2605.24566 [cs.CV] |
| (or arXiv:2605.24566v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.24566 arXiv-issued DOI via DataCite (pending registration) |
From: Joshua Siy [view email]
[v1]
Sat, 23 May 2026 13:00:36 UTC (1,712 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。