
























Abstract:Audio effects (FX) such as reverberation, distortion, modulation, and dynamic range processing play a pivotal role in shaping emotional responses during music listening. While prior studies have examined links between low-level audio features and affective perception, the systematic impact of audio FX on emotion remains underexplored. This work investigates how foundation models - large-scale neural architectures pretrained on multimodal data - can be leveraged to analyze these effects. Such models encode rich associations between musical structure, timbre, and affective meaning, offering a powerful framework for probing the emotional consequences of sound design techniques. By applying various probing methods to embeddings from deep learning models, we examine the complex, nonlinear relationships between audio FX and estimated emotion, uncovering patterns tied to specific effects and evaluating the robustness of foundation audio models. Our findings aim to advance understanding of the perceptual impact of audio production practices, with implications for music cognition, performance, and affective computing.
| Comments: | this https URL |
| Subjects: | Sound (cs.SD); Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2509.15151 [cs.SD] |
| (or arXiv:2509.15151v4 [cs.SD] for this version) | |
| https://doi.org/10.48550/arXiv.2509.15151 arXiv-issued DOI via DataCite |
From: Vassilis Lyberatos [view email]
[v1]
Thu, 18 Sep 2025 16:57:08 UTC (13,059 KB)
[v2]
Sat, 20 Sep 2025 08:36:11 UTC (13,059 KB)
[v3]
Tue, 6 Jan 2026 15:40:56 UTC (3,770 KB)
[v4]
Thu, 21 May 2026 10:15:02 UTC (3,770 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。