





















Abstract:Deep reinforcement learning agents progressively lose representational capacity during training: neurons become dormant, removing active capacity from the network, and effective rank collapses, leaving surviving neurons redundant. Existing remedies such as periodic resets, and special neural network architectures, are largely algorithm- or domain-specific. We propose a simple architectural fix, the Hadamard Representation (HR), which replaces a standard hidden layer with the element-wise product of two independently parameterized layers. HR operates through two complementary mechanisms. First, it reduces the probability of a neuron becoming dormant, which is particularly valuable for continuously differentiable activations such as tanh: unlike dormant ReLU neurons, which are effectively pruned, saturated tanh neurons silently corrupt downstream layers by turning their outgoing weights into fixed biases. Second, independently of dormancy, the multiplicative structure captures richer feature interactions and increases effective rank without widening the layer. We evaluate HR across five algorithms and three domains: DQN, PPO, and PQN on pixel-based discrete-action Atari, SimbaV2 on state-based continuous control, and MR.Q on visual continuous control. HR consistently improves performance over the strong baselines without any hyperparameter tuning, and gains persist against parameter-matched wider variants, ruling out parameter count as an alternative explanation.
| Comments: | 26 pages, 17 figures |
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2406.09079 [cs.LG] |
| (or arXiv:2406.09079v5 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2406.09079 arXiv-issued DOI via DataCite |
From: Jacob Eeuwe Kooi [view email]
[v1]
Thu, 13 Jun 2024 13:03:37 UTC (2,749 KB)
[v2]
Wed, 23 Oct 2024 08:05:57 UTC (8,876 KB)
[v3]
Fri, 29 Nov 2024 13:24:58 UTC (12,845 KB)
[v4]
Mon, 19 May 2025 12:12:54 UTC (14,572 KB)
[v5]
Mon, 25 May 2026 08:30:30 UTC (16,843 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。