



























Abstract:Activation functions play a central role in neural networks by shaping internal representations. Recently, learning binary activation representations has attracted significant attention due to their advantages in computational and memory efficiency, as well as interpretability. However, training neural networks with Heaviside activations remains challenging, as their non-differentiability obstructs standard gradient-based optimization. In this paper, we propose Heavy Tailed Activation Function (HTAF), a smooth approximation to the Heaviside function that enables stable training with gradient-based optimization. We construct HTAF as a sigmoid hyperbolic tangent composite function and theoretically show that it maintains a large gradient mass around zero inputs while exhibiting slower gradient decay in the tail regions. We show that Spiking Neural Networks, Binary Neural Networks and Deep Heaviside neural Networks can be trained stably using HTAF with gradient-based optimization. Finally, we introduce Implicit Concept Bottleneck Models (ICBMs), an interpretable image model that leverages HTAF to induce discrete feature representations. Extensive experiments across various architectures and image datasets demonstrate that ICBM enables stable discretization while achieving prediction performance comparable to or better than standard models.
| Comments: | 32 pages |
| Subjects: | Machine Learning (cs.LG); Machine Learning (stat.ML) |
| Cite as: | arXiv:2605.11558 [cs.LG] |
| (or arXiv:2605.11558v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.11558 arXiv-issued DOI via DataCite (pending registration) |
From: Seokhun Park [view email]
[v1]
Tue, 12 May 2026 05:41:36 UTC (5,699 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。