

















Abstract:We observe that existing model interpretation methods generally ignore the baseline, and such neglect often results in imprecise or even incorrect interpretation. In this paper, we reformulate the task of model interpretation and the interpretation principles for model interpretation results to demonstrate the importance of the baseline. We further unify gradient-based methods, Integrated Gradients (IG) methods, and Taylor expansion, clarifying the connections among them and explicitly identifying the baseline for each method. On this basis, we analyze the flaws and errors in related model interpretation methods (IG, LayerCAM, ODAM, Difference Map). We advocate evaluating the quality of model interpretation results precisely through the attribution error between the attribution result and the attribution target, rather than adopting flawed evaluation methods, such as those based on marginal-effect or the assumption of perfect model performance. We revise IG and develope a model interpretation method with a clear and reasonable baseline, achieving better results. Our method supports model interpretation based on features from any layer. Interpretation based on features from different layers are all reasonable, and the differences among these results reflect varying degrees of feature extraction at different feature extraction stages.
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE) |
| Cite as: | arXiv:2605.22417 [cs.CV] |
| (or arXiv:2605.22417v3 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.22417 arXiv-issued DOI via DataCite |
From: Yongjin Cui [view email]
[v1]
Thu, 21 May 2026 12:40:11 UTC (34,942 KB)
[v2]
Mon, 25 May 2026 13:56:01 UTC (34,942 KB)
[v3]
Tue, 26 May 2026 15:47:46 UTC (34,942 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。