
























Authors:Jianzhou Yao (1 and 2), Anxiong Song (1 and 2), Katja Baerenfaller (1 and 3), Damir Zhakparov (1 and 3) ((1) Swiss Institute of Allergy and Asthma Research, Davos, Switzerland, (2) ETH Zurich, Zurich, Switzerland, (3) Swiss Institute of Bioinformatics, Lausanne, Switzerland)
Abstract:Deep allergenicity classifiers are increasingly used in safety screening of novel foods, and recent protein language models have substantially improved protein-level allergenicity prediction. However, whether their explanations capture biologically meaningful information remains unclear. We introduce an epitope-grounded residue-level benchmark for quantitatively evaluating attribution faithfulness in protein allergenicity models. Across frozen ESM-2, multi-task ESM-2, and DeepPlantAllergy, protein-level classification was robust, yet classification-head explanation signals did not significantly exceed random in their residue-level alignment with annotated epitopes across AUROC, AUPRC, and Precision@k. Integrated Gradients identified residues that were functionally important to the model, but not overlapping annotated epitopes. Saturation mutagenesis further suggested classifiers may rely on physicochemical and compositional sequence features rather than epitope-specific mechanisms. Residue-level importance signals should therefore not be interpreted as immunological explanations for safety screening or hypoallergen design without quantitative validation. Code available: this https URL
From: Jianzhou Yao [view email]
[v1]
Sat, 20 Jun 2026 18:25:02 UTC (621 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。