
























Abstract:Continual learning methods based on pre-trained models (PTM) have recently gained attention which adapt to successive downstream tasks without catastrophic forgetting. These methods typically refrain from updating the pre-trained parameters and instead employ additional adapters, prompts, and classifiers. In this paper, we from a novel perspective investigate the benefit of sparse orthogonal parameters for continual learning. We found that merging sparse orthogonality of models learned from multiple streaming tasks has great potential in addressing catastrophic forgetting. Leveraging this insight, we propose a novel yet effective method called SoTU (Sparse Orthogonal Parameters TUning). We hypothesize that the effectiveness of SoTU lies in the transformation of knowledge learned from multiple domains into the fusion of orthogonal delta parameters. Experimental evaluations on diverse CL benchmarks demonstrate the effectiveness of the proposed approach. Notably, SoTU achieves optimal feature representation for streaming data without necessitating complex classifier designs, making it a Plug-and-Play solution.
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2411.02813 [cs.LG] |
| (or arXiv:2411.02813v3 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2411.02813 arXiv-issued DOI via DataCite |
From: Kun-Peng Ning [view email]
[v1]
Tue, 5 Nov 2024 05:19:09 UTC (12,553 KB)
[v2]
Wed, 16 Jul 2025 15:39:50 UTC (2,536 KB)
[v3]
Thu, 21 May 2026 06:49:45 UTC (2,534 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。