





















Abstract:Large neural networks achieve state-of-the-art performance on many tasks, yet their sheer size hinders deployment on resource-constrained devices. Among existing compression approaches, cross-layer parameter sharing remains relatively unexplored for transformer models. In this paper, we introduce Fine-grained Parameter Sharing (FiPS), a unified framework for compressing transformer Multi-Layer Perceptrons (MLPs) that combines cross-block parameter sharing, low-rank factorization, and sparsity in a single optimization. FiPS concatenates MLP weight matrices across a group of transformer blocks and factorizes them into a shared basis and sparse, layer-specific projection matrices. Both factors are initialized via singular value decomposition (SVD) and jointly optimized by block-wise reconstruction error minimization. FiPS compresses Vision Transformers (ViTs) by up to 33% with less than 1% top-1 accuracy loss on ImageNet-1k, and by up to 57% when combined with fine-tuning. It also compresses Large Language Models (LLMs) by up to 20% while outperforming existing SVD-based methods in perplexity and downstream benchmarks at matched compression. Combined with Quantization-Aware Training (QAT), 3-bit FiPS on Gemma-2-2B achieves lower perplexity than 2-bit QAT alone while matching the same 8x compression. These results establish fine-grained parameter sharing as a practical and effective approach for transformer MLP compression.
| Comments: | Accepted as is to Transactions on Machine Learning Research (TMLR), 2026. OpenReview: this https URL |
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2411.09816 [cs.LG] |
| (or arXiv:2411.09816v4 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2411.09816 arXiv-issued DOI via DataCite |
From: Cem Üyük [view email]
[v1]
Thu, 14 Nov 2024 21:29:58 UTC (1,159 KB)
[v2]
Sun, 15 Dec 2024 12:12:48 UTC (1,197 KB)
[v3]
Sun, 23 Feb 2025 20:02:33 UTC (2,020 KB)
[v4]
Sat, 23 May 2026 13:05:26 UTC (1,455 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。