





















Abstract:Post-training model compression is essential for enhancing the portability of Large Language Models (LLMs) while preserving their performance. While several compression approaches have been proposed, less emphasis has been placed on selecting the most suitable set of data (the so-called \emph{calibration data}) for finding the compressed model configuration. The choice of calibration data is a critical step in preserving model capabilities both intra- and inter-tasks. In this work, we address the challenge of identifying high-performance calibration sets for both pruning and quantization by analyzing intrinsic data properties rather than model-specific signals. We introduce \texttt{\textbf{ZipCal}}, a model-agnostic data curation strategy that maximizes lexical diversity based on Zipfian power laws. Experiments demonstrate that our method consistently outperforms standard uniform random sampling across various pruning benchmarks. Notably, it also performs on par, in terms of downstream performance, with a state-of-the-art method that relies on model perplexity. The latter becomes prohibitively expensive at large-scale models and datasets, while \texttt{\textbf{ZipCal}} is on average $\sim$240$\times$ faster due to its tractable linear complexity\footnote{We make the code and the experiments available at this https URL.}.
| Comments: | Added statistical analysis, mechanistic analysis and a comparison with a generative baseline. 22 pages |
| Subjects: | Computation and Language (cs.CL); Artificial Intelligence (cs.AI) |
| ACM classes: | I.2.7 |
| Cite as: | arXiv:2603.16105 [cs.CL] |
| (or arXiv:2603.16105v3 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2603.16105 arXiv-issued DOI via DataCite |
From: Francesco Pio Monaco [view email]
[v1]
Tue, 17 Mar 2026 04:12:08 UTC (157 KB)
[v2]
Tue, 7 Apr 2026 13:53:12 UTC (164 KB)
[v3]
Mon, 25 May 2026 08:42:21 UTC (174 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。