


























Jiaming Liu, University of California, Santa Barbara
Vikas Kalagi, University of California, Santa Barbara
Divyakant Agrawal, University of California, Santa Barbara
Amr El Abbadi, University of California, Santa Barbara
A promising direction for enabling private queries to large language models (LLMs) is with homomorphic encryption (HE). An open problem is performing token sampling under HE. In this paper, we introduce Hyperion, an efficient HE algorithm for inverse transform sampling, enabling private token sampling with 1 comparison depth, $O(1)$ amortized comparisons, and $O(\log n)$ rotations. We implement our approach and demonstrate that it samples tokens in 0.14 seconds for 32k tokens ($\approx 4.4\, \mu\mathrm{s}$ per token) on GPU, achieving a $100\times$ latency improvement over prior work.
BibTeX
@misc{cryptoeprint:2025/2318,
author = {Lawrence Lim and Jiaming Liu and Vikas Kalagi and Divyakant Agrawal and Amr El Abbadi},
title = {Hyperion: Private Token Sampling with Homomorphic Encryption},
howpublished = {Cryptology {ePrint} Archive, Paper 2025/2318},
year = {2025},
url = {https://eprint.iacr.org/2025/2318}
}
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。