惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
人人都是产品经理
人人都是产品经理
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
有赞技术团队
有赞技术团队
博客园 - 聂微东
C
Cybersecurity and Infrastructure Security Agency CISA
S
SegmentFault 最新的问题
博客园_首页
I
InfoQ
A
About on SuperTechFans
Apple Machine Learning Research
Apple Machine Learning Research
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
美团技术团队
T
Tor Project blog
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
V
Visual Studio Blog
WordPress大学
WordPress大学
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
T
Tailwind CSS Blog
P
Palo Alto Networks Blog
博客园 - 叶小钗
N
News and Events Feed by Topic
Google DeepMind News
Google DeepMind News
Last Week in AI
Last Week in AI
小众软件
小众软件
N
News and Events Feed by Topic
Spread Privacy
Spread Privacy
O
OpenAI News
N
News | PayPal Newsroom
H
Help Net Security
Recent Announcements
Recent Announcements
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
酷 壳 – CoolShell
酷 壳 – CoolShell
PCI Perspectives
PCI Perspectives
M
MIT News - Artificial intelligence
云风的 BLOG
云风的 BLOG
罗磊的独立博客
D
Darknet – Hacking Tools, Hacker News & Cyber Security
The GitHub Blog
The GitHub Blog
Google Online Security Blog
Google Online Security Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
IT之家
IT之家
Y
Y Combinator Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
博客园 - 【当耐特】
T
The Blog of Author Tim Ferriss
AWS News Blog
AWS News Blog
W
WeLiveSecurity
www.infosecurity-magazine.com
www.infosecurity-magazine.com
NISL@THU
NISL@THU

cs.AI updates on arXiv.org

暂无文章

Fractured Chain-of-Thought Reasoning
Baohao Liao, Hanze Dong, Yuhui Xu, Doyen Sahoo, Christof Monz, J · 2025-05-19 · via cs.AI updates on arXiv.org

Inference-time scaling techniques have significantly bolstered the reasoning capabilities of large language models (LLMs) by harnessing additional computational effort at inference without retraining. Similarly, Chain-of-Thought (CoT) prompting and its extension, Long CoT, improve accuracy by generating rich intermediate reasoning trajectories, but these approaches incur substantial token costs that impede their deployment in latency-sensitive settings. In this work, we first show that truncated CoT, which stops reasoning before completion and directly generates the final answer, often matches the full CoT sampling while using dramatically fewer tokens. Building on this insight, we introduce Fractured Sampling, a unified inference-time strategy that interpolates between full CoT and solution-only sampling along three orthogonal axes: (1) the number of reasoning trajectories, (2) the number of final solutions per trajectory, and (3) the depth at which reasoning traces are truncated. Through extensive experiments on five diverse reasoning benchmarks and several model scales, we demonstrate that Fractured Sampling consistently achieves superior accuracy-cost trade-offs, yielding steep log-linear scaling gains in Pass@k versus token budget. Our analysis reveals how to allocate computation across these dimensions to maximize performance, paving the way for more efficient and scalable LLM reasoning. Code is available at https://github.com/BaohaoLiao/frac-cot.