惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Y
Y Combinator Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
C
CXSECURITY Database RSS Feed - CXSecurity.com
S
Securelist
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
C
Cybersecurity and Infrastructure Security Agency CISA
P
Privacy International News Feed
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Cyberwarzone
Cyberwarzone
G
GRAHAM CLULEY
C
Cisco Blogs
The Hacker News
The Hacker News
I
Intezer
Security Latest
Security Latest
L
LINUX DO - 最新话题
Google DeepMind News
Google DeepMind News
N
News and Events Feed by Topic
The GitHub Blog
The GitHub Blog
Blog — PlanetScale
Blog — PlanetScale
S
Security Affairs
B
Blog RSS Feed
云风的 BLOG
云风的 BLOG
Attack and Defense Labs
Attack and Defense Labs
H
Hacker News: Front Page
U
Unit 42
Vercel News
Vercel News
T
Threatpost
W
WeLiveSecurity
H
Heimdal Security Blog
WordPress大学
WordPress大学
大猫的无限游戏
大猫的无限游戏
L
Lohrmann on Cybersecurity
P
Privacy & Cybersecurity Law Blog
N
News | PayPal Newsroom
E
Exploit-DB.com RSS Feed
A
About on SuperTechFans
GbyAI
GbyAI
S
Schneier on Security
博客园 - 司徒正美
Webroot Blog
Webroot Blog
Scott Helme
Scott Helme
宝玉的分享
宝玉的分享
NISL@THU
NISL@THU
AWS News Blog
AWS News Blog
S
Secure Thoughts
J
Java Code Geeks
Latest news
Latest news
PCI Perspectives
PCI Perspectives
N
Netflix TechBlog - Medium
Recent Commits to openclaw:main
Recent Commits to openclaw:main

cs.LG updates on arXiv.org

暂无文章

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models
Tianyu Fu, Yichen You, Zekai Chen, Guohao Dai, Huazhong Yang, Yu · 2025-11-12 · via cs.LG updates on arXiv.org

Improving the reasoning abilities of Large Language Models (LLMs), especially under parameter constraints, is crucial for real-world applications. Looped transformers address this by performing multiple latent iterations to refine each token beyond a single forward pass. However, we identify a latent overthinking phenomenon: most token predictions are already correct after the first pass, but are sometimes revised into errors in later iterations. We ask whether selectively skipping latent iterations can improve accuracy, and reveal significant potential with an oracle iteration policy that boosts performance by up to 7.3%. Motivated by this, we propose Think-at-Hard (TaH), a looped transformer optimized for selective iteration. TaH employs a lightweight neural decider to trigger latent iteration, only at tokens likely to be incorrect after the standard forward pass. During latent iterations, depth-aware Low-Rank Adaptation (LoRA) modules shift the objective from general next-token prediction to focused hard-token refinement. A duo-causal attention mechanism extends attention from the token sequence dimension to an additional iteration depth dimension, enabling cross-iteration information flow with full sequential parallelism. Experiments on nine benchmarks show consistent gains across math, QA, and coding tasks. With identical parameter counts, TaH outperforms always-iterate baselines by 3.8-4.4% while skipping iterations on 93% of tokens, and exceeds single-iteration Qwen3 baselines by 3.0-3.8%. When allowing <3% more parameters from LoRA and decider, the gains further increase to 5.3-6.2% and 6.1-6.8%, respectively. Our code is available at https://github.com/thu-nics/TaH.