Abstract
Agentic search requires large language models (LLMs) to perform multi-step search to solve complex information-seeking tasks, imposing unique challenges on their reasoning capabilities. However, what constitutes effective reasoning for agentic search and how it can be learned remains unclear. In this work, we first investigate the reasoning behaviors that enable success in agentic search. By comparing successful and failed trajectories via an LLM-based analysis pipeline, we identify four beneficial behaviors: Information Verification, Authority Evaluation, Adaptive Search, and Error Recovery. Building on this, we propose Behavior Priming, a training approach that equips agentic search models with these reasoning behaviors before reinforcement learning (RL). Specifically, it collects trajectories with the identified behaviors for supervised fine-tuning (SFT), and then applies standard RL to further improve task performance. Experiments on Qwen3-1.7B and Llama3.2-3B-Instruct show that Behavior Priming yields relative improvements over direct RL by 37.2% on three web benchmarks and 6.2% on seven multi-hop QA benchmarks, and outperforms the SFT-then-RL baseline using outcome-correct trajectories for fine-tuning. Crucially, we show that these reasoning behaviors matter more than outcome correctness in the priming stage prior to RL. Further analysis reveals that Behavior Priming enhances exploration (pass@8) and test-time scaling (search step number), providing a robust foundation for RL.
- Anthology ID:
- 2026.findings-acl.1400
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 28080–28097
- Language:
- URL:
- https://aclanthology.org/2026.findings-acl.1400/
- DOI:
- Bibkey:
- Cite (ACL):
- Jiahe Jin, Abhijay Sai Paladugu, and Chenyan Xiong. 2026. Beneficial Reasoning Behaviors in Agentic Search and Effective Training Methods to Obtain Them. In Findings of the Association for Computational Linguistics: ACL 2026, pages 28080–28097, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Beneficial Reasoning Behaviors in Agentic Search and Effective Training Methods to Obtain Them (Jin et al., Findings 2026)
- Copy Citation:
- PDF:
- https://aclanthology.org/2026.findings-acl.1400.pdf
- Checklist:
- 2026.findings-acl.1400.checklist.pdf





















