SmartSearch: Process Reward-Guided Query Refinement for Search Agents

ruc.edu.cn | January 28, 2026

Large language model (LLM)-based search agents have become an important research direction for solving knowledge-intensive tasks by combining reasoning and information retrieval. While recent studies have improved search agents' reasoning abilities through prompt engineering and fine-tuning, the quality of intermediate search queries generated during multi-step reasoning has received limited attention. In practice, low-quality queries often lead to irrelevant or misleading retrieval results, causing the entire search process to deviate from the intended goal and significantly limiting overall performance.

To address this issue, researchers propose SmartSearch, a framework that explicitly optimizes intermediate query quality through process reward guidance, thereby enhancing the information acquisition capability of search agents.

Overview of the two key mechanisms in SmartSearch: (a) process rewards and (b) query optimization.

SmartSearch consists of two key mechanisms. First, it introduces process rewards with dual evaluation, providing fine-grained supervision signals for assessing the quality of each intermediate query. This enables the model to identify whether a query is informative, precise and aligned with the search goal. Second, SmartSearch adopts selective query optimization, which focuses on refining low-quality queries and regenerating subsequent search steps based on the optimized queries, effectively steering the search process back on track.

Building on these mechanisms, SmartSearch further proposes a three-stage, query-oriented curriculum learning framework, guiding the search agent through imitation, alignment, and generalization stages. Through this progressive training strategy, the agent gradually internalizes the ability to generate high-quality queries under the guidance of process rewards and applies this capability to increasingly complex and unseen scenarios.

Extensive experiments on six challenging benchmarks, including four knowledge-intensive tasks and two web exploration tasks, demonstrate that SmartSearch consistently outperforms existing baseline methods. Additional ablation studies and quantitative analyses further confirm its advantages in search efficiency, query quality, and robustness. Notably, SmartSearch shows strong generalization performance in open web environments.

RENMIN UNIVERSITY of CHINA

General Information

History

Leadership

Schools and Departments

RUC at a Glance

Flagship Cultural Brands

Highlights

Updates

Undergraduate

Graduate

International Students

Joint-degree Programs

Non-degree Programs

Scholarships

Humanities and Social Sciences

Sciences and Engineering

Partnerships

Confucius Institutes

Research Centers

Overview

Careers

Visas

Venues

Events

Gallery

Dining and Accommodation

Sciences and Engineering

SmartSearch: Process Reward-Guided Query Refinement for Search Agents

ruc.edu.cn | January 28, 2026

Quick Links

About RUC

Venues

Admissions

Careers

Contact Us

International Students Office
Hong Kong, Macao and Taiwan Affairs Office

Tel: 86-10-82509597
E-mail: international@ruc.edu.cn

京公网安备110402430004号 京ICP备05007162号-1

Copyright © Renmin University of China. All rights reserved. Presented by China Daily.

RENMIN UNIVERSITY of CHINA

SmartSearch: Process Reward-Guided Query Refinement for Search Agents

ruc.edu.cn | January 28, 2026

Quick Links

Contact Us

Tel: 86-10-82509597 E-mail: international@ruc.edu.cn

Copyright © var oTime = new Date(); document.write(oTime.getFullYear()); Renmin University of China. All rights reserved. Presented by China Daily.

Tel: 86-10-82509597
E-mail: international@ruc.edu.cn

Copyright © Renmin University of China. All rights reserved. Presented by China Daily.