Introducing ZeroSearch - an open source method to train RAG retrievers without queries

Alibaba researchers introduce ZeroSearch, a method for training LLMs without using search APIs. It replaces real-time search with a document simulator to train QA models using multi-step reasoning. A 14B model trained this way outperforms Google Search-powered baselines on standard benchmarks.

Training without external tools

ZeroSearch removes the need for search APIs or browser-based tools during model training.

Trains policy models using simulated documents instead of real search results

Avoids API costs and latency by removing search engine dependencies

Simulation uses prompted LLMs or fine-tuned 3B, 7B, or 14B models

Policy learns to reason through multi-step QA using only internal context

Simulation engine setup
The simulation engine generates noisy retrieval documents from intermediate queries.

Generates up to 20 documents per query using language models

Varies document quality to simulate realistic and imperfect retrieval

Uses reward signals to rerank documents based on final answer quality

Fine-tunes separate 3B, 7B, and 14B models to serve as simulators

Performance across benchmarks

ZeroSearch models outperform retrieval-based and search-free baselines on QA tasks.

14B model beats Search-R1 (Google Search + policy) on Natural Questions
7B model matches Search-R1 while remaining fully search-free
3B model outperforms prior non-search baselines
Shows stronger generalization on TriviaQA and PopQA (out-of-domain benchmarks)

Learning with curriculum rollout
Training uses a curriculum that gradually increases retrieval noise.

Early stages use high-quality, relevant documents to stabilize learning

Later stages inject irrelevant or conflicting data to improve reasoning

Policy adapts to ambiguity and incomplete information

Reinforcement learning (PPO) improves final answer accuracy using delayed rewards

Access and availability

GitHub repo includes training code, simulator prompts, and evaluation scripts

Models trained with ZeroSearch are not yet publicly released

Try ZeroSearch

Introducing ZeroSearch – an open source method to train RAG retrievers without queries

Training without external tools

Performance across benchmarks

Discover the Future of AI – First!