Article

Baidu Launches Multi-Agent AI Search to Master Complex Queries

DATE: 7/2/2025 · STATUS: LIVE

Search engines today fumble through conflicting facts and shifting context until modular intelligence swoops in with secrets about smarter results…

Baidu Launches Multi-Agent AI Search to Master Complex Queries

Article content

As user queries become more complex and require layered reasoning, search systems have begun shifting away from simple keyword matching and ranking. Instead, engineers seek to build platforms that replicate human cognitive strategies for gathering and processing information. This shift signals a change in design philosophies, with emphasis on collaboration among intelligent modules that can adapt to context and handle ambiguity. These next-generation frameworks aim to understand intent, maintain context across multi-turn queries, and adjust strategies based on intermediate results.

Despite progress, many Retrieval-Augmented Generation (RAG) pipelines remain brittle. They may answer direct questions but falter when sources conflict or when a task demands contextual inference. A simple query comparing ages of historical figures, for instance, requires extracting birthdates from multiple records, performing calculations, and synthesizing a coherent response. Standard pipelines often stop after one retrieval pass, leading to shallow or incomplete answers. They also lack mechanisms to resolve contradictory statements across documents, and may deliver fragmented or misleading responses when a question touches on diverse domains.

Tools such as Learning-to-Rank frameworks and advanced retrieval modules powered by Large Language Models (LLMs) have improved semantic search by integrating user behavior data and heuristic rules. Yet even enhanced RAG variants—like ReAct and RQ-RAG—adhere to fixed logic flows. Their single-agent, one-shot document access model limits recovery from tool failures or dynamic reconfiguration of subtasks. Such models draw on features like click logs, dwell time, and content embeddings to improve ranking accuracy in straightforward searches.

To address these constraints, researchers at Baidu have rolled out a multi-agent architecture dubbed the AI Search Paradigm. It breaks the search process into four specialized roles: the Master, which oversees the entire workflow; the Planner, which decomposes complex requests into sub-queries; the Executor, which runs retrieval or external tool calls; and the Writer, which merges intermediate outputs into a final answer. Each agent communicates and adapts to changes in real time.

Under this framework, a Directed Acyclic Graph (DAG) captures dependencies among subtasks. The Planner selects appropriate tools from Baidu’s MCP servers for each node. When a retrieval or computation fails, the Executor reassigns queries or switches strategies on the fly. The Writer then filters out inconsistent data and crafts a structured narrative that addresses every branch of the DAG.

In a test comparing the lifespans of Emperor Wu of Han and Julius Caesar, the Planner split the request into tool calls for birthdate extraction and a subsequent age calculation step. The system retrieved that Emperor Wu of Han lived for 69 years and Julius Caesar for 56 years, yielding a 13-year gap. This end-to-end coordination across multiple agents produced a precise, traceable response.

Performance evaluations contrasted this paradigm with traditional RAG workflows. Three configurations were examined: Writer-Only, which limits agents to synthesis tasks; Executor-Inclusive, which adds dynamic tool invocation; and Planner-Enhanced, offering full DAG planning. Qualitative feedback pointed to higher user satisfaction and greater robustness when all four agents collaborated in the Planner-Enhanced mode.

The modular nature of the AI Search Paradigm allows ongoing expansion. Teams can integrate new tools, refine planning heuristics, or adjust execution policies without overhauling the core architecture. This structured, agent-based setup offers a clear pathway for building scalable, reliable search solutions that mirror human reasoning processes. Users reported clearer answer paths and simpler error tracing thanks to the explicit DAG structure underlying each query resolution.

In another development, Baidu has open-sourced the ERNIE 4.5 series, a lineup of foundation models crafted for advanced language understanding, multi-step reasoning, and text generation tasks. The code release includes pretrained weights and sample scripts to facilitate research adoption and benchmarking.

A recent piece titled “Introduction to Generalization in Mathematical Reasoning” explores how large-scale language models equipped with long chain-of-thought reasoning, such as DeepSeek-R1, have achieved strong results on Olympiad-level math problems. The article outlines challenges in transferring symbolic logic capabilities across different problem sets.

A hands-on tutorial demonstrates how to integrate AutoGen with Microsoft’s Semantic Kernel alongside Google’s Gemini Flash model. The write-up begins with environment setup, covers authentication, and walks through building a conversational AI that leverages both frameworks for enhanced response quality.

An overview on tabular machine learning discusses key benchmarking strategies for structured datasets. It reviews techniques for feature engineering, data splitting, and model comparison, stressing the need for reproducible pipelines and common evaluation metrics in enterprise settings.

Another report addresses the challenge of generating ultra-long text sequences that span thousands of words. It examines memory constraints in transformer architectures, progressive decoding techniques, and trade-offs between coherence and computational demand for narrative and legal document generation.

An introduction to Masked Diffusion Models (MDMs) surveys their use in discrete data generation, from text to symbolic sequences. The write-up highlights inefficiencies such as slow convergence and proposes learning-based acceleration methods to reduce sampling time without sacrificing quality.

A deep dive on learning-based robotics contrasts these systems with traditional rule-driven controllers. It outlines how neural policies trained on sensor data can adapt to dynamic environments and complex manipulation tasks, referencing both simulation studies and real-world trials.

An analysis of recent work on LLM safety highlights the need for robust code control frameworks. The article argues for versioning, formal verification, and runtime monitoring when deploying agentic language models in scientific and industrial contexts.

A guide to the Lilac library showcases a modular data analysis pipeline that operates without classic signal-processing steps. By chaining reusable components, users can perform feature extraction, statistical analysis, and visualization in a single coherent workflow.

A study on dexterous hand manipulation data collection highlights the persistent challenge of generating large, diverse datasets. Human-like hands offer rich degrees of freedom, yet capturing accurate motion and force data at scale requires specialized sensors and automated annotation tools.

Keep building

Join Skool — Ship Your First Microapp Back to feed