AGENTIC RETRIEVAL-AUGMENTED GENERATION: A SURVEY ON AGENTIC RAG
智能体增强检索生成:智能体RAG综述
Abul Ehtesham The Davey Tree Expert Company Kent, OH, USA abul.ehtesham@davey.com
Abul Ehtesham The Davey Tree Expert Company 美国俄亥俄州肯特市 abul.ehtesham@davey.com
ABSTRACT
摘要
Large Language Models (LLMs) have revolutionized artificial intelligence (AI) by enabling humanlike text generation and natural language understanding. However, their reliance on static training data limits their ability to respond to dynamic, real-time queries, resulting in outdated or inaccurate outputs. Retrieval-Augmented Generation (RAG) has emerged as a solution, enhancing LLMs by integrating real-time data retrieval to provide con textually relevant and up-to-date responses. Despite its promise, traditional RAG systems are constrained by static workflows and lack the adaptability required for multi-step reasoning and complex task management.
大语言模型 (LLMs) 通过实现类人文本生成和自然语言理解,彻底改变了人工智能 (AI)。然而,它们对静态训练数据的依赖限制了其响应动态、实时查询的能力,导致输出过时或不准确。检索增强生成 (RAG) 作为一种解决方案应运而生,通过集成实时数据检索来增强大语言模型,以提供上下文相关且最新的响应。尽管其前景广阔,但传统的 RAG 系统受限于静态工作流程,缺乏多步推理和复杂任务管理所需的适应性。
Agentic Retrieval-Augmented Generation (Agentic RAG) transcends these limitations by embedding autonomous AI agents into the RAG pipeline. These agents leverage agentic design patterns reflection, planning, tool use, and multi-agent collaboration to dynamically manage retrieval strategies, iterative ly refine contextual understanding, and adapt workflows through clearly defined operational structures ranging from sequential steps to adaptive collaboration. This integration enables Agentic RAG systems to deliver unparalleled flexibility, s cal ability, and context-awareness across diverse applications.
AI智能体增强检索生成 (Agentic RAG) 通过将自主AI智能体嵌入RAG管道,超越了这些限制。这些智能体利用智能体设计模式(如反思、规划、工具使用和多智能体协作)来动态管理检索策略,迭代地优化上下文理解,并通过从顺序步骤到自适应协作的明确操作结构来调整工作流程。这种集成使Agentic RAG系统能够在各种应用中提供无与伦比的灵活性、可扩展性和上下文感知能力。
This survey provides a comprehensive exploration of Agentic RAG, beginning with its foundational principles and the evolution of RAG paradigms. It presents a detailed taxonomy of Agentic RAG architectures, highlights key applications in industries such as healthcare, finance, and education, and examines practical implementation strategies. Additionally, it addresses challenges in scaling these systems, ensuring ethical decision-making, and optimizing performance for real-world applications, while providing detailed insights into frameworks and tools for implementing Agentic RAG 1. The GitHub link for this survey is available at: https://github.com/asinghcsu/AgenticRAG-Survey.
本综述全面探讨了Agentic RAG,从其基本原理和RAG范式的演变开始。它详细介绍了Agentic RAG架构的分类,突出了在医疗、金融和教育等行业中的关键应用,并探讨了实际实施策略。此外,它还讨论了在扩展这些系统、确保道德决策以及优化实际应用性能方面面临的挑战,同时提供了关于实现Agentic RAG的框架和工具的详细见解。本综述的GitHub链接为:https://github.com/asinghcsu/AgenticRAG-Survey。
Keywords Large Language Models (LLMs) $\cdot$ Artificial Intelligence (AI) $\cdot$ Natural Language Understanding · Retrieval-Augmented Generation (RAG) $\cdot$ Agentic RAG $\cdot$ Autonomous AI Agents $\cdot$ Reflection $\cdot$ Planning $\cdot$ Tool Use $\cdot$ Multi-Agent Collaboration $\cdot$ Agentic Patterns $\cdot$ Contextual Understanding $\cdot$ Dynamic Adaptability $\cdot$ S cal ability $\cdot$ Real-Time Data Retrieval $\cdot$ Taxonomy of Agentic RAG $\cdot$ Healthcare Applications $\cdot$ Finance Applications $\cdot$ Educational Applications $\cdot$ Ethical AI Decision-Making $\cdot$ Performance Optimization $\cdot$ Multi-Step Reasoning
关键词 大语言模型 (LLMs) $\cdot$ 人工智能 (AI) $\cdot$ 自然语言理解 $\cdot$ 检索增强生成 (RAG) $\cdot$ Agentic RAG $\cdot$ 自主AI智能体 $\cdot$ 反思 $\cdot$ 规划 $\cdot$ 工具使用 $\cdot$ 多智能体协作 $\cdot$ Agentic模式 $\cdot$ 上下文理解 $\cdot$ 动态适应性 $\cdot$ 可扩展性 $\cdot$ 实时数据检索 $\cdot$ Agentic RAG分类 $\cdot$ 医疗应用 $\cdot$ 金融应用 $\cdot$ 教育应用 $\cdot$ 伦理AI决策 $\cdot$ 性能优化 $\cdot$ 多步推理
1 Introduction
1 引言
Large Language Models (LLMs) [1, 2] [3], such as OpenAI’s GPT-4, Google’s PaLM, and Meta’s LLaMA, have significantly transformed artificial intelligence (AI) with their ability to generate human-like text and perform complex natural language processing tasks. These models have driven innovation across diverse domains, including conversational agents [4], automated content creation, and real-time translation. Recent advancements have extended their capabilities to multimodal tasks, such as text-to-image and text-to-video generation [5], enabling the creation and editing of videos and images from detailed prompts [6], which broadens the potential applications of generative AI.
大语言模型 (LLMs) [1, 2] [3],如 OpenAI 的 GPT-4、Google 的 PaLM 和 Meta 的 LLaMA,凭借其生成类人文本和执行复杂自然语言处理任务的能力,显著改变了人工智能 (AI)。这些模型推动了多个领域的创新,包括对话代理 [4]、自动化内容创作和实时翻译。最近的进展将这些模型的能力扩展到多模态任务,如文本到图像和文本到视频生成 [5],使得通过详细提示创建和编辑视频及图像成为可能 [6],从而拓宽了生成式 AI 的潜在应用。
Despite these advancements, LLMs face significant limitations due to their reliance on static pre-training data. This reliance often results in outdated information, hallucinated responses [7], and an inability to adapt to dynamic, real-world scenarios. These challenges emphasize the need for systems that can integrate real-time data and dynamically refine responses to maintain contextual relevance and accuracy.
尽管取得了这些进展,大语言模型 (LLM) 由于依赖静态的预训练数据,仍面临显著的局限性。这种依赖通常导致信息过时、虚构的响应 [7],以及无法适应动态的现实世界场景。这些挑战强调了需要能够整合实时数据并动态优化响应的系统,以保持上下文相关性和准确性。
Retrieval-Augmented Generation (RAG) [8, 9] emerged as a promising solution to these challenges. By combining the generative capabilities of LLMs with external retrieval mechanisms [10], RAG systems enhance the relevance and timeliness of responses. These systems retrieve real-time information from sources such as knowledge bases [11], APIs, or the web, effectively bridging the gap between static training data and the demands of dynamic applications. However, traditional RAG workflows remain limited by their linear and static design, which restricts their ability to perform complex multi-step reasoning, integrate deep contextual understanding, and iterative ly refine responses.
检索增强生成 (Retrieval-Augmented Generation, RAG) [8, 9] 作为应对这些挑战的有力解决方案应运而生。通过将大语言模型的生成能力与外部检索机制 [10] 相结合,RAG 系统提升了响应的相关性和时效性。这些系统从知识库 [11]、API 或网络等来源检索实时信息,有效地弥合了静态训练数据与动态应用需求之间的差距。然而,传统的 RAG 工作流程仍受限于其线性和静态的设计,这限制了其执行复杂多步推理、整合深度上下文理解以及迭代优化响应的能力。
The evolution of agents [12] has significantly enhanced the capabilities of AI systems. Modern agents, including LLM-powered and mobile agents [13], are intelligent entities capable of perceiving, reasoning, and autonomously executing tasks. These agents leverage agentic patterns, such as reflection [14], planning [15], tool use, and multi-agent collaboration [16], to enhance decision-making and adaptability.
智能体 (agents) 的演进 [12] 显著提升了 AI 系统的能力。现代智能体,包括基于大语言模型的智能体和移动智能体 [13],是能够感知、推理并自主执行任务的智能实体。这些智能体利用反思 [14]、规划 [15]、工具使用和多智能体协作 [16] 等智能体模式来增强决策能力和适应性。
Furthermore, these agents employ agentic workflow patterns [12, 13], such as prompt chaining, routing, parallel iz ation, orchestrator-worker models, and evaluator-optimizer , to structure and optimize task execution. By integrating these patterns, Agentic RAG systems can efficiently manage dynamic workflows and address complex problem-solving scenarios. The convergence of RAG and agentic intelligence has given rise to Agentic Retrieval-Augmented Generation (Agentic RAG) [14], a paradigm that integrates agents into the RAG pipeline. Agentic RAG enables dynamic retrieval strategies, contextual understanding, and iterative refinement [15], allowing for adaptive and efficient information processing. Unlike traditional RAG, Agentic RAG employs autonomous agents to orchestrate retrieval, filter relevant information, and refine responses, excelling in scenarios requiring precision and adaptability. The overview of Agentic RAG is in figure 1.
此外,这些智能体采用了智能工作流模式 [12, 13],例如提示链、路由、并行化、协调者-工作者模型和评估者-优化者,以构建和优化任务执行。通过整合这些模式,Agentic RAG 系统能够高效管理动态工作流并应对复杂的解决问题的场景。RAG 和智能体智能的融合催生了 Agentic Retrieval-Augmented Generation (Agentic RAG) [14],这是一种将智能体整合到 RAG 管道中的范式。Agentic RAG 实现了动态检索策略、上下文理解和迭代优化 [15],从而实现了自适应和高效的信息处理。与传统的 RAG 不同,Agentic RAG 利用自主智能体来协调检索、过滤相关信息并优化响应,在需要精确性和适应性的场景中表现出色。Agentic RAG 的概述见图 1。
This survey explores the foundational principles, taxonomy, and applications of Agentic RAG. It provides a comprehensive overview of RAG paradigms, such as Naïve RAG, Modular RAG, and Graph RAG [16], alongside their evolution into Agentic RAG systems. Key contributions include a detailed taxonomy of Agentic RAG frameworks, applications across domains such as healthcare [17, 18], finance, and education [19], and insights into implementation strategies, benchmarks, and ethical considerations.
本调查探讨了Agentic RAG的基础原理、分类及应用。它全面概述了RAG范式,如Naïve RAG、Modular RAG和Graph RAG [16],以及它们如何演变为Agentic RAG系统。主要贡献包括Agentic RAG框架的详细分类、跨领域的应用(如医疗 [17, 18]、金融和教育 [19]),以及对实施策略、基准和伦理考量的深入见解。
The structure of this paper is as follows: Section 2 introduces RAG and its evolution, highlighting the limitations of traditional approaches. Section 3 elaborates on the principles of agentic intelligence and agentic patterns. Section 4 elaborates agentic workflow patterns. Section 5 provides a taxonomy of Agentic RAG systems, including single-agent, multi-agent, and graph-based frameworks. Section 6 examines applications of Agentic RAG, while Section 7 discusses implementation tools and frameworks. Section 8 focuses on benchmarks and dataset, and Section 9 concludes with future directions for Agentic RAG systems.
本文结构如下:第 2 节介绍 RAG 及其演进,强调传统方法的局限性。第 3 节详细阐述智能体智能 (Agentic Intelligence) 和智能体模式 (Agentic Patterns) 的原则。第 4 节详细阐述智能体工作流模式。第 5 节提供了智能体 RAG 系统的分类,包括单智能体、多智能体和基于图的框架。第 6 节探讨智能体 RAG 的应用,而第 7 节讨论实现工具和框架。第 8 节重点介绍基准测试和数据集,第 9 节总结智能体 RAG 系统的未来方向。
2 Foundations of Retrieval-Augmented Generation
2 检索增强生成的基础
2.1 Overview of Retrieval-Augmented Generation (RAG)
2.1 检索增强生成 (Retrieval-Augmented Generation, RAG) 概述
Retrieval-Augmented Generation (RAG) represents a significant advancement in the field of artificial intelligence, combining the generative capabilities of Large Language Models (LLMs) with real-time data retrieval. While LLMs have demonstrated remarkable capabilities in natural language processing, their reliance on static pre-trained data often results in outdated or incomplete responses. RAG addresses this limitation by dynamically retrieving relevant information from external sources and incorporating it into the generative process, enabling con textually accurate and up-to-date outputs.
检索增强生成 (RAG) 代表着人工智能领域的重大进步,它将大语言模型 (LLMs) 的生成能力与实时数据检索相结合。虽然 LLMs 在自然语言处理方面展示了卓越的能力,但它们对静态预训练数据的依赖常常导致过时或不完整的响应。RAG 通过动态地从外部来源检索相关信息并将其整合到生成过程中,解决了这一限制,从而实现了上下文准确且最新的输出。

Figure 1: An Overview of Agentic RAG
图 1: Agentic RAG 概览
2.2 Core Components of RAG
2.2 RAG的核心组件
The architecture of RAG systems integrates three primary components (Figure2):
RAG系统的架构集成了三个主要组件(图2):
2.3 Evolution of RAG Paradigms
2.3 RAG范式的演变
The field of Retrieval-Augmented Generation (RAG) has evolved significantly to address the increasing complexity of real-world applications, where contextual accuracy, s cal ability, and multi-step reasoning are critical. What began as simple keyword-based retrieval has transitioned into sophisticated, modular, and adaptive systems capable of integrating diverse data sources and autonomous decision-making processes. This evolution underscores the growing need for RAG systems to handle complex queries efficiently and effectively.
检索增强生成 (Retrieval-Augmented Generation, RAG) 领域已经显著发展,以应对现实应用中日益增长的复杂性,其中上下文准确性、可扩展性和多步推理至关重要。从最初基于关键词的简单检索,RAG 已转变为能够整合多样化数据源和自主决策过程的复杂、模块化和自适应系统。这一演变凸显了 RAG 系统高效处理复杂查询的日益增长的需求。

Figure 2: Core Components of RAG
图 2: RAG 的核心组件
This section examines the progression of RAG paradigms, presenting key stages of development—Naïve RAG, Advanced RAG, Modular RAG, Graph RAG, and Agentic RAG alongside their defining characteristics, strengths, and limitations. By understanding the evolution of these paradigms, readers can appreciate the advancements made in retrieval and generative capabilities and their application in various domains
本节探讨了 RAG (Retrieval-Augmented Generation) 范式的发展历程,介绍了其关键发展阶段——Naïve RAG、Advanced RAG、Modular RAG、Graph RAG 和 Agentic RAG,以及它们的特点、优势和局限性。通过了解这些范式的演变,读者可以更好地理解检索和生成能力的进步及其在各个领域的应用。
2.3.1 Naïve RAG
2.3.1 朴素 RAG
Naïve RAG [20] represents the foundational implementation of retrieval-augmented generation. Figure 3 illustrates the simple retrieve-read workflow of Naive RAG, focusing on keyword-based retrieval and static datasets.. These systems rely on simple keyword-based retrieval techniques, such as TF-IDF and BM25, to fetch documents from static datasets. The retrieved documents are then used to augment the language model’s generative capabilities.
Naïve RAG [20] 代表了检索增强生成的基础实现。图 3 展示了 Naive RAG 的简单检索-阅读工作流程,重点关注基于关键字的检索和静态数据集。这些系统依赖于简单的基于关键字的检索技术,例如 TF-IDF 和 BM25,从静态数据集中获取文档。检索到的文档随后用于增强语言模型的生成能力。

Figure 3: An Overview of Naive RAG.
图 3: Naive RAG 概述
Naïve RAG is characterized by its simplicity and ease of implementation, making it suitable for tasks involving fact-based queries with minimal contextual complexity. However, it suffers from several limitations:
朴素 RAG 因其简单易实现的特点,适合处理基于事实且上下文复杂度较低的查询任务。然而,它也存在一些局限性:
Despite these limitations, Naïve RAG systems provided a critical proof-of-concept for integrating retrieval with generation, laying the foundation for more sophisticated paradigms.
尽管存在这些限制,Naïve RAG 系统为将检索与生成结合提供了重要的概念验证,为更复杂的范式奠定了基础。
2.3.2 Advanced RAG
2.3.2 高级 RAG
Advanced RAG [20] systems build upon the limitations of Naïve RAG by incorporating semantic understanding and enhanced retrieval techniques. Figure 4 highlights the semantic enhancements in retrieval and the iterative, contextaware pipeline of Advanced RAG. These systems leverage dense retrieval models, such as Dense Passage Retrieval (DPR), and neural ranking algorithms to improve retrieval precision.
高级 RAG [20] 系统在朴素 RAG 的局限性基础上,通过结合语义理解和增强的检索技术进行了改进。图 4 展示了高级 RAG 在检索中的语义增强以及迭代、上下文感知的流程。这些系统利用密集检索模型(如 Dense Passage Retrieval (DPR))和神经排序算法来提高检索精度。

Figure 4: Overview of Advanced RAG
图 4: 高级 RAG 概览
Key features of Advanced RAG include:
高级 RAG 的关键特性包括:
These advancements make Advanced RAG suitable for applications requiring high precision and nuanced understanding, such as research synthesis and personalized recommendations. However, challenges such as computational overhead and limited s cal ability persist, particularly when dealing with large datasets or multi-step queries.
这些进步使得 Advanced RAG 适用于需要高精度和细致理解的应用,例如研究综合和个性化推荐。然而,计算开销和有限的可扩展性等挑战仍然存在,尤其是在处理大型数据集或多步查询时。
2.3.3 Modular RAG
2.3.3 模块化RAG
Modular RAG [20] represents the latest evolution in RAG paradigms, emphasizing flexibility and customization. These systems decompose the retrieval and generation pipeline into independent, reusable components, enabling domain-specific optimization and task adaptability. Figure 5 demonstrates the modular architecture, showcasing hybrid retrieval strategies, composable pipelines, and external tool integration.
模块化RAG [20]代表了RAG范式的最新演进,强调灵活性和可定制性。这些系统将检索和生成流水线分解为独立的、可重用的组件,从而实现领域特定的优化和任务适应性。图5展示了模块化架构,展示了混合检索策略、可组合的流水线以及外部工具集成。
Key innovations in Modular RAG include:
模块化 RAG 的关键创新包括:
For instance, a Modular RAG system designed for financial analytics might retrieve live stock prices via APIs, analyze historical trends using dense retrieval, and generate actionable investment insights through a tailored language model. This modularity and customization make Modular RAG ideal for complex, multi-domain tasks, offering both s cal ability and precision.
例如,专为金融分析设计的模块化 RAG 系统可以通过 API 获取实时股票价格,使用密集检索分析历史趋势,并通过定制的语言模型生成可操作的投资见解。这种模块化和定制化使模块化 RAG 成为处理复杂、多领域任务的理想选择,既具备可扩展性又保证了精确性。

Figure 5: Overview of Modular RAG
图 5: 模块化 RAG 概览
2.3.4 Graph RAG
2.3.4 Graph RAG
Graph RAG [16] extends traditional Retrieval-Augmented Generation systems by integrating graph-based data structures as illustrated in Figure 6. These systems leverage the relationships and hierarchies within graph data to enhance multihop reasoning and contextual enrichment. By incorporating graph-based retrieval, Graph RAG enables richer and more accurate generative outputs, particularly for tasks requiring relational understanding.
Graph RAG [16] 通过集成基于图的数据结构扩展了传统的检索增强生成 (Retrieval-Augmented Generation) 系统,如图 6 所示。这些系统利用图数据中的关系和层次结构来增强多跳推理和上下文丰富性。通过引入基于图的检索,Graph RAG 能够生成更丰富、更准确的输出,特别是在需要关系理解的任务中。
Graph RAG is characterized by its ability to:
图 RAG 的特点在于其能够:
However, Graph RAG has some limitations:
然而,Graph RAG存在一些局限性:
Graph RAG is well-suited for applications such as healthcare diagnostics, legal research, and other domains where reasoning over structured relationships is crucial.
Graph RAG 非常适合应用于医疗诊断、法律研究以及其他需要对结构化关系进行推理的领域。
2.3.5 Agentic RAG
2.3.5 智能 RAG
Agentic RAG represents a paradigm shift by introducing autonomous agents capable of dynamic decision-making and workflow optimization. Unlike static systems, Agentic RAG employs iterative refinement and adaptive retrieval strategies to address complex, real-time, and multi-domain queries. This paradigm leverages the modularity of retrieval and generation processes while introducing agent-based autonomy.
Agentic RAG 通过引入能够动态决策和工作流优化的自主智能体,代表了一种范式转变。与静态系统不同,Agentic RAG 采用迭代优化和自适应检索策略来处理复杂、实时和多领域的查询。这种范式利用检索和生成过程的模块化,同时引入基于智能体的自主性。

Figure 6: Overview of Graph RAG
图 6: Graph RAG 概述
Key characteristics of Agentic RAG include:
Agentic RAG 的关键特点包括:
• Autonomous Decision-Making: Agents independently evaluate and manage retrieval strategies based on query complexity. • Iterative Refinement: Incorporates feedback loops to improve retrieval accuracy and response relevance. • Workflow Optimization: Dynamically orchestrates tasks, enabling efficiency in real-time applications.
• 自主决策:AI智能体根据查询复杂度独立评估和管理检索策略。
• 迭代优化:引入反馈循环以提高检索准确性和响应相关性。
• 工作流优化:动态协调任务,提升实时应用的效率。
Despite its advancements, Agentic RAG faces some challenges:
尽管取得了进展,AI智能体 RAG 仍面临一些挑战:
Agentic RAG excels in domains like customer support, financial analytics, and adaptive learning platforms, where dynamic adaptability and contextual precision are paramount.
Agentic RAG 在客户支持、金融分析和自适应学习平台等领域表现出色,这些领域对动态适应性和上下文精确性要求极高。
2.4 Challenges and Limitations of Traditional RAG Systems
2.4 传统RAG系统的挑战与局限
Traditional Retrieval-Augmented Generation (RAG) systems have significantly expanded the capabilities of Large Language Models (LLMs) by integrating real-time data retrieval. However, these systems still face critical challenges that hinder their effectiveness in complex, real-world applications. The most notable limitations revolve around contextual integration, multi-step reasoning, and s cal ability and latency issues.
传统检索增强生成 (Retrieval-Augmented Generation, RAG) 系统通过集成实时数据检索,显著扩展了大语言模型 (LLMs) 的能力。然而,这些系统在复杂的实际应用中仍然面临关键的挑战,阻碍了其有效性。最显著的局限性集中在上下文整合、多步推理以及可扩展性和延迟问题上。
2.4.1 Contextual Integration
2.4.1 上下文集成
Even when RAG systems successfully retrieve relevant information, they often struggle to seamlessly incorporate it into generated responses. The static nature of retrieval pipelines and limited contextual awareness lead to fragmented, inconsistent, or overly generic outputs.
即使 RAG 系统成功检索到相关信息,它们也经常难以将其无缝整合到生成的响应中。检索管道的静态特性和有限的上下文感知导致输出碎片化、不一致或过于通用。
Example: A query such as, "What are the latest advancements in Alzheimer’s research and their implications for early-stage treatment?" might yield relevant research papers and medical guidelines. However, traditional RAG systems often fail to synthesize these findings into a coherent explanation that connects the new treatments to specific patient scenarios. Similarly, for a query like, "What are the best sustainable practices for small-scale agriculture in arid regions?", traditional systems might retrieve documents on general agricultural methods but overlook critical sustainability practices tailored to arid environments.
例如,对于查询“阿尔茨海默病研究的最新进展及其对早期治疗的影响是什么?”,可能会检索到相关的研究论文和医疗指南。然而,传统的 RAG 系统通常无法将这些发现综合成一个连贯的解释,将新疗法与具体的患者情境联系起来。同样,对于查询“在干旱地区,小规模农业的最佳可持续实践是什么?”,传统系统可能会检索到关于一般农业方法的文档,但忽视了针对干旱环境量身定制的关键可持续实践。
Table 1: Comparative Analysis of RAG Paradigms
| Paradigm | Key Features | Strengths |
| Naive RAG | · Keyword-based retrieval (e.g., TF-IDF, BM25) | · Simple and easy to implement · Suitable for fact-based queries |
| Advanced RAG | · Dense retrieval models (e.g., DPR) · Neural ranking and re-ranking · Multi-hop retrieval | · High precision retrieval · Improved contextual relevance |
| Modular RAG | · Hybrid retrieval (sparse and dense) · Tool and API integration · Composable, domain-specific pipelines | · High flexibility and customization · Suitable for diverse applications · Scalable |
| Graph RAG | · Integration of graph-based structures · Multi-hop reasoning · Contextual enrichment via nodes | · Relational reasoning capabilities · Mitigates hallucinations · Ideal for structured data tasks |
| Agentic RAG | · Autonomous agents · Dynamic decision-making ·Iterative refinement and work- flow optimization | · Adaptable to real-time changes ·Scalable for multi-domain tasks · High accuracy |
表 1: RAG 范式对比分析
| 范式 | 关键特性 | 优势 |
|---|---|---|
| 朴素 RAG (Naive RAG) | · 基于关键词的检索(例如 TF-IDF、BM25) | · 简单易实现 · 适合基于事实的查询 |
| 高级 RAG (Advanced RAG) | · 密集检索模型(例如 DPR) · 神经排序与重排序 · 多跳检索 | · 高精度检索 · 上下文相关性提升 |
| 模块化 RAG (Modular RAG) | · 混合检索(稀疏与密集) · 工具与 API 集成 · 可组合的领域特定管道 | · 高度灵活和可定制 · 适合多样化应用 · 可扩展 |
| 图 RAG (Graph RAG) | · 基于图的结构集成 · 多跳推理 · 通过节点进行上下文丰富 | · 关系推理能力 · 减少幻觉 · 适合结构化数据任务 |
| 智能体 RAG (Agentic RAG) | · 自主智能体 · 动态决策 · 迭代优化与工作流优化 | · 适应实时变化 · 多领域任务可扩展 · 高准确性 |
2.4.2 Multi-Step Reasoning
2.4.2 多步推理
Many real-world queries require iterative or multi-hop reasoning—retrieving and synthesizing information across multiple steps. Traditional RAG systems are often ill-equipped to refine retrieval based on intermediate insights or user feedback, resulting in incomplete or disjointed responses.
许多现实世界中的查询需要迭代或多步推理——在多个步骤中检索和综合信息。传统的RAG系统通常无法根据中间洞察或用户反馈来优化检索,导致响应不完整或不连贯。
Example: A complex query like, "What lessons from renewable energy policies in Europe can be applied to developing nations, and what are the potential economic impacts?" demands the orchestration of multiple types of information, including policy data, contextual iz ation for developing regions, and economic analysis. Traditional RAG systems typically fail to connect these disparate elements into a cohesive response.
示例:一个复杂的查询,如“欧洲可再生能源政策中的哪些经验可以应用于发展中国家,以及潜在的经济影响是什么?”需要协调多种类型的信息,包括政策数据、发展中国家的背景信息和经济分析。传统的 RAG 系统通常无法将这些不同的元素连接成一个连贯的响应。
2.4.3 S cal ability and Latency Issues
2.4.3 可扩展性与延迟问题
As the volume of external data sources grows, querying and ranking large datasets becomes increasingly computationally intensive. This results in significant latency, which undermines the system’s ability to provide timely responses in real-time applications.
随着外部数据源数量的增长,查询和排序大型数据集的计算需求日益增加。这导致了显著的延迟,从而削弱了系统在实时应用中提供及时响应的能力。
Example: In time-sensitive settings such as financial analytics or live customer support, delays caused by querying multiple databases or processing large document sets can hinder the system’s overall utility. For example, a delay in retrieving market trends during high-frequency trading could result in missed opportunities.
示例:在金融分析或实时客户支持等时间敏感的场景中,查询多个数据库或处理大量文档集导致的延迟可能会影响系统的整体效用。例如,在高频交易中,检索市场趋势的延迟可能会导致错失机会。
2.5 Agentic RAG: A Paradigm Shift
2.5 Agentic RAG: 范式转变
Traditional RAG systems, with their static workflows and limited adaptability, often struggle to handle dynamic, multistep reasoning and complex real-world tasks. These limitations have spurred the integration of agentic intelligence, resulting in Agentic RAG. By incorporating autonomous agents capable of dynamic decision-making, iterative reasoning, and adaptive retrieval strategies, Agentic RAG builds on the modularity of earlier paradigms while overcoming their inherent constraints. This evolution enables more complex, multi-domain tasks to be addressed with enhanced precision and contextual understanding, positioning Agentic RAG as a cornerstone for next-generation AI applications. In particular, Agentic RAG systems reduce latency through optimized workflows and refine outputs iterative ly, tackling the very challenges that have historically hindered traditional RAG’s s cal ability and effectiveness.
传统 RAG 系统由于其静态的工作流程和有限的适应性,往往难以处理动态、多步推理和复杂的现实任务。这些局限性推动了智能体(AI Agent)的集成,从而催生了 Agentic RAG。通过引入能够进行动态决策、迭代推理和自适应检索策略的自主智能体,Agentic RAG 在早期范式的模块化基础上,克服了其固有的限制。这一演进使得更复杂、多领域的任务能够以更高的精度和上下文理解得以解决,将 Agentic RAG 定位为下一代人工智能应用的基石。特别是,Agentic RAG 系统通过优化工作流程降低了延迟,并通过迭代优化输出,解决了历史上阻碍传统 RAG 扩展能力和有效性的关键挑战。
3 Core Principles and Background of Agentic Intelligence
3 AI智能体核心原则与背景
Agentic Intelligence forms the foundation of Agentic Retrieval-Augmented Generation (RAG) systems, enabling them to transcend the static and reactive nature of traditional RAG. By integrating autonomous agents capable of dynamic decision-making, iterative reasoning, and collaborative workflows, Agentic RAG systems exhibit enhanced adaptability and precision. This section explores the core principles underpinning agentic intelligence.
智能体智能是智能体检索增强生成(Agentic RAG)系统的基础,使其能够超越传统RAG的静态和反应性特性。通过集成能够动态决策、迭代推理和协作工作流的自主智能体,Agentic RAG系统展现出更强的适应性和精确性。本节探讨了支撑智能体智能的核心原则。
Components of an AI Agent. In essence, an AI agent comprises (Figure. 7):
AI智能体的组成部分。本质上,一个AI智能体包括(图 7):

Figure 7: An Overview of AI Agents
图 7: AI智能体概览
Agentic Patterns [25, 26] provide structured methodologies that guide the behavior of agents in Agentic RetrievalAugmented Generation (RAG) systems. These patterns enable agents to dynamically adapt, plan, and collaborate, ensuring that the system can handle complex, real-world tasks with precision and s cal ability. Four key patterns underpin agentic workflows:
智能体模式 [25, 26] 为智能体在智能检索增强生成 (RAG) 系统中的行为提供了结构化方法论。这些模式使智能体能够动态适应、规划和协作,确保系统能够精确且可扩展地处理复杂的现实任务。智能体工作流基于四大关键模式:
3.0.1 Reflection
3.0.1 反思
Reflection is a foundational design pattern in agentic workflows, enabling agents to iterative ly evaluate and refine their outputs. By incorporating self-feedback mechanisms, agents can identify and address errors, inconsistencies, and areas for improvement, enhancing performance across tasks like code generation, text production, and question answering ( as shown in Figure 8). In practical use, Reflection involves prompting an agent to critique its outputs for correctness, style, and efficiency, then incorporating this feedback into subsequent iterations. External tools, such as unit tests or web searches, can further enhance this process by validating results and highlighting gaps.
反思是智能工作流程中的一个基础设计模式,使智能体能够迭代地评估和改进其输出。通过引入自我反馈机制,智能体可以识别并纠正错误、不一致性以及改进的领域,从而提升代码生成、文本生成和问答等任务的表现(如图 8 所示)。在实际应用中,反思包括提示智能体对其输出的正确性、风格和效率进行批评,然后将这些反馈纳入后续迭代中。外部工具(如单元测试或网络搜索)可以通过验证结果和突出差距来进一步增强这一过程。
In multi-agent systems, Reflection can involve distinct roles, such as one agent generating outputs while another critiques them, fostering collaborative improvement. For instance, in legal research, agents can iterative ly refine responses by re-evaluating retrieved case law, ensuring accuracy and comprehensiveness. Reflection has demonstrated significant performance improvements in studies like Self-Refine [27], Reflexion [28], and CRITIC [23].
在多智能体系统中,反思可以涉及不同的角色,例如一个智能体生成输出,而另一个智能体对其进行批判,从而促进协作改进。例如,在法律研究中,智能体可以通过重新评估检索到的判例法来迭代地精炼响应,确保准确性和全面性。反思在诸如 Self-Refine [27]、Reflexion [28] 和 CRITIC [23] 等研究中展示了显著的性能提升。

Figure 8: An Overview of Agentic Self- Reflection
图 8: AI智能体自我反思概览
3.0.2 Planning
3.0.2 规划
Planning [24] is a key design pattern in agentic workflows that enables agents to autonomously decompose complex tasks into smaller, manageable subtasks. This capability is essential for multi-hop reasoning and iterative problem-solving in dynamic and uncertain scenarios as shown in Figure 9a.
规划 [24] 是AI智能体工作流中的关键设计模式,它使智能体能够自主地将复杂任务分解为更小、可管理的子任务。这种能力对于在动态和不确定场景中进行多跳推理和迭代问题解决至关重要,如图 9a 所示。
3.0.3 Tool Use
3.0.3 工具使用
Tool Use enables agents to extend their capabilities by interacting with external tools, APIs, or computational resources as illustrated in 9b. This pattern allows agents to gather information, perform computations, and manipulate data beyond their pre-trained knowledge. By dynamically integrating tools into workflows, agents can adapt to complex tasks and provide more accurate and con textually relevant outputs.
工具使用使AI智能体能够通过与外部工具、API或计算资源交互来扩展其能力,如图9b所示。该模式允许AI智能体收集信息、执行计算并处理超出其预训练知识范围的数据。通过将工具动态集成到工作流中,AI智能体可以适应复杂任务,并提供更准确且与上下文相关的输出。
Modern agentic workflows incorporate tool use for a variety of applications, including information retrieval, computational reasoning, and interfacing with external systems. The implementation of this pattern has evolved significantly with advancements like GPT-4’s function calling capabilities and systems capable of managing access to numerous tools. These developments facilitate sophisticated workflows where agents autonomously select and execute the most relevant tools for a given task.
现代智能体工作流结合了多种应用的工具使用,包括信息检索、计算推理以及与外部系统的接口。随着GPT-4的函数调用能力和能够管理众多工具访问的系统的进步,这种模式的实现已经显著发展。这些进展促进了复杂的工作流,其中智能体能够自主选择并执行最相关的工具来完成特定任务。
While tool use significantly enhances agentic workflows, challenges remain in optimizing the selection of tools, particularly in contexts with a large number of available options. Techniques inspired by retrieval-augmented generation (RAG), such as heuristic-based selection, have been proposed to address this issue.
尽管工具使用显著增强了AI智能体工作流,但在优化工具选择方面仍存在挑战,尤其是在可用选项众多的场景下。启发式选择等受检索增强生成(RAG)启发的技术已被提出以解决这一问题。
3.0.4 Multi-Agent
3.0.4 多智能体 (Multi-Agent)
Multi-agent collaboration [29] is a key design pattern in agentic workflows that enables task specialization and parallel processing. Agents communicate and share intermediate results, ensuring the overall workflow remains efficient and coherent. By distributing subtasks among specialized agents, this pattern improves the s cal ability and adaptability of complex workflows. Multi-agent systems allow developers to decompose intricate tasks into smaller, manageable subtasks assigned to different agents. This approach not only enhances task performance but also provides a robust framework for managing complex interactions. Each agent operates with its own memory and workflow, which can include the use of tools, reflection, or planning, enabling dynamic and collaborative problem-solving (see Figure 10).
多智能体协作 [29] 是智能体工作流中的关键设计模式,它实现了任务专业化和并行处理。智能体通过通信和共享中间结果,确保整体工作流保持高效和连贯。通过将子任务分配给专门的智能体,这种模式提高了复杂工作流的可扩展性和适应性。多智能体系统允许开发者将复杂任务分解为更小、更易于管理的子任务,并分配给不同的智能体。这种方法不仅提升了任务性能,还为管理复杂的交互提供了稳健的框架。每个智能体都有其独立的内存和工作流,其中可能包括工具使用、反思或规划,从而实现动态和协作式的问题解决(见图 10)。

Figure 9: Overview of Agentic Planning and Tool Use
图 9: AI智能体规划与工具使用概览
While multi-agent collaboration offers significant potential, it is a less predictable design pattern compared to more mature workflows like Reflection and Tool Use. Nevertheless, emerging frameworks such as AutoGen, Crew AI, and LangGraph are providing new avenues for implementing effective multi-agent solutions.
尽管多智能体协作具有巨大的潜力,但与更成熟的工作流程(如反思和工具使用)相比,它是一种不太可预测的设计模式。然而,新兴框架如 AutoGen、Crew AI 和 LangGraph 正在为实施有效的多智能体解决方案提供新的途径。

Figure 10: An Overview of MultiAgent
图 10: 多智能体概览
These design patterns form the foundation for the success of Agentic RAG systems. By structuring workflows—from simple, sequential steps to more adaptive, collaborative processes—these patterns enable systems to dynamically adapt their retrieval and generative strategies to the diverse and ever-changing demands of real-world environments. Leveraging these patterns, agents are capable of handling iterative, context-aware tasks that significantly exceed the capabilities of traditional RAG systems.
这些设计模式构成了Agentic RAG系统成功的基础。通过构建从简单、顺序的步骤到更具适应性、协作性的流程的工作流,这些模式使系统能够动态调整其检索和生成策略,以应对现实环境中多样且不断变化的需求。利用这些模式,AI智能体能够处理迭代的、上下文感知的任务,这些任务远超传统RAG系统的能力。
4 Agentic Workflow Patterns: Adaptive Strategies for Dynamic Collaboration
4 智能工作流模式:动态协作的自适应策略
4.1 Prompt Chaining: Enhancing Accuracy Through Sequential Processing
4.1 Prompt Chaining:通过顺序处理提升准确性
Prompt chaining [12, 13] decomposes a complex task into multiple steps, where each step builds upon the previous one. This structured approach improves accuracy by simplifying each subtask before moving forward. However, it may increase latency due to sequential processing.
提示链 (Prompt chaining) [12, 13] 将一个复杂任务分解为多个步骤,其中每个步骤都建立在前一个步骤的基础上。这种结构化方法通过简化每个子任务来提高准确性,但由于顺序处理,可能会增加延迟。

Figure 11: Illustration of Prompt Chaining Workflow
图 11: 提示链工作流程示意图
When to Use: This workflow is most effective when a task can be broken down into fixed subtasks, each contributing to the final output. It is particularly useful in scenarios where step-by-step reasoning enhances accuracy.
使用时机:当任务可以分解为固定的子任务,每个子任务都对最终输出有贡献时,此工作流程最为有效。在逐步推理能提高准确性的场景中尤为有用。
Example Applications:
示例应用:
• Generating marketing content in one language and then translating it into another while preserving nuances. • Structuring document creation by first generating an outline, verifying its completeness, and then developing the full text.
• 以一种语言生成营销内容,然后将其翻译成另一种语言,同时保留细微差别。 • 通过首先生成大纲、验证其完整性,然后开发全文来结构化文档创建。
4.2 Routing:Directing Inputs to Specialized Processes
4.2 路由:将输入定向到专用处理流程
Routing [12, 13] involves classifying an input and directing it to an appropriate specialized prompt or process. This method ensures distinct queries or tasks are handled separately, improving efficiency and response quality.
路由 (Routing) [12, 13] 涉及对输入进行分类并将其引导至适当的专用提示或处理过程。该方法确保不同的查询或任务被分开处理,从而提高效率和响应质量。

Figure 12: Illustration Routing Workflow
图 12: 路由工作流程示意图
When to Use: Ideal for scenarios where different types of input require distinct handling strategies, ensuring optimized performance for each category.
使用场景:适用于需要根据不同输入类型采取不同处理策略的场景,以确保每种类别的最佳性能。
Example Applications:
示例应用:
• Directing customer service queries into categories such as technical support, refund requests, or general inquiries. • Assigning simple queries to smaller models for cost efficiency, while complex requests go to advanced models.
• 将客户服务查询分类为技术支持、退款请求或一般咨询。• 将简单查询分配给较小的模型以节省成本,而复杂请求则交给高级模型处理。
4.3 Parallel iz ation: Speeding Up Processing Through Concurrent Execution
4.3 并行化:通过并发执行加速处理
Parallel iz ation [12, 13] divides a task into independent processes that run simultaneously, reducing latency and improving throughput. It can be categorized into sectioning (independent subtasks) and voting (multiple outputs for accuracy).
并行化 [12, 13] 将任务划分为同时运行的独立进程,从而减少延迟并提高吞吐量。它可以分为分段(独立的子任务)和投票(多个输出以提高准确性)。

Figure 13: Illustration of Parallel iz ation Workflow
图 13: 并行化工作流程示意图
When to Use: Useful when tasks can be executed independently to enhance speed or when multiple outputs improve confidence.
使用场景:适用于任务可以独立执行以提高速度,或多个输出可以提升置信度的情况。
Example Applications:
示例应用:
• Sectioning: Splitting tasks like content moderation, where one model screens input while another generates a response. • Voting: Using multiple models to cross-check code for vulnerabilities or analyze content moderation decisions.
• 分段处理:将任务拆分,例如内容审核,一个模型筛选输入,另一个生成响应。
• 投票机制:使用多个模型交叉检查代码漏洞或分析内容审核决策。
4.4 Orchestrator-Workers: Dynamic Task Delegation
4.4 协调器-工作者:动态任务委派
This workflow [12, 13] features a central orchestrator model that dynamically breaks tasks into subtasks, assigns them to specialized worker models, and compiles the results. Unlike parallel iz ation, it adapts to varying input complexity.
该工作流 [12, 13] 的核心是一个中央协调器模型,它动态地将任务分解为子任务,分配给专门的工人模型,并汇总结果。与并行化不同,它能适应不同的输入复杂度。

Figure 14: Illustration of Orchestrator-Workers Workflow
图 14: Orchestrator-Workers 工作流程示意图
When to Use: Best suited for tasks requiring dynamic decomposition and real-time adaptation, where subtasks are not predefined.
适用场景:最适合需要动态分解和实时适应的任务,其中子任务未预定义。
Example Applications:
示例应用:
• Automatically modifying multiple files in a codebase based on the nature of requested changes.
• 根据请求更改的性质自动修改代码库中的多个文件。
• Conducting real-time research by gathering and synthesizing relevant information from multiple sources.
• 通过从多个来源收集和综合相关信息进行实时研究。
4.5 Evaluator-Optimizer: Refining Output Through Iteration
4.5 评估器-优化器:通过迭代优化输出
The evaluator-optimizer [12, 13] workflow iterative ly improves content by generating an initial output and refining it based on feedback from an evaluation model.
评估器-优化器 [12, 13] 工作流程通过生成初始输出并根据评估模型的反馈进行迭代改进内容。

Figure 15: Illustration of Evaluator-Optimizer Workflow
图 15: 评估-优化工作流程示意图
When to Use: Effective when iterative refinement significantly enhances response quality, especially when clear evaluation criteria exist.
何时使用:当迭代优化显著提升响应质量时,尤其是在存在明确评估标准的情况下。
Example Applications:
示例应用:
4.6 Taxonomy of Agentic RAG Systems
4.6 AI智能体RAG系统分类
Agentic Retrieval-Augmented Generation (RAG) systems can be categorized into distinct architectural frameworks based on their complexity and design principles. These include single-agent architectures, multi-agent systems, and hierarchical agentic architectures. Each framework is tailored to address specific challenges and optimize performance for diverse applications. This section provides a detailed taxonomy of these architectures, highlighting their characteristics, strengths, and limitations.
基于代理的检索增强生成 (RAG) 系统可以根据其复杂性和设计原则分为不同的架构框架。这些框架包括单智能体架构、多智能体系统和分层代理架构。每个框架都是为了应对特定挑战并优化不同应用的性能而设计的。本节详细介绍了这些架构的分类,重点介绍了它们的特征、优势和局限性。
4.6.1 Single-Agent Agentic RAG: Router
4.6.1 单智能体 AI智能体 RAG: Router
A Single-Agent Agentic RAG: [30] serves as a centralized decision-making system where a single agent manages the retrieval, routing, and integration of information (as shown in Figure. 16). This architecture simplifies the system by consolidating these tasks into one unified agent, making it particularly effective for setups with a limited number of tools or data sources.
单智能体代理 RAG:[30] 作为一个集中决策系统,其中单个智能体管理信息的检索、路由和整合(如图 16 所示)。这种架构通过将这些任务整合到一个统一的智能体中,简化了系统,特别适用于工具或数据源有限的设置。
Workflow
工作流
- Data Integration and LLM Synthesis: Once the relevant data is retrieved from the chosen sources, it is passed to a Large Language Model (LLM). The LLM synthesizes the gathered information, integrating insights from multiple sources into a coherent and con textually relevant response.
数据集成与大语言模型合成:从选定来源检索到相关数据后,将其传递给大语言模型 (LLM)。LLM 对收集到的信息进行合成,将来自多个来源的见解整合成一个连贯且上下文相关的响应。
-
Output Generation: Finally, the system delivers a comprehensive, user-facing answer that addresses the original query. This response is presented in an actionable, concise format and may optionally include references or citations to the sources used.
-
输出生成:最后,系统生成一个全面的、面向用户的答案,以解决原始查询。该响应以可操作、简洁的格式呈现,并可选择包含对所使用来源的引用或引文。

Figure 16: An Overview of Single Agentic RAG
图 16: 单智能体 RAG 概述
Key Features and Advantages.
主要特性与优势
• Centralized Simplicity: A single agent handles all retrieval and routing tasks, making the architecture straightforward to design, implement, and maintain.
• 集中化简洁性:单个智能体处理所有检索和路由任务,使架构设计、实现和维护变得简单明了。
• Efficiency & Resource Optimization: With fewer agents and simpler coordination, the system demands fewer computational resources and can handle queries more quickly.
• 效率与资源优化:更少的智能体和更简单的协调意味着系统需要更少的计算资源,并且能够更快地处理查询。
• Dynamic Routing: The agent evaluates each query in real-time, selecting the most appropriate knowledge source (e.g., structured DB, semantic search, web search).
• 动态路由 (Dynamic Routing):AI智能体实时评估每个查询,选择最合适的知识源(例如,结构化数据库、语义搜索、网络搜索)。
• Versatility Across Tools: Supports a variety of data sources and external APIs, enabling both structured and unstructured workflows.
跨工具多用途性:支持多种数据源和外部 API,能够处理结构化和非结构化工作流。
• Ideal for Simpler Systems: Suited for applications with well-defined tasks or limited integration requirements (e.g., document retrieval, SQL-based workflows).
• 适用于简单系统:适合具有明确任务或有限集成需求的应用(例如,文档检索、基于SQL的工作流)。
Prompt: Can you tell me the delivery status of my order?
提示:您能告诉我我的订单配送状态吗?
System Process (Single-Agent Workflow):
系统进程(单智能体工作流):
1. Query Submission and Evaluation:
1. 查询提交与评估
• The user submits the query, which is received by the coordinating agent. • The coordinating agent analyzes the query and determines the most appropriate sources of information.
• 用户提交查询,协调AI智能体接收该查询。
• 协调AI智能体分析查询并确定最合适的信息来源。
2. Knowledge Source Selection:
- 知识源选择:
3. Data Integration and LLM Synthesis:
- 数据集成与大语言模型合成:
• The relevant data is passed to the LLM, which synthesizes the information into a coherent response.
• 相关数据被传递给大语言模型,由其将信息合成为连贯的响应。
4. Output Generation:
4. 输出生成:
• The system generates an actionable and concise response, providing live tracking updates and potential alternatives.
• 系统生成一个可操作且简洁的响应,提供实时跟踪更新和潜在的替代方案。
Response:
响应:
Integrated Response: “Your package is currently in transit and expected to arrive tomorrow evening. The live tracking from UPS indicates it is at the regional distribution center.”
集成响应:“您的包裹目前正在运输中,预计明晚到达。UPS的实时跟踪显示它已到达区域配送中心。”
4.7 Multi-Agent Agentic RAG Systems:
4.7 多智能体 AI智能体 RAG 系统:
Multi-Agent RAG [30] represents a modular and scalable evolution of single-agent architectures, designed to handle complex workflows and diverse query types by leveraging multiple specialized agents (as shown in Figure 17). Instead of relying on a single agent to manage all tasks—reasoning, retrieval, and response generation—this system distributes responsibilities across multiple agents, each optimized for a specific role or data source.
多智能体RAG [30] 代表了单智能体架构的模块化和可扩展演进,旨在通过利用多个专门化的智能体(如图 17 所示)来处理复杂的工作流和多样化的查询类型。该系统不再依赖单一智能体来管理所有任务(如推理、检索和响应生成),而是将职责分配给多个智能体,每个智能体都针对特定角色或数据源进行了优化。
Workflow
工作流
-
Query Submission: The process begins with a user query, which is received by a coordinator agent or master retrieval agent. This agent acts as the central orchestrator, delegating the query to specialized retrieval agents based on the query’s requirements.
-
查询提交:该过程从用户查询开始,协调代理或主检索代理接收查询。该代理作为中央协调者,根据查询需求将查询委派给专门的检索代理。
-
Specialized Retrieval Agents: The query is distributed among multiple retrieval agents, each focusing on a specific type of data source or task. Examples include:
-
专用检索智能体:查询被分配到多个检索智能体中,每个智能体专注于特定类型的数据源或任务。例如:
-
Tool Access and Data Retrieval: Each agent routes the query to the appropriate tools or data sources within its domain, such as:
-
工具访问与数据检索:每个AI智能体将查询路由到其领域内的适当工具或数据源,例如:
• Vector Search: For semantic relevance. • Text-to-SQL: For structured data. • Web Search: For real-time public information. • APIs: For accessing external services or proprietary systems
• 向量搜索 (Vector Search): 用于语义相关性。
• 文本到 SQL (Text-to-SQL): 用于结构化数据。
• 网络搜索 (Web Search): 用于实时公共信息。
• API (APIs): 用于访问外部服务或专有系统
The retrieval process is executed in parallel, allowing for efficient processing of diverse query types.
检索过程并行执行,支持高效处理多种查询类型。

Figure 17: An Overview of Multi-Agent Agentic RAG Systems
图 17: 多智能体代理 RAG 系统概览
- Data Integration and LLM Synthesis: Once retrieval is complete, the data from all agents is passed to a Large Language Model (LLM). The LLM synthesizes the retrieved information into a coherent and con textually relevant response, integrating insights from multiple sources seamlessly.
数据整合与大语言模型合成:检索完成后,来自所有智能体的数据将传递给大语言模型。大语言模型将检索到的信息合成为一个连贯且上下文相关的响应,无缝整合来自多个来源的见解。
- Output Generation: The system generates a comprehensive response, which is delivered back to the user in an actionable and concise format.
输出生成:系统生成一个全面的响应,并以可操作且简洁的格式返回给用户。
Key Features and Advantages.
关键特性和优势
Challenges
挑战
Prompt: What are the economic and environmental impacts of renewable energy adoption in Europe?
提示:欧洲采用可再生能源的经济和环境影响是什么?
System Process (Multi-Agent Workflow):
系统进程(多智能体工作流):
Response:
响应:
Integrated Response: “Adopting renewable energy in Europe has led to a $20%$ reduction in greenhouse gas emissions over the past decade, according to EU policy reports. Economically, renewable energy investments have generated approximately 1.2 million jobs, with significant growth in solar and wind sectors. Recent academic studies also highlight potential trade-offs in grid stability and energy storage costs.”
综合响应:"根据欧盟政策报告,欧洲采用可再生能源在过去十年中使温室气体排放减少了 20%。从经济角度来看,可再生能源投资创造了约 120 万个就业岗位,其中太阳能和风能领域增长显著。最近的学术研究还强调了电网稳定性和储能成本方面的潜在权衡。"
4.8 Hierarchical Agentic RAG Systems
4.8 分层AI智能体RAG系统
Hierarchical Agentic RAG: [14] systems employ a structured, multi-tiered approach to information retrieval and processing, enhancing both efficiency and strategic decision-making as shown in Figure 18. Agents are organized in a hierarchy, with higher-level agents overseeing and directing lower-level agents. This structure enables multi-level decision-making, ensuring that queries are handled by the most appropriate resources.
分层AI智能体RAG: [14] 系统采用结构化、多层次的信息检索和处理方法,提高了效率和战略决策能力,如图 18 所示。AI智能体按层次组织,高层AI智能体监督和指导低层AI智能体。这种结构实现了多层次决策,确保查询由最合适的资源处理。

Figure 18: An illustration of Hierarchical Agentic RAG
图 18: 分层代理 RAG 示意图
Workflow
工作流程
Key Features and Advantages.
关键特性与优势
Challenges
挑战
• Coordination Complexity: Maintaining robust inter-agent communication across multiple levels can increase orchestration overhead. • Resource Allocation: Efficiently distributing tasks among tiers to avoid bottlenecks is non-trivial.
• 协调复杂性:在多个层级间维持强大的智能体间通信会增加编排开销。
• 资源分配:高效地在各层级间分配任务以避免瓶颈并非易事。
Use Case: Financial Analysis System
用例:金融分析系统
Prompt: What are the best investment options given the current market trends in renewable energy?
提示:在当前可再生能源市场趋势下,最佳投资选择是什么?
System Process (Hierarchical Agentic Workflow):
系统流程(分层智能体工作流):
Response:
响应:
Integrated Response: “Based on current market data, renewable energy stocks have shown a $15%$ growth over the past quarter, driven by supportive government policies and heightened investor interest. Analysts suggest that wind and solar sectors, in particular, may experience continued momentum, while emerging technologies like green hydrogen present moderate risk but potentially high returns.”
综合回应:
“根据当前市场数据,受政府支持政策和投资者兴趣增加的推动,可再生能源股票在过去一个季度增长了15%。分析师指出,尤其是风能和太阳能行业可能会继续保持增长势头,而绿色氢等新兴技术虽然存在一定风险,但可能带来高回报。”
4.9 Agentic Corrective RAG
4.9 自主修正 RAG
Corrective RAG : introduces mechanisms to self-correct retrieval results, enhancing document utilization and improving response generation quality as demonstrated in Figure 19. By embedding intelligent agents into the workflow, Corrective RAG [31] [32] ensures iterative refinement of context documents and responses, minimizing errors and maximizing relevance.
Corrective RAG:引入自我纠正检索结果的机制,提高文档利用率和响应生成质量,如图 19 所示。通过将 AI智能体嵌入工作流,Corrective RAG [31] [32] 确保上下文文档和响应的迭代优化,最大限度地减少错误并提高相关性。
Key Idea of Corrective RAG: The core principle of Corrective RAG lies in its ability to evaluate retrieved documents dynamically, perform corrective actions, and refine queries to enhance the quality of generated responses. Corrective RAG adjusts its approach as follows:
Corrective RAG 的核心思想:Corrective RAG 的核心原则在于其能够动态评估检索到的文档,执行纠正操作,并优化查询以提高生成响应的质量。Corrective RAG 的调整方式如下:

Figure 19: Overview of Agentic Corrective RAG
图 19: Agentic Corrective RAG 概述
• Dynamic Retrieval from External Sources: When context is insufficient, the External Knowledge Retrieval Agent performs web searches or accesses alternative data sources to supplement the retrieved documents.
• 动态检索外部资源:当上下文信息不足时,外部知识检索智能体 (External Knowledge Retrieval Agent) 会执行网络搜索或访问其他数据源来补充检索到的文档。
• Response Synthesis: All validated and refined information is passed to the Response Synthesis Agent for final response generation.
• 响应合成:所有经过验证和精炼的信息都会传递给响应合成智能体,以生成最终响应。
Workflow: The Corrective RAG system is built on five key agents:
工作流程:纠正性 RAG 系统建立在五个关键 AI 智能体之上:
Key Features and Advantages:
关键特性与优势:
Prompt: What are the latest findings in generative AI research?
提示:生成式 AI (Generative AI) 研究的最新发现有哪些?
System Process (Corrective RAG Workflow):
系统流程(修正RAG工作流):
-
Query Submission: A user submits the query to the system.
-
查询提交:用户向系统提交查询。
2. Context Retrieval:
- 上下文检索:
3. Relevance Evaluation:
3. 相关性评估
• The Relevance Evaluation Agent assesses the documents for alignment with the query. • Documents are classified into relevant, ambiguous, or irrelevant categories. Irrelevant documents are flagged for corrective actions.
• 相关性评估智能体评估文档与查询的匹配度。• 文档被分类为相关、模糊或不相关类别。不相关文档被标记以采取纠正措施。
4. Corrective Actions (if needed):
- 纠正措施(如需要):
• The Query Refinement Agent rewrites the query to improve specificity and relevance. • The External Knowledge Retrieval Agent performs web searches to fetch additional papers and reports from external sources.
• 查询优化代理重写查询以提高其针对性和相关性。 • 外部知识检索代理执行网络搜索,以从外部来源获取额外的论文和报告。
5. Response Synthesis:
- 响应合成:
• The Response Synthesis Agent integrates validated documents into a coherent and comprehensive summary.
• 响应合成智能体将验证过的文档整合为连贯且全面的摘要。
Response:
响应:
Integrated Response: “Recent findings in generative AI highlight advancements in diffusion models, reinforcement learning for text-to-video tasks, and optimization techniques for large-scale model training. For more details, refer to studies published in NeurIPS 2024 and AAAI 2025.”
综合回应:“生成式 AI (Generative AI) 的最新研究进展突显了扩散模型、用于文本到视频任务的强化学习以及大规模模型训练的优化技术。更多详情请参考 NeurIPS 2024 和 AAAI 2025 上发表的研究。”
4.10 Adaptive Agentic RAG
4.10 自适应智能体 RAG
Adaptive Retrieval-Augmented Generation (Adaptive RAG) [33] enhances the flexibility and efficiency of large language models (LLMs) by dynamically adjusting query handling strategies based on the complexity of the incoming query. Unlike static retrieval workflows, Adaptive RAG [34] employs a classifier to assess query complexity and determine the most appropriate approach, ranging from single-step retrieval to multi-step reasoning, or even bypassing retrieval altogether for straightforward queries as illustrated in Figure 20.
自适应检索增强生成 (Adaptive Retrieval-Augmented Generation, Adaptive RAG) [33] 通过动态调整查询处理策略,基于输入查询的复杂性,提升了大语言模型 (LLMs) 的灵活性和效率。与静态检索工作流不同,自适应检索增强生成 [34] 使用分类器评估查询复杂性,并确定最合适的方法,从单步检索到多步推理,甚至对于简单查询完全跳过检索,如图 20 所示。

Figure 20: An Overview of Adaptive Agentic RAG
图 20: 自适应 AI智能体 RAG 概览
Key Idea of Adaptive RAG The core principle of Adaptive RAG lies in its ability to dynamically tailor retrieval strategies based on the complexity of the query. Adaptive RAG adjusts its approach as follows:
自适应 RAG 的核心思想
• Straightforward Queries: For fact-based questions that require no additional retrieval (e.g., "What is the boiling point of water?"), the system directly generates an answer using pre-existing knowledge.
• 简单查询:对于不需要额外检索的基于事实的问题(例如“水的沸点是多少?”),系统直接利用已有知识生成答案。
• Simple Queries: For moderately complex tasks requiring minimal context (e.g., "What is the status of my latest electricity bill?"), the system performs a single-step retrieval to fetch the relevant details.
• 简单查询:对于需要较少上下文的中等复杂度任务(例如,“我最近的电费账单状态如何?”),系统执行单步检索以获取相关详细信息。
• Complex Queries: For multi-layered queries requiring iterative reasoning (e.g., "How has the population of City X changed over the past decade, and what are the contributing factors?"), the system employs multi-step retrieval, progressively refining intermediate results to provide a comprehensive answer.
• 复杂查询:对于需要迭代推理的多层次查询(例如,“X市过去十年的人口变化如何,影响因素有哪些?”),系统采用多步检索,逐步优化中间结果以提供全面答案。
Workflow: The Adaptive RAG system is built on three primary components:
工作流:自适应RAG系统建立在三个主要组件之上:
1. Classifier Role:
- 分类器角色:
2. Dynamic Strategy Selection:
2. 动态策略选择:
• For straightforward queries, the system avoids unnecessary retrieval, directly leveraging the LLM for response generation. • For simple queries, it employs a single-step retrieval process to fetch relevant context. • For complex queries, it activates multi-step retrieval to ensure iterative refinement and enhanced reasoning.
• 对于简单的查询,系统避免不必要的检索,直接利用大语言模型生成响应。
• 对于简单的查询,它采用单步检索过程来获取相关上下文。
• 对于复杂的查询,它启动多步检索以确保迭代优化和增强推理。
3. LLM Integration:
3. 大语言模型集成:
Key Features and Advantages
关键特性与优势
• Dynamic Adaptability: Adjusts retrieval strategies based on query complexity, optimizing both computational efficiency and response accuracy.
• 动态适应性:根据查询复杂度调整检索策略,优化计算效率和响应准确性。
• Resource Efficiency: Minimizes unnecessary overhead for simple queries while ensuring thorough processing for complex ones.
• 资源效率:在确保复杂查询得到彻底处理的同时,最大限度地减少简单查询的不必要开销。
• Enhanced Accuracy: Iterative refinement ensures that complex queries are resolved with high precision.
• 提升准确性:迭代优化确保复杂查询得到高精度解决。
• Flexibility: Can be extended to incorporate additional pathways, such as domain-specific tools or external APIs.
• 灵活性:可以扩展以整合额外的路径,例如特定领域的工具或外部 API。
Prompt: Why is my package delayed, and what alternatives do I have?
提示:为什么我的包裹延迟了,我有哪些替代方案?
System Process (Adaptive RAG Workflow):
系统进程(自适应RAG工作流):
3. Multi-Step Retrieval:
3. 多步检索:
4. Response Synthesis:
- 响应合成:
• The LLM integrates all retrieved information, synthesizing a comprehensive and actionable response.
• 大语言模型整合所有检索到的信息,生成全面且可操作的响应。
Response:
响应:
Integrated Response: “Your package is delayed due to severe weather conditions in your region. It is currently at the local distribution center and will be delivered in 2 days. Alternatively, you may opt for a local pickup from the facility.”
综合回复:“由于您所在地区的恶劣天气条件,您的包裹被延迟。目前包裹在本地配送中心,将在2天内送达。您也可以选择从该设施自取。”
4.11 Graph-Based Agentic RAG
4.11 基于图的智能体RAG
4.11.1 Agent-G: Agentic Framework for Graph RAG
4.11.1 Agent-G: 图 RAG 的智能体框架
Agent-G [8]: introduces a novel agentic architecture that integrates graph knowledge bases with unstructured document retrieval. By combining structured and unstructured data sources, this framework enhances retrieval-augmented generation (RAG) systems with improved reasoning and retrieval accuracy. It employs modular retriever banks, dynamic agent interaction, and feedback loops to ensure high-quality outputs as shown in Figure 21.
Agent-G [8]: 引入了一种新颖的智能体架构,将图知识库与非结构化文档检索相结合。通过整合结构化和非结构化数据源,该框架增强了检索增强生成 (RAG) 系统的推理和检索准确性。它采用了模块化检索器库、动态智能体交互和反馈循环,以确保高质量的输出,如图 21 所示。

Figure 21: An Overview of Agent-G: Agentic Framework for Graph RAG [8]
图 21: Agent-G 概览:用于图 RAG 的智能体框架 [8]
Key Idea of Agent-G The core principle of Agent-G lies in its ability to dynamically assign retrieval tasks to specialized agents, leveraging both graph knowledge bases and textual documents. Agent-G adjusts its retrieval strategy as follows:
Agent-G 的核心思想
Workflow: The Agent-G system is built on four primary components:
工作流程:Agent-G 系统建立在四个主要组件上:
1. Retriever Bank:
- Retriever Bank:
2. Critic Module:
2. 批判模块:
3. Dynamic Agent Interaction:
3. 动态智能体交互:
4. LLM Integration:
4. 大语言模型集成:
Key Features and Advantages
关键特性和优势
Prompt: What are the common symptoms of Type 2 Diabetes, and how are they related to heart disease?
提示:2型糖尿病的常见症状有哪些?它们与心脏病有何关联?
System Process (Agent-G Workflow):
系统进程(Agent-G 工作流):
-
Query Reception and Assignment: The system receives the query and identifies the need for both graph-structured and unstructured data to answer the question comprehensively.
-
查询接收与分配:系统接收查询,并识别出需要图结构数据和非结构化数据来全面回答问题。
2. Graph Retriever:
2. Graph Retriever:
3. Document Retriever:
- 文档检索器:
4. Critic Module:
4. 评论模块:
- Response Synthesis: The LLM integrates validated data from the Graph Retriever and Document Retriever into a coherent response, ensuring alignment with the query’s intent.
响应合成:大语言模型将从图检索器和文档检索器中验证的数据整合成一个连贯的响应,确保与查询意图一致。
Response:
响应:
Integrated Response: “Type 2 Diabetes symptoms include increased thirst, frequent urination, and fatigue. Studies show a $50%$ correlation between diabetes and heart disease, primarily through shared risk factors such as obesity and high blood pressure.”
综合回应:“2型糖尿病 (Type 2 Diabetes) 的症状包括口渴、尿频和疲劳。研究表明,糖尿病与心脏病之间存在50%的相关性,主要是通过共同的危险因素,如肥胖和高血压。”
4.11.2 GeAR: Graph-Enhanced Agent for Retrieval-Augmented Generation
4.11.2 GeAR: 图增强的检索增强生成智能体
GeAR [35]: introduces an agentic framework that enhances traditional Retrieval-Augmented Generation (RAG) systems by incorporating graph-based retrieval mechanisms. By leveraging graph expansion techniques and an agent-based architecture, GeAR addresses challenges in multi-hop retrieval scenarios, improving the system’s ability to handle complex queries as shown in Figure 22.
GeAR [35]: 引入了一个基于图检索机制的智能体框架,增强了传统的检索增强生成 (Retrieval-Augmented Generation, RAG) 系统。通过利用图扩展技术和基于智能体的架构,GeAR 解决了多跳检索场景中的挑战,提升了系统处理复杂查询的能力,如图 22 所示。
Key Idea of GeAR GeAR advances RAG performance through two primary innovations:
GeAR的关键思想 GeAR通过两项主要创新提升RAG性能:
• Graph Expansion: Enhances conventional base retrievers (e.g., BM25) by expanding the retrieval process to include graph-structured data, enabling the system to capture complex relationships and dependencies between entities. • Agent Framework: Incorporates an agent-based architecture that utilizes graph expansion to manage retrieval tasks more effectively, allowing for dynamic and autonomous decision-making in the retrieval process.
• 图扩展:通过将检索过程扩展到图结构数据,增强传统的基础检索器(例如 BM25),使系统能够捕捉实体之间的复杂关系和依赖。
• 智能体框架:采用基于智能体的架构,利用图扩展更有效地管理检索任务,实现检索过程中的动态和自主决策。
Workflow: The GeAR system operates through the following components:
工作流程:GeAR 系统通过以下组件运行:
1. Graph Expansion Module:
1. 图扩展模块:
2. Agent-Based Retrieval:
基于智能体的检索:
3. LLM Integration:
3. 大语言模型 (LLM) 集成:
• Combines the retrieved information, enriched by graph expansion, with the capabilities of a Large Language Model (LLM) to generate coherent and con textually relevant responses. • The integration ensures that the generative process is informed by both unstructured documents and structured graph data.
• 结合通过图扩展增强的检索信息与大语言模型 (LLM) 的能力,生成连贯且上下文相关的响应。• 这种集成确保生成过程同时受到非结构化文档和结构化图数据的支持。

Figure 22: An Overview of GeAR: Graph-Enhanced Agent for Retrieval-Augmented Generation[35]
图 22: GeAR 概览:图增强的检索增强生成 AI 智能体 [35]
Key Features and Advantages
关键特性与优势
Prompt: Which author influenced the mentor of J.K. Rowling?
提示:哪位作者影响了J.K. Rowling的导师?
System Process (GeAR Workflow):
系统进程 (GeAR 工作流):
- Top-Tier Agent: Evaluates the query’s multi-hop nature and determines that a combination of graph expansion and document retrieval is necessary to answer the question.
顶级智能体:评估查询的多跳性质,并确定需要结合图扩展和文档检索来回答问题。
2. Graph Expansion Module:
2. 图扩展模块:
3. Agent-Based Retrieval:
3. 基于智能体的检索:
-
Response Synthesis: Combines insights from the graph and document retrieval processes using the LLM to generate a response that accurately reflects the complex relationships in the query.
-
响应合成:结合图检索和文档检索过程中的洞察,利用大语言模型生成准确反映查询中复杂关系的响应。
Response:
响应:
Integrated Response: “J.K. Rowling’s mentor, [Mentor Name], was heavily influenced by [Author Name], known for their [notable works or genre]. This connection highlights the layered relationships in literary history, where influential ideas often pass through multiple generations of authors.”
综合回应:“J.K. Rowling 的导师 [Mentor Name] 深受 [Author Name] 的影响,后者以其 [著名作品或流派] 而闻名。这种联系凸显了文学历史中多层次的关系,其中具有影响力的思想往往通过多代作家传承。”
4.12 Agentic Document Workflows in Agentic RAG
4.12 Agentic RAG 中的智能文档工作流
Agentic Document Workflows (ADW) [36] extend traditional Retrieval-Augmented Generation (RAG) paradigms by enabling end-to-end knowledge work automation. These workflows orchestrate complex document-centric processes, integrating document parsing, retrieval, reasoning, and structured outputs with intelligent agents (see Figure 23). ADW systems address limitations of Intelligent Document Processing (IDP) and RAG by maintaining state, coordinating multi-step workflows, and applying domain-specific logic to documents.
Agentic Document Workflows (ADW) [36] 通过实现端到端的知识工作自动化,扩展了传统的检索增强生成 (Retrieval-Augmented Generation, RAG) 范式。这些工作流程协调了以文档为中心的复杂过程,将文档解析、检索、推理和结构化输出与智能体集成在一起(见图 23)。ADW 系统通过保持状态、协调多步骤工作流程以及将特定领域的逻辑应用于文档,解决了智能文档处理 (Intelligent Document Processing, IDP) 和 RAG 的局限性。
Workflow
工作流
1. Document Parsing and Information Structuring:
1. 文档解析与信息结构化
• Documents are parsed using enterprise-grade tools (e.g., LlamaParse) to extract relevant data fields such as invoice numbers, dates, vendor information, line items, and payment terms. • Structured data is organized for downstream processing.
• 使用企业级工具(如 LlamaParse)解析文档,提取相关数据字段,如发票编号、日期、供应商信息、行项目和付款条款。• 结构化数据被组织起来用于下游处理。
2. State Maintenance Across Processes:
2. 跨进程的状态维护:
3. Knowledge Retrieval:
- 知识检索:
4. Agentic Orchestration:
4. Agentic Orchestration:
• Intelligent agents apply business rules, perform multi-hop reasoning, and generate actionable recommendations. • Orchestrates components such as parsers, retrievers, and external APIs for seamless integration.
• 智能体应用业务规则,执行多跳推理,并生成可操作的建议。• 协调解析器、检索器和外部API等组件,实现无缝集成。
5. Actionable Output Generation:
5. 可操作输出生成

Figure 23: An Overview of Agentic Document Workflows (ADW) [36]
图 23: AI智能体文档工作流 (Agentic Document Workflows, ADW) 概述 [36]
Use Case: Invoice Payments Workflow
用例:发票支付工作流
Prompt: Generate a payment recommendation report based on the submitted invoice and associated vendor contract terms.
提示:根据提交的发票和相关的供应商合同条款生成付款建议报告。
System Process (ADW Workflow):
系统进程 (ADW 工作流):
Response: Integrated Response: "Invoice INV-2025-045 for $\$15,000.00$ has been processed. An early payment discount of $2%$ is available if paid by 2025-04-10, reducing the amount due to $\$14,700.00$ . A bulk order discount of $5%$ was applied as the subtotal exceeded $\$10,000.00$ . It is recommended to approve early payment to save $2%$ and ensure timely fund allocation for upcoming project phases."
响应:综合响应:"发票 INV-2025-045 的金额为 $15,000.00 已处理。如果在 2025-04-10 之前支付,可享受 2% 的早付折扣,应付金额将减少至 $14,700.00。由于小计超过 $10,000.00,已应用 5% 的批量订单折扣。建议批准早付以节省 2%,并确保为即将到来的项目阶段及时分配资金。"
Key Features and Advantages
关键特性与优势
4.13 Comparative Analysis of Agentic RAG Frameworks
4.13 AI智能体 RAG 框架对比分析
Table 2 provides a comprehensive comparative analysis of the three architectural frameworks: Traditional RAG, Agentic RAG, and Agentic Document Workflows (ADW). This analysis highlights their respective strengths, weaknesses, and best-fit scenarios, offering valuable insights into their applicability across diverse use cases.
表 2: 对三种架构框架(传统 RAG、Agentic RAG 和 Agentic Document Workflows (ADW))进行了全面的比较分析。该分析突出了它们各自的优势、劣势以及最适合的场景,为它们在不同用例中的适用性提供了宝贵的见解。
Table 2: Comparative Analysis: Traditional RAG vs Agentic RAG vs Agentic Document Workflows (ADW)
| Feature | Traditional RAG | Agentic RAG | Agentic Document Workflows (ADW) |
| Focus | Isolated retrieval and generation tasks | Multi-agent collaboration and reasoning | Document-centric end-to-end workflows |
| Context Maintenance | Limited | Enabled through memory modules | Maintains state across multi-step workflows |
| Dynamic Adaptability | Minimal | High | Tailored to document workflows |
| Workflow Orchestration | Absent | Orchestrates multi-agent tasks | Integrates multi-step document processing |
| Use of External Tools/APIs | Basic integration (e.g., retrieval tools) | Extends via tools like APIs and knowledge bases | Deeply integrates business rules and domain-specific tools |
| Scalability | Limited to small datasets or queries | Scalable for multi-agent systems | Scales for multi-domain enterprise workflows |
| Complex Reasoning | Basic (e.g., simple Q&A) | Multi-step reasoning with agents | Structured reasoning across documents |
| Primary Applications | QA systems, knowledge retrieval | Multi-domain knowledge and reasoning | Contract review, invoice processing, claims analysis |
| Strengths | Simplicity, quick setup | High accuracy, collaborative reasoning | End-to-end automation, domain-specific intelligence |
| Challenges | Poor contextual understanding | Coordination complexity | Resource overhead, domain standardization |
表 2: 对比分析:传统 RAG vs 智能体 RAG vs 智能体文档工作流 (ADW)
| 特性 | 传统 RAG | 智能体 RAG | 智能体文档工作流 (ADW) |
|---|---|---|---|
| 焦点 | 孤立的检索与生成任务 | 多智能体协作与推理 | 以文档为中心的端到端工作流 |
| 上下文维护 | 有限 | 通过记忆模块实现 | 在多步工作流中保持状态 |
| 动态适应性 | 最小 | 高 | 针对文档工作流定制 |
| 工作流编排 | 无 | 编排多智能体任务 | 集成多步文档处理 |
| 外部工具/API 的使用 | 基本集成(如检索工具) | 通过 API 和知识库等工具扩展 | 深度集成业务规则和领域特定工具 |
| 可扩展性 | 限于小型数据集或查询 | 适用于多智能体系统 | 可扩展至多领域企业工作流 |
| 复杂推理 | 基础(如简单问答) | 多步推理与智能体 | 跨文档的结构化推理 |
| 主要应用 | 问答系统、知识检索 | 多领域知识与推理 | 合同审查、发票处理、索赔分析 |
| 优势 | 简单性、快速设置 | 高准确性、协作推理 | 端到端自动化、领域特定智能 |
| 挑战 | 上下文理解差 | 协调复杂性 | 资源开销、领域标准化 |
The comparative analysis underscores the evolutionary trajectory from Traditional RAG to Agentic RAG and further to Agentic Document Workflows (ADW). While Traditional RAG offers simplicity and ease of deployment for basic tasks, Agentic RAG introduces enhanced reasoning and s cal ability through multi-agent collaboration. ADW builds upon these advancements by providing robust, document-centric workflows that facilitate end-to-end automation and integration with domain-specific processes. Understanding the strengths and limitations of each framework is crucial for selecting the most appropriate architecture to meet specific application requirements and operational demands.
对比分析强调了从传统 RAG 到 Agentic RAG 再到 Agentic Document Workflows (ADW) 的演进轨迹。传统 RAG 为基本任务提供了简单易部署的方案,而 Agentic RAG 通过多智能体协作引入了更强的推理和扩展能力。ADW 在这些进步的基础上,提供了以文档为中心的稳健工作流,促进了端到端的自动化以及与领域特定流程的集成。了解每个框架的优势和局限性对于选择最适合的架构以满足特定应用需求和操作要求至关重要。
5 Applications of Agentic RAG
5 个 AI智能体 RAG 应用
Agentic Retrieval-Augmented Generation (RAG) systems have demonstrated transformative potential across a variety of domains. By combining real-time data retrieval, generative capabilities, and autonomous decision-making, these systems address complex, dynamic, and multi-modal challenges. This section explores the key applications of Agentic RAG, providing detailed insights into how these systems are shaping industries such as customer support, healthcare, finance, education, legal workflows, and creative industries.
智能检索增强生成 (Agentic RAG) 系统在多个领域展示了变革性潜力。通过结合实时数据检索、生成能力和自主决策,这些系统能够应对复杂、动态和多模态的挑战。本节探讨了智能检索增强生成 (Agentic RAG) 的关键应用,详细分析了这些系统如何塑造客户支持、医疗保健、金融、教育、法律工作流程和创意产业等行业。
5.1 Customer Support and Virtual Assistants
5.1 客户支持与虚拟助手
Agentic RAG systems are revolutionizing customer support by enabling real-time, context-aware query resolution. Traditional chatbots and virtual assistants often rely on static knowledge bases, leading to generic or outdated responses. By contrast, Agentic RAG systems dynamically retrieve the most relevant information, adapt to the user’s context, and generate personalized responses.
AI智能体 RAG 系统通过实现实时、上下文感知的查询解决,正在彻底改变客户支持。传统的聊天机器人和虚拟助手通常依赖于静态知识库,导致通用或过时的响应。相比之下,AI智能体 RAG 系统动态检索最相关的信息,适应用户的上下文,并生成个性化的响应。
Use Case: Twitch Ad Sales Enhancement [37]
用例:Twitch 广告销售增强 [37]
For instance, Twitch leveraged an agentic workflow with RAG on Amazon Bedrock to streamline ad sales. The system dynamically retrieved advertiser data, historical campaign performance, and audience demographics to generate detailed ad proposals, significantly boosting operational efficiency.
例如,Twitch 利用 Amazon Bedrock 上的 RAG 代理工作流程简化了广告销售。该系统动态检索广告商数据、历史广告活动表现和受众人口统计信息,以生成详细的广告提案,显著提高了运营效率。
Key Benefits:
主要优势:
• Improved Response Quality: Personalized and context-aware replies enhance user engagement. • Operational Efficiency: Reduces the workload on human support agents by automating complex queries. • Real-Time Adaptability: Dynamically integrates evolving data, such as live service outages or pricing updates.
- 提升响应质量:个性化和上下文感知的回复增强了用户参与度。
- 提高运营效率:通过自动化处理复杂查询,减轻了人工支持代理的工作量。
- 实时适应性:动态整合不断变化的数据,如实时服务中断或定价更新。
5.2 Healthcare and Personalized Medicine
5.2 医疗与个性化医疗
In healthcare, the integration of patient-specific data with the latest medical research is critical for informed decisionmaking. Agentic RAG systems enable this by retrieving real-time clinical guidelines, medical literature, and patient history to assist clinicians in diagnostics and treatment planning.
在医疗保健领域,将患者特定数据与最新医学研究相结合对于知情决策至关重要。通过检索实时临床指南、医学文献和患者病史,AI智能体RAG系统能够协助临床医生进行诊断和治疗规划。
Use Case: Patient Case Summary [38]
用例:患者病例摘要 [38]
Agentic RAG systems have been applied in generating patient case summaries. For example, by integrating electronic health records (EHR) and up-to-date medical literature, the system generates comprehensive summaries for clinicians to make faster and more informed decisions.
智能 RAG 系统已应用于生成患者病例摘要。例如,通过整合电子健康记录 (EHR) 和最新的医学文献,该系统生成了全面的摘要,帮助临床医生做出更快、更明智的决策。
Key Benefits:
关键优势:
5.3 Legal and Contract Analysis
5.3 法律与合同分析
Agentic RAG systems are redefining how legal workflows are conducted, offering tools for rapid document analysis and decision-making.
Agentic RAG 系统正在重新定义法律工作流程,提供快速文档分析和决策的工具。
Use Case: Contract Review [39]
用例:合同审查 [39]
A legal agentic RAG system can analyze contracts, extract critical clauses, and identify potential risks. By combining semantic search capabilities with legal knowledge graphs, it automates the tedious process of contract review, ensuring compliance and mitigating risks.
一个合法的 AI智能体 RAG 系统可以分析合同、提取关键条款并识别潜在风险。通过将语义搜索功能与法律知识图谱相结合,它自动化了繁琐的合同审查过程,确保合规并降低风险。
Key Benefits:
主要优势:
5.4 Finance and Risk Analysis
5.4 金融与风险分析
Agentic RAG systems are transforming the finance industry by providing real-time insights for investment decisions, market analysis, and risk management. These systems integrate live data streams, historical trends, and predictive modeling to generate actionable outputs.
Agentic RAG 系统正在通过为投资决策、市场分析和风险管理提供实时洞察来改变金融行业。这些系统集成了实时数据流、历史趋势和预测模型,以生成可操作的输出。
Use Case: Auto Insurance Claims Processing [40]
用例:自动保险理赔处理 [40]
In auto insurance, Agentic RAG can automate claim processing. For example, by retrieving policy details and combining them with accident data, it generates claim recommendations while ensuring compliance with regulatory requirements.
在汽车保险领域,AI智能体 RAG (Agentic RAG) 可以自动化处理理赔。例如,通过检索保单详情并结合事故数据,它能生成理赔建议,同时确保符合监管要求。
Key Benefits:
关键优势:
5.5 Education and Personalized Learning
5.5 教育与个性化学习
Education is another domain where Agentic RAG systems are making significant strides. These systems enable adaptive learning by generating explanations, study materials, and feedback tailored to the learner’s progress and preferences.
教育是AI智能体RAG系统取得显著进展的另一个领域。这些系统通过生成解释、学习材料和针对学习者进度和偏好的反馈,实现了自适应学习。
Use Case: Research Paper Generation [41]
用例:研究论文生成 [41]
In higher education, Agentic RAG has been used to assist researchers by synthesizing key findings from multiple sources. For instance, a researcher querying, “What are the latest advancements in quantum computing?” receives a concise summary enriched with references, enhancing the quality and efficiency of their work.
在高等教育领域,Agentic RAG 已被用于辅助研究人员,通过综合多个来源的关键发现。例如,研究人员查询“量子计算的最新进展是什么?”时,会收到一份简明扼要的摘要,并附有参考文献,从而提高了他们工作的质量和效率。
Key Benefits:
关键优势:
5.6 Graph-Enhanced Applications in Multimodal Workflows
5.6 图增强在多模态工作流中的应用
Graph-Enhanced Agentic RAG (GEAR) combines graph structures with retrieval mechanisms, making it particularly effective in multimodal workflows where interconnected data sources are essential.
图增强的 AI 智能体 RAG (GEAR) 将图结构与检索机制相结合,使其在需要互连数据源的多模态工作流中特别有效。
Use Case: Market Survey Generation
用例:市场调查生成
GEAR enables the synthesis of text, images, and videos for marketing campaigns. For example, querying, “What are the emerging trends in eco-friendly products?” generates a detailed report enriched with customer preferences, competitor analysis, and multimedia content.
GEAR 能够为营销活动生成文本、图像和视频。例如,查询“环保产品的新兴趋势是什么?”会生成一份详细的报告,其中包含客户偏好、竞争对手分析和多媒体内容。
Key Benefits:
主要优势:
The applications of Agentic RAG systems span a wide range of industries, showcasing their versatility and transformative potential. From personalized customer support to adaptive education and graph-enhanced multimodal workflows, these systems address complex, dynamic, and knowledge-intensive challenges. By integrating retrieval, generation, and agentic intelligence, Agentic RAG systems are paving the way for next-generation AI applications.
Agentic RAG 系统的应用横跨多个行业,展示了其多功能性和变革潜力。从个性化客户支持到自适应教育以及图增强的多模态工作流,这些系统应对了复杂、动态且知识密集的挑战。通过整合检索、生成和智能体技术,Agentic RAG 系统正在为下一代 AI 应用铺平道路。
6 Tools and Frameworks for Agentic RAG
6 用于AI智能体RAG的工具和框架
Agentic Retrieval-Augmented Generation (RAG) systems represent a significant evolution in combining retrieval, generation, and agentic intelligence. These systems extend the capabilities of traditional RAG by integrating decisionmaking, query reformulation, and adaptive workflows. The following tools and frameworks provide robust support for developing Agentic RAG systems, addressing the complex requirements of real-world applications.
智能检索增强生成 (Agentic RAG) 系统代表了检索、生成和智能体智能结合的重要进展。这些系统通过整合决策、查询重构和自适应工作流,扩展了传统 RAG 的能力。以下工具和框架为开发智能 RAG 系统提供了强大的支持,满足了实际应用中的复杂需求。
Key Tools and Frameworks:
关键工具与框架:
7 Benchmarks and Datasets
7 基准测试和数据集
Current benchmarks and datasets provide valuable insights into evaluating Retrieval-Augmented Generation (RAG) systems, including those with agentic and graph-based enhancements. While some are explicitly designed for RAG, others are adapted to test retrieval, reasoning, and generation capabilities in diverse scenarios. Datasets are crucial for testing the retrieval, reasoning, and generation components of RAG systems. Table 3 discusses some key datasets based on the dowstream task for RAG Evaluation.
当前的基准和数据集为评估检索增强生成 (Retrieval-Augmented Generation, RAG) 系统提供了宝贵的见解,包括那些具有智能体和图增强的系统。虽然有些是专门为 RAG 设计的,但其他一些则是为了测试不同场景中的检索、推理和生成能力而进行调整的。数据集对于测试 RAG 系统的检索、推理和生成组件至关重要。表 3 讨论了基于 RAG 评估下游任务的一些关键数据集。
Benchmarks play a critical role in standardizing the evaluation of RAG systems by providing structured tasks and metrics. The following benchmarks are particularly relevant:
基准测试在标准化 RAG 系统评估中起着关键作用,通过提供结构化任务和指标。以下基准测试尤其相关:
• FlashRAG Toolkit: Implements 12 RAG methods and includes 32 benchmark datasets to support efficient and standardized RAG evaluation [63].
FlashRAG 工具包:实现了 12 种 RAG 方法,并包含 32 个基准数据集,以支持高效且标准化的 RAG 评估 [63]。
• GNN-RAG: This benchmark evaluates graph-based RAG systems on tasks like node-level and edge-level predictions, focusing on retrieval quality and reasoning performance in Knowledge Graph Question Answering (KGQA) [64].
• GNN-RAG: 该基准测试评估基于图的 RAG 系统在节点级和边级预测等任务上的表现,重点关注知识图谱问答 (Knowledge Graph Question Answering, KGQA) 中的检索质量和推理性能 [64]。
Table 3: Downstream Tasks and Datasets for RAG Evaluation (Adapted from [20]
| Category | Task Type | Datasets and References |
| QA | Single-hop QA | Natural Questions (NQ) [65], TriviaQA [66], SQuAD [67], Web Questions (WebQ) [68], PopQA [69], MS MARCO |
| Multi-hop QA | [56] HotpotQA [60], 2WikiMultiHopQA [59], MuSiQue [58] | |
| Long-form QA | ELI5 [70], NarrativeQA (NQA) [71], ASQA [72], QM- Sum [73] | |
| Domain-specific QA | Qasper [74], COVID-QA [75], CMB/MMCU Medical [76] | |
| Multi-choice QA | QuALITY [77], ARC (No reference available), Common- | |
| Graph QA | senseQA [78] GraphQA [79] | |
| Graph-based QA | Event Argument Extraction | WikiEvent [80], RAMS [81] |
| Open-domain Dialog | Wizard of Wikipedia (WoW) [82] | |
| Dialog Recommendation | Personalized Dialog | KBP [83], DuleMon [84] |
| Task-oriented Dialog | CamRest [85] | |
| Personalized Content | Amazon Datasets (Toys, Sports, Beauty) [86] | |
| Reasoning | Commonsense Reasoning | HellaSwag [87], CommonsenseQA [78] |
| CoT Reasoning | CoT Reasoning [88] | |
| Others | Complex Reasoning | CSQA [89] |
| Language Understanding | MMLU (No reference available), WikiText-103 [65] | |
| Fact Checking/Verification | FEVER [90], PubHealth [91] | |
| Strategy QA | StrategyQA [92] | |
| Text Summarization | WikiASP [93], XSum [94] | |
| Summarization | Long-form Summarization | NarrativeQA (NQA) [71], QMSum [73] |
| Biography | Biography Dataset (No reference available) | |
| Text Generation Text Classification | Sentiment Analysis | SST-2 [95] |
| General Classification | VioLens[96], TREC [57] | |
| Code Search | Programming Search | CodeSearchNet [97] |
| Robustness | Retrieval Robustness | NoMIRACL [98] |
| Language Modeling Robustness | WikiText-103 [99] | |
| Math | Math Reasoning | GSM8K [100] |
| Machine Translation | Translation Tasks | JRC-Acquis [101] |
表 3: RAG 评估的下游任务和数据集 (改编自 [20])
| 类别 | 任务类型 | 数据集和参考文献 |
|---|---|---|
| QA | 单跳 QA | Natural Questions (NQ) [65], TriviaQA [66], SQuAD [67], Web Questions (WebQ) [68], PopQA [69], MS MARCO |
| QA | 多跳 QA | [56] HotpotQA [60], 2WikiMultiHopQA [59], MuSiQue [58] |
| QA | 长篇 QA | ELI5 [70], NarrativeQA (NQA) [71], ASQA [72], QMSum [73] |
| QA | 领域特定 QA | Qasper [74], COVID-QA [75], CMB/MMCU Medical [76] |
| QA | 多选 QA | QuALITY [77], ARC (无参考文献), Common- |
| QA | 图 QA | senseQA [78] GraphQA [79] |
| 基于图的 QA | 事件参数提取 | WikiEvent [80], RAMS [81] |
| 基于图的 QA | 开放域对话 | Wizard of Wikipedia (WoW) [82] |
| 对话推荐 | 个性化对话 | KBP [83], DuleMon [84] |
| 对话推荐 | 任务导向对话 | CamRest [85] |
| 对话推荐 | 个性化内容 | Amazon Datasets (Toys, Sports, Beauty) [86] |
| 推理 | 常识推理 | HellaSwag [87], CommonsenseQA [78] |
| 推理 | CoT 推理 | CoT Reasoning [88] |
| 其他 | 复杂推理 | CSQA [89] |
| 其他 | 语言理解 | MMLU (无参考文献), WikiText-103 [65] |
| 其他 | 事实检查/验证 | FEVER [90], PubHealth [91] |
| 其他 | 策略 QA | StrategyQA [92] |
| 其他 | 文本摘要 | WikiASP [93], XSum [94] |
| 摘要 | 长篇摘要 | NarrativeQA (NQA) [71], QMSum [73] |
| 摘要 | 传记 | Biography Dataset (无参考文献) |
| 文本生成 文本分类 | 情感分析 | SST-2 [95] |
| 文本生成 文本分类 | 通用分类 | VioLens[96], TREC [57] |
| 代码搜索 | 编程搜索 | CodeSearchNet [97] |
| 鲁棒性 | 检索鲁棒性 | NoMIRACL [98] |
| 鲁棒性 | 语言模型鲁棒性 | WikiText-103 [99] |
| 数学 | 数学推理 | GSM8K [100] |
| 机器翻译 | 翻译任务 | JRC-Acquis [101] |
8 Conclusion
8 结论
Agentic Retrieval-Augmented Generation (RAG) represents a transformative advancement in artificial intelligence, addressing the limitations of traditional RAG systems through the integration of autonomous agents. By leveraging agentic intelligence, these systems introduce capabilities such as dynamic decision-making, iterative reasoning, and collaborative workflows, enabling them to tackle complex, real-world tasks with enhanced precision and adaptability.
智能检索增强生成 (Agentic RAG) 代表了人工智能领域的一项变革性进步,它通过整合自主AI智能体,克服了传统RAG系统的局限性。通过利用智能体智能,这些系统引入了动态决策、迭代推理和协作工作流等能力,使其能够以更高的精度和适应性处理复杂的现实世界任务。
This survey explored the evolution of RAG systems, from their initial implementations to advanced paradigms like Modular RAG, highlighting the contributions and limitations of each. The integration of agents into the RAG pipeline has emerged as a pivotal development, resulting in Agentic RAG systems that overcome static workflows and limited contextual adaptability. Applications across healthcare, finance, education, and creative industries demonstrate the transformative potential of these systems, showcasing their ability to deliver personalized, real-time, and context-aware solutions.
本综述探讨了RAG系统的演变,从最初的实现到模块化RAG等高级范式,并重点介绍了每种范式的贡献和局限性。将AI智能体集成到RAG流程中已成为一个关键发展,由此产生的智能体RAG系统克服了静态工作流和有限的上下文适应性。在医疗、金融、教育和创意产业中的应用展示了这些系统的变革潜力,突显了其提供个性化、实时和上下文感知解决方案的能力。
Despite their promise, Agentic RAG systems face challenges that require further research and innovation. Coordination complexity in multi-agent architectures, s cal ability, and latency issues, as well as ethical considerations, must be addressed to ensure robust and responsible deployment. Additionally, the lack of specialized benchmarks and datasets tailored to evaluate agentic capabilities poses a significant hurdle. Developing evaluation methodologies that capture the unique aspects of Agentic RAG, such as multi-agent collaboration and dynamic adaptability, will be crucial for advancing the field.
尽管前景广阔,AI智能体 RAG 系统仍面临需要进一步研究和创新的挑战。多智能体架构中的协调复杂性、可扩展性和延迟问题,以及伦理考量,都必须得到解决,以确保其稳健和负责任的部署。此外,缺乏专门用于评估智能体能力的基准测试和数据集,也是一个重大障碍。开发能够捕捉 AI智能体 RAG 独特方面的评估方法,如多智能体协作和动态适应性,对于推动该领域的发展至关重要。
Looking ahead, the convergence of retrieval-augmented generation and agentic intelligence has the potential to redefine AI’s role in dynamic and complex environments. By addressing these challenges and exploring future directions, researchers and practitioners can unlock the full potential of Agentic RAG systems, paving the way for transformative applications across industries and domains. As AI systems continue to evolve, Agentic RAG stands as a cornerstone for creating adaptive, context-aware, and impactful solutions that meet the demands of a rapidly changing world.
展望未来,检索增强生成与智能体智能的结合有望重新定义AI在动态复杂环境中的角色。通过应对这些挑战并探索未来方向,研究人员和实践者能够充分释放智能体RAG系统的潜力,为跨行业和跨领域的变革性应用铺平道路。随着AI系统的不断发展,智能体RAG将成为创建适应性、情境感知且具有影响力的解决方案的基石,以满足快速变化世界的需求。
References
参考文献
