What is Agentic RAG?

About 10 minutes

Agentic RAG is a design in which retrieval, reading, and verification are not handled by a fixed pipeline but instead planned and executed by an AI agent that adapts to the situation. Rather than simply searching documents, the agent decides what to investigate, whether the retrieved evidence is sufficient, and which source to consult next.[1]

Differences from conventional RAG

In conventional RAG, the developer fixes the retrieval procedure.

graph LR
    Q["Question"] --> R["Retrieve"]
    R --> C["Evidence"]
    C --> G["Generate"]
    G --> A["Answer"]

In Agentic RAG, the agent treats retrieval not as a single operation but as a sequence of actions that unfolds over multiple steps.

graph TD
    Q["Question"] --> Plan["Investigation plan"]
    Plan --> Tool["Select retrieval tool"]
    Tool --> Observe["Observe results"]
    Observe --> Judge["Is evidence sufficient?"]
    Judge -->|Insufficient| Refine["Refine query / try another source"]
    Refine --> Tool
    Judge -->|Sufficient| Verify["Verify evidence"]
    Verify --> Answer["Generate answer"]

The critical difference is that an LLM participates in the retrieval decision-making process.

Dimension	Conventional RAG	Agentic RAG
Number of retrievals	Fixed	Adjusted as needed
Sources	Fixed in advance	Selected per question
Query	Primarily one rewrite	Refined based on observed results
Verification	Light check after generation	Evidence gaps and contradictions checked mid-process
Scope	FAQ, single-document search	Investigation, analysis, multi-source integration

Why Agentic RAG became necessary

Once RAG is deployed in production, complex questions become the main challenge — not simple ones.

Consider the following question.

From the customer inquiries of the past three months, classify the product issues
likely to lead to churn, and organise the related existing tickets with suggested
response priorities.

This question contains multiple subtasks.

Find only the inquiries from “the past three months.”
Extract inquiries related to churn risk.
Classify by product issue.
Cross-reference with existing tickets.
Assess response priority.
Summarise with supporting evidence.

A single vector search is not enough. An investigation plan, multiple retrievals, cross-referencing, and verification are all required.

Core components of Agentic RAG

Agentic RAG is composed of the following components.

Component	Role
Planner	Decomposes the question and determines the investigation steps
Tool Router	Selects which retrieval tools and data sources to use
Retriever Tools	Keyword search, semantic search, database queries, web search, and so on
Reader	Reads retrieved documents and extracts the relevant portions
Critic	Evaluates sufficiency of evidence, contradictions, and citation accuracy
Memory	Retains discoveries made during investigation, reasons for excluding sources, and already-read documents
Generator	Produces the final answer from verified evidence

This structure mirrors how a human researcher works: search first, read what is found, adjust the search if the results are insufficient, then organise the evidence before answering.

Representative design patterns

1. Corrective Agentic RAG

Retrieval quality is assessed; if it is poor, corrections are applied. This is closely related to the CRAG design.[2]

graph LR
    Q["Question"] --> R["Retrieve"]
    R --> E["Evaluate retrieval quality"]
    E -->|Good| G["Generate"]
    E -->|Ambiguous| RR["Re-retrieve"]
    E -->|Poor| WS["Search alternative source"]
    RR --> E
    WS --> E

Suited for FAQ, medical, legal, or internal policy search scenarios where retrieval failure directly affects answer quality.

2. Multi-hop RAG

A design that follows a chain of evidence in sequence.

Question: How did the pricing policy for Product B change after Company A's acquisition?

1. Find when Company A was acquired.
2. Search for Product B documents published after the acquisition.
3. Extract changes to the pricing policy.
4. Compare with pre-acquisition documents.

The first retrieval result becomes the condition for the second retrieval. Fixed top-k search alone tends to miss questions of this type.

3. Tool-using RAG

Uses not only document retrieval but also APIs, SQL, calculations, and code execution.

Tool	Example
SQL	Aggregate sales figures, usage statistics, or logs
Web search	Verify public or recent information
File search	Find documents in internal repositories
Code execution	Perform numerical analysis or run tests
Browser control	Access information on web interfaces

RAG expands from “document retrieval” to “tool use for obtaining evidence.”

4. Hierarchical Retrieval Interface

Research such as A-RAG (2026) points toward having LLMs interact directly with hierarchical retrieval interfaces.[4]

For example, an agent might be given the following tools.

Keyword search
Semantic search
Reading a chunk’s content
Reading an entire section
Expanding related documents

The agent can search broadly first, then read deeply only where needed — similar to how a human scans a table of contents before turning to the relevant page.

Strengths of Agentic RAG

Better handling of complex questions

A question can be decomposed and investigated in stages. Where conventional RAG tries to find an answer in one shot, Agentic RAG performs exploration toward the answer.

Flexible source selection

The agent can choose sources appropriate to the question — official documentation for product specifications, tickets for incident information, a database for usage data, and the web for recent public information.

Better recovery from retrieval failure

When retrieved evidence is weak, the agent can adjust the query or try a different source. This makes recovery from failure easier compared with a fixed pipeline.

Easier audit logging

The agent can record every retrieval action taken, which evidence was accepted, which was rejected, and why. In business contexts, “why this answer was produced” is sometimes more important than the answer itself.

Risks of Agentic RAG

Agentic RAG is powerful, but a poorly designed implementation can be dangerous.

Risk	Description	Mitigation
Cost overrun	Multiple rounds of retrieval, generation, and verification	Set maximum step counts, budget limits, and timeouts
Latency	Multi-step processing slows responses	Use parallel retrieval, caching, and early termination
Runaway loops	Agent continues unnecessary retrieval or actions	Define explicit action rules and stopping conditions
Permission leakage	Agent accesses sources it should not	Enforce user permissions at the retrieval layer
Evidence conflation	Contradictions across multiple sources go undetected	Verify per citation and surface contradictions explicitly
Prompt injection	Agent follows instructions embedded in retrieved documents	Treat retrieved documents as reference material, not commands

Practical design principles

1. Do not give the agent too much freedom

More autonomy does not always mean better outcomes. In business contexts, the tools available, sources accessible, step count, and response format should all be explicitly constrained.

Permitted:
- Product documentation search
- Ticket search
- FAQ search

Prohibited:
- Searching customer data outside the user's permission scope
- Sending confidential information to external web endpoints
- Making assertions without evidence

2. Log retrieval actions

Record not only the final answer but also the intermediate retrieval steps.

Generated sub-queries
Sources used
Evidence accepted
Evidence rejected
Reasons for re-retrieval
Citations in the final answer

Logs make it substantially easier to diagnose incorrect answers and drive improvements.

3. Build in the ability to decline

Even with Agentic RAG, the system must be able to decline answering when evidence is insufficient.[3]

The available evidence is not sufficient to draw a conclusion.
The missing information is: contract plan and target version.

An agent that always attempts an answer — regardless of evidence — is dangerous no matter how capable its retrieval is.

4. Evaluate at the task level

Evaluation of Agentic RAG must cover the agent’s actions, not only the final answer.

Evaluation dimension	What to check
Retrieval Success	Were the necessary documents found?
Tool Choice	Was the right source selected?
Step Efficiency	Did the agent over-retrieve unnecessarily?
Evidence Faithfulness	Is the answer faithful to the evidence?
Citation Accuracy	Do the citations actually support the answer?
Refusal Quality	Did the agent decline appropriately when evidence was insufficient?

Does Agentic RAG replace conventional RAG?

Agentic RAG does not replace all forms of RAG.

For simple FAQ, short internal policy lookup, and finding the relevant section of a product manual, Advanced RAG is often sufficient. Agentic RAG becomes necessary when the retrieval process itself is complex: investigation, comparison, multi-source integration, and code understanding.

Question type	Recommended approach
”How do I change my password?”	Advanced RAG
”What causes this error code?”	Hybrid search + reranking
”Classify the inquiry trends from the past month.”	Agentic RAG
”How do I modify the authentication logic in this repository?”	Code RAG + Agentic RAG
”What are the main themes across all internal documents?”	Graph RAG or Agentic RAG

Summary

Agentic RAG is a form of RAG in which an agent plans the retrieval, reading, and verification process.
It is suited to complex investigation, multi-source tasks, code understanding, and situations requiring re-retrieval.
The greater flexibility requires careful attention to permissions, logging, stopping conditions, and refusal design.
In practice, a realistic approach is to introduce Agentic RAG only where Advanced RAG falls short.

References

Code RAG and Coding Agents

RAG Architecture Patterns