Skip to content
LinkedInX

Code RAG and Coding Agents

About 10 minutes

Prerequisites: What is RAG and Harness Engineering

Code RAG is RAG that retrieves not natural-language documents but source code, type definitions, tests, configuration files, issues, pull-request history, and design notes, then uses that retrieved context for code generation or modification. With the rise of coding agents, RAG has expanded beyond “internal document search” into “context acquisition technology for understanding and modifying a repository.”[1][4]

Why conventional RAG does not transfer directly to code

Section titled “Why conventional RAG does not transfer directly to code”

In natural-language RAG, documents are commonly split by character count or paragraph boundaries. That approach alone is insufficient for code.

Code has a structure that differs fundamentally from natural language.

  • Functions
  • Classes
  • Types
  • Imports and exports
  • Call relationships
  • Tests
  • Configuration files
  • Generated files
  • Build scripts
  • Modification history

When a chunk boundary falls in the middle of a function, the return value, exception handling, types, and dependencies are lost. Conversely, grouping unrelated functions into one chunk causes the LLM to be misled by irrelevant code.

The retrieval targets in Code RAG extend well beyond source files.

InformationUse
Source codeUnderstanding the implementation, identifying where to change
Type definitionsUnderstanding API inputs, outputs, and constraints
TestsExpected behaviour, regression confirmation
README / docsDesign intent, usage instructions
Configuration filesBuild settings, linting, routing, environment differences
Issues / PRsPast discussions, reasons for changes
Commit historyWhy a particular implementation decision was made
Execution logsFailure causes, environment-specific issues
Agent instruction filesRepository-specific work rules

Coding agents combine these sources to decide what to read, what to change, and how to verify changes.

graph TD
    Task["Natural-language development task"] --> Plan["Investigation plan"]
    Plan --> Search["Code & document search"]
    Search --> Read["Read relevant files"]
    Read --> Edit["Edit code"]
    Edit --> Test["Run tests & linting"]
    Test --> Observe["Review results"]
    Observe -->|Failure| Search
    Observe -->|Success| Summary["Describe changes"]

Where conventional RAG ends at generating an answer, Code RAG extends through editing, execution, and verification.

Chunk design: cut by structure, not by line count

Section titled “Chunk design: cut by structure, not by line count”

The quality of Code RAG depends heavily on chunk design.

A poor approach is simple fixed-length splitting.

Lines 1–80
Lines 81–160
Lines 161–240

This risks cutting in the middle of a function or class.

A better approach uses syntactic structure.

Chunk unitSuited for
FunctionUnderstanding a specific operation
ClassUnderstanding state and responsibilities
Type definitionUnderstanding an API contract
FileUnderstanding an entire module
DirectoryUnderstanding a subsystem
Call graphAnalysing the scope of impact

Research such as cAST (2025) demonstrates the importance of using ASTs to create semantically coherent chunks. In code, what matters is not character count but structural integrity.[2]

Search methods: vector search alone is not enough

Section titled “Search methods: vector search alone is not enough”

Code RAG combines multiple retrieval methods.

Search methodExampleStrength
Text searchrg "functionName"Exact matches for names, strings, and errors
Symbol searchDefinition and reference lookupTracing relationships between functions and types
Vector searchSearching for similar logicIntent-based and paraphrase-tolerant queries
AST searchSyntactic pattern matchingDiscovering specific structural forms
Execution result searchTest logs, error messagesIdentifying failure causes
History searchGit logs, PRsUnderstanding reasons for changes

In practice, coding agents typically start with fast string search to locate initial leads, then move to semantic search or file reading as needed.

The relationship between coding agents and RAG

Section titled “The relationship between coding agents and RAG”

Coding agents use RAG as an internal component — but its role is more than simple retrieval.

Agent actionRelationship to RAG
Understand the taskRead relevant documentation, issues, and existing implementations
Locate the change siteCode search, symbol search, dependency search
Decide the implementation approachReference similar implementations, design patterns, and tests
EditModify code based on retrieved context
TestUse execution results as the next piece of context
FixSearch and read error logs to inform re-editing
ExplainCite evidence files and verification results

In other words, RAG is at the core of a coding agent’s ability to read.

Generating a small function and modifying a repository are entirely different problems.

Repository-level tasks present the following challenges.

  • Changes span multiple files.
  • Modifications must conform to the existing design.
  • Tests, linting, and type checking are all involved.
  • The work requires editing existing code rather than generating new code.
  • Misjudging the scope of impact causes regressions.
  • Project-specific rules must be followed.

Benchmarks such as SWE-PolyBench have emerged because the evaluation of coding agents has shifted from “solve an isolated code problem” to “make a change in a real repository.”[3]

In recent coding agents, files such as AGENTS.md and similar context files have become important. These files convey repository-specific work rules, prohibited actions, verification commands, and design principles to the agent.[5][6]

From a Code RAG perspective, such a file is a high-priority document that should always be retrieved.

Typical content includes the following.

  • Do not run the production build command without approval.
  • Treat Japanese as the source of truth.
  • Do not modify the existing UI.
  • Which commands to use for verification.
  • The boundary between generated files and hand-edited files.

If an agent searches only the code and ignores this file, it may produce changes that are technically correct but violate project conventions.

Code RAG cannot be evaluated on answer quality alone. Whether the code actually works is what matters.

Evaluation dimensionWhat to check
Retrieval RecallWere the necessary files, functions, and tests found?
Context PrecisionDid the agent read excessive amounts of unneeded code?
Edit CorrectnessDoes the change satisfy the requirement?
Regression SafetyWere any existing behaviours broken?
Test SuccessDo tests, linting, and type checking pass?
Style ConsistencyDoes the edit match the conventions of the existing codebase?
MinimalityWere unrelated changes avoided?
TraceabilityCan the reason for the change and its supporting evidence be explained?

In natural-language RAG the goal is a “correct answer.” In Code RAG the goal is a “correct, verifiable change.”

In code, function names, type names, error messages, and test names are powerful leads. Exact-text search often outperforms semantic search in the first retrieval pass.

Storing the following metadata alongside file contents improves retrieval precision.

  • Symbol names
  • Definition locations
  • Reference locations
  • Import and export relationships
  • Test targets
  • Module dependencies
  • Owning directory

Tests are part of the specification. Retrieving tests related to the implementation under consideration — not just the implementation itself — is important for confirming expected behaviour.

4. Use execution results as the input for the next retrieval

Section titled “4. Use execution results as the input for the next retrieval”

A test failure log becomes the next retrieval query.

Failure log → search error name → read related test → fix → re-run tests

This loop pairs naturally with an Agentic RAG design.

5. Separate read permissions from write permissions

Section titled “5. Separate read permissions from write permissions”

The scope the agent can read and the scope it can write should be treated as distinct concerns.

OperationExample
ReadSource, tests, documentation, configuration
WriteDesignated implementation files, tests
Requires approvalProduction builds, destructive operations, external communication
ProhibitedSecrets, credentials, large-scale reformatting of unrelated code

With the emergence of coding agents, RAG is expanding from “evidence retrieval for answering questions” to “context management for performing work.”

The following directions are becoming important for Code RAG going forward.

  • Structural search using ASTs and type information
  • Integration with tests, execution logs, and coverage
  • Persistent retrieval of repository-specific rules
  • Continual-memory-style history using past modification records
  • Agents that evaluate before-and-after diffs
  • Workflows that retrieve and incorporate code review comments
  • Code RAG retrieves code, tests, configuration, history, and agent instructions to support development work.
  • In code, structured chunking by function, class, AST node, or symbol is more important than fixed-length chunking.
  • Coding agents use RAG in a loop of retrieval, reading, editing, testing, and re-fixing.
  • Evaluation must cover not only retrieval precision but also whether the resulting change works and whether it breaks existing behaviour.
  1. CodeRAG-Bench: Can Retrieval Augment Code Generation?
  2. cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree
  3. SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents
  4. Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches
  5. Agent READMEs: An Empirical Study of Context Files for Agentic Coding
  6. Introducing Codex