Codex Levels 0-10: Definitions for All 11 Stages

About 10 minutes

Developers and teams that want to assess their current Codex maturity and identify the next capability, rule, or verification practice to add

You have read How to Use Codex Levels

Codex maturity cannot be measured by the number of features used. The important question is how much work can be delegated and how well context, verification, permissions, and review support that delegation.

This page defines Codex maturity from Level 0 through Level 10. Because Level 0 is included, the model has 11 stages. Not every project needs Level 10. Choose a level that matches the risk of the changes, team size, and frequency of operation.

The concrete examples use building and operating a Next.js personal portfolio site as a shared theme. This makes it possible to compare expanding delegation through one project, from a small project-card edit to adding projects, updating the skills page, and publication workflows.

The 11-Stage Model

Level	Title	Core capability	Typical deliverable
Level 0	Chat Advisor	Coding advice	Answer or sample code
Level 1	Repository Reader	Repository understanding	Related-file map and investigation results
Level 2	Focused Editor	Bounded editing	Single-responsibility diff
Level 3	Verified Implementer	Verified implementation	Small feature and test results
Level 4	Context Engineer	Persistent context	AGENTS.md and shared rules
Level 5	GitHub Collaborator	Issue and PR collaboration	Branch, PR, and review record
Level 6	Harness Builder	Safe working environment	Permissions, approvals, checks, and constraints
Level 7	Tool Operator	External tool operation	Browser checks and MCP or connector actions
Level 8	Parallel Orchestrator	Parallel task management	Task plan, worktrees, and subagent results
Level 9	Workflow Operator	Recurring operations	CI, scheduled runs, and triage workflows
Level 10	Agent Platform Architect	Organization-scale platform design	Agent roles, evaluations, audit, and improvement loops

Level 0: Chat Advisor

State: You ask coding or design questions and apply the answer manually. Codex does not inspect the target repository, edit files, or run commands.

Typical work: Explain an error, generate a function example, or compare design options.

Limitation: The answer may not match the actual dependencies, conventions, or implementation.

Move forward when: Codex can inspect the repository and explain the code using real files as evidence.

Level 1: Repository Reader

State: Codex reads the working tree and explains code structure, related files, and likely causes. Investigation is the main activity; editing is optional.

Typical work: Locate the authentication entry point, trace a bug, or identify tests affected by a proposed change.

Completion standard: The result includes file paths and code evidence that a human can verify.

Move forward when: You can define in-scope and out-of-scope areas and delegate a small edit.

Level 2: Focused Editor

State: Codex makes a change limited to one file or one responsibility. A human reads the diff for unintended changes.

Typical work: Add one validation rule, correct copy, or perform a small refactor using an existing pattern.

Completion standard: The request defines what may change, what must not change, and the expected result.

Move forward when: Tests or linting become part of the same definition of done as the edit.

Level 3: Verified Implementer

State: Codex updates several related files and completes a small feature or bug fix, including verification.

Typical work: Update a form, validator, and tests, then run the targeted test command.

Completion standard: The final report includes changed files, commands run, success or failure, and anything unverified.

Move forward when: Repeated conventions and commands are moved into repository instructions.

Level 4: Context Engineer

State: Files such as AGENTS.md continuously provide the technology stack, editing rules, verification commands, and prohibited actions.

Typical work: Use a root AGENTS.md as an entry point to domain-specific rules and skills.

Completion standard: New threads receive the same working agreement without repeating it in every prompt.

Move forward when: Work expands from the local tree into issues, branches, pull requests, and reviews.

Level 5: GitHub Collaborator

State: Codex reads GitHub issues and pull requests and assists with branch creation, implementation, PR creation, and review feedback.

Typical work: Extract acceptance criteria from an issue, draft a PR summary, or address review comments.

Completion standard: Commit scope, branch policy, review ownership, and merge permission are explicit.

Move forward when: Permissions, approvals, and safety checks are systematized instead of specified case by case.

Level 6: Harness Builder

State: You design a harness that gives Codex repeatable rules, skills, checks, permissions, approval conditions, prohibited actions, and failure procedures.

Typical work: Require approval for production builds, protect folders, run standard checks after edits, and validate shared policies automatically.

Completion standard: Dangerous actions stop for approval, routine changes follow reproducible procedures, and policy violations are detected.

Move forward when: Tools beyond files and the shell are connected with limited purpose and permissions.

Level 7: Tool Operator

State: Codex uses browsers, MCP, connectors, images, or document tools and verifies evidence outside the codebase.

Typical work: Inspect an article list in a browser, read GitHub or CMS post data through a connector, or compare a screenshot with an implementation.

Completion standard: Read and write permissions, sensitive-data boundaries, and consequential actions requiring confirmation are defined for every tool.

Move forward when: Independent work can be split into non-conflicting parallel tasks.

Level 8: Parallel Orchestrator

State: Multiple threads, worktrees, cloud tasks, or subagents execute independent work in parallel.

Typical work: Separate implementation, test additions, and documentation updates, then perform integration verification.

Completion standard: File ownership, dependencies, integration order, and conflict ownership are explicit. Multiple tasks do not edit the same files simultaneously.

Move forward when: Parallel work becomes part of recurring or event-driven operations.

Level 9: Workflow Operator

State: Codex participates in recurring workflows such as CI investigation, issue triage, dependency analysis, and documentation synchronization.

Typical work: Perform first-pass CI failure analysis, identify low-coverage areas, or run scheduled consistency checks.

Completion standard: Triggers, stop conditions, timeouts, notifications, retries, audit logs, and human handoff conditions are defined.

Move forward when: Individual workflows are managed as a shared platform and continuously evaluated.

Level 10: Agent Platform Architect

State: Multiple agents, tools, harnesses, evaluations, and audit controls form a reusable platform across projects or an organization.

Typical work: Separate planning, implementation, review, and security roles, then improve rules and skills using evaluation results.

Completion standard: The platform has role separation, least privilege, quality metrics, cost limits, audit trails, failure shutdown procedures, and improvement loops.

Level 10 does not remove humans. Humans still define specifications, permissions, quality standards, and exception handling while governing the work of the agent system.

Assess Your Current Level

Use the highest stage you can reproduce routinely, not the highest stage that succeeded once.

Question	Level
Do you use answers manually without repository inspection?	Level 0
Can Codex investigate using real code evidence?	Level 1
Can you review a tightly scoped diff?	Level 2
Can Codex complete a small feature with tests?	Level 3
Are working rules applied persistently through `AGENTS.md` or similar files?	Level 4
Can issues, PRs, and reviews be handled consistently?	Level 5
Are permissions, approvals, and checks managed as a harness?	Level 6
Can external tools be used within defined permission boundaries?	Level 7
Can multiple tasks run in parallel without conflicts?	Level 8
Can recurring workflows be monitored and recovered?	Level 9
Can multiple workflows be evaluated, audited, and improved?	Level 10

See the references for the external specifications and background sources used on this page.[1][2][3]

References

Summary

Codex Levels 0-10 measure the maturity of context, verification, permissions, and operations supporting delegation.
Levels 0-3 cover advice through verified implementation; Levels 4-6 cover instructions, GitHub, and harnesses; Levels 7-10 cover tools, parallelism, recurring operations, and platform design.
The goal is not the highest level. The goal is a reproducible level appropriate for the project’s risk and scale.

Quiz