How to Use Codex Levels
About 10 minutes
When you first use Codex, asking it to fix a piece of code can already feel useful. But to use it reliably in real work, you need more than one-off code generation. You need to design how work is split, how results are verified, and how changes are reviewed.
Codex Levels are a map for that maturity curve. Level 0 is advice. Level 1 is repository understanding. Levels 2-3 are local editing and verification. Levels 4-6 cover context, GitHub collaboration, and harness design. Levels 7 and above move into external tools, parallel work, recurring operations, and platform design.
What Codex Levels Mean
Section titled “What Codex Levels Mean”Codex Levels describe how much work you can safely delegate to Codex. A higher level does not mean humans disappear. It means the human role moves from individual implementation details toward specification, verification, permissions, and review design.
| Stage | How Codex is used | What the human designs |
|---|---|---|
| Level 0 | Coding advice outside a repository | Assumptions, questions, usage decisions |
| Level 1 | File reading and code investigation | Scope and investigation criteria |
| Level 2-3 | Bounded edits, tests, and small feature completion | Change scope, acceptance criteria, verification commands |
| Level 4-6 | AGENTS.md, GitHub collaboration, and harnesses | Working agreements, review ownership, constraints, approvals |
| Level 7-8 | Browser, MCP, external tools, parallel work | Tool permissions, task ownership, conflict avoidance |
| Level 9-10 | CI, recurring tasks, team operations | Operating rules, auditability, quality standards |
Use the table as a delegation checklist, not as a personal skill ranking.
Core Principles From the Official Docs
Section titled “Core Principles From the Official Docs”OpenAI’s Codex documentation describes Codex as an agent that receives prompts, reads files, edits files, and calls tools while working through a task. In other words, Codex is not just a chat interface that returns an answer. It acts inside a working environment.
That means your prompt should include:
- the target files or feature
- steps to reproduce the problem
- the expected behavior
- verification commands to run
- areas that must not change
- evidence you want in the final report
For example:
Fix the project detail page in src/app/projects/[slug]/page.tsx.
Conditions:
- Do not change the existing ProjectCard component.
- Show projects with featured: false on the detail page.
- Add a failing-case test in tests/projects/slug.test.ts.
- Run npm test -- tests/projects/slug.test.ts after the change.
- Report anything that remains unverified.If you only say “fix this,” Codex has to guess what completion means. Clear completion criteria align Codex’s work with your review.
Patterns Seen in Public Workflows
Section titled “Patterns Seen in Public Workflows”Public posts on LinkedIn and X from experienced Codex users tend to emphasize working conditions over long “magic prompts.” The recurring patterns are:
- Write requests like GitHub issues.
- Keep AGENTS.md as a short table of contents and move detailed rules into separate files.
- Keep one task around one hour of work or a few hundred lines of changes.
- Ask the final report to include tests, diffs, and unverified items.
This matches the official guidance: Codex is more reliable with verifiable work, small focused steps, and concrete context.
Level 0-1: Move From Advice to Repository Understanding
Section titled “Level 0-1: Move From Advice to Repository Understanding”At the first stage, do not start by delegating a large implementation. Use Codex to understand the codebase.
Explain where login is implemented in this repository.
List the related files, major functions, and test files in a table.
Do not make changes yet.This is close to the Ask mode workflow OpenAI recommends for larger changes. Before editing, ask Codex to explain the structure and draft a plan. Once the understanding is correct, move into Code mode or CLI-based edits.
Level 2-3: Delegate a Small Feature With Verification
Section titled “Level 2-3: Delegate a Small Feature With Verification”At the next stage, delegate both implementation and verification. Keep the scope small.
Good request:
Add technology stack badge display to the project card.
Scope:
- components/ProjectCard.tsx
- app/projects/page.tsx
- tests/project-card.test.tsx
Completion criteria:
- Display the first three tags from the tags array as badges.
- Do not change the existing card size or layout.
- npm test -- tests/project-card.test.tsx passes.At this level, do not accept Codex’s report blindly. Read the diff, confirm the commands it ran, and check whether the test actually covers the changed behavior.
Level 4-6: Add AGENTS.md and Harness Rules
Section titled “Level 4-6: Add AGENTS.md and Harness Rules”If you use Codex repeatedly, writing the same context in every prompt is inefficient. Put the repository working agreement in AGENTS.md.
# AGENTS.md
## Repository Rules
- Inspect related files and existing patterns before implementation.
- Keep changes minimal; avoid unrelated refactors.
- Run npx tsc --noEmit after TypeScript changes.
- Check the browser with next dev after UI changes.
- Do not write secrets, API keys, or personal data into files.
- In the final report, include changes, verification results, and unverified items.Longer is not automatically better. A very long instructions file becomes hard for both humans and Codex to use. In practice, the root AGENTS.md works best as a table of contents, with detailed rules split into files for testing, security, UI, and GitHub workflows.
At Level 6, add explicit verification commands, forbidden commands, approval requirements, and CI checks. That is harness design.
Level 7-8: Use Tool Integrations and Parallel Work
Section titled “Level 7-8: Use Tool Integrations and Parallel Work”Codex is available through CLI, IDE, web, and app surfaces. Local threads read and edit your working tree. Cloud threads work in an isolated environment connected to a GitHub repository. When you start parallel work, avoid having two threads edit the same files.
Work that parallelizes well:
- Add API tests on branch A.
- Update documentation on branch B.
- Investigate dependency impact on branch C.
Work that does not parallelize well:
- Multiple UI edits to the same component.
- Multiple edits to the same migration file.
- Implementing several competing designs before the spec is settled.
Codex app and cloud threads are useful for background work. Review and integration still belong to the human workflow.
Level 9-10: Move Into Operations
Section titled “Level 9-10: Move Into Operations”At higher levels, Codex becomes part of routine development. Examples include adding tests for low-coverage areas, checking dependency update impact, triaging issues, drafting pull request reviews, and updating documentation.
The more you automate, the more permission design matters.
| Area | What to check |
|---|---|
| Permissions | Repositories Codex can read, branches it can write, commands it can run |
| Auditability | Which instruction produced which diff and which checks were run |
| Review | Human review boundaries and files that always require review |
| Failure handling | CI failures, conflicts, and unverified items |
Level 10 is not “let Codex do everything.” It is the stage where you design the environment, rules, verification, and review process that let Codex work safely.
A Practical Level-Up Sequence
Section titled “A Practical Level-Up Sequence”- Ask Codex to explain the structure of an existing repository.
- Pick one small change and write target files plus completion criteria.
- Specify the verification command to run after implementation.
- Ask for a final report with diff summary, test results, and unverified items.
- Move repeated rules into AGENTS.md.
This sequence changes Codex from a useful code generator into a verifiable development workflow.
See the references for the external specifications and background sources used on this page.[1][2][3][4]
References
Section titled “References”Summary
Section titled “Summary”- Codex Levels are a way to expand Codex delegation gradually.
- Leveling up is about completion criteria, verification, AGENTS.md, and review design, not longer prompts.
- Small tasks, concrete file references, and verification commands make Codex more reliable.
- As you move into parallel work and automation, define permissions, auditability, and failure handling first.