OpenAI Codex Beginner's Guide — Getting Started with Cloud-Based Coding Agent [2026]
OpenAI Codex is a cloud-based autonomous coding agent launched in May 2025. It connects to your GitHub repository via ChatGPT or a desktop app to automate code writing, testing, and PR creation. This guide covers core concepts, a 5-step quickstart, and a model comparison table.
What Is OpenAI Codex? — How It Differs from the Original Codex
OpenAI Codex is a cloud-based autonomous coding agent announced in May 2025. You give it a task in plain text, and it writes code, runs tests, and returns a diff — all inside an isolated cloud sandbox. Important: This is completely different from the original Codex (a GPT-3-based code completion model released in 2021). The current Codex is an autonomous agent built on o3/GPT-5 series models. Its purpose is task delegation, not line-by-line code completion. In February 2026, native desktop apps for macOS and Windows and IDE extensions were also released.
Architecture Diagram — How Does Codex Work?
Here is a visual breakdown of how Codex processes a task. ``` [ChatGPT / Desktop App / CLI] ↓ [App Server] ↓ [Cloud Sandbox (isolated)] ↓ [Repository Clone] ↓ [Setup: Install dependencies (network ON)] ↓ [Agent: Edit code, run tests (network OFF)] ↓ [Diff + Logs + Test Results] ↓ [User Review → GitHub PR] ``` During the setup phase, internet access is allowed so npm, pip, or other package managers can install dependencies. The agent execution phase runs offline to prevent unintended external access.
Available Interfaces and Model Comparison
Available interfaces
| Interface | Highlights | Best For |
|---|---|---|
| Web (chatgpt.com) | Browser-based, no install | Beginners / quick trials |
| Desktop App (macOS/Windows) | Native, push notifications | Daily use |
| VS Code Extension | In-editor Codex panel | Frontend developers |
| JetBrains Extension | IntelliJ / PyCharm support | Backend developers |
| Xcode Extension | Swift / iOS workflow | iOS engineers |
| CLI | Scriptable, automation-ready | DevOps / advanced users |
Model comparison as of April 2026
| Model | SWE-bench | Highlights | Released |
|---|---|---|---|
| codex-1 | — | First-gen, o3-based | May 2025 |
| GPT-5.3-Codex | 78.2% | Production standard, balanced | Feb 2026 |
| GPT-5.3-Codex-Spark | — | 15x faster via Cerebras chips | Feb 2026 |
| GPT-5.4 | 57.7% | Latest, 1M context window | Mar 2026 |
SWE-bench measures the percentage of real GitHub issues a model can autonomously resolve.
5-Step Quickstart — How to Use Codex Right Now
Follow these steps to get started with Codex immediately. ``` ① Subscribe to ChatGPT Plus or higher ↓ ② Open chatgpt.com or the desktop app ↓ ③ Open the Codex panel and connect your GitHub account (OAuth) ↓ ④ Select a repository and enter your task in plain text ↓ ⑤ Review the diff and test results, then create and merge a PR ``` Step ③ note: Codex requests read/write access to your repositories. During the first setup you can limit the scope to specific repositories for added security.
"Code" vs "Ask" — When to Use Each Mode
Codex offers two distinct modes.
| Mode | Behavior | Use Case | Time |
|---|---|---|---|
| Code | Runs task in background sandbox | Write, fix, and test code | 1–30 min |
| Ask | Real-time conversational answers | Explain code, discuss architecture | Seconds–1 min |
Code mode spins up a full sandbox and actually writes code, so it takes longer — but you can work on something else while it runs. Ask mode reads the repository and answers in real time, ideal for quick clarifications. Recommended flow: Use Ask to design → Use Code to implement → Use Ask to clarify the diff → Merge the PR.
Supported Languages and Sandbox Modes
Supported languages (any language whose test runner works on a Linux container)
| Language | Test Frameworks | Notes |
|---|---|---|
| Python | pytest, unittest | Most battle-tested |
| TypeScript / JavaScript | Jest, Vitest, Mocha | Node.js and Deno supported |
| Go | go test | Works with standard toolchain |
| Rust | cargo test | Auto-installs Cargo dependencies |
| Java | JUnit, Maven, Gradle | Full JVM ecosystem |
| C / C++ | GoogleTest, CMake | Reads Makefiles automatically |
The 3 sandbox permission modes
| Mode | Description | Recommended For |
|---|---|---|
| workspace-write | Write access within the repo only (default) | Covers most use cases |
| read-only | File reading only, no modifications | Code review and analysis |
| danger-full-access | Allows writes outside the repo | Advanced users / CI integration |
Having automated tests dramatically improves output quality since Codex can self-verify its changes.
Pricing and Plans
Using Codex requires a ChatGPT Plus plan or higher.
| Plan | Monthly Price | Codex Access | Parallel Tasks |
|---|---|---|---|
| Free | $0 | Not available | — |
| Plus | $20/mo | Limited | 1–3 |
| Pro | $200/mo | Unlimited | 10+ |
| Team / Enterprise | Custom | Unlimited + admin controls | Custom |
For large-scale enterprise use, Team or Enterprise plans are recommended. Direct API access is available on a pay-per-use basis, starting at approximately $1.50 USD per 1M input tokens depending on the model.
Frequently Asked Questions
Q1. Does Codex work with private repositories? Yes. After connecting via GitHub OAuth, you can select any private repository. Code is deleted from the sandbox after the task completes. Q2. Is it compatible with the old Codex API? No. The original Codex models (e.g., davinci-codex) were deprecated in March 2023. The current Codex agent is accessed through the ChatGPT UI or a separate API endpoint. Q3. Does Codex automatically merge pull requests? No. Codex creates a diff and a draft PR, but merging always requires your approval. Q4. Can I use Codex without an internet connection? Codex is a cloud service, so an internet connection is required. There is currently no on-premises version. Q5. How large a codebase can Codex handle in one task? GPT-5.4 supports a 1M-token context window. Limiting each task to a focused set of files yields the most reliable results. Q6. Can Codex be used on a repository with no tests? Yes, but without automated tests Codex cannot self-verify its changes, so you will need to review the output manually. Q7. How is Codex different from GitHub Copilot and Cursor? Copilot provides real-time inline suggestions as you type. Cursor is an in-editor local agent. Codex autonomously executes complete tasks in a cloud sandbox and opens pull requests. Think of it as a delegated agent, not a completion tool.
Need Help Getting Started? Talk to Oflight
Struggling with repository connection? Not sure how to break tasks down for Codex? Want to design a secure, enterprise-grade Codex workflow? Oflight's AI consulting service is here to help. We provide end-to-end support for adopting Codex and other cutting-edge AI agents — from implementation and process improvement to team training. Start with a free consultation. → View AI Consulting Services
Feel free to contact us
Contact Us