What Is Grok Build? xAI's Official CLI Coding Agent
[Grok Build](https://x.ai/cli) is xAI's official CLI coding agent, released as early beta on May 14, 2026 (official / Changelog / npaka's Japanese note write-up).
Alongside Claude Code, Codex CLI, Gemini CLI, and agmsg, Grok Build is xAI's entry in the CLI coding-agent space. Its key differentiators are up to 8 parallel sub-agents isolated in separate Git worktrees, a Plan-Review-Approve workflow, ACP (Agent Client Protocol) + MCP servers, and local execution (air-gap-compatible).
Model & pricing:
- Internal model: grok-build-0.1 (256K context, API rates $1.00/M input, $2.00/M output); Composer 2.5 for SuperGrok / X Premium+
- SWE-Bench Verified 70.8% (vendor-reported; ~17 pts behind Claude Opus 4.7 87.6% and GPT-5.5 88.7%)
- Access gated to SuperGrok Heavy ($300/mo; $99/mo introductory promo), SuperGrok, and X Premium+
Install: `curl -fsSL https://x.ai/cli/install.sh | bash`.
Integrations: AGENTS.md / plugins / hooks / Skills / Plugin Marketplace (MongoDB / Vercel / Sentry / Chrome DevTools / Cloudflare / Superpowers) / Agent Dashboard / `/goal` long-running autonomous mode / headless `-P` flag for CI/CD.
Strategic context: the February 2026 SpaceX × xAI merger, SpaceX's disclosed $60B acquisition option on Cursor, and Composer 2.5 currently training on Colossus, xAI's in-house supercomputer — Grok Build sits inside an unusually vertically-integrated stack.
Caveats: roughly 15× the price of Claude Code / Codex CLI bundles ($20/mo); ~17 pts behind on SWE-Bench Verified; no IDE integration (terminal only); flagship demos such as Arena Mode remain unshipped. Where it wins: Git-worktree parallelism, xAI ecosystem integration, and local execution.
TL;DR — What Is Grok Build?
[Grok Build](https://x.ai/cli) is xAI's official CLI coding agent, released as early beta on May 14, 2026.
Three takeaways:
1. 8 parallel sub-agents isolated in Git worktrees — a real architectural difference from Claude Code (shared workspace) and Codex CLI (subprocess) 2. xAI ecosystem integration — Composer 2.5 trained on Colossus, grok-build-0.1 (256K context), Plugin Marketplace (MongoDB / Vercel / Sentry / Cloudflare etc.) 3. Premium pricing — gated to SuperGrok Heavy ($300/mo, $99/mo promo), about 5–15× the price of the $20/mo Claude Code / Codex CLI bundles
This column sits alongside our Claude Code Agent View, agmsg, cmux, and Ornith-1.0 coverage as part of the June 2026 CLI coding-agent cluster.
Release Timeline
| Date | Event |
|---|---|
| Feb 2026 | SpaceX × xAI merger; SpaceX holds a $60B acquisition option on Cursor |
| May 14, 2026 | Grok Build early beta released, initially limited to SuperGrok Heavy |
| Jun 1, 2026 | xAI launches Composer 2.5 inside the Grok Build CLI (3 days after grok-build-0.1 entered public API beta) |
| Week of Jun 3, 2026 | Kilo Code integration, V9-Medium training completes |
| Jun 15, 2026 | Grok Build v0.2.52 (tool auto-approval state, ER diagrams, Respect manual folds, Ctrl+X, etc.) |
| Week of Jun 17, 2026 | Grok Build 0.1 enters xAI's public API beta |
| Late Jun 2026 | `/goal` long-running autonomous mode added |
Japanese coverage by npaka on note summarizes the early-beta announcement.
Architecture — Parallel Sub-agents in Isolated Git Worktrees
Grok Build's headline differentiator is up to 8 sub-agents running in independent Git worktrees:
- Claude Code: multiple agents in a shared workspace - Codex CLI: multiple agents via subprocesses - Grok Build: isolated per Git worktree — each sub-agent operates on its own branch, so collisions and overwrites can't happen structurally
This makes it a natural fit for "try several solutions in parallel and merge the best one later" — the same shape as the Loop Engineering Maker-Checker pattern and agent ensembles.
Plan-Review-Approve Workflow
Like Claude Code, planning happens before any code is touched:
1. User submits a task 2. Grok Build writes an execution plan 3. User can approve, reject, comment on, or fully rewrite the plan 4. Once approved, implementation runs and changes show as diffs 5. Comments work at the per-step level too
By design it structurally reduces uncontrolled-execution risk — no code change without a reviewable plan.
Models, Context, Pricing
| Item | Value |
|---|---|
| Internal model | grok-build-0.1 (text + image, coding-specialized) |
| Context | 256K tokens (about a quarter of Claude Code's 1M) |
| API price (grok-build-0.1) | $1.00 / 1M input, $2.00 / 1M output |
| Upper-tier model | Composer 2.5 (SuperGrok / X Premium+, Colossus-trained, long-task focus) |
| Related internal | Grok V9-Medium (1.5T parameters, coding-focused) finished training in the same window |
| Access | SuperGrok Heavy ($300/mo; $99/mo introductory promo for 6 months), expanded to SuperGrok and X Premium+ |
| Benchmark | SWE-Bench Verified 70.8% (vendor-reported; Opus 4.7 = 87.6%, GPT-5.5 = 88.7% — about 17 pts ahead) |
Pricing comparison:
- Claude Code (Anthropic): $20/mo bundle - Codex CLI (OpenAI): $20/mo bundle - Grok Build: SuperGrok Heavy $300/mo ($99/mo promo) → roughly 5–15× more expensive
Ry Walker's independent review calls it "the steepest entry price in the category." Cost-effectiveness is currently against Grok Build.
Feature Set
Core:
- Plan Mode: pre-execution plan review, step-level comments and rewrites - Parallel sub-agents: up to 8, isolated per Git worktree - Headless mode `-P`: CI/CD integration with streaming JSON - `/goal` long-running autonomous mode: hand off larger tasks to autonomous execution; status / pause / resume / clear - Agent Dashboard: single-pane management of multiple sessions and dispatches - `grok inspect`: configuration introspection
Extension surfaces:
- AGENTS.md — auto-detection of project conventions - plugins / hooks / skills — same shape as the Claude Code ecosystem - MCP servers — standard support - ACP (Agent Client Protocol) — for building custom bots and orchestration - Plugin Marketplace — MongoDB / Vercel / Sentry / Chrome DevTools / Cloudflare / Superpowers, plus publishable user plugins
Approval modes: `--always-approve` for full automation; default is `ask`.
Install
Install:
curl -fsSL https://x.ai/cli/install.sh | bashPrerequisite: an active SuperGrok Heavy / SuperGrok / X Premium+ subscription and login.
Local execution: code runs on the user's machine — air-gap-compatible and data-sovereignty-friendly. Same shape as Claude Code / Codex CLI, but emphasized in xAI's public messaging.
Strategic Context — xAI × SpaceX × Colossus × Cursor
Read as a standalone product, Grok Build looks like "a more expensive, less benchmarked Claude Code." In the context of xAI's vertical stack, the positioning shifts:
- SpaceX × xAI merger (Feb 2026) — AI and space resources consolidated under Elon Musk - SpaceX holds a $60B acquisition option on Cursor — Cursor could be folded into the xAI stack - Composer 2.5 trains on Colossus — xAI's own hundreds-of-thousands-of-GPUs supercluster - Grok Build = the coding frontend of this vertical stack
Short-term it lags Claude Code / Codex; medium-term, the Colossus compute and the Cursor option could change the picture.
Competitive Positioning (CLI Coding Agents, June 2026)
| Aspect | Grok Build | Claude Code | Codex CLI | agmsg |
|---|---|---|---|---|
| Parallelism | 8 (Git worktree) | Unlimited (shared) | Multiple (subprocess) | Peer messaging only |
| Isolation | Git worktree | Shared | Subprocess | N/A |
| Benchmark (SWE-Bench Verified) | 70.8% (vendor) | 87.6% | 88.7% | N/A (messaging layer) |
| Pricing | $99–300/mo | $20/mo | $20/mo | Free (MIT) |
| Maturity | Early beta | GA | GA | GA |
| IDE integration | None (terminal only) | Yes | Yes | N/A |
| MCP / ACP | Both | Mostly MCP | MCP | N/A |
| Local execution | Yes | Yes | Yes | Yes |
Grok Build's strengths: 1. Git-worktree parallelism — ideal for parallel alternative solutions 2. Out-of-the-box xAI Plugin Marketplace — MongoDB / Vercel / Sentry / Cloudflare integrations from day one 3. `/goal` long-running autonomous mode 4. Forward-looking integration with the xAI stack — Composer 2.5 / Cursor / Colossus
Weaknesses: 1. Price — 5–15× Claude Code / Codex CLI 2. Benchmarks — 17 pts behind on SWE-Bench Verified 3. No IDE integration — terminal only 4. Unshipped features — Arena Mode, local-first privacy, etc. were promoted but haven't shipped
Use Cases
- Trying multiple solutions in parallel (A/B/C variants in separate Git worktrees → compare, merge winner) - Existing xAI ecosystem subscribers (SuperGrok Heavy / X Premium+) - Need first-class MongoDB / Vercel / Sentry / Cloudflare CLI integrations - Hand off large refactors to `/goal` overnight - Air-gap / data-sovereignty engagements (local execution)
Avoid for:
- Cost-sensitive teams — Claude Code / Codex CLI at $20/mo are very hard to beat on dollars per task - Benchmark-driven adoption — currently 17 pts behind Opus 4.7 / GPT-5.5 on SWE-Bench Verified - Teams that need IDE integration — Cursor / Windsurf / Zed series are likely better fits
Oflight's View — How to Use It
What we recommend in our AI consulting and software development engagements:
Pattern 1: One leg of a multi-vendor CLI strategy
Use Claude Code (implementation) + Codex CLI (review) + Grok Build (parallel experiments), wired together with agmsg. Grok Build's Git-worktree parallelism has unique value for branching exploration — three competing bug fixes side-by-side, or a TypeScript and a Rust implementation of the same feature.
Pattern 2: Leverage existing SuperGrok Heavy subscriptions
Customers already on SuperGrok Heavy / X Premium+ can use Grok Build at no extra cost. Keep Claude Code as the primary, use Grok Build for specific tasks (parallel experiments, xAI Plugin Marketplace integrations).
Pattern 3: Forward-looking xAI stack readiness
The Cursor acquisition option / Composer 2.5 / Colossus trajectory is uncertain, but getting hands on the xAI stack now is strategically worth it — as a diversification candidate against vendor-concentration risks like the Claude Fable 5 export-control suspension.
Caveats
- Expensive — $300/mo (or $99/mo promo) is heavy for SMBs - Lags on benchmarks — vendor-reported 70.8%, 17 pts behind Opus 4.7 / GPT-5.5 - Early beta — frequent breaking changes; the v0.1 → v0.2.x cadence is fast, so be careful in production - No IDE integration — terminal only - Some demoed features aren't live — Arena Mode in particular - Auto-renewal surprises — community reports of unexpected SuperGrok Heavy renewals; worth checking before subscribing - No independent benchmark verification beyond vendor numbers
Talk to Us About AI Agent Environments — Three Inquiry Funnels
We help design, build, and operate multi-agent environments that include Grok Build.
(1) Evaluation & Requirements (from ¥198,000)
"Grok Build or Claude Code?" "Multi-vendor CLI strategy?" "Is SuperGrok Heavy worth the price for us?" — 1–2 weeks, written report deliverable.
(2) Custom Development & SI (from ¥498,000)
Build a multi-agent automation system combining Grok Build + Claude Code + Codex + agmsg, plus CI/CD integration and Plugin Marketplace usage.
(3) Ongoing Maintenance (¥9,800–¥80,000/month)
Ongoing model-update tracking, new-release evaluation, KPI monitoring for Grok Build / Claude Code / Codex. OpenClaw + Grok Build integration in scope.
- [OpenClaw maintenance](../services/openclaw-setup): Light ¥9,800 / Standard ¥19,800 / Premium ¥49,800 per month - AI-consulting continuous support: Light ¥30,000/mo / Standard ¥80,000/mo / Premium on request
FAQ
Q1. Difference vs Claude Code? A. Architecture is Git-worktree parallelism; price is ~15× higher; benchmark is 17 pts behind; no IDE integration. Differentiators are parallel experiments, xAI ecosystem integration, and `/goal` autonomous mode. Not a full replacement for Claude Code — more of a supplement for specific tasks. Q2. Is $300/mo SuperGrok Heavy worth it? A. For existing X Premium+ / SuperGrok subscribers, the marginal cost is small. New from-zero adopters paying just for Grok Build face a steep bill. The $99/mo 6-month promo is the realistic way to evaluate ROI. Q3. grok-build-0.1 vs Composer 2.5? A. grok-build-0.1 = the general-purpose API model at $1/$2 per M tokens. Composer 2.5 = the long-task / instruction-following model for SuperGrok / X Premium+. Use grok-build-0.1 for everyday coding and Composer 2.5 for big refactors and long-running work. Q4. Is the local-execution / air-gap story real? A. Code runs locally, but inference (grok-build-0.1 / Composer 2.5) still calls xAI's cloud. Same as Claude Code etc. For true offline, evaluate local LLMs instead. Q5. Is the Cursor acquisition confirmed? A. It's an option, not an executed deal. SpaceX has the right to acquire Cursor for $60B. If exercised, Cursor's IDE integration could land inside the xAI stack and Grok Build's boundary may shift. Treat it as medium-term uncertainty. Q6. Can [agmsg](../columns/agmsg-cross-agent-messaging-cli-ai-2026-06) bridge Grok Build to Claude Code? A. Likely — agmsg is designed as a cross-vendor CLI agent layer (covering OpenCode etc.), and Grok Build is the same Bash + terminal shape. But the agmsg-official-support list does not yet include Grok Build. Integration likely viable; PoC required. Q7. Does the 17-pt benchmark gap actually matter in production? A. Task-dependent. On real-world large-scale SI / refactors, the gap shows up; on routine business coding, documentation, and test generation, it often doesn't. PoC on your own code is mandatory. Q8. Japanese enterprise procurement? A. xAI is a US company, so it's a foreign SaaS for Japanese buyers — apply the same cross-border / GDPR / APPI diligence as Claude / GPT. The realistic split is sensitive workloads on [self-hosted local LLMs](../columns/local-llm-landscape-2026-june-update), general coding on Grok Build / Claude Code in hybrid.
Bottom Line
Grok Build is xAI's CLI coding agent, released May 14, 2026. Same category as Claude Code / Codex CLI; the unique axes are 8-way Git-worktree parallel sub-agents and the xAI Plugin Marketplace / Composer 2.5 / `/goal` autonomous mode.
Short-term verdict: expensive ($300/mo or $99/mo promo), 17 pts behind on SWE-Bench Verified vs Opus 4.7 / GPT-5.5, terminal-only, with demoed features (Arena Mode) still unshipped — cost-effectiveness currently against xAI.
Medium-term strategic value: the SpaceX × xAI merger + Cursor acquisition option + Colossus-trained Composer 2.5 vertical stack is real, and Grok Build is its coding frontend. Touching the xAI stack now has option value.
Oflight's recommendation: keep Claude Code as the workhorse; use Grok Build for parallel-experiment tasks where Git-worktree isolation matters; SuperGrok Heavy holders should make use of what they're already paying for; tie multiple CLI agents together with agmsg — Grok Build is one leg of a multi-vendor strategy, not a replacement.
References
Primary: - xAI: Grok Build (x.ai/cli) - xAI: Grok Build Changelog - xAI - @xai on X Japanese coverage: - npaka on note: Grok Build write-up Third-party: - Big Hat Group — xAI Weekly 2026-06-17 - Big Hat Group — xAI Weekly 2026-06-03 - Digital Applied — Grok Build CLI Parallel Coding Agents - TechJack — xAI Composer 2.5 in Grok Build CLI - Ry Walker Research — Grok Build (independent review) - Releasebot — xAI Release Notes June 2026 - Medium (Noor Mohamad) — Best 12 Grok Setups 2026 Related Oflight columns: - Claude Code Agent View — parallel orchestration - agmsg — cross-vendor CLI agent messaging - cmux (Manaflow) - Ornith-1.0 — DeepReinforce agentic-coding LLM - Kimi K2.7-Code - Loop Engineering - Sakana Fugu — cloud-side orchestration - Claude Fable 5 export-control suspension - Local LLM June 2026 Update - Cursor Automations Oflight services: - AI Consulting - OpenClaw setup - Software Development Inquiries: - AI consulting / PoC - Custom development / SI - OpenClaw + Grok Build integration Note: the official x.ai/cli page returned 403 to automated fetching, so this column is grounded in npaka's Japanese write-up and multiple third-party sources. Specifications and pricing reflect June 27, 2026; Grok Build is early beta and details change frequently — re-check the official changelog before any production decision.
Feel free to contact us
Contact Us