株式会社オブライト
AI2026-06-27

What Is Grok Build? xAI's Official CLI Coding Agent

[Grok Build](https://x.ai/cli) is xAI's official CLI coding agent, released as early beta on May 14, 2026 (official / Changelog / npaka's Japanese note write-up).

Alongside Claude Code, Codex CLI, Gemini CLI, and agmsg, Grok Build is xAI's entry in the CLI coding-agent space. Its key differentiators are up to 8 parallel sub-agents isolated in separate Git worktrees, a Plan-Review-Approve workflow, ACP (Agent Client Protocol) + MCP servers, and local execution (air-gap-compatible).

Model & pricing:
- Internal model: grok-build-0.1 (256K context, API rates $1.00/M input, $2.00/M output); Composer 2.5 for SuperGrok / X Premium+
- SWE-Bench Verified 70.8% (vendor-reported; ~17 pts behind Claude Opus 4.7 87.6% and GPT-5.5 88.7%)
- Access gated to SuperGrok Heavy ($300/mo; $99/mo introductory promo), SuperGrok, and X Premium+

Install: `curl -fsSL https://x.ai/cli/install.sh | bash`.

Integrations: AGENTS.md / plugins / hooks / Skills / Plugin Marketplace (MongoDB / Vercel / Sentry / Chrome DevTools / Cloudflare / Superpowers) / Agent Dashboard / `/goal` long-running autonomous mode / headless `-P` flag for CI/CD.

Strategic context: the February 2026 SpaceX × xAI merger, SpaceX's disclosed $60B acquisition option on Cursor, and Composer 2.5 currently training on Colossus, xAI's in-house supercomputer — Grok Build sits inside an unusually vertically-integrated stack.

Caveats: roughly 15× the price of Claude Code / Codex CLI bundles ($20/mo); ~17 pts behind on SWE-Bench Verified; no IDE integration (terminal only); flagship demos such as Arena Mode remain unshipped. Where it wins: Git-worktree parallelism, xAI ecosystem integration, and local execution.


TL;DR — What Is Grok Build?

[Grok Build](https://x.ai/cli) is xAI's official CLI coding agent, released as early beta on May 14, 2026.

Three takeaways:

1. 8 parallel sub-agents isolated in Git worktrees — a real architectural difference from Claude Code (shared workspace) and Codex CLI (subprocess) 2. xAI ecosystem integrationComposer 2.5 trained on Colossus, grok-build-0.1 (256K context), Plugin Marketplace (MongoDB / Vercel / Sentry / Cloudflare etc.) 3. Premium pricing — gated to SuperGrok Heavy ($300/mo, $99/mo promo), about 5–15× the price of the $20/mo Claude Code / Codex CLI bundles

This column sits alongside our Claude Code Agent View, agmsg, cmux, and Ornith-1.0 coverage as part of the June 2026 CLI coding-agent cluster.

Release Timeline

DateEvent
Feb 2026SpaceX × xAI merger; SpaceX holds a $60B acquisition option on Cursor
May 14, 2026Grok Build early beta released, initially limited to SuperGrok Heavy
Jun 1, 2026xAI launches Composer 2.5 inside the Grok Build CLI (3 days after grok-build-0.1 entered public API beta)
Week of Jun 3, 2026Kilo Code integration, V9-Medium training completes
Jun 15, 2026Grok Build v0.2.52 (tool auto-approval state, ER diagrams, Respect manual folds, Ctrl+X, etc.)
Week of Jun 17, 2026Grok Build 0.1 enters xAI's public API beta
Late Jun 2026`/goal` long-running autonomous mode added

Japanese coverage by npaka on note summarizes the early-beta announcement.

Architecture — Parallel Sub-agents in Isolated Git Worktrees

Grok Build's headline differentiator is up to 8 sub-agents running in independent Git worktrees:

- Claude Code: multiple agents in a shared workspace - Codex CLI: multiple agents via subprocesses - Grok Build: isolated per Git worktree — each sub-agent operates on its own branch, so collisions and overwrites can't happen structurally

This makes it a natural fit for "try several solutions in parallel and merge the best one later" — the same shape as the Loop Engineering Maker-Checker pattern and agent ensembles.

Plan-Review-Approve Workflow

Like Claude Code, planning happens before any code is touched:

1. User submits a task 2. Grok Build writes an execution plan 3. User can approve, reject, comment on, or fully rewrite the plan 4. Once approved, implementation runs and changes show as diffs 5. Comments work at the per-step level too

By design it structurally reduces uncontrolled-execution risk — no code change without a reviewable plan.

Models, Context, Pricing

ItemValue
Internal modelgrok-build-0.1 (text + image, coding-specialized)
Context256K tokens (about a quarter of Claude Code's 1M)
API price (grok-build-0.1)$1.00 / 1M input, $2.00 / 1M output
Upper-tier modelComposer 2.5 (SuperGrok / X Premium+, Colossus-trained, long-task focus)
Related internalGrok V9-Medium (1.5T parameters, coding-focused) finished training in the same window
AccessSuperGrok Heavy ($300/mo; $99/mo introductory promo for 6 months), expanded to SuperGrok and X Premium+
BenchmarkSWE-Bench Verified 70.8% (vendor-reported; Opus 4.7 = 87.6%, GPT-5.5 = 88.7% — about 17 pts ahead)

Pricing comparison:

- Claude Code (Anthropic): $20/mo bundle - Codex CLI (OpenAI): $20/mo bundle - Grok Build: SuperGrok Heavy $300/mo ($99/mo promo) → roughly 5–15× more expensive

Ry Walker's independent review calls it "the steepest entry price in the category." Cost-effectiveness is currently against Grok Build.

Feature Set

Core:

- Plan Mode: pre-execution plan review, step-level comments and rewrites - Parallel sub-agents: up to 8, isolated per Git worktree - Headless mode `-P`: CI/CD integration with streaming JSON - `/goal` long-running autonomous mode: hand off larger tasks to autonomous execution; status / pause / resume / clear - Agent Dashboard: single-pane management of multiple sessions and dispatches - `grok inspect`: configuration introspection

Extension surfaces:

- AGENTS.md — auto-detection of project conventions - plugins / hooks / skills — same shape as the Claude Code ecosystem - MCP servers — standard support - ACP (Agent Client Protocol) — for building custom bots and orchestration - Plugin Marketplace — MongoDB / Vercel / Sentry / Chrome DevTools / Cloudflare / Superpowers, plus publishable user plugins

Approval modes: `--always-approve` for full automation; default is `ask`.

Install

Install:

curl -fsSL https://x.ai/cli/install.sh | bash

Prerequisite: an active SuperGrok Heavy / SuperGrok / X Premium+ subscription and login.

Local execution: code runs on the user's machine — air-gap-compatible and data-sovereignty-friendly. Same shape as Claude Code / Codex CLI, but emphasized in xAI's public messaging.

Strategic Context — xAI × SpaceX × Colossus × Cursor

Read as a standalone product, Grok Build looks like "a more expensive, less benchmarked Claude Code." In the context of xAI's vertical stack, the positioning shifts:

- SpaceX × xAI merger (Feb 2026) — AI and space resources consolidated under Elon Musk - SpaceX holds a $60B acquisition option on Cursor — Cursor could be folded into the xAI stack - Composer 2.5 trains on Colossus — xAI's own hundreds-of-thousands-of-GPUs supercluster - Grok Build = the coding frontend of this vertical stack

Short-term it lags Claude Code / Codex; medium-term, the Colossus compute and the Cursor option could change the picture.

Competitive Positioning (CLI Coding Agents, June 2026)

AspectGrok BuildClaude CodeCodex CLIagmsg
Parallelism8 (Git worktree)Unlimited (shared)Multiple (subprocess)Peer messaging only
IsolationGit worktreeSharedSubprocessN/A
Benchmark (SWE-Bench Verified)70.8% (vendor)87.6%88.7%N/A (messaging layer)
Pricing$99–300/mo$20/mo$20/moFree (MIT)
MaturityEarly betaGAGAGA
IDE integrationNone (terminal only)YesYesN/A
MCP / ACPBothMostly MCPMCPN/A
Local executionYesYesYesYes

Grok Build's strengths: 1. Git-worktree parallelism — ideal for parallel alternative solutions 2. Out-of-the-box xAI Plugin Marketplace — MongoDB / Vercel / Sentry / Cloudflare integrations from day one 3. `/goal` long-running autonomous mode 4. Forward-looking integration with the xAI stack — Composer 2.5 / Cursor / Colossus

Weaknesses: 1. Price — 5–15× Claude Code / Codex CLI 2. Benchmarks — 17 pts behind on SWE-Bench Verified 3. No IDE integration — terminal only 4. Unshipped features — Arena Mode, local-first privacy, etc. were promoted but haven't shipped

Use Cases

- Trying multiple solutions in parallel (A/B/C variants in separate Git worktrees → compare, merge winner) - Existing xAI ecosystem subscribers (SuperGrok Heavy / X Premium+) - Need first-class MongoDB / Vercel / Sentry / Cloudflare CLI integrations - Hand off large refactors to `/goal` overnight - Air-gap / data-sovereignty engagements (local execution)

Avoid for:

- Cost-sensitive teams — Claude Code / Codex CLI at $20/mo are very hard to beat on dollars per task - Benchmark-driven adoption — currently 17 pts behind Opus 4.7 / GPT-5.5 on SWE-Bench Verified - Teams that need IDE integration — Cursor / Windsurf / Zed series are likely better fits

Oflight's View — How to Use It

What we recommend in our AI consulting and software development engagements:

Pattern 1: One leg of a multi-vendor CLI strategy

Use Claude Code (implementation) + Codex CLI (review) + Grok Build (parallel experiments), wired together with agmsg. Grok Build's Git-worktree parallelism has unique value for branching exploration — three competing bug fixes side-by-side, or a TypeScript and a Rust implementation of the same feature.

Pattern 2: Leverage existing SuperGrok Heavy subscriptions

Customers already on SuperGrok Heavy / X Premium+ can use Grok Build at no extra cost. Keep Claude Code as the primary, use Grok Build for specific tasks (parallel experiments, xAI Plugin Marketplace integrations).

Pattern 3: Forward-looking xAI stack readiness

The Cursor acquisition option / Composer 2.5 / Colossus trajectory is uncertain, but getting hands on the xAI stack now is strategically worth it — as a diversification candidate against vendor-concentration risks like the Claude Fable 5 export-control suspension.

Caveats

- Expensive — $300/mo (or $99/mo promo) is heavy for SMBs - Lags on benchmarks — vendor-reported 70.8%, 17 pts behind Opus 4.7 / GPT-5.5 - Early beta — frequent breaking changes; the v0.1 → v0.2.x cadence is fast, so be careful in production - No IDE integration — terminal only - Some demoed features aren't live — Arena Mode in particular - Auto-renewal surprises — community reports of unexpected SuperGrok Heavy renewals; worth checking before subscribing - No independent benchmark verification beyond vendor numbers

Talk to Us About AI Agent Environments — Three Inquiry Funnels

We help design, build, and operate multi-agent environments that include Grok Build.

(1) Evaluation & Requirements (from ¥198,000)

"Grok Build or Claude Code?" "Multi-vendor CLI strategy?" "Is SuperGrok Heavy worth the price for us?" — 1–2 weeks, written report deliverable.

(2) Custom Development & SI (from ¥498,000)

Build a multi-agent automation system combining Grok Build + Claude Code + Codex + agmsg, plus CI/CD integration and Plugin Marketplace usage.

(3) Ongoing Maintenance (¥9,800–¥80,000/month)

Ongoing model-update tracking, new-release evaluation, KPI monitoring for Grok Build / Claude Code / Codex. OpenClaw + Grok Build integration in scope.

- [OpenClaw maintenance](../services/openclaw-setup): Light ¥9,800 / Standard ¥19,800 / Premium ¥49,800 per month - AI-consulting continuous support: Light ¥30,000/mo / Standard ¥80,000/mo / Premium on request

FAQ

Q1. Difference vs Claude Code? A. Architecture is Git-worktree parallelism; price is ~15× higher; benchmark is 17 pts behind; no IDE integration. Differentiators are parallel experiments, xAI ecosystem integration, and `/goal` autonomous mode. Not a full replacement for Claude Code — more of a supplement for specific tasks. Q2. Is $300/mo SuperGrok Heavy worth it? A. For existing X Premium+ / SuperGrok subscribers, the marginal cost is small. New from-zero adopters paying just for Grok Build face a steep bill. The $99/mo 6-month promo is the realistic way to evaluate ROI. Q3. grok-build-0.1 vs Composer 2.5? A. grok-build-0.1 = the general-purpose API model at $1/$2 per M tokens. Composer 2.5 = the long-task / instruction-following model for SuperGrok / X Premium+. Use grok-build-0.1 for everyday coding and Composer 2.5 for big refactors and long-running work. Q4. Is the local-execution / air-gap story real? A. Code runs locally, but inference (grok-build-0.1 / Composer 2.5) still calls xAI's cloud. Same as Claude Code etc. For true offline, evaluate local LLMs instead. Q5. Is the Cursor acquisition confirmed? A. It's an option, not an executed deal. SpaceX has the right to acquire Cursor for $60B. If exercised, Cursor's IDE integration could land inside the xAI stack and Grok Build's boundary may shift. Treat it as medium-term uncertainty. Q6. Can [agmsg](../columns/agmsg-cross-agent-messaging-cli-ai-2026-06) bridge Grok Build to Claude Code? A. Likely — agmsg is designed as a cross-vendor CLI agent layer (covering OpenCode etc.), and Grok Build is the same Bash + terminal shape. But the agmsg-official-support list does not yet include Grok Build. Integration likely viable; PoC required. Q7. Does the 17-pt benchmark gap actually matter in production? A. Task-dependent. On real-world large-scale SI / refactors, the gap shows up; on routine business coding, documentation, and test generation, it often doesn't. PoC on your own code is mandatory. Q8. Japanese enterprise procurement? A. xAI is a US company, so it's a foreign SaaS for Japanese buyers — apply the same cross-border / GDPR / APPI diligence as Claude / GPT. The realistic split is sensitive workloads on [self-hosted local LLMs](../columns/local-llm-landscape-2026-june-update), general coding on Grok Build / Claude Code in hybrid.

Bottom Line

Grok Build is xAI's CLI coding agent, released May 14, 2026. Same category as Claude Code / Codex CLI; the unique axes are 8-way Git-worktree parallel sub-agents and the xAI Plugin Marketplace / Composer 2.5 / `/goal` autonomous mode.

Short-term verdict: expensive ($300/mo or $99/mo promo), 17 pts behind on SWE-Bench Verified vs Opus 4.7 / GPT-5.5, terminal-only, with demoed features (Arena Mode) still unshipped — cost-effectiveness currently against xAI.

Medium-term strategic value: the SpaceX × xAI merger + Cursor acquisition option + Colossus-trained Composer 2.5 vertical stack is real, and Grok Build is its coding frontend. Touching the xAI stack now has option value.

Oflight's recommendation: keep Claude Code as the workhorse; use Grok Build for parallel-experiment tasks where Git-worktree isolation matters; SuperGrok Heavy holders should make use of what they're already paying for; tie multiple CLI agents together with agmsg — Grok Build is one leg of a multi-vendor strategy, not a replacement.

References

Primary: - xAI: Grok Build (x.ai/cli) - xAI: Grok Build Changelog - xAI - @xai on X Japanese coverage: - npaka on note: Grok Build write-up Third-party: - Big Hat Group — xAI Weekly 2026-06-17 - Big Hat Group — xAI Weekly 2026-06-03 - Digital Applied — Grok Build CLI Parallel Coding Agents - TechJack — xAI Composer 2.5 in Grok Build CLI - Ry Walker Research — Grok Build (independent review) - Releasebot — xAI Release Notes June 2026 - Medium (Noor Mohamad) — Best 12 Grok Setups 2026 Related Oflight columns: - Claude Code Agent View — parallel orchestration - agmsg — cross-vendor CLI agent messaging - cmux (Manaflow) - Ornith-1.0 — DeepReinforce agentic-coding LLM - Kimi K2.7-Code - Loop Engineering - Sakana Fugu — cloud-side orchestration - Claude Fable 5 export-control suspension - Local LLM June 2026 Update - Cursor Automations Oflight services: - AI Consulting - OpenClaw setup - Software Development Inquiries: - AI consulting / PoC - Custom development / SI - OpenClaw + Grok Build integration Note: the official x.ai/cli page returned 403 to automated fetching, so this column is grounded in npaka's Japanese write-up and multiple third-party sources. Specifications and pricing reflect June 27, 2026; Grok Build is early beta and details change frequently — re-check the official changelog before any production decision.

Feel free to contact us

Contact Us