株式会社オブライト
AI2026-05-08

OpenAI Symphony 2026 — The Open-Source Spec That Turns Linear Tickets Into Codex Agent Workspaces and Pull Requests

OpenAI's Symphony, released in 2026 as an open-source spec, defines a "ticket-driven AI development" pattern: each Linear issue gets its own Codex agent in its own workspace and runs until the issue reaches a terminal state. This article walks through the spec's core ideas, the Elixir reference implementation, the polling / supervisor / workspace-isolation architecture, and the realistic operational considerations for adopting it — based on publicly available information.


What Symphony is — "one ticket = one agent" as an orchestration spec

OpenAI Symphony is an open-source specification for keeping coding agents (Codex) running per task, autonomously, until the task is done. The core idea is simple: for every issue in Linear, allocate its own workspace and its own Codex agent, then keep that agent in its loop until the issue reaches a terminal state. OpenAI explicitly states Symphony will not be maintained as a standalone product — it ships as a reference implementation. The deliverables are SPEC.md (the protocol in Markdown) and an Elixir reference implementation, both at `openai/symphony` on GitHub.

Why "ticket-driven" matters

AI coding tools have been moving from "chat with me" toward "point at this issue and propose." Symphony pushes one step further: "after a human files the ticket, let the agent take it from there." - Before: a developer asks Codex / Cursor / Copilot in the IDE → reviews each turn - Symphony: a PM or designer files the ticket → an agent runs against the spec → returns a PR OpenAI reports that internal teams saw merged-PR throughput jump roughly 6× in the first three weeks after Symphony rolled out — a signal that the operating unit of dev work is shifting from "one person typing" to "tickets running in parallel."

Architecture — Poll → Dispatch → Execute

From the public material, the Elixir reference runs a three-step loop: 1. Poll Linear at a default cadence of 30 seconds for active issues. 2. Dispatch an isolated workspace + Codex process per issue, feeding the workflow prompt and ticket context. 3. Execute until the issue hits a terminal state. Crashed agents are restarted by an Erlang/OTP supervisor. Each agent owns its workspace on the filesystem so changes between tickets don't collide. Running on an Erlang/OTP supervision tree imports the "telco-grade fault-tolerance" lineage — long-running parallelism, automatic restart on failure, isolated partial faults — onto AI agent operations. (See our Elixir guide for the language-level background.)

Why an Elixir reference implementation

Per the official material, OpenAI picked Elixir because the language ships excellent primitives for orchestrating and supervising concurrent processes. Put plainly: running hundreds-to-thousands of agents at once with isolation and per-agent restart on failure is what Erlang/OTP was literally built for. A second interesting choice: when polishing the spec, the team asked Codex itself to reimplement Symphony in TypeScript / Go / Rust / Java / Python, and used the resulting variations to surface ambiguities in the spec. AI generating multiple implementations as a way to debug a specification is a noteworthy new pattern in spec-design hygiene.

A standout side effect — agents file follow-up tickets

Two design choices stand out from the public coverage: - Agents can file new Linear tickets: when an agent notices something adjacent ("this part needs refactoring," "this endpoint can be faster"), it can drop a fresh ticket into Linear instead of dropping the observation. Side-observations become queued work. - PMs and designers can file feature tickets directly: without checking out the repo, a feature request goes in and a review packet — including a video walkthrough — comes back. This is one of the cleanest concrete examples of the broader trend: AI moving from "IDE helper" to "primary actor in the ops loop."

When it fits, when it doesn't

Strong fit - Dev orgs with many small/medium parallel tasks — the Linear backlog flows straight to AI - Orgs where PMs and designers outnumber developers, and devs are the throughput bottleneck - Mature CI / test / build pipelines (because the verification side has to keep up with PR volume) - Clear coding conventions and architecture docs (great priors to feed agents) Poor fit - Workloads where transmitting code to the Codex API is not allowed (→ pair with DGX Spark + local LLM and similar) - Exploratory or vague tickets where requirement extraction is the actual hard part - Domains under contracts / regulations that require explicit human review for every merge

Operational watch-outs

- It's a reference, not a product: OpenAI doesn't plan to maintain Symphony standalone. Production adoption means forking and operating it yourself, and reading the SPEC.md carefully. - Can your org absorb a 6× PR rate?: that headline number assumed reviewers, QA, and deploy paths could keep up. Reviewer bottlenecks turn into "a backlog of unreviewed AI PRs" — a new flavor of tech debt. - License and IP hygiene: with AI generating volume, license-of-output handling and third-party library checks belong in CI. - Ticket size discipline: too small wastes API spend on overhead; too big never finishes. Set internal ticket-size standards. - Codex API cost: parallel agents bill in parallel. Build budget caps and priority queues from day one. - Beyond Linear: the spec abstracts "a board with a state machine," so porting to JIRA / GitHub Issues / Asana is feasible. The reference implementation, however, is Linear-specific.

How Oflight uses it

We treat Symphony's SPEC as standard vocabulary for ticket-driven AI development, and design a middleware layer that fits each client's tracker (Linear / JIRA / GitHub Projects) and confidentiality posture. - Cloud-OK projects: fork the Symphony reference and run Codex underneath. - Confidential projects: keep the orchestration ideas, swap the agent backend to a local model on DGX Spark or our OpenClaw agent platform. - PM / designer-driven ticketing: pair this with DocDD templates so that the priors going into agents are high-quality. We can phase Symphony-style operations in via Software Development or AI Consulting.

FAQ

Q1: Will OpenAI keep updating Symphony? A: Officially they will not maintain Symphony as a standalone product. The SPEC may evolve, but production users should expect to run their own fork. Q2: Can it be adapted to a non-Linear tracker? A: At the spec level, yes. The reference implementation is Linear-first; porting work falls on adopters. Q3: We can't keep up with PRs A: Tune ticket size, set priority queues, layer agent-on-agent code review, harden automated tests, and add pre-human-review filters. Strengthen CI and review before turning Symphony on. Q4: Can a non-Elixir org adopt it? A: Just running it doesn't require Elixir code. Forking and customizing does — staff at least one person who can read OTP-style code, or partner externally. Q5: How does it differ from Codex / Cursor / Claude Code? A: Existing tools assist a developer. Symphony orchestrates without one. They coexist: people + AI on hard tasks, Symphony on routine ones. Q6: Cost expectations? A: Tickets × average tokens × rate. Parallelism makes peak burn high; build monthly caps, priority queues, and overnight batching from day one.

References

Feel free to contact us

Contact Us