株式会社オブライト
AI Model2026-05-17

GPT-Realtime-2

Also known as: GPT-Realtime-2 / gpt-realtime-1.5 / GPT Realtime API

OpenAI's real-time voice AI model delivering ultra-low-latency full-duplex speech interaction — designed for voice agents and live translation use cases.


Overview

GPT-Realtime-2 succeeds gpt-realtime-1.5 and packages a reasoning voice agent, live translation, and streaming Whisper into a unified stack. See OpenAI GPT-Realtime-2 explainer for details.

Use cases

Used in contact-centre automation, real-time meeting interpretation, and voice-first UIs where text-only LLMs struggle with latency.

Related Columns

AI
OpenAI GPT-Realtime-2 and the Three New Voice Models — A Practitioner's 2026 Look at Reasoning Voice Agents, Live Translation, and Streaming Whisper
On May 7, 2026, OpenAI released a trio of new voice models: GPT-Realtime-2 (the first voice model with GPT-5-class reasoning), GPT-Realtime-Translate (live translation across 70+ input / 13 output languages), and GPT-Realtime-Whisper (streaming speech-to-text). This article summarizes capabilities, benchmark deltas vs 1.5, pricing, when to pick which, and the upgrade decision from 1.5 — based on official information.
AI
OpenAI gpt-realtime-1.5 and the Official realtime-voice-component — A Practitioner's Look at the New Voice-Agent Stack [2026]
OpenAI released the gpt-realtime-1.5 audio model on February 26, 2026, and openai/realtime-voice-component on GitHub provides an official React reference for voice UIs. This article summarizes the documented gains (+5% audio reasoning, +10.23% transcription, +7% instruction following), pricing, the component's positioning as a reference implementation, and practical considerations for business adoption.
AI
NVIDIA PersonaPlex 7B Complete Guide — Real-Time Full-Duplex Voice AI Architecture & Use Cases [2026]
NVIDIA PersonaPlex 7B, released in January 2026, is an open-source voice AI that integrates the traditional ASR→LLM→TTS pipeline into a single end-to-end model, achieving true full-duplex voice interaction. This guide covers architecture, performance benchmarks, setup procedures, and practical use cases.

Related Terms

Feel free to contact us

Contact Us