OpenAI said Thursday that its API will now include a number of new voice intelligence features designed to help developers build apps that can speak, transcribe and translate conversations with users.
The company new GPT‑Real Time‑2 is another voice model, designed to create a realistic voice simulation that can converse with users. However, unlike its predecessor (GPT-Realtime-1.5), this one is built with GPT‑5 class reasoning which according to OpenAI was created to deal with more complex requests from users.
The company is also launching GPT‑Realtime‑Translate, which, as it sounds, is designed to provide real-time translation services that “walk” with the user, conversationally. The feature includes more than 70 input languages (ie the languages it can understand) and 13 output languages (the languages it relays to the speaker).
Finally, the company also released a new transcription feature, GPT-Realtime-Whisper, which gives users live speech-to-text capabilities that are captured as interactions happen.
“Together, the models we’re launching move real-time audio from simple call-and-response to voice interfaces that can actually work: listen, reason, translate, transcribe and take action as a conversation unfolds,” the company said.
Who will these updates be good for? Companies looking to expand their customer service capabilities are an obvious target. However, OpenAI also notes that its new features will help a wide range of sectors, including education, media, events, and creator platforms, among others.
As useful as these tools seem from a business perspective, it also seems plausible that they could be abused. The company said it has built in safeguards to prevent its new features from being misused to create spam, fraud or other forms of online abuse. Certain trigger rules have been built into the system so that “conversations can be terminated if they are found to be in violation of our harmful content guidelines,” OpenAI said.
Techcrunch event
San Francisco, California
|
13-15 October 2026
All new voice models are included OpenAI’s real-time API. Translate and Whisper are charged per minute, while GPT-Realtime-2 is charged by token consumption.
When you purchase through links in our articles, we may earn a small commission. This does not affect our editorial independence.
