Skip to content
Artificial Intelligence

OpenAI introduces new voice intelligence features in its API

The company announced new models that enable talking, transcribing, and translating conversations in real time.

person Redacción Tricuatro calendar_month 7 May, 2026 schedule 1 min read

OpenAI revealed this Thursday a series of voice intelligence features in its API, aimed at helping developers build apps that can converse, transcribe, and translate live conversations. The key addition is GPT‑Realtime‑2, a voice model designed to produce realistic vocal simulations capable of handling complex user requests with reasoning similar to GPT‑5.

They also launched GPT‑Realtime‑Translate, a tool offering instant translation across more than 70 input languages and 13 output languages, enabling seamless multilingual conversations. Additionally, GPT-Realtime-Whisper provides live speech-to-text capabilities, capturing interactions as they happen with high accuracy.

OpenAI explained that these models move beyond simple call-and-response interactions toward voice interfaces that can listen, reason, translate, transcribe, and take action during a conversation. The company highlighted potential uses in customer service, education, media, events, and creator platforms.

Of course, they acknowledged the risks of misuse and said they have built safeguards to prevent spam, fraud, or harmful content, including triggers to halt conversations that violate content guidelines.

The new features shift real-time audio from basic responses toward voice interfaces that can listen, reason, translate, transcribe, and act during conversations.

These advancements expand what’s possible with OpenAI’s Realtime API, opening new opportunities for interaction across various industries and applications involving AI-powered voice systems.

Share:
Also available in: ES

Related articles

Latest news

View all

Comments (0)

No comments yet. Be the first!

Leave a comment