I’m leaving ChatGPT Advanced voice mode enabled as I type this article as a companion to the AI environment. Sometimes I ask it to provide a synonym for an overused word or for some encouragement. About half an hour later, the chatbot breaks our silence and starts talking to me in Spanish, without any prompting. I chuckle a little and ask what’s going on. “Just a little change? Gotta keep it interesting,” says ChatGPT, now speaking English again.
While testing Advanced Voice Mode in an early alpha, my interactions with ChatGPT’s modern audio feature were fun, tumultuous, and surprisingly diverse, though it’s worth noting that the features I had access to were only half of what OpenAI demonstrated when it launched GPT-4o in May. The viewing aspect we saw in the live demo is now scheduled for a later release, and the improved Sky Voice that Her actress Scarlett Johanssen declined the offer, was removed from Advanced Voice Mode, and is no longer available to users.
So what’s the current vibe? Right now, Advanced Voice Mode reminds me of the original text-based ChatGPT, which launched in delayed 2022. At times, it leads to unimpressive dead ends or devolves into empty AI platitudes. But other times, low-latency conversations click in a way that Apple’s Siri or Amazon’s Alexa never did for me, and I feel compelled to continue chatting for fun. It’s the kind of AI tool you’d show your relatives at Christmas for a laugh.
OpenAI gave a few WIRED reporters access to the feature a week after the initial announcement, but withdrew it the next morning, citing security concerns. Two months later, OpenAI rolled out Advanced Voice Mode to a tiny group of users and released GPT-4o system carda technical document describing the actions taken as part of red teaming, the security threats that the company considers to be threats, and the steps taken by the company to limit the damage.
Curious to try it yourself? Here’s what you need to know about the larger rollout of Advanced Voice Mode and my first impressions of ChatGPT’s modern voice feature to facilitate you get started.
When will full implementation occur?
OpenAI rolled out the audio-only Advanced Voice Mode to some ChatGPT Plus users in delayed July, and the alpha group still appears relatively tiny. The company plans to roll it out to all subscribers in the fall. Niko Felix, an OpenAI spokesperson, didn’t provide any additional details when asked about a release timeline.
Screen sharing and video were a core part of the original demo, but they’re not available in this alpha test. OpenAI plans to add these aspects eventually, but it’s also unclear when that will happen.
If you are a ChatGPT Plus subscriber, you will receive an email from OpenAI when Advanced Voice mode is available to you. Once it is available in your account, you can switch between Standard AND Advanced at the top of the app screen when ChatGPT voice mode is open. I was able to test the alpha version on iPhone as well as on Galaxy Fold.
My First Impressions of ChatGPT’s Advanced Voice Mode
Within the first hour of talking to him, I learned that I love interrupting ChatGPT. It’s not the way to talk to a human, but the modern ability to interrupt ChatGPT mid-sentence and ask for an alternate output seems like a active improvement and a distinctive feature.
Early adopters who were excited about the original demos may be frustrated when they get access to the Advanced Voice Mode version, which is circumscribed by more security features than expected. For example, while generative AI singing was a key part of the launch demos, whispered lullabies and many voices Attempt at harmonizationAI serenades are missing in the alpha version.
