Sunday, December 22, 2024

ChatGPT now supports real-time video, seven months after OpenAI’s first demonstration

Share

OpenAI has finally released real-time video capabilities for ChatGPT, which it demonstrated almost seven months ago.

During a livestream on Thursday, the company announced that Advanced Voice Mode, ChatGPT’s human-level conversation feature, is starting to work. Using the ChatGPT app, users subscribed to ChatGPT Plus, Team or Pro can point their phones at objects and get a ChatGPT response in near real-time.

Advanced Vision Voice Mode can also understand what’s on your device’s screen through screen sharing. For example, it can explain various settings menus or provide suggestions for a math problem.

To access advanced voice mode with vision, tap the voice icon next to the ChatGPT chat bar, then tap the video icon in the lower left corner, which will launch the video. To share your screen, tap the three-dot menu and select “Share Screen.”

According to OpenAI, the rollout of Advanced Voice with Vision will begin on Thursday and end next week. However, not all users will receive access. OpenAI says ChatGPT Enterprise and Edu subscribers won’t get the feature until January and that there is no timeline for ChatGPT users in the EU, Switzerland, Iceland, Norway and Liechtenstein.

In last demo on CNN’s “60 Minutes,” OpenAI CEO Greg Brockman conducted an advanced voiceover with a sight quiz on Anderson Cooper about his anatomy skills. When Cooper drew body parts on the board, ChatGPT could “understand” what he was drawing.

OpenAI employees demonstrate ChatGPT’s advanced voice mode with vision during a live broadcast. Image credits:OpenAI

“The location is spot on,” ChatGPT said. “The brain is in the head. In terms of form, it’s a good start. The brain is more oval.”

However, in the same demonstration, Advanced Voice with Vision made an error on a geometry problem, suggesting it was prone to hallucinations.

Advanced Voice with Vision has been delayed multiple times – apparently in part because OpenAI announced the feature long before it was ready for production. In April, OpenAI promised that advanced voice mode would be available to users “within weeks.” A few months later, the company said it needed more time.

When the advanced voice mode finally launched for some ChatGPT users in early fall, it lacked a visual analytics component. On the eve of Thursday’s launch, OpenAI has focused its attention on making its advanced voice-only voice mode available on additional platforms and users in the EU.

Rivals like Google and Meta are working on similar capabilities for their chatbot products. This week, Google released its real-time AI conversation and video analytics feature, Project Astra, to a group of Android “trusted testers.”

In addition to its advanced voice-with-vision mode, OpenAI on Thursday launched the holiday-themed “Santa Mode,” which adds Santa’s voice as a preset voice in ChatGPT. Users can find it by tapping or clicking the snowflake icon in the ChatGPT app next to the suggestion bar.

Latest Posts

More News