OpenAI unveils voice-activated ChatGPT AI model that can have ‘realistic conversations’

Joaquin Phoenix in
Joaquin Phoenix in

ChatGPT maker OpenAI said Monday it would release a new AI model called GPT-4o, capable of realistic voice conversation and able to interact across text and vision, its latest move to stay ahead in a race to dominate the emerging technology.

The new audio capabilities enable users to speak to ChatGPT and obtain real-time responses with no delay, as well as interrupt ChatGPT while it is speaking, both hallmarks of realistic conversations that AI voice chatbots have not had until now, the OpenAI researchers showed at a livestream event.

“It feels like AI from the movies … Talking to a computer has never felt really natural for me; now it does,” OpenAI CEO Sam Altman wrote in a blog post.

Joaquin Phoenix in the 2013 film “Her.” Courtesy of Warner Bros. Picture
Joaquin Phoenix in the 2013 film “Her.” Courtesy of Warner Bros. Picture

Microsoft-backed OpenAI faces growing competition and pressure to expand the user base of ChatGPT, its popular chatbot product that wowed the world with its ability to produce human-like written content and top-notch software code.

At the livestream event, OpenAI researchers showed off ChatGPT’s new voice assistant capabilities. In one demo, ChatGPT used its vision and voice capabilities to talk a researcher through solving a math equation on a sheet of paper.

In another demo, researchers showed the GPT-4o model’s capability of real-time language translation.

OpenAI’s demonstrations verged on science-fiction, with ChatGPT and its interlocutor at one point engaging in coquettish banter. The OpenAI researcher told the chatbot he was in a great mood because he was demonstrating “how useful and amazing you are.”

In one demonstration, the ChatGPT voice assistant was able to read out a bedtime story in different voices, emotions and tones. REUTERS
In one demonstration, the ChatGPT voice assistant was able to read out a bedtime story in different voices, emotions and tones. REUTERS

ChatGPT responded: “Oh stop it! You’re making me blush!”

Altman posted on X after the demo, “her,” in what appeared to be a reference to the so named 2013 film by Spike Jones about a man falling in love with his AI assistant, voiced by Scarlett Johansson.

OpenAI’s chief technology officer, Mira Murati, said at the event that the new model would be offered for free because it is more cost-effective than the company’s previous models. Paid users of GPT-4o will have greater capacity limits than the company’s free users, she said. The GPT-4o model will be available in ChatGPT over the next few weeks, the company said.

Shortly after launching in late 2022, ChatGPT was called the fastest application to ever reach 100 million monthly active users. However, worldwide traffic to ChatGPT’s website has been on a roller-coaster ride in the past year and is only now returning to its May 2023 peak, according to analytics firm Similarweb.

OpenAI is under pressure to expand the user base of ChatGPT. CEO Sam Altman, above. AFP via Getty Images
OpenAI is under pressure to expand the user base of ChatGPT. CEO Sam Altman, above. AFP via Getty Images

OpenAI made the announcements a day before Alphabet is scheduled to hold its annual Google developers conference, where it is expected to show off its own new AI-related features. Reuters reported last week that OpenAI planned to announce an AI-powered search product, citing sources. But the company decided to delay the search product announcement, according to one source familiar with the matter.

Shares of Alphabet were down 0.4% on Monday, after falling nearly 3% earlier in the day. Microsoft shares were down 0.3%.