New ChatGPT AI Watches Through Your Camera, Offers Advice on What It Sees

Victor Tangermann

May 13, 2024 at 3:25 PM·3 min read

Today, OpenAI showed off its latest large language model (LLM) called GPT-4o — that's a lower-case "o," for "omni" — that the company promises can "reason across audio, vision, and text in real time."

During its brief announcement, the company demonstrated the AI's uncanny ability to assess what it "sees" through the user's smartphone camera, allowing it to help solve math problems and even assist with coding.

OpenAI is making the new model "available to all ChatGPT users, including on the free plan" per OpenAI CEO Sam Altman. "So far, GPT-4 class models have only been available to people who pay a monthly subscription."

It's arguably a natural evolution of the popular AI chatbot; by harnessing a live video stream, the assistant could likely be more helpful by benefiting from far more context.

It's also unsurprising, considering we've seen very similar demos by AI hardware companies Humane and Rabbit, who both attempted to bring an AI chatbot-based gadget with a built-in camera to market this year — albeit with catastrophic results.

Given its positioning at the forefront of the tech, however, OpenAI is leveraging the computing power of the modern smartphone instead, and from what we've seen, that approach makes for a far more seamless experience, with barely any delays between a user's question and GPT-4o's answer.

https://twitter.com/sama/status/1790068956557881504

OpenAI claims GPT-40 can respond to audio inputs in as little as 232 milliseconds, which is "similar to human response time in a conversation." That's in part because it doesn't have to transcribe the text, and has "all inputs and outputs" be "processed by the same neural network."

In other words, OpenAI may have just given both Humane and Rabbit, whose products can take an eternity to respond to the user's inputs, a run for their money.

The new model also sounds considerably more natural and "emotional," with a life-like female voice seemingly picking up on tone and emotions of the user in real-time. Put differently, it's a lot closer to Scarlett Johannson's voice in the 2013 sci-fi blockbuster "Her."

"I'm all ears," ChatGPT told OpenAI research lead Barret Zoph with a notably cheerful voice during a demo today. "What math problem can I help you tackle today?"

The demo, which relied on ChatGPT's new ability to see the world around it, didn't go by without a hitch.

"OK, I see it," ChatGPT said after Zoph asked her to help with a calculus problem without revealing the answer right away.

"No, I didn't show you yet!" Zoph answered, perplexed.

"Whoops, I got too excited," ChatGPT replied sheepishly while Zoph hurriedly wrote out the math problem with a sharpie on a paper in front of him. "I'm ready when you are."

Of course, we should take what OpenAI showed off today with a healthy grain of salt. Tech demos are tech demos — and it would be far from the first time we've seen big tech companies fudge the presentation with carefully rehearsed and conveniently curated demonstrations.

Late last month, for instance, news broke that the creators of a two-minute video titled "Air Head" — allegedly generated with OpenAI's new text-to-video AI Sora — had augmented the footage with more traditional filmmaking techniques.

In short, it remains to be seen how well the new ChatGPT will be able to respond to questions that involve a live smartphone camera feed in the real world, which tends to be far messier than a simple math problem written out in a perfectly lit studio environment.

Besides, OpenAI likely hasn't been able to solve far stickier problems, like having its AIs "hallucinate" facts or perpetuate harmful biases.

Nonetheless, based on what we've seen today, it's still a considerable step forward for the tech that could make it even more useful than it is today.

More on ChatGPT: Stack Overflow Bans Users for Protesting Against It Selling Their Answers to OpenAI as Training Data

Engadget
OpenAI's board allegedly learned about ChatGPT launch on Twitter
“[The] board was not informed in advance of that,” Toner said on Tuesday on a podcast called The Ted AI Show. “We learned about ChatGPT on Twitter.”
Engadget
OpenAI’s new safety team is led by board members, including CEO Sam Altman
OpenAI has created a new Safety and Security Committee less than two weeks after the company dissolved the team tasked with protecting humanity from AI’s existential threats. This latest iteration will include two board members and CEO Sam Altman.
Engadget
Opera is adding Google's Gemini AI to its browser
Opera ha teamed up with Google to integrate its Gemini AI models into its Aria AI browser assistant.
TechCrunch
Scarlett Johansson brought receipts to the OpenAI controversy
OpenAI announced this week that it’s removing Sky, one of the voices used by its new GPT-4o model, after users found it sounded eerily similar to Scarlett Johansson’s AI character in “Her.” While the company claims the voice was not based on Johansson’s, the actress said OpenAI had previously approached her about using her voice for the model. The U.S. Department of Justice and 30 state attorneys general filed a lawsuit against Live Nation Entertainment, Ticketmaster’s parent company, for alleged monopolistic practices.
TechCrunch
This Week in AI: OpenAI and publishers are partners of convenience
This week in AI, OpenAI announced that it reached a deal with News Corp, the new publishing giant, to train OpenAI-developed generative AI models on articles from News Corp brands, including The Wall Street Journal, Financial Times and MarketWatch. The agreement, which the companies describe as "multi-year" and "historic," also gives OpenAI the right to display News Corp mastheads within apps like ChatGPT in response to certain questions -- presumably in cases where the answers are sourced partly or in whole from News Corp publications. News Corp gets an infusion of cash for its content -- over $250 million, reportedly -- at a time when the media industry's outlook is even grimmer than usual.
TechCrunch
OpenAI and Google lay out their competing AI visions
This week had two major events from OpenAI and Google. Hot off OpenAI’s tail, Google’s I/O conference featured a smattering of announcements and integrations for its flagship model, Gemini. This week also saw some major shake-ups at AWS and OpenAI.
Yahoo Finance
Microsoft unveils GPT-4o for Azure, new AI apps in fight against Google, Amazon
Microsoft debuted a litany of new AI offerings as part of its Build developer conference as its fight with Google and Amazon continues to heat up.
Engadget
OpenAI will reportedly pay $250 million to put News Corp's journalism in ChatGPT
OpenAI and News Corp, the owner of The Wall Street Journal, MarketWatch, The Sun, and more than a dozen other publishing brands, have struck a multi-year deal to display news from these publications in ChatGPT
Yahoo Finance
Microsoft debuts new Copilot+ PCs using OpenAI's GPT-4o while taking shots at Apple
The spring wave of AI product announcements continued Monday with Microsoft rolling out the latest version of Copilot with new AI features.
TechCrunch
Microsoft upgrades its AI app-building platforms
Microsoft's big focus at this year's Build conference is generative AI. Azure AI Studio is a toolset within Microsoft's Azure OpenAI Service that lets customers combine an AI model like OpenAI’s recently announced GPT-4o with their own data and build a chat assistant or another type of app that "reasons over" that data. Copilot Studio, meanwhile, provides tools to connect Copilot for Microsoft 365 -- the AI-powered "copilot" in apps like Excel, Word and PowerPoint as well as Microsoft’s Edge browser and Windows -- to third-party data.
Engadget
RIP ChatGPT's knockoff Scarlett Johansson voice [2023 — 2024]
OpenAI is removing an AI voice from ChatGPT that many believe sounds like Scarlett Johansson. The company didn't give a clear explanation for the decision.
Engadget
OpenAI claims that its free GPT-4o model can talk, laugh, sing and see like a human
The new model accepts any combination of text, audio and images as input and can generate an output in all three formats.
Engadget
Scarlett Johansson says OpenAI used her likeness without permission for its 'Sky' voice assistant
Actor Scarlett Johansson has accused OpenAI of copying her voice for one of the voice assisstants in ChatGPT despite denying the company permission to do so.
Yahoo News
OpenAI in the crosshairs: Scarlett Johansson's scathing statement, mass exodus of executives are among the company’s latest woes
OpenAI CEO Sam Altman was at one point dubbed "the Oppenheimer of AI." So what's going on under his leadership?
TechCrunch
OpenAI inks deal to train AI on Reddit data
OpenAI has reached a deal with Reddit to use the social news site's data for training AI models. Reddit content will be incorporated into ChatGPT, OpenAI's popular conversational AI, and the companies will work together to bring unspecified new "AI-powered features" to both Reddit users and moderators. OpenAI will also become a Reddit advertising partner.
Engadget
What to expect from Microsoft Build 2024: The Surface event, Windows 11 and AI
Microsoft has a Surface showcase and its Build developer conference planned for early next week. Here's what we expect.
Engadget
OpenAI co-founder and Chief Scientist Ilya Sutskever is leaving the company
Last year, Ilya Sutskever admitted that he regretted the role he played in the sudden dismissal of OpenAI CEO Sam Altman and President Greg Brockman.
TechCrunch
OpenAI Startup Fund raises additional $5M
The OpenAI Startup Fund, a venture fund that invests in early-stage AI companies and has recently transferred legal control from Sam Altman to Ian Hathaway, has closed on an additional $5 million, according to a filing with the U.S. Securities and Exchange Commission. The fresh funds were raised from two investors who transferred the capital into a special purpose vehicle, a legal entity associated with the OpenAI Startup Fund: OpenAI Startup Fund SPV III, L.P. This is the third time the OpenAI Startup Fund has raised an SPV.
TechCrunch
Anthropic is expanding to Europe and raising more money
On the heels of OpenAI announcing the latest iteration of its GPT large language model, its biggest rival in generative AI in the U.S. announced an expansion of its own. Anthropic said Monday that Claude, its AI assistant, is now live in Europe with support for "multiple languages," including French, German, Italian and Spanish across Claude.ai, its iOS app and its business plan for teams. The launch comes after Anthropic extended its API to Europe to get developers using and integrating its models.
TechCrunch
This Week in AI: OpenAI considers allowing AI porn
This week in AI, OpenAI revealed that it's exploring how to "responsibly" generate AI porn. Announced in a document intended to peel back the curtains and gather feedback on its AI's instructions, OpenAI's new NSFW policy is intended to start a conversation about how -- and where -- the company might allow explicit images and text in its AI products, OpenAI said. "We want to ensure that people have maximum control to the extent that it doesn't violate the law or other peoples' rights," Joanne Jang, a member of the product team at OpenAI, told NPR.

News

Life

Entertainment

Finance

Sports

New on Yahoo

New ChatGPT AI Watches Through Your Camera, Offers Advice on What It Sees

Recommended Stories

OpenAI's board allegedly learned about ChatGPT launch on Twitter

OpenAI’s new safety team is led by board members, including CEO Sam Altman

Opera is adding Google's Gemini AI to its browser

Scarlett Johansson brought receipts to the OpenAI controversy

This Week in AI: OpenAI and publishers are partners of convenience

OpenAI and Google lay out their competing AI visions

Microsoft unveils GPT-4o for Azure, new AI apps in fight against Google, Amazon

OpenAI will reportedly pay $250 million to put News Corp's journalism in ChatGPT

Microsoft debuts new Copilot+ PCs using OpenAI's GPT-4o while taking shots at Apple

Microsoft upgrades its AI app-building platforms

RIP ChatGPT's knockoff Scarlett Johansson voice [2023 — 2024]

OpenAI claims that its free GPT-4o model can talk, laugh, sing and see like a human

Scarlett Johansson says OpenAI used her likeness without permission for its 'Sky' voice assistant

OpenAI in the crosshairs: Scarlett Johansson's scathing statement, mass exodus of executives are among the company’s latest woes

OpenAI inks deal to train AI on Reddit data

What to expect from Microsoft Build 2024: The Surface event, Windows 11 and AI

OpenAI co-founder and Chief Scientist Ilya Sutskever is leaving the company

OpenAI Startup Fund raises additional $5M

Anthropic is expanding to Europe and raising more money

This Week in AI: OpenAI considers allowing AI porn