Google Veo, a serious swing at AI-generated video, debuts at Google I/O 2024

Kyle Wiggers

Updated May 14, 2024 at 5:10 PM·6 min read

Oops!
Something went wrong.
Please try again later.

Google’s gunning for OpenAI’s Sora with Veo, an AI model that can create 1080p video clips around a minute long given a text prompt.

Unveiled on Tuesday at Google’s I/O 2024 developer conference, Veo can capture different visual and cinematic styles, including shots of landscapes and time lapses, and make edits and adjustments to already generated footage.

“We’re exploring features like storyboarding and generating longer scenes to see what Veo can do,” Demis Hassabis, head of Google’s AI R&D lab DeepMind, told reporters during a virtual roundtable. “We’ve made incredible progress on video.”

Veo builds on Google’s preliminary commercial work in video generation, previewed in April, which tapped the company’s Imagen 2 family of image-generating models to create looping video clips.

But unlike the Imagen 2-based tool, which could only create low-resolution, few-seconds-long videos, Veo appears to be competitive with today’s leading video generation models — not only Sora, but models from startups like Pika, Runway and Irreverent Labs.

In a briefing, Douglas Eck, who leads research efforts at DeepMind in generative media, showed me some cherry-picked examples of what Veo can do. One in particular — an aerial view of a bustling beach — demonstrated Veo’s strengths over rival video models, he said.

“The detail of all the swimmers on the beach has proven to be hard for both image and video generation models — having that many moving characters,” he said. “If you look closely, the surf looks pretty good. And the sense of the prompt word 'bustling,' I would argue, is captured with all the people — the lively beachfront filled with sunbathers.”

Veo was trained on lots of footage. That’s generally how it works with generative AI models: Fed example after example of some form of data, the models pick up on patterns in the data that enable them to generate new data — videos, in Veo’s case.

Where did the footage to train Veo come from? Eck wouldn’t say precisely, but he did admit that some might’ve been sourced from Google’s own YouTube.

“Google models may be trained on some YouTube content, but always in accordance with our agreement with YouTube creators,” he said.

The “agreement” part may technically be true. But it’s also true that, considering YouTube’s network effects, creators don’t have much choice but to play by Google’s rules if they hope to reach the widest possible audience.

Reporting by The New York Times in April revealed that Google broadened its terms of service last year in part to allow the company to tap more data to train its AI models. Under the old ToS, it wasn’t clear whether Google could use YouTube data to build products beyond the video platform. Not so under the new terms, which loosen the reins considerably.

Google’s far from the only tech giant leveraging vast amounts of user data to train in-house models. (See: Meta.) But what’s sure to disappoint some creators is Eck’s insistence that Google’s setting the “gold standard,” here, ethics-wise.

“The solution to this [training data] challenge will be found with getting all of the stakeholders together to figure out what are the next steps,” he said. “Until we make those steps with the stakeholders — we’re talking about the film industry, the music industry, artists themselves — we won’t move fast.”

Yet Google’s already made Veo available to select creators, including Donald Glover (AKA Childish Gambino) and his creative agency Gilga. (Like OpenAI with Sora, Google's positioning Veo as a tool for creatives.)

https://www.youtube.com/watch?v=dKAVFLB75xs&ab_channel=Google

Eck noted that Google provides tools to allow webmasters to prevent the company’s bots from scraping training data from their websites. But the settings don’t apply to YouTube. And Google, unlike some of its rivals, doesn’t offer a mechanism to let creators remove their work from its training data sets post-scraping.

I asked Eck about regurgitation, as well, which in the generative AI context refers to when a model generates a mirror copy of a training example. Tools like Midjourney have been found to spit out exact stills from movies including “Dune,” “Avengers” and “Star Wars” provided a time stamp — laying a potential legal minefield for users. OpenAI has reportedly gone so far as to block trademarks and creators’ names in prompts for Sora to try to deflect copyright challenges.

So what steps did Google take to mitigate the risk of regurgitation with Veo? Eck didn’t have an answer, short of saying the research team implemented filters for violent and explicit content (so no porn) and is using DeepMind's SynthID tech to mark videos from Veo as AI-generated.

“We’re going to make a point of — for something as big as the Veo model — to gradually release it to a small set of stakeholders that we can work with very closely to understand the implications of the model, and only then fan out to a larger group,” he said.

Eck did have more to share on the model’s technical details.

Eck described Veo as “quite controllable” in the sense that the model understands camera movements and VFX reasonably well from prompts (think descriptors like “pan,” “zoom” and “explosion”). And, like Sora, Veo has somewhat of a grasp on physics — things like fluid dynamics and gravity — which contribute to the realism of the videos it generates.

Veo also supports masked editing for changes to specific areas of a video and can generate videos from a still image, a la generative models like Stability AI's Stable Video. Perhaps most intriguing, given a sequence of prompts that together tell a story, Veo can generate longer videos -- videos beyond a minute in length.

That’s not to suggest Veo’s perfect. Reflecting the limitations of today’s generative AI, objects in Veo’s videos disappear and reappear without much explanation or consistency. And Veo gets its physics wrong often — for example, cars will inexplicably, impossibly reverse on a dime.

That’s why Veo will remain behind a waitlist on Google Labs, the company’s portal for experimental tech, for the foreseeable future, inside a new front end for generative AI video creation and editing called VideoFX. As it improves, Google aims to bring some of the model’s capabilities to YouTube Shorts and other products.

“This is very much a work in progress, very much experimental … there’s much more left undone than done here," Eck said. "But I think this is sort of the raw materials for doing something really great in the filmmaking space."

We're launching an AI newsletter! Sign up here to start receiving it in your inboxes on June 5.

Read more about Google I/O 2024 on TechCrunch

Engadget
Google unveils Veo and Imagen 3, its latest AI media creation models
Today, Google announced its new AI media creation engines: Veo, which can produce "high-quality" 1080p videos; and Imagen 3.
TechCrunch
Google's image-generating AI gets an upgrade
Google’s upgrading its image-generation tech to keep apace with rivals. At the company's I/O developer conference in Mountain View on Tuesday, Google announced Imagen 3, the latest in the tech giant's Imagen generative AI model family. Demis Hassabis, CEO of DeepMind, Google's AI research division, said that Imagen 3 more accurately understands the text prompts that it translates into images versus its predecessor, Imagen 2, and is more “creative and detailed” in its generations.
Engadget
Google Search will now show AI-generated answers to millions by default
With the new features, Google is positioning Search as more than a way to simply find websites. Instead, the company wants people to use its search engine to directly get answers and help them with planning events and brainstorming ideas.
TechCrunch
Google TV to launch AI-generated movie descriptions
As anticipated, numerous AI-related announcements were made at this year's Google I/O 2024 conference, including the unveiling of a new feature for Google TV. Gemini, the company's family of generative AI models, will enhance the smart TV operating system so it can generate descriptions for movies and TV shows. When a description is missing on the home screen, the AI will fill it in automatically to ensure that viewers never have to wonder what a title is about, Google explains.
Engadget
Google is bringing a slew of AI-powered software features to Chromebook Plus laptops
Google has a host of new AI-powered features coming to its Chromebook Plus models.
TechCrunch
AI models have favorite numbers, because they think they're people
AI models are always surprising us, not just in what they can do, but what they can't, and why. An interesting new behavior is both superficial and revealing about these systems: they pick random numbers as if they're human beings. This is actually a very old and well known limitation we, humans, have: we overthink and misunderstand randomness.
TechCrunch
Apple's Design Awards nominees highlight indies and startups, largely ignore AI (except for Arc)
With its list of Apple Design Awards finalists, Apple is celebrating indie apps and startups over bigger tech firms — including those offering AI chatbots. At a time when its App Store model has been called into question by legislators and regulators alike, Apple's annual list of what it considers the best and most technically innovative software available on its platform is turning its attention to the little guy. There's no ChatGPT to be found on Apple's list of finalists, for example.
Engadget
OpenAI’s new safety team is led by board members, including CEO Sam Altman
OpenAI has created a new Safety and Security Committee less than two weeks after the company dissolved the team tasked with protecting humanity from AI’s existential threats. This latest iteration will include two board members and CEO Sam Altman.
Yahoo Life
Actress Judi Dench says she 'can't even see' due to macular degeneration. Here's what to know about the leading cause of vision loss for people over 50.
The eye condition causes progressive sight loss in the center of vision.
Yahoo Sports
French Open 2024: How to watch the Iga Swiatek vs. Naomi Osaka match
It's time for the clay court Grand Slam at Roland Garros. Here's how to tune into Swiatek vs. Osaka.
TechCrunch
OpenAI's new safety committee is made up of all insiders
OpenAI has formed a new committee to oversee "critical" safety and security decisions related to the company's projects and operations. Altman and the rest of the Safety and Security Committee -- OpenAI board members Bret Taylor, Adam D’Angelo and Nicole Seligman as well as chief scientist Jakub Pachocki, Aleksander Madry (who leads OpenAI's "preparedness" team), Lilian Weng (head of safety systems), Matt Knight (head of security) and John Schulman (head of "alignment science") -- will be responsible for evaluating OpenAI's safety processes and safeguards over the next 90 days, according to a post on the company's corporate blog.
Engadget
VR classics Job Simulator and Vacation Simulator come to Apple Vision Pro
Job Simulator and Vacation Simulator have been released for the Apple Vision Pro. This is a version developed specifically for the platform with optimized hand-and-eye tracking.
TechCrunch
China's $47B semiconductor fund puts chip sovereignty front and center
China has closed a third state-backed investment fund to bolster its semiconductor industry and reduce reliance on other nations, both for using and for manufacturing wafers — prioritizing what is called chip sovereignty. China's National Integrated Circuit Industry Investment Fund, also known simply as 'the Big Fund,' had two previous vintages: Big Fund I (2014 to 2019) and Big Fund II (2019 to 2024). The latter was significantly larger than the former, but Big Fund III is larger than both at 344 billion yuan, or about $47.5 billion, public filings revealed.
Yahoo Sports
NASCAR: Stewart-Haas Racing shutting down Cup Series team at end of 2024 season
Stewart-Haas began in 2009 when Tony Stewart joined forces with Gene Haas.
Engadget
Acer, ASUS and HP all have new Chromebook Plus laptops with Google's built-in AI features
Google announced a host of new AI features for Chromebooks, and Acer, ASUS and HP have new models to showcase them.
Yahoo Life Shopping
Cher loves these 'bootyfull' wide-leg pants from Amazon, and they're down to just $20
More than 28,000 shoppers agree with the legendary performer — and at over 30% off, the savings are un-'Believe'-able.
TechCrunch
Rock band's hidden hacking-themed website gets hacked
On Friday, Pal Kovacs was listening to the long-awaited new album from rock and metal giants Bring Me The Horizon when he noticed a strange sound at the end of the record’s last track. Being a fan of solving riddles and breaking encrypted codes, Kovacs wondered: does this sound contain a hidden message? Kovacs opened the song in the audio editing app Audacity and, as he suspected, there was indeed a spectrogram — essentially a visual representation of the audio itself — which was actually a scannable QR code.
Engadget
Ooni's larger, dual-zone Koda 2 Max pizza oven is now available for pre-order
Ooni's largest pizza oven yet allows you to monitor food and ambient temps from your phone. It's now available for pre-order and ships in July.
Autoblog
2025 BMW M5 Touring and sedan spied pounding around the Nurburgring
Spy shots of the 2025 BMW M5 Touring and sedan showing less camo on the Nurburgring.
Engadget
Apple's new M2 iPad Air tablets drop to record-low prices
The new M2 iPad Air is on sale already, and some models are available for record-low prices.

News

Life

Entertainment

Finance

Sports

New on Yahoo

Google Veo, a serious swing at AI-generated video, debuts at Google I/O 2024

Recommended Stories

Google unveils Veo and Imagen 3, its latest AI media creation models

Google's image-generating AI gets an upgrade

Google Search will now show AI-generated answers to millions by default

Google TV to launch AI-generated movie descriptions

Google is bringing a slew of AI-powered software features to Chromebook Plus laptops

AI models have favorite numbers, because they think they're people

Apple's Design Awards nominees highlight indies and startups, largely ignore AI (except for Arc)

OpenAI’s new safety team is led by board members, including CEO Sam Altman

Actress Judi Dench says she 'can't even see' due to macular degeneration. Here's what to know about the leading cause of vision loss for people over 50.

French Open 2024: How to watch the Iga Swiatek vs. Naomi Osaka match

OpenAI's new safety committee is made up of all insiders

VR classics Job Simulator and Vacation Simulator come to Apple Vision Pro

China's $47B semiconductor fund puts chip sovereignty front and center

NASCAR: Stewart-Haas Racing shutting down Cup Series team at end of 2024 season

Acer, ASUS and HP all have new Chromebook Plus laptops with Google's built-in AI features

Cher loves these 'bootyfull' wide-leg pants from Amazon, and they're down to just $20

Rock band's hidden hacking-themed website gets hacked

Ooni's larger, dual-zone Koda 2 Max pizza oven is now available for pre-order

2025 BMW M5 Touring and sedan spied pounding around the Nurburgring

Apple's new M2 iPad Air tablets drop to record-low prices