The latest AI model that creates Google videos, VEO 3, can create sound to proceed with the clips it creates.
On Tuesday during the Google I/O 2025 Developer Conference, Google presented VEO 3, which claims that the company can create sound effects, background noises and even dialogue to accompany the videos it creates. VEO 3 also improves its predecessor, VEO 2, in terms of quality material it can create, Google says.
VEO 3 is available from Tuesday to Google’s Gemini Chatbot app for Google’s AI Ultra Subscribers, where it can be triggered by text or image.
“For the first time, they emerge from the silent era of the generation of video,” said Google Deepmind, Google’s AI R&D chief executive. ‘[You can give Veo 3] A prompt that describes the characters and an environment and suggests a dialogue with a description of the way you want to hear. ”
The widespread availability of tools for the construction of a video generator has led to such an explosion of providers that the space is saturated. The newly established businesses such as Runway, Lightricks, Genmo, Pika, Higgsfield, Kling and BathingAs well as technological giants such as Openai and Alibaba release models in a quick clip. In many cases, it distinguishes a model from the other.
The audio output is a large VEO 3 variable if Google can fulfill its promises. AI audio creation tools are not new, nor the models to create video sound belongings. But VEO 3 can uniquely understand the crude pixels from its videos and sounds that create clips automatically, per google.
Here is a sample of a clip from the model:
VEO 3 was likely to be possible from Deepmind’s previous work on “Video-To-Audio” AI. Last June, Deepmind revealed that it is developing AI technology to create soundtracks for video training in combination of dialogue sounds and transfers as well as video clips.
Deepmind will not exactly say where it comes from content to train VEO 3, but Youtube is a strong probability. Google holds YouTube and Deepmind told TechCrunch that Google models such as VEO “can” be trained in a YouTube material.
To mitigate the risk of Deepfakes, Deepmind says it uses its own waterproofing technology, synthetic, to incorporate invisible indicators into VEO 3 frames.
While companies like Google Pitch Veo 3 as powerful creative tools, many artists are understandable cautious of them – threaten to upgrade entire industries. 2024 study Assigned by The Animation Guild, a Union that represents Hollywood’s animators and cartoonists, estimates that over 100,000 American films, television and moving jobs will be disturbed by AI by 2026.
Google has also released new features for VEO 2 today, including a feature that allows users to give images of characters, scenes, objects and styles for better consistency. The latest VEO 2 can understand camera movements such as spins, dolls and zoom and allows users to add or delete video objects or broaden the frames of the clips, for example, to convert them from the portrait into a landscape.
Google says that all these new VEO 2 capabilities will come to the API API VERTEX platform in the coming weeks.
