Close Menu
TechTost
  • AI
  • Apps
  • Crypto
  • Fintech
  • Hardware
  • Media & Entertainment
  • Security
  • Startups
  • Transportation
  • Venture
  • Recommended Essentials
What's Hot

From teenage hacker to Iron Dome researcher, this founder raised $28M to fight AI phishing

Stilta raises $10.5M from a16z and YC to help companies rediscover patents they forgot they had

You can now speak in your Gmail inbox, as seen at Google IO 2026

Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
TechTost
Subscribe Now
  • AI

    You can now speak in your Gmail inbox, as seen at Google IO 2026

    20 May 2026

    Anthropic has acquired the programming tools startup used by OpenAI, Google and Cloudflare

    19 May 2026

    SandboxAQ brings drug discovery models to Claude — no computer science PhD required

    19 May 2026

    Amazon’s new Alexa+ feature can create podcast episodes

    18 May 2026

    Why trust is a big question in the Elon Musk-OpenAI test

    18 May 2026
  • Apps

    Google has just announced that it is a contender in AI design at IO 2026

    20 May 2026

    Apple announces accessibility feature updates with Apple Intelligence support

    19 May 2026

    Kin Health raises $9 million to build an AI notebook for patients

    19 May 2026

    Google brings AI and vibe-coded widgets to Android

    18 May 2026

    Google’s “Create Widget” feature will allow you to code your own widgets

    18 May 2026
  • Crypto

    As crypto cools, a16z crypto raises $2.2 billion in capital

    6 May 2026

    Coinbase to lay off 14% of staff as part of broader restructuring

    5 May 2026

    British cryptographer Adam Back denies NYT report that he is Bitcoin creator Satoshi Nakamoto

    9 April 2026

    Hackers stole over $2.7 billion in crypto in 2025, data shows

    23 December 2025

    New report examines how David Sachs may benefit from Trump administration role

    1 December 2025
  • Fintech

    Venmo’s biggest makeover in years comes at a very interesting time

    11 May 2026

    Fintech startup Parker files for bankruptcy

    10 May 2026

    Robinhood’s venture fund IPO attracted 150,000+ private investors, CEO says

    7 May 2026

    PayPal says it’s “becoming a tech company again” — that’s AI

    6 May 2026

    Stripe introduces Link, a digital wallet that autonomous AI agents can also use

    1 May 2026
  • Hardware

    Mach Industries just spent $50 million to solve a major defense technology problem

    20 May 2026

    South Korea’s LetinAR makes optics behind AI glasses

    18 May 2026

    Users are turning to jailbreaking their older Kindles as Amazon ends support

    17 May 2026

    Cerebras raises $5.5 billion, then shares soar to $108, first huge tech IPO of 2026

    15 May 2026

    Google unveils Googlebook, a new line of laptops with native artificial intelligence

    13 May 2026
  • Media & Entertainment

    Google’s Gemini Omni turns images, audio and text into video — and that’s just the beginning

    19 May 2026

    Theo Baker spent four years researching Stanford. Before he leaves, here’s what he found.

    19 May 2026

    YouTube viewers watch 2 billion hours of Shorts on TV every month

    14 May 2026

    Digg is trying again, this time as an AI news aggregator

    12 May 2026

    Bravo creates unscripted mini-dramas for the Peacock app

    11 May 2026
  • Security

    US cyber agency CISA has exposed bundles of passwords and cloud keys to the open web

    19 May 2026

    Open source tools maker Grafana Labs says hackers stole its code and refuses to pay ransom

    19 May 2026

    NYC Health + Hospitals says hackers stole medical data and fingerprints during breach affecting at least 1.8 million people

    18 May 2026

    Instructure strikes against hackers who breached it twice

    17 May 2026

    US lawmakers demand answers from Instructure after Canvas data breaches

    16 May 2026
  • Startups

    From teenage hacker to Iron Dome researcher, this founder raised $28M to fight AI phishing

    20 May 2026

    “Survivor” stars Kyle Fraser and Kamilla Karthigesu present a goal-tracking app, Paprclip

    19 May 2026

    Clio’s $500 million milestone comes just as Anthropic steps up to first stage

    15 May 2026

    Startup Battlefield 200 applications close on May 27

    14 May 2026

    Anduril Raises $5B, Doubles Valuation To $61B

    13 May 2026
  • Transportation

    OSHA is investigating the death of a worker at SpaceX’s Starbase site

    19 May 2026

    TechCrunch Mobility: The AI ​​skills arms race is coming for the automotive industry

    18 May 2026

    Tesla Reveals Two Robotaxi Accidents With Remote Controls

    16 May 2026

    RJ Scaringe has raised more than $12 billion in three startups, and investors still want more

    16 May 2026

    Indian Uber rival Rapido raises $240 million at $3 billion valuation

    15 May 2026
  • Venture

    Stilta raises $10.5M from a16z and YC to help companies rediscover patents they forgot they had

    20 May 2026

    Forget Streaming: Status AI Raises $17 Million To Turn Social Media Into Interactive Entertainment

    19 May 2026

    For Eclipse, the $2.5 billion Cerebras win is just the beginning of realizing its physical world thesis

    17 May 2026

    General Catalyst posted VC rage bait and it worked, especially on a16z

    16 May 2026

    Meridian Ventures Raises $35M Fund to Back MBA-Deferred Founders

    15 May 2026
  • Recommended Essentials
TechTost
You are at:Home»Media & Entertainment»Google’s Gemini Omni turns images, audio and text into video — and that’s just the beginning
Media & Entertainment

Google’s Gemini Omni turns images, audio and text into video — and that’s just the beginning

techtost.comBy techtost.com19 May 202605 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
Google's Gemini Omni Turns Images, Audio And Text Into Video
Share
Facebook Twitter LinkedIn Pinterest Email

When Google started Twins three years agothe goal was to create a multimodal large language model — a single neural network that was trained on text, image, audio, and video and could generate content in any of these formats.

Today at the Google I/O developer conference, the company took a concrete step toward that goal with the Gemini Omni, a new family of multimodal models that Google CEO Sundar Pichai says will be able to “create anything from any input.”

Omni will launch with video. Users can now combine images, audio, video, and text, and instead of simply stitching these inputs together, Omni has a word in all of them to produce a consistent output. The result is high quality videos that reflect an understanding of physics, culture, history and science.

Omni also allows users to edit photos with simple text commands instead of complex editing software, similar to Google’s Nano Banana.

Google already has a dedicated video model, Veo, that lets users turn text and images into video, and even direct and customize avatars. But Google DeepMind director of product management Nicole Brichtova says today’s release is more than just a Veo update: “It’s the next step in the evolution of combining the intelligence of Gemini with the rendering capabilities of our media models.”

An example that Koray Kavukcuoglu, DeepMind’s chief technologist, gave to reporters during a media briefing on Monday: When Omni received a simple prompt, such as “an explanation of how protein folding starts,” it quickly rendered a video of a stop-motion explanation with a voice saying, “Proteins start out as chains of amino acids called leaf patterns. beta sheets, forming a perfect three-dimensional shape.”

The long-term vision for Omni is broader, including the model used to do things like create images from audio or audio from video.

“When we first announced Gemini, it was our first AI model that was inherently multimodal,” Pichai said during the briefing. “We knew that training him in a combination of text, code, audio, images and video would give him a deeper understanding of the world. With world models, AI moves from predicting text to simulating reality. Gemini Omni is the next step in that direction.”

As part of the release, users will also be able to create videos of their own digital avatars — something OpenAI made popular in the now-defunct Sora app with Cameos. To avoid deepfakes, users will have to go through a dedicated product onboarding, which involves registering themselves and speaking a series of numbers, according to Brichtova. The avatar is then saved for future use.

Additionally, all videos created with Omni will include Google’s SynthID digital watermark, which allows users to verify whether videos were created through Gemini products.

The first model in the family is the Gemini Omni Flash, which will be released today on the Gemini app, YouTube Shorts and the creative studio AI Flow. Flash will be able to render 10-second videos, which Brichtova says isn’t a model limitation, but rather a decision based both on the desire to get it into more hands and the expectation that most users won’t want to make much longer videos yet. However, longer video durations are in the works for the near future.

Google seems to be pitching Omni Flash as more of a consumer tool. The examples Brichtova and Gabe Barth-Maron, a research engineer at DeepMind, gave on a call with TechCrunch about uses for digital avatars were all personal: making a video of yourself winning an award or going to the moon, or removing a bystander from the background of a video you took on vacation.

Barth-Maron put it more simply: “They are like personalized memes.”

“We definitely focused on making this easy for consumers to use,” Brichtova said. “There aren’t many video models that have breached that gap with consumers, so this is our game to do.”

The ease of use comes with a caveat: Brichtova and Barth-Maron noted that editing prompts should be very specific, otherwise the Omni risks over-editing or inadvertently changing elements the user wanted to keep—a problem Nano Banana users would face.

Image Credits:Google

Despite the short-term consumer focus, the business and creative implications of Omni are obvious, and Google will make Omni available via API in the coming weeks. The avatar creator — a feature available today in Shorts — is something Google is hoping content creators will pick up on. But more broadly, an integrated multimodal workflow could be transformative for advertisers and filmmakers.

Startup Luma AI is building something similar, a tool that can create an entire ad campaign based on a short brief and a product image, backed by its own “unified” model.

“We’re really, really proud of the text rendering capabilities of the model, which is really useful for things like advertising,” Brichtova said. “If you want a product somewhere, or even a slogan, it has to be accurate… We certainly expect filmmakers and other kinds of creators to use this model as well.”

More professional use cases may be better served by the Omni Pro model, which should perform better in all Omni tasks. Google hasn’t yet said when Pro will be released, but Brichtova said it will happen when “we feel like we’re at a point where we have an incremental change over Flash.”

When you purchase through links in our articles, we may earn a small commission. This does not affect our editorial independence.

audio beginning Gemini gemini omni flash Google google gemini omni google io 2026 Googles images Omni See text turns video
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleUS cyber agency CISA has exposed bundles of passwords and cloud keys to the open web
Next Article Mach Industries just spent $50 million to solve a major defense technology problem
bhanuprakash.cg
techtost.com
  • Website

Related Posts

You can now speak in your Gmail inbox, as seen at Google IO 2026

20 May 2026

Google has just announced that it is a contender in AI design at IO 2026

20 May 2026

Anthropic has acquired the programming tools startup used by OpenAI, Google and Cloudflare

19 May 2026
Add A Comment

Leave A Reply Cancel Reply

Don't Miss

From teenage hacker to Iron Dome researcher, this founder raised $28M to fight AI phishing

20 May 2026

Stilta raises $10.5M from a16z and YC to help companies rediscover patents they forgot they had

20 May 2026

You can now speak in your Gmail inbox, as seen at Google IO 2026

20 May 2026
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Fintech

Venmo’s biggest makeover in years comes at a very interesting time

11 May 2026

Fintech startup Parker files for bankruptcy

10 May 2026

Robinhood’s venture fund IPO attracted 150,000+ private investors, CEO says

7 May 2026
Startups

From teenage hacker to Iron Dome researcher, this founder raised $28M to fight AI phishing

“Survivor” stars Kyle Fraser and Kamilla Karthigesu present a goal-tracking app, Paprclip

Clio’s $500 million milestone comes just as Anthropic steps up to first stage

© 2026 TechTost. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer

Type above and press Enter to search. Press Esc to cancel.