Google is making the second generation of Imagen, its AI model that can create and edit images at the prompt of text, more widely available — at least to Google Cloud customers using Vertex AI and approved for access.
But the company isn’t disclosing what data it used to train the new model — nor is it introducing a way for creators who may have accidentally contributed to the dataset to opt out or file for compensation.
It’s called Imagen 2, Google’s improved model — which was quiet launched previewed at the tech giant’s I/O conference in May — developed using technology from Google DeepMind, Google’s flagship artificial intelligence lab. Compared to the first-generation Imagen, it’s “significantly” improved in image quality, Google claims (the company surprisingly declined to share sample images before this morning), and introduces new features, including the ability to render text and logos.
“If you want to create images with text overlays — for example, advertising — you can do that,” Google Cloud CEO Thomas Kurian said during a press briefing Tuesday.
Text and logo generation brings Imagen in line with other leading image generators such as OpenAI’s DALL-E 3 and Amazon’s recently released Titan Image Generator. In two potential points of differentiation, however, Imagen 2 can render text in multiple languages — namely Chinese, Hindi, Japanese, Korean, Portuguese, English and Spanish, with more coming sometime in 2024 — and overlay logos on existing images .
“Imagen 2 can create … emblems, lettering and abstract logos… [and] has the ability to overlay these logos on products, clothing, business cards and other surfaces,” explained Vishy Tirumalasetty, Google’s head of productive media products, in a blog post provided to TechCrunch ahead of today’s announcement.
Thanks to “innovative training and modeling techniques,” Imagen 2 can also understand more descriptive, lengthy prompts and provide “detailed answers” to questions about elements of an image. These techniques also enhance Imagen 2’s multilingual understanding, Google says — allowing the model to translate a prompt in one language into an output (eg, a logo) in another language.
Imagen 2 leverages SynthID, an approach developed by DeepMind, to apply invisible watermarks to images generated by it. Of course, detecting these watermarks—which Google claims are resistant to image manipulations, including compression, filters, and color adjustments—requires a Google-provided tool that isn’t available to third parties. But as policymakers express concern about the growing volume of Disinformation generated by artificial intelligence on the web, perhaps it will allay some fears.
Google didn’t disclose the data it used to train Imagen 2, which — while disappointing — isn’t surprising. It’s an open legal question whether GenAI vendors like Google can train a model on publicly available—even copyrighted—data and then reverse engineer and commercialize that model.
Related lawsuits are playing out in the courts, with sellers arguing that they are protected by the fair use doctrine. But it will be a while before the dust settles.
Google, meanwhile, is playing it safe by keeping quiet on the matter — a reversal of the strategy it took with the first-generation Imagen, where it revealed it used a version of the public LAION dataset to train the model. LAION is known to contain problematic content including but not limited to private medical images, copyrighted artwork and photoshopped celebrity porn — which is obviously not the best look for Google.
Some companies developing AI-powered image generators, such as Stability AI and — as of a few months ago — OpenAI, allow creators to opt out of training datasets if they choose. Others, including Adobe and Getty Images, institute compensation systems for creators — though not always well-paid or transparent.
Google — and, to be fair, several of its competitors, including Amazon — offer no such opt-out mechanism or creator compensation. That won’t be changing anytime soon, it seems.
Instead, Google offers an indemnification policy that protects eligible Vertex AI customers from copyright claims related to both Google’s use of training data and Imagen 2 outputs.
Regression, or when a production model spits out a copy of a training example, is rightfully a concern for enterprise customers and developers. An academic study showed that the first-generation Imagen was not immune to this phenomenon, producing recognizable photos of real people, copyrighted works by artists, and more when requested in certain ways.
Not shockingly, lately overview of Fortune 500 companies by Acrolinx, nearly a third said intellectual property was their biggest concern about using genetic AI. Other voting found that nine out of 10 developers “consider a lot” of IP protection when making decisions about whether to use genetic AI.
It’s a concern Google hopes its recently expanded policy will address. (Google’s compensation terms didn’t previously cover Imagen’s outputs.) As for creators’ concerns, well… they’re out of luck this process.