After coming to the Bard and Pixel 8 Pro last week, Gemini, the recently announced flagship of Google’s GenAI model family, is rolling out to Google Cloud customers using Vertex AI.
Gemini Pro, a lightweight version of a more capable Gemini model, the Gemini Ultra, currently in private preview for a “select set” of customers, is now accessible in public preview on Vertex AI, Google’s fully managed AI programming platform. via the new Gemini Pro API. The API is free to use “within limits” for now (more on what that means later) and supports 38 languages and regions including Europe, as well as features like chat functionality and filtering.
“Gemini is a state-of-the-art inherently multimodal model that features sophisticated reasoning advanced coding skills,” Google Cloud CEO Thomas Kurian said during a press conference Tuesday. “[Now,] developers will be able to build their own apps around it.”
Gemini Pro API
By default, the Gemini Pro API in Vertex accepts text as input and generates text as output, similar to text generation model APIs such as Anthropic’s, AI21’s, and Cohere’s. An additional endpoint, Gemini Pro Vision, also in preview today, can edit text and images — including photos and videos — and text extraction according to OpenAI’s GPT-4 model with Vision.
Image processing faces one of the major criticisms of Gemini since its unveiling last Wednesday — namely that the version of Gemini powering Bard, an enhanced Gemini Pro model, cannot accept images despite being technically “multimodal” ( i.e. trained on a range of data including text, images, video and audio). Questions remain about Gemini’s performance and image analysis skills, especially in light of a misleading product demo. But now, at least, users will be able to make the model and its image understanding themselves.
In Vertex AI, developers can adapt Gemini Pro to specific environments and use cases by leveraging the same detailing tools available for other Vertex-hosted models, such as Google’s PaLM 2. Gemini Pro can also be connected to external APIs to perform specific actions or “grounded” to improve the accuracy and relevance of model responses, either with third-party data from an application or database, or with data from the web and Google search.
Citation checking — another existing Vertex AI feature, now supported for Gemini Pro — serves as an additional measure of data control by highlighting the sources of information that Gemini Pro used to arrive at an answer.
“Grounding allows us to take a response generated by Gemini and compare it to a set of data that resides within a company’s systems … or web sources,” Kurian said. “[T]Comparing it allows you to improve the quality of the model’s responses.”
Kurian spent quite a bit of time focusing on the Gemini Pro’s control, moderation and governance options — seemingly countering coverage implying that the Gemini Pro isn’t the most powerful model out there. Will the assurances be enough to convince developers? It can. But if it’s not, Google sweetens the pot with discounts.
Gemini Pro input to Vertex AI will cost $0.00025 per character, while output will cost $0.00005 per character. (Vertex customers pay per 1,000 characters and, in the case of models like Gemini Pro Vision, per image.) This is down 4x and 2x, respectively, from pricing for the Gemini Pro’s predecessor. And for a limited time – until early next year – Gemini Pro can be tried for free for Vertex AI customers.
“Our goal is to attract developers with attractive prices,” Kurian said candidly.
Vertex Boost
Google is bringing other new features to Vertex AI in hopes of warding off developers from rival platforms like Bedrock.
Quite a bit about Gemini Pro. Soon, Vertex customers will be able to tap Gemini Pro to power custom chat and chat agents, delivering what Google describes as “dynamic interactions … supporting advanced logic.” Gemini Pro will also become an option to advance Vertex AI’s search summarization, suggestion, and answer generation capabilities, leveraging documents in various formats (e.g. PDF, images) from different sources (e.g. OneDrive, Salesforce) to answer questions.
Kurian says he expects Gemini Pro-powered chat and search features to arrive “very early” in 2024.
Elsewhere in Vertex, it exists now Auto Side by Side (Auto SxS). A recently announced response to AWS Model Evaluation At Bedrock, Auto SxS allows developers to evaluate models in an “on-demand”, “automated” manner. Google claims that Auto SxS is faster and more cost-effective than manually evaluated models (although the jury is out on this pending independent testing).
Google is also adding models to Vertex from third parties, including Mistral and Meta, and introducing step-by-step distillation, a technique that creates smaller, specialized, low-latency models from larger models. In addition, Google is expanding its indemnification policy to include results from PaLM 2 and Imagen models, which means the company will legally defend eligible customers involved in lawsuits over IP disputes involving the results of these models.
Genetic AI models have a tendency to regurgitate training data—an obvious concern for enterprise customers. If one day it’s discovered that a vendor like Google used copyrighted data to train a model without first obtaining proper permission, that vendor’s customers could end up on the hook for incorporating IP-infringing work into projects their.
Some sellers claim fair use as a defense. But — with business wariness in mind — a growing number are expanding their indemnity policies around GenAI offerings.
Google fails to extend Vertex AI’s compensation policy to cover customers using the Gemini Pro API. The company says, however, that it will do so once the Gemini Pro API is publicly released.