Google was released on Thursday a “redesigned” version of its Gemini Deep Research probe based on its much-hyped founding model, the Gemini 3 Pro.
This new agent isn’t just designed to generate investigative reports — although it can still do that. It now allows developers to incorporate the research capabilities of Google’s SATA model into their own applications. This feature is made possible through the new Google Interactions APIwhich is designed to give developers more control over the coming age of agentic AI.
The new Gemini Deep Research tool is an agent equipped to synthesize mountains of information and handle a large context dump at the prompt. Google says it is used by customers for tasks ranging from due diligence to drug toxicity safety research.
Google also says it will soon integrate this new deep research agent into services including Google Search, Google Finance, the Gemini app and the popular NotebookLM. This is another step toward preparing for a world where people don’t do anything on Google anymore — AI agents do.
The tech giant says Deep Research is taking advantage of the Gemini 3 Pro’s status as its “most realistic” model that’s trained to minimize hallucinations during complex tasks.
Hallucinations with artificial intelligence—where the LLM just makes things up—is a particularly critical issue for long-term, deep practical reasoning tasks in which many autonomous decisions are made over minutes, hours, or more. The more choices an LLM has to make, the greater the chance that even one illusory choice will invalidate the entire output.
To prove its claims of progress, Google also created yet another benchmark (as if the AI world needed another). The new benchmark is unimaginatively called DeepSearchQA and is intended to test agents on complex multi-step information retrieval tasks. Google has open sourced this benchmark.
Techcrunch event
San Francisco
|
13-15 October 2026
Also try Deep Research on Humanity’s Last Exam, a much more interesting, independent general knowledge benchmark filled with incredibly specialized papers. and BrowserComp, a benchmark for browser-based agent tasks.
As you might expect, Google’s new agent beat the competition on its own and Humanity’s benchmark. However, OpenAI’s ChatGPT 5 Pro was a surprisingly close second all the way and slightly edged out Google in BrowserComp.
But those benchmark comparisons were almost out of date by the time Google published them. Because on the same day, OpenAI released the long-awaited GPT 5.2 — codenamed Garlic. OpenAI says its newest model outperforms its rivals — especially Google — on a number of standard benchmarks, including its homegrown OpenAI.
Perhaps one of the most interesting aspects of this announcement was the timing. Knowing that people were waiting for the release of Garlic, Google pulled back some AI news of its own.
