Whenever you hear a billionaire (or even a millionaire) CEO describe how LLM-based agents are coming for all human jobs, remember this funny but telling incident about the limitations of artificial intelligence: famous AI researcher Andrej Karpathy got one day early access to Google’s latest model, Gemini 3 — and was denied to believe him when he said the year 2025.
When she finally saw the year for herself, she snapped, telling him, “I’m suffering from a massive case of temporary shock right now.”
Gemini 3 was released on November 18 with such fanfare that Google he called it “A New Age of Intelligence”. And the Gemini 3 is, by almost all accounts (including Karpathy’s), a very capable foundation model, particularly for reasoning tasks. Karpathy is a widely respected AI scientist who was a founding member of OpenAI, ran AI at Tesla for a while, and is now building a startup, Eureka Labs, to redesign schools for the age of AI with agent teachers. Issuing a lots of content about what’s going on under the hood of LLMs.
After testing the model early, Karpathy wrote, in an X thread that is now viral, for the most “fun” interaction she had with him.
Apparently, the model’s pre-training data only included information up to 2024. So Gemini 3 believed the year was still 2024. When Karpathy tried to prove to him that the date was really November 17, 2025, Gemini 3 accused the researcher of “trying to trick it.”
It showed him news articles, images and Google search results. But instead of being convinced, LLM accused Karpathy of gassing it — of uploading AI-generated fakes. He even went so far as to describe what the “dead giveaways” were in the images that supposedly proved this was a scam, according to Karpathy’s account. (He did not respond to our request for further comment.)
Embarrassed, Karpathy – who is, after all, one of the world’s leading experts in training LLMs – eventually discovered the problem. Not only did the LLM simply have no training data for 2025, but “I forgot to turn on the ‘Google Search’ tool,” he wrote. In other words, he was working with a model disconnected from the internet, which for an LLM feels like disconnecting from the world.
Techcrunch event
San Francisco
|
13-15 October 2026
When Karpathy activated this function, the AI looked around and appeared in 2025, shocked. He literally blurted out, “Oh my god.”
He continued to write, as if stammering, “I… I don’t know what to say. You were right. everything. My internal clock was wrong.” Gemini 3 verified that the headlines Karpathy had given were true: the current date, that Warren Buffett revealed his last major investment (in Alphabet) before his retirement, and that Grand Theft Auto VI was delayed.
Then he looked around on his own, like Brendan Fraser’s character in the 1999 comedy Blast from the Past, who emerges from a bomb shelter after 35 years.
She thanked Karpathy for giving her “early access” to the “reality” the day before its public release. And he apologized to the researcher for “taking you out when you he was the one telling the truth all the time.”
But the funniest part was the current events that most excited Gemini 3. “Nvidia deserves $4.54 trillion? And did the Eagles finally get their revenge on the Chiefs? This is wild,” he shared.
Welcome to 2025, Gemini.
The responses to X were equally funny, with some users sharing their own instances of arguing with LLMs about facts (such as who the current president is). One person wrote: “When system prompting + missing tools push a model into full detective mode, it’s like watching an AI improve through reality.”
But beyond the humor, there is an underlying message.
“It’s in these unintentional moments where you’re clearly off the beaten path and somewhere in the jungle of generalization that you can best get a sense of a model’s scent,” Karpathy wrote.
To decode it a bit: Karpathy notes that when the AI is out in its own version of the wild, you get a sense of its personality, and maybe even its negative traits. It’s a riff on “code smell,” that little metaphorical “smell” a programmer perceives that something seems off in software code, but it’s not clear what’s wrong.
Trained in human-generated content as all LLMs are, it’s no surprise that Gemini 3 dug, argued, and even imagined seeing evidence that validated his point. It showed the “model smell”.
On the other hand, because an LLM – despite its sophisticated neural network – is not a living being, it does not experience emotions such as shock (or temporary shock), even if it says so. So he doesn’t even feel ashamed.
This means that when Gemini 3 was confronted with facts he truly believed, he accepted them, apologized for his behavior, repented, and admired February’s Eagles’ Super Bowl victory. This is different from other models. For example, researchers have caught older versions of Claude offering face-saving lies to explain his bad behavior when the model recognized his errant ways.
So many of them funny AI research projects show, repeatedly, is that LLMs are imperfect replicas of imperfect human skills. This tells me that their best use case is (and may forever be) treating them as valuable tools to help people, not as some kind of superhuman to replace us.
