Google and Microsoft chatbots compile Super Bowl stats

If you needed more proof that GenAI is prone to making things up, Google’s chatbot Gemini, formerly Bard, thinks the 2024 Super Bowl has already happened. It even has the (fantastic) stats to back it up.

According to a Reddit Thread, Gemini, powered by Google’s eponymous GenAI models, answers questions about Super Bowl LVIII as if the game ended yesterday — or weeks ago. Like many betting companies, it seems to favor the Chiefs over the 49ers (sorry, San Francisco fans).

Gemini embellishes quite creatively, in at least one instance giving a player a stat breakdown that suggests Kansas Chiefs quarterback Patrick Mahomes rushed for 286 yards for two touchdowns and an interception to Brock Purdy’s 253 yards and a touchdown.

Image Credits: /r/smellymonster (opens in new window)

It’s not just Gemini. Microsoft’s Copilot chatbot also insists the game is over and provides false reports to back up the claim. But — perhaps it reflects a San Francisco bias! — says the 49ers, not the Chiefs, emerged victorious “by a final score of 24-21.”

Image Credits: Kyle Wiggers / TechCrunch

Copilot is powered by a GenAI model similar, if not identical, to the model supporting OpenAI’s ChatGPT (GPT-4). But in my tests, ChatGPT didn’t want to make the same mistake.

Image Credits: Kyle Wiggers / TechCrunch

It’s all rather silly — and likely resolved by now, given that this reporter didn’t have the luck to reproduce Gemini’s responses in the Reddit thread. (I’d be shocked if Microsoft wasn’t also working on a fix.) But it also shows the main limitations of current GenAI — and the dangers of relying too much on it.

GenAI models have no real intelligence. By being fed a huge number of examples typically drawn from the public web, AI models learn how likely data (eg text) is to occur based on patterns, including the context of any context data.

This probability-based approach works extremely well at scale. But while the range of words and their probabilities are likely to arrive at text that makes sense is far from certain. LLMs can produce something that is grammatically correct but nonsensical, for example – like the Golden Gate claim. Or they can spew inaccuracies, propagating inaccuracies in their training data.

It’s not malicious on LLM’s part. They have no malice and the concepts of true and false have no meaning for them. They have simply learned to associate certain words or phrases with certain meanings, even if those associations are not exact.

Hence the Gemini and Copilot’s Super Bowl lies.

Google and Microsoft, like most GenAI vendors, readily acknowledge that their GenAI applications are not perfect and, in fact, prone to error. But these acknowledgments come in the form of small letters, I would argue that they could easily be missed.

The Super Bowl misinformation is certainly not the most damaging example of GenAI failing. This distinction probably lies validating torment, booster ethnic and racial stereotypes or persuasive writing about conspiracy theories. It is, however, a useful reminder to double-check statements from GenAI bots. There’s a good chance it’s not true.

What's Hot

Google and Microsoft chatbots compile Super Bowl stats

Related Posts

Leave A Reply Cancel Reply