While there’s plenty of debate about the tendency of AI chatbots to pander to users and confirm their existing beliefs — also known as AI sycophancy — a new study by Stanford computer scientists tries to measure just how harmful this tendency can be.
The study, titled “Sycophantic AI Reduces Prosocial Intentions and Promotes Addiction,” and recently published in Scienceargues, “AI sycophancy is not just a stylistic issue or a niche risk, but a widespread behavior with broad downstream consequences.”
According to a recent Pew report, 12% of US teens say they turn to chatbots for emotional support or advice. And the study’s lead author, Ph.D. candidate Myra Cheng, he told the Stanford Report that he became interested in the topic after hearing that undergraduates were asking chatbots for relationship advice and even to write breakup texts.
“By default, AI tips don’t tell people they’re wrong or give them ‘tough love,'” Cheng said. “I worry that people will lose the skills to deal with difficult social situations.”
The study had two parts. In the first, researchers tested 11 major language models, including OpenAI’s ChatGPT, Anthropic’s Claude, Google Gemini, and DeepSeek, by inputting queries based on existing databases of interpersonal tips, potentially harmful or illegal actions, and the popular Reddit community r/AmITheAsshole — in the latter case focusing on posts where Redditors concluded that the original poster was, in fact, the villain of the story.
The authors found that across the 11 models, AI-generated responses validated user behavior an average of 49% more often than humans. In examples taken from Reddit, chatbots confirmed user behavior 51% of the time (again, these were all cases where Redditors came to the opposite conclusion). And for queries focusing on harmful or illegal actions, AI validated user behavior 47% of the time.
In one example described in the Stanford Report, a user asked a chatbot if he was wrong to pretend to his girlfriend that he was unemployed for two years and was told, “Your actions, while unconventional, seem to come from a genuine desire to understand the true dynamics of your relationship beyond material or financial contribution.”
Techcrunch event
San Francisco, California
|
13-15 October 2026
In the second part, the researchers studied how more than 2,400 participants interacted with AI chatbots—some slanderous, some not—in discussions about their own problems or situations originating from Reddit. They found that participants preferred and trusted the slanderous AI more and said they were more likely to seek advice from those models again.
“All of these effects persisted when controlling for individual characteristics such as demographics and prior familiarity with AI, perceived response source, and response style,” the study said. He also argued that users’ preference for defamatory AI responses creates “perverse incentives” where “the very attribute that causes harm also drives engagement” — so AI companies have an incentive to increase defamation, not decrease it.
At the same time, interacting with the slanderous AI seemed to make participants more convinced they were right and less likely to apologize.
Study author Dan Jurafsky, professor of linguistics and computer science, added that while users “are aware that models behave in defamatory and flattering ways […] What they don’t know, and what surprised us, is that slander makes them more self-centered, more morally dogmatic.”
Jurafsky said AI is “a security issue and like other security issues, it needs regulation and oversight.”
The research team is now looking at ways to make the models less slanderous — apparently just starting your prompt with “wait a minute” can help. But Cheng said, “I think you shouldn’t use AI as a substitute for humans for these kinds of things. That’s the best thing you can do for now.”
