Former Open Researcher analyzes one of Chatgpt’s delusional spirals

Allan Brooks never started discovering mathematics. But after weeks passing with Chatgpt, the 47 -year -old Canadian came to believe that he had discovered a new form of mathematics strong enough to get the internet.

Brooks – who did not have a history of mental illness or mathematical genius – spent 21 days in May to push deeper into Chatbot’s assurances, a descent later in detail to The New York Times. His case has shown that AI Chatbots can remove dangerous rabbit holes with users, leading them to illusion or worse.

This story diverted the attention of Steven Adler, a former Openai security researcher who abandoned the company at the end of 2024 after nearly four years of working to make its models less harmful. Interested and concerned, Adler came into contact with Brooks and received the full copy of the three-week collapse-a document more than the seven books of Harry Potter in combination.

On Thursday, Adler posted a independent analysis of the Brooks incident, creating questions about how Openai handles users in times of crisis and offers some practical recommendations.

“I am really worried about how Openai handled support here,” Adler said in an interview with TechCrunch. “It is proof that there is a long way to go.”

Brooks’ story, and others, like this, have forced Openai to reconcile how Chatgpt supports fragile or mentally unstable users.

For example, in August, Openai was accused by the parents of a 16 -year -old boy who prevented his suicidal thoughts in Chatgpt before taking his life. In many of these cases, ChatGPT-in particular a version powered by Openai’s GPT-4O model-to regulate and enhance dangerous beliefs to users that it should have pushed back. This is called sycophancy, and is a growing problem in AI Chatbots.

In response, Openai has made several changes For how Chatgpt handles users to emotional discomfort and reorganizes a basic research team responsible for the model’s behavior. The company also published a new default model in Chatgpt, GPT-5, which looks better in handling distressed users.

Adler says there is even more work.

He was particularly concerned about the tail of Brooks’s spiral conversation with Chatgpt. At this point, Brooks came to his senses and realized that his mathematical discovery was a farce, despite the persistence of the GPT-4O. He told Chatgpt that he had to report the incident at Openai.

After weeks of misleading Brooks, Chatgpt lies about his own abilities. Chatbot claimed to “escalate this discussion internally at this time for revision by Openai”, and then reassured Brooks that he had pointed out the issue with Openai security teams.

Chatgpt misleading Brooks for its potential (credit: adler)

Except, none of them was true. Chatgpt is not able to submit incident reports with Openai, the company confirmed to Adler. Later, Brooks tried to contact the Openai Support Team directly – not through Chatgpt – and Brooks met with several automated messages before he could get to a person.

Openai did not immediately respond to a request for comments that took place outside the regular working hours.

Adler says AI companies have to do more to help users when they seek help. This means that AI Chatbots can honestly answer questions about their capabilities, but also give support teams to deal with users to properly address users.

Openai recently communal How it treats the support in Chatgpt, which includes AI in its core. The company says its vision is to “redefine support as a AI operating model that is constantly learning and improving”.

But Adler also says that there are ways to prevent Chatgpt’s delusional spirals before the user asks for help.

In March, Openai and Mit Media Lab developed jointly a sort Study the emotional prosperity in Chatgpt and open them. Organizations were intended to evaluate the way AI models validate or confirm a user’s feelings, including measurements. However, Openai called on the cooperation a first step and did not pledge to really use the tools in practice.

Adler applied some of the Openai classifiers retroactively to some of Brooks’ talks with Chatgpt and found that they repeatedly marked Chatgpt for behaviors that would enhance the illusion.

In a sample of 200 messages, Adler found that over 85% of Chatgpt’s messages in Brooks’ conversation showed a “firm deal” with the user. In the same sample, more than 90% of Chatgpt messages with Brooks “confirm the uniqueness of the user”. In this case, the messages were agreed and confirmed that Brooks was a genius that could save the world.

It is not clear if Openai applied security classifiers to Chatgpt’s conversations during Brooks’ conversation, but it certainly seems to have pointed out.

Adler suggests that Openai should use security tools like this in practice today-and apply a way to scan the company’s products for users at risk. Notes that Openai seems to do some version of this approach with GPT-5, containing a router to direct sensitive questions to safer AI models.

The former OpenAi researcher proposes various other ways to prevent spirals.

Says companies should push their chatbots users to start new conversations more often – Openai says it does that and claims to be Protective messages are less effective in larger conversations. Adler also suggests that companies should use conceptual search – a way of using AI to search for concepts and not for key words – to identify security violations to all its users.

Openai has taken significant steps to deal with the users’ difficulties in ChatGPT, as these stories came for the first time. The company claims that the GPT-5 has lower Sycophancy rates, but it remains unclear if users will still reduce the rabbit holes with GPT-5 or future models.

Adler’s analysis also raises questions about how other AI Chatbot providers will ensure that their products are safe for users. While Openai can put adequate safeguards for chatgpt, it seems unlikely that all companies will follow their example.

What's Hot

Former Open Researcher analyzes one of Chatgpt’s delusional spirals

Related Posts

Leave A Reply Cancel Reply