Launch AI detection GPZero scan all 4,841 papers was accepted by the prestigious Conference on Neural Information Processing Systems (NeurIPS), which took place last month in San Diego. The company found 100 hallucinations in 51 documents that it confirmed were fake, the company tells TechCrunch.
Having a paper accepted by NeurIPS is a noteworthy achievement in the world of artificial intelligence. Since these are the top minds in AI research, one would assume they would use LLMs for the excruciatingly boring task of writing reports.
Thus, there are several caveats with this finding: 100 confirmed hallucinations in 51 tasks are not statistically significant. Each article has dozens of citations. So, out of the tens of thousands of reports, this is, statistically, zero.
It is also important to note that an inaccurate reference does not negate the research of the paper. As NeurIPS said Luckwhich was the first to report on GPTZero’s research, “Even if 1.1% of assignments have one or more incorrect references due to the use of LLM, the content of the assignments themselves [is] is not necessarily annulled”.
But having said all that, a fake referral is nothing either. NeurIPS prides itself on its “rigorous scientific publishing in machine learning and artificial intelligence.” he says. And each paper is peer-reviewed by multiple people who have been instructed to point out the illusions.
Citations are also a kind of currency for researchers. They are used as a career metric to show how influential a researcher’s work is among their peers. When the AI composes them, it devalues them.
No one can blame the peer reviewers for not catching a few AI-generated reports, given the sheer volume involved. GPTZero is also quick to point this out. The goal of the exercise was to offer concrete data on how AI is sneaking through a “tsunami of submissions” that has “stretched the review pipelines of these conferences to breaking point.” says the startup in its report. GPTZero even points to a May 2025 paper called “The AI Conference Peer Review Crisis” who discussed the problem at premiere conferences, including NeurIPS.
Techcrunch event
San Francisco
|
13-15 October 2026
However, why were the researchers themselves unable to check the accuracy of the LLM’s work? They certainly need to know the actual list of documents they used for their work.
What the whole thing really points to is a big, ironic takeaway: If the world’s top AI experts, with their reputations at stake, can’t ensure that LLM usage is accurate in detail, what does that mean for the rest of us?
