More or Less (BBC Radio 4)
Episode Summary: Did AI researchers let AI hallucinations into scientific papers?
Release date: February 21, 2026
Host: Tom Coles
Guest: Alex Tway, CTO and co-founder of GPT0
Overview
This episode of More or Less investigates claims that "100+ AI hallucinated citations" made their way into papers accepted at one of the world’s leading AI research conferences, NeurIPS. Host Tom Coles, with guest Alex Tway from GPT0, explores how these hallucinations occurred, why they're significant, and what they reveal about the intersection of AI technology and academic rigor. The discussion demystifies AI ‘hallucinations’—confident but false outputs generated by large language models—and places the issue within the high-pressure world of AI research publishing.
Key Discussion Points & Insights
What Are AI Hallucinations?
- Definition: Large language models, like ChatGPT, sometimes generate plausible-sounding but false or fictional information—known as "hallucinations."
- Analogy: Tom Coles calls it, “AI…with all the confidence of your overconfident friend” ([01:18]).
- Hallucinations can include made-up facts, statistics, or—as highlighted here—fake academic citations.
The Fortune Magazine Headline
- A headline in Fortune magazine alleges that over 100 hallucinated citations made it into published AI research papers at NeurIPS.
- Raises eyebrows: “You might think that the top AI researchers in the world would be careful about using AI to write their research papers…” — Tom Coles ([02:36]).
GPT0’s Findings
- Investigation: Alex Tway’s company, GPT0, used specialized AI tools to detect hallucinated citations among the 5,000 published papers at NeurIPS.
- Process: They verified whether each citation referenced an actual publication via “massive book scale search engines, academic databases and so on” ([06:19]).
- Findings:
- At least 100 hallucinated citations were found across 50+ papers—but this is not an exhaustive figure (“they just stopped counting when they found a suitable round number” — Tom Coles [06:52]).
- Types of hallucinations:
- 39 were to completely non-existent publications.
- 61 included combinations of fabricated authors, fake titles, fake URLs, or mismatched details ([07:01]).
- Example: “The authors were first name, last name and others, which I imagine is quite a coincidence that all of those three were real people.” — Alex Tway ([07:32])
Why Does It Happen?
- High pressure to publish well-credentialed research.
- “Having a couple of papers in these conferences can get you an OpenAI job… can mean raising $100 million from investors.” — Alex Tway ([04:43])
- Researchers may use AI “to write the boring bits of their papers” due to time constraints and workload ([07:41]).
- Irony noted: Even those advancing the field of AI are tripped up by its limitations.
The Reliability Problem
- AI-generated text sometimes slips through without proper human review.
- “If we can’t trust that your paper is even a human reviewed, so the AI is making mistakes in your paper and you’re not catching it, then how can you trust that everything else created by the researcher was also reviewed by human and not hallucinated by AI?” — Alex Tway ([08:04])
Conference Organizers’ Response
- NeurIPS organizers acknowledge that hallucinations can escape peer review and that researchers use AI as writing assistants.
- They do not believe these errors invalidate the research itself, but are updating guidance for authors and reviewers as AI use evolves ([08:47]).
Societal and Field Implications
- Citation errors may disproportionately affect non-anglophone researchers.
- “We found that it would just start chaining together highly likely names of researchers... just a string of 10 three letter acronyms and you could just tell that the LLM thinks oh, if I had to make up a citation…just write Chinese names.” — Alex Tway ([09:29])
- Suggests a risk of amplification of biases and potential marginalization in global research culture.
Notable Quotes & Memorable Moments
- Tom Coles, on LLMs’ "confidence": “Despite having all the confidence of your overconfident friend, some of the stuff that AI engines like ChatGPT, Gemini, Grok, or Claude confidently tells you is essentially made up.” ([01:18])
- Alex Tway, on being hallucinated: "In some ways it's a weird point of pride…to be hallucinated by an AI. That's definitely one sign that you've made it in the industry." ([03:09])
- Alex Tway, on fabricated citations:
"About 39 were just completely non existent publications… The other 61 had a combination of fabricated authors, people who don't exist or exist but never wrote a paper like that, fake titles, fake links…" ([07:01]) - Tom Coles, joking about the citation "authors first name, last name, and others":
"We asked professor first name and doctor last name for comment, but didn't hear back." ([07:41]) - Alex Tway, about trust in research:
“…If we can't trust that your paper is even a human reviewed, so the AI is making mistakes in your paper and you're not catching it, then how can you trust that everything else created by the researcher was also reviewed by human and not hallucinated by AI?" ([08:04])
Timestamps for Key Segments
- [01:18] — Introduction to AI hallucinations and their relevance in research
- [02:27] — Fortune headline and the concern about research integrity
- [03:00 – 07:01] — Alex Tway details how hallucinated citations were uncovered
- [07:01 – 07:41] — Examples of hallucinated citations (“first name, last name and others”)
- [08:04] — Why hallucinations in citations are a trust problem
- [08:47] — NeurIPS organizers’ perspective and evolving guidelines
- [09:29] — Biases in hallucinated citations, especially affecting non-Anglophone names
Conclusion
This episode offers a concise yet nuanced look into the infiltration of AI-generated errors within elite scientific publishing. While the overall risk to research integrity may be low for now, it exposes vulnerabilities at the intersection of academic culture and rapidly evolving AI capabilities. The hosts use wit and clear analogies to make technical issues accessible and relevant for a broad audience, reminding us to temper our trust in AI—especially when it comes to the fine print.
For questions or suggestions on numbers in the news, listeners are encouraged to contact moreorless@bbc.co.uk.
