Chatbots struggle with news accuracy and sourcing ahead of US midterms

Sign up now: Get ST's newsletters delivered to your inbox

When ChatGPT, Claude and Gemini returned biased answers, they aligned with the political left, and Grok primarily tilted in favour of the political right.

When ChatGPT, Claude and Gemini returned biased answers, they aligned with the political left, and Grok primarily tilted in favour of the political right.

PHOTO: AFP

Google Preferred Source badge

SAN FRANCISCO – Four major artificial intelligence chatbots – OpenAI’s ChatGPT, Alphabet Inc’s Google Gemini, Anthropic’s Claude and xAI’s Grok – are struggling to fairly and accurately answer questions about elections and geopolitics, according to a new study from Forum AI.

Researchers asked the four chatbots more than 3,100 questions about a wide range of news topics, like politics, healthcare and foreign affairs.

They found that the collective answers about elections, in particular, “failed on accuracy, bias, or source selection 90 per cent of the time.”

Nearly 36 per cent of answers to questions about elections contained at least one factual error; Grok – the most egregious offender – returned an error nearly 52 per cent of the time.

When ChatGPT, Claude and Gemini returned biased answers, they aligned with the political left, and Grok primarily tilted in favour of the political right.

All four models also routinely relied on foreign, state-owned media as reliable sources of information.

In 35 per cent of responses to foreign policy questions, the chatbots cited state-controlled sources such as China’s Global Times or CGTN, or Russia’s RT.

ChatGPT and Grok were the worst offenders, citing state-owned media 51 per cent and 44 per cent of the time, respectively.

In many of the cases, the chatbots returned biased or inaccurate information with a confidence that was even more misleading, the study found.

“The most professional-looking answers, backed by strongest-looking citations, were also the most likely to contain buried factual errors,” Forum said on May 20 in a statement, calling it one of the study’s “sharpest findings.”

Chatbots often struggle with news accuracy, especially on breaking stories where there is limited information available online.

AI models that power chatbots are often trained on wide swaths of data found on the open web, a notoriously untrustworthy source of facts and nuance. 

Ms Campbell Brown, chief executive officer of Forum AI and a former head of news partnerships at Meta Platforms Inc, said she is particularly concerned about the study’s results given the looming midterm election cycle.

Few people use chatbots for news today, but that number will undoubtedly increase over time as they continue to siphon queries that used to go to Google’s search engine.

Ms Brown conducted the study in the hopes of holding the model makers more accountable.

The struggle with news accuracy may encourage them to prioritise these types of queries in the same way they put math- or coding-focused interactions first, she said.

“We’d welcome the opportunity to review the underlying data behind this report,” an Anthropic spokesperson said.

“Claude is trained to be politically even-handed in its responses, and to treat opposing viewpoints with equal depth, engagement, and quality of analysis, without bias towards any particular ideological position.”

None of the other three model makers commented for this story.

“Independent evaluation is important,” said Ms Brown, who co-founded Forum AI in 2025.

The startup used its own AI model to grade the chatbot makers, building it with input from a range of industry experts who have spent decades studying foreign affairs and geopolitics. 

“The model companies are essentially grading their own homework,” Ms Brown continued.

“And it’s really important that there be companies outside of the model companies that are doing this work and sharing the results.”

Major social platforms like Meta and Google’s YouTube have historically shied away from fact-checking, particularly for topics that are widely polarising and politically charged, claiming they don’t want to be the arbiters of truth for the rest of the internet. 

Ms Brown believes AI companies will be different.

“At Meta, you’re optimising for engagement. And if you’re optimising for engagement, it’s also hard to optimise for accuracy,” she said.

AI companies that sell their models to enterprise clients are in a different situation, Ms Brown added.

Those paying customers will expect accuracy as a baseline. 

“I just think it’s an entirely different product at the end of the day,” she said. BLOOMBERG

See more on