AI assistants fail basic fact-checking in BBC news study

Tanveer February 15, 2025

[ad_1]

A systematic evaluation of leading AI chatbots reveals widespread problems with accuracy and reliability when handling news content.

The study, conducted by the BBC, tested ChatGPT, Microsoft Copilot, Google Gemini, and Perplexity on their ability to accurately report current events.

In December 2024, 45 BBC journalists evaluated how these AI systems handled 100 current news questions. They assessed responses across seven key areas: accuracy, source attribution, impartiality, fact-opinion separation, commentary, context, and proper handling of BBC content. Each response was rated from “no issues” to “significant issues.”

51 percent of AI responses contained significant issues, ranging from basic factual errors to completely fabricated information. When the systems specifically cited BBC content, 19 percent of responses contained errors, while 13 percent contained either fabricated or misattributed quotes.

Several diagrams on BBC analysis of AI assistants: quality issues by category and comparison of ChatGPT, Copilot, Gemini and Perplexity. — Google Gemini had the highest rate of problematic responses at more than 60 percent. Accuracy and source support have room for improvement in all systems tested. | Image: via BBC

Contents hide

From health advice to current events: AI systems struggle with accuracy

Scale of AI news distortion remains unknown, BBC warns

From health advice to current events: AI systems struggle with accuracy

Some of the errors could have real-world consequences. Google Gemini incorrectly claimed that the UK’s National Health Service (NHS) advises against vaping, when in fact the health authority recommends e-cigarettes to help people quit smoking. Perplexity AI fabricated details about science journalist Michael Mosley’s death, while ChatGPT failed to acknowledge the death of a Hamas leader, describing him as an active leader months after his passing.

The AI assistants regularly cited outdated information as current news, failed to separate opinions from facts, and dropped crucial context from their reporting. Microsoft Copilot, for instance, presented a 2022 article about Scottish independence as if it were current news.

Four bar charts compare AI assistants in the categories of impartiality, fact-opinion separation, editorialization, and context provision. — Among all the AI tools tested, Perplexity managed to perform most consistently across these different challenges. | Image: via BBC

The BBC set a high bar in its evaluation – even small mistakes counted as “significant issues” if they might mislead someone reading the response. And while the standards were tough, the problems they found match what other researchers have already seen about how AI stumbles when handling news.

Take one of the more striking examples: Microsoft’s Bing chatbot got so confused reading court coverage that it accused a journalist of committing the very crimes he was reporting on.

The BBC says it will run this study again in the near future. Adding independent reviewers and comparing how often humans make similar mistakes could make future studies even more useful – it would help show just how big the gap is between human and AI performance.

Recommendation

Scale of AI news distortion remains unknown, BBC warns

The BBC acknowledges that their research, while revealing, only begins to uncover the full scope of the problem. The challenge of tracking these errors is complex. “The scale and scope of errors and the distortion of trusted content is unknown,” the BBC report states.

AI assistants can provide answers to an almost unlimited range of questions, and different users might receive entirely different responses when asking the same question. This inconsistency makes systematic evaluation extremely difficult.

The problem extends beyond just users and journalists. Media companies and regulators lack the tools to fully monitor or measure these distortions. Perhaps most concerning, the BBC suggests that even the AI companies themselves may not know the true extent of their systems’ errors.

“Regulation may have a key role to play in helping ensure a healthy information ecosystem in the AI age,” the BBC writes.

[ad_2]

Source link

Tanveer

Pro AI Tools is a seasoned expert in the field of artificial intelligence and technology. With a passion for innovation and a keen understanding of AI's transformative power, they have dedicated their career to exploring and sharing insights into cutting-edge tools and technologies.Drawing from extensive experience in the tech industry, Pro AI Tools is committed to providing valuable resources and comprehensive reviews to help individuals and businesses leverage AI for enhanced productivity and success. Their expertise spans a wide range of AI applications, from machine learning and natural language processing to automation and data analysis.Pro AI Tools believes in the potential of technology to drive positive change and is dedicated to making complex concepts accessible to a broad audience. Through their website, ProAITools.tech, they aim to empower users with the knowledge and tools needed to stay at the forefront of AI advancements.When not immersed in the latest tech developments, Pro AI Tools enjoys exploring new technologies, attending industry conferences, and sharing insights with a community of tech enthusiasts.

View all posts

Pro AI Tools

Pro AI Tools

AI assistants fail basic fact-checking in BBC news study

From health advice to current events: AI systems struggle with accuracy

Scale of AI news distortion remains unknown, BBC warns

Tanveer

Meta joins the race to build humanoid robots

Sonoco Opens Hyderabad GCC With $10 Mn Initial Investment for Global IT Operations

UAE to invest up to 50 billion euros in AI campus in France

LWiAI Podcast #191 – Sora leak, Pixtral Large, OpenAI email archives

Recent Posts

Archives

Categories

Meta

AI assistants fail basic fact-checking in BBC news study

From health advice to current events: AI systems struggle with accuracy

Scale of AI news distortion remains unknown, BBC warns

Tanveer

You Might Also Like

Meta joins the race to build humanoid robots

Sonoco Opens Hyderabad GCC With $10 Mn Initial Investment for Global IT Operations

UAE to invest up to 50 billion euros in AI campus in France

LWiAI Podcast #191 – Sora leak, Pixtral Large, OpenAI email archives

Recent Posts

Archives

Categories

Meta