Trending

Krisp Launches AI-Powered Live Interpretation to Break Language Barriers in Real-Time

SAP and NVIDIA Unite to Drive Next-Gen Business AI with Advanced Reasoning Models

Driving Profitability with SAP AI – How AI-Powered Predictive Maintenance Reduces Downtime and Costs in Manufacturing

Table of Contents

How Can We Stop AI Hallucinations? A Guide to Knowledge Overshadowing And Hallucination-Free LLMs

Read Time: 3 minutes

Table of Contents

A groundbreaking study from researchers at leading universities reveals that AI hallucinations stem from “knowledge overshadowing,” where frequently encountered information suppresses rarer facts, following a predictable mathematical pattern that enables both prediction and prevention of AI-generated misinformation.

Researchers from leading universities have identified “knowledge overshadowing” as the fundamental mechanism behind AI hallucinations, offering both predictive tools and practical solutions to combat false information generated by large language models.

What Are AI Hallucinations?

AI hallucinations arise when language models generate statements that sound plausible but are factually incorrect. Unlike simple misinformation, these errors are often subtle, mixing real knowledge with distortions. The study provides an example: when asked to name a famous singer in North Korea, an AI model incorrectly suggests “Kim Jong Un,” conflating widely known information (the North Korean leader’s name) with an unrelated category (famous singers).

Previous studies have linked hallucinations to data quality issues, inadequate fine-tuning, or inherent biases in how AI models weigh different pieces of information. However, the new research demonstrates that hallucinations persist even when the training data is strictly factual. This suggests that the problem lies not in what the models learn but in how they prioritize and retrieve information.

Understanding Knowledge Overshadowing

The study finds that knowledge overshadowing is a major driver of hallucinations. When a model is trained on a dataset where one piece of information appears more frequently than another, the more common knowledge suppresses the less common knowledge, leading the model to make incorrect assumptions.

The researchers discovered that the likelihood of a hallucination follows a predictable pattern. Their “log-linear law” shows that as the frequency of dominant knowledge increases, the probability of overshadowing—and thus hallucination—rises proportionally to the logarithm of that frequency. A similar effect occurs when knowledge length (the number of words in a fact) increases or when the model size grows.

This insight has important implications for large AI models. As models scale up, their ability to generalize improves, but their tendency to hallucinate also increases because they compress and simplify knowledge representations. This compression causes less frequent facts to be absorbed into dominant knowledge structures, increasing the risk of factual distortions.

Can We Predict and Prevent Hallucinations?

A key contribution of the study is its ability to predict hallucinations before they occur. By applying the log-linear law, researchers can estimate when a model is likely to hallucinate based on the characteristics of its training data. This predictive capability provides AI developers with a tool to diagnose and address hallucination risks before deploying models in real-world settings.

To mitigate hallucinations, the researchers propose a new method called “Contrastive Decoding to Amplify Overshadowed Knowledge” (CoDA). This technique works by identifying overshadowed knowledge and boosting its influence during text generation. Rather than retraining the model with new data, CoDA adjusts the model’s decoding process to balance dominant and less dominant knowledge sources.

Experiments with CoDA show significant improvements in factual accuracy. When tested on datasets designed to assess AI factuality, CoDA reduced hallucination rates by 27.9% on the Overshadow dataset, 13.1% on MemoTrap, and 18.3% on NQ-Swap—three benchmarks used to measure AI-generated misinformation.

Implications for AI Development

The findings suggest a fundamental shift in how AI developers should approach hallucinations. Instead of treating them as mere data-quality issues, developers should recognize that hallucinations stem from the structure of knowledge within AI models. Understanding knowledge overshadowing allows for more precise interventions, such as adjusting training data distributions or using methods like CoDA to counteract biases.

The study also challenges the assumption that bigger AI models are always better. While increasing model size generally improves performance, it also exacerbates hallucinations due to greater compression of information. This means that future AI development must balance model size with strategies to manage knowledge overshadowing.

Limitations and Future Work

While the study offers new insights, it also acknowledges limitations. The researchers were unable to analyze the training data of proprietary models like OpenAI’s GPT-4, making it difficult to directly validate their findings on state-of-the-art commercial AI systems. Additionally, quantifying real-world knowledge distributions remains a challenge, as natural language data is inherently noisy and imprecise.

It’s important to note that the researchers published their findings on arXiv, which is a pre-print server. Online pre-print servers help researchers gain fast feedback from their colleagues, especially in fast-moving fields, like artificial intelligence. However, this has not been officially peer-reviewed yet.

The Road Ahead

Future work could explore how knowledge overshadowing interacts with other AI mechanisms, such as reinforcement learning with human feedback (RLHF), which is commonly used to fine-tune models. Researchers also plan to refine methods like CoDA to work more effectively with larger models and real-world datasets.

As AI systems become more deeply integrated into industries that rely on accurate information, addressing hallucinations will be critical. The study’s identification of knowledge overshadowing as a primary cause — and its development of predictive and corrective measures — represents a step toward making AI-generated content more reliable.

Researchers and Affiliations

The study was conducted by Yuji Zhang, Sha Li, Cheng Qian, Jiateng Liu, Pengfei Yu, Chi Han, Yi R. Fung, Chengxiang Zhai, and Heng Ji from the University of Illinois Urbana-Champaign; Kathleen McKeown from Columbia University; Manling Li from both Northwestern University and Stanford University.

Get Instant Domain Overview
Discover your competitors‘ strengths and leverage them to achieve your own success