Overview
As Large Language Models (LLMs) become increasingly integrated into critical systems, detecting hallucinations reliably has emerged as a crucial challenge for AI safety. While much attention has focused on detecting hallucinations at either the token or sentence level, our research suggests both approaches miss important nuances in how LLMs generate and manipulate information. This post examines a specific challenge we've encountered in entity-level hallucination detection: the persistent problem of false positives across multiple detection methods.
The Current Landscape and Our Approach
Hallucination detection has traditionally operated at either the token level (examining individual words) or the sentence level (evaluating entire statements). Our research suggests an intermediate approach: focusing on entities - coherent semantic units... (read 543 more words →)