Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HalluClean: A Unified Framework to Combat Hallucinations in LLMs

About

Large language models (LLMs) have achieved impressive performance across a wide range of natural language processing tasks, yet they often produce hallucinated content that undermines factual reliability. To address this challenge, we introduce HalluClean, a lightweight and task-agnostic framework for detecting and correcting hallucinations in LLM-generated text. HalluClean adopts a reasoning-enhanced paradigm, explicitly decomposing the process into planning, execution, and revision stages to identify and refine unsupported claims. It employs minimal task-routing prompts to enable zero-shot generalization across diverse domains, without relying on external knowledge sources or supervised detectors. We conduct extensive evaluations on five representative tasks-question answering, dialogue, summarization, math word problems, and contradiction detection. Experimental results show that HalluClean significantly improves factual consistency and outperforms competitive baselines, demonstrating its potential to enhance the trustworthiness of LLM outputs in real-world applications.

Yaxin Zhao, Yu Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Hallucination DetectionHaluEval
F1 Score71.5
75
Hallucination DetectionPubMedQA
F1 Score81.7
36
Hallucination DetectionHaluEval Sum
F1 Score65.9
12
Hallucination Detection (Dialogue)HaluEval DA
F1 Score77.1
12
Hallucination Detection (Math Word Problems)UMWP
F1 Score89.1
12
Hallucination Detection (Self-contradictory Hallucinations)ChatProtect SC
F1 Score87
12
Hallucination DetectionHalluQA
Accuracy55
12
Dialogue AnalysisDA
R Metric92.5
10
Scientific ClaimsSC
R Score87.3
10
SummarizationSUM
ROUGE Score (R)59.5
10
Showing 10 of 15 rows

Other info

Follow for update