Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CausalAbstain: Enhancing Multilingual LLMs with Causal Reasoning for Trustworthy Abstention

About

Large Language Models (LLMs) often exhibit knowledge disparities across languages. Encouraging LLMs to \textit{abstain} when faced with knowledge gaps is a promising strategy to reduce hallucinations in multilingual settings. Current abstention strategies for multilingual scenarios primarily rely on generating feedback in various languages using LLMs and performing self-reflection. However, these methods can be adversely impacted by inaccuracies and biases in the generated feedback. To address this, from a causal perspective, we introduce \textit{CausalAbstain}, a method that helps LLMs determine whether to utilize multiple generated feedback responses and how to identify the most useful ones. Extensive experiments demonstrate that \textit{CausalAbstain} effectively selects helpful feedback and enhances abstention decisions with interpretability in both native language (\textsc{Casual-native}) and multilingual (\textsc{Causal-multi}) settings, outperforming strong baselines on two benchmark datasets covering encyclopedic and commonsense knowledge QA tasks. Our code and data are open-sourced at https://github.com/peachch/CausalAbstain.

Yuxi Sun, Aoqi Zuo, Wei Gao, Jing Ma• 2025

Related benchmarks

TaskDatasetResultRank
Multilingual Commonsense ReasoningM-Hellaswag
Accuracy (zh)79.2
21
Multilingual Knowledge Evaluationm-MMLU
Accuracy (zh)80.5
21
Multilingual Language UnderstandingM-MMLU (test)
zh Accuracy61.2
14
Showing 3 of 3 rows

Other info

Code

Follow for update