Shared Doubt: Zero-Shot Cross-Lingual Confidence Estimation for Language Models

About

Confidence estimation (CE), i.e., quantifying the reliability of a model's prediction, has attracted great interest in the context of large language models (LLMs). However, most studies focus on English, ignoring the multilingual reality of LLM usage, while many CE methods degrade or require retraining across languages. To address this gap, we investigate whether multilingual LLMs encode shared, language-transferable confidence features in open-ended question answering. We use a lightweight linear probe that predicts answer correctness directly from intermediate representations. Trained monolingually, the probe generalizes zero-shot to unseen, typologically diverse languages without target-language supervision. Learned layer weights and multiple ablations reveal that confidence features concentrate in middle layers across languages, suggesting a shared confidence subspace. While zero-shot cross-lingual performance depends on similarity to the source language, the probe provides a strong baseline without any retraining and compares favorably to other popular confidence estimation methods.

Athina Kyriakou, Dennis Ulmer, Ivan Titov• 2026

Related benchmarks

Task	Dataset	Result
Confidence Estimation	MKQA (test)	AUROC0.82	14
Confidence Estimation	Global MMLU (test)	AUROC0.78	14
Confidence Estimation	MKQA Spanish es (test)	AUROC82	5
Confidence Estimation	Global-MMLU Spanish es (test)	AUROC74	5
Confidence Estimation	MKQA Russian / ru (test)	AUROC80	5
Confidence Estimation	Global-MMLU Russian (test)	AUROC72	5
Confidence Estimation	Global-MMLU Japanese ja (test)	AUROC72	5
Confidence Estimation	MKQA Japanese ja (test)	AUROC64	5
Confidence Estimation	MKQA French	AUROC0.82	2
Confidence Estimation	MKQA English	AUROC76	2

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord