Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Shared Doubt: Zero-shot Cross-Lingual Confidence Estimation for Language Models

About

Confidence estimation (CE), i.e. quantifying the reliability of a model's prediction, has attracted great interest in the context of large language models (LLMs). However, most studies focus on English, ignoring the multilingual reality of LLM usage, while many CE methods degrade or require retraining across languages. To address this gap, we investigate whether multilingual LLMs encode shared, language-transferable confidence features. We use a lightweight linear probe that predicts answer correctness directly from intermediate representations. Trained monolingually, the probe generalizes zero-shot to unseen, typologically diverse languages without target-language supervision. Learned layer weights and multiple ablations reveal that confidence features concentrate in middle layers across languages, suggesting a shared confidence subspace. While zero-shot cross-lingual performance depends on similarity to the source language, the probe provides a strong baseline without any retraining and compares favorably to other popular confidence estimation methods.

Athina Kyriakou, Dennis Ulmer, Ivan Titov• 2026

Related benchmarks

TaskDatasetResultRank
Confidence EstimationMKQA (test)
AUROC0.82
14
Confidence EstimationGlobal MMLU (test)
AUROC0.78
14
Confidence EstimationMKQA Spanish es (test)
AUROC82
5
Confidence EstimationGlobal-MMLU Spanish es (test)
AUROC74
5
Confidence EstimationMKQA Russian / ru (test)
AUROC80
5
Confidence EstimationGlobal-MMLU Russian (test)
AUROC72
5
Confidence EstimationGlobal-MMLU Japanese ja (test)
AUROC72
5
Confidence EstimationMKQA Japanese ja (test)
AUROC64
5
Confidence EstimationMKQA French
AUROC0.82
2
Confidence EstimationMKQA English
AUROC76
2
Showing 10 of 22 rows

Other info

Follow for update