Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation

About

LLM self-evaluation relies on the LLM's own ability to estimate response correctness, which can greatly improve its deployment reliability. In this research track, we propose the Chain-of-Embedding (CoE) in the latent space to enable LLMs to perform output-free self-evaluation. CoE consists of all progressive hidden states produced during the inference time, which can be treated as the latent thinking path of LLMs. We find that when LLMs respond correctly and incorrectly, their CoE features differ, these discrepancies assist us in estimating LLM response correctness. Experiments in four diverse domains and seven LLMs fully demonstrate the effectiveness of our method. Meanwhile, its label-free design intent without any training and millisecond-level computational cost ensures real-time feedback in large-scale scenarios. More importantly, we provide interesting insights into LLM response correctness from the perspective of hidden state changes inside LLMs.

Yiming Wang, Pei Zhang, Baosong Yang, Derek F. Wong, Rui Wang• 2024

Related benchmarks

Task	Dataset	Result
Hallucination Detection	TriviaQA	--	621
Mathematical Reasoning	GSM8K	EM33	123
Hallucination Detection	GSM8K	AUROC75.5	115
Hallucination Detection	CSQA	AUROC66.89	107
Mathematical Reasoning	GSM-Symbolic	GSM-Sym Accuracy25.9	73
Hallucination Detection	MMLU	AUPRC73.77	62
Hallucination Detection	CommonsenseQA	Mean AUROC0.4779	62
Question Answering	MMLU	AUC50.53	51
Question Answering	CommonsenseQA	AUC61.38	51
Question Answering	MedMCQA	AUC62.14	51

Showing 10 of 38 rows

Other info

Follow for update

@wizwand_team Discord