Small Language Model Helps Resolve Semantic Ambiguity of LLM Prompt

About

Large language models (LLMs) are increasingly utilized in various complex reasoning tasks due to their excellent instruction following capability. However, the model's performance is highly dependent on the open-ended characteristics of the users' input prompt. Natural prompts often do not follow proper syntactic rules, which creates ambiguous queries that yield multiple interpretations. Such ambiguous prompts confuse the model in choosing the correct reasoning paths to answer questions. Prior works address this challenge by applying query editing during the LLM inference process without explicitly solving the root cause of the ambiguity. To address this limitation, we propose a pre-inference prompt optimization mechanism via explicit prompt disambiguation. Particularly, we identify semantic risks in the prompt, check their multi-perspective consistency, and resolve any semantic conflicts that arise. Finally, we organize the resolved ambiguities in a logically structured manner as a clean input to the LLM. By explicitly resolving semantic ambiguity, our method can produce a more focused attention distribution to the semantically essential tokens. We also leverage small language models (SLMs) as the main executor of prompt disambiguation to benefit from their efficient computation. Through comprehensive experiments on multiple benchmarks, we demonstrate that our method improves reasoning performance by 2.5 points at a cost of only \$0.02. Our study promotes explicit prompt disambiguation as an effective prompt optimization method without disturbing the internal mechanism of LLM inference.

Zhenzhen Huang, Chaoning Zhang, Fachrina Dewi Puspitasari, Jiaquan Zhang, Yitian Zhou, Shuxu Chen, Yang Yang• 2026

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	AGIEval MATH	Accuracy46.7	99
Coreference Resolution	WSC	Accuracy@185.2	33
Fact Checking	LIAR	Accuracy@169	33
Reasoning	GPQA	Accuracy@1 (GPQA)44	33
Spatial Reasoning	BBH Navigate	Accuracy@198	33
Coreference Resolution	WSC ambiguity-augmented	Accuracy82.6	11
Fake News Detection	LIAR ambiguity-augmented	Accuracy68.9	11
Knowledge-intensive reasoning	GPQA ambiguity-augmented	Accuracy42.8	11
Reasoning	GPQA Ambiguity-Augmented (subset of 200 samples)	Accuracy@144	11
Reasoning	LIAR Ambiguity-Augmented subset of 200 samples	Accuracy@169	11

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord