Share your thoughts, 1 month free Claude Pro on usSee more

Knowledge-intensive Reasoning on GPQA ambiguity-augmented

42.8Accuracy

DisambiguSLM

Updated 2mo ago

Evaluation Results

Method	Links
DisambiguSLM 2026.04		42.8
SPO 2026.04		41.1
OPRO 2026.04		40.3
Step-back 2026.04		39.7
PromptAgent 2026.04		39.2
CoT 2026.04		38.9
TextGrad 2026.04		38.9
APE 2026.04		38.8
PromptBreeder 2026.04		38.5
Rephrase 2026.04		37.1
Naïve prompting 2026.04		36.5