Share your thoughts, 1 month free Claude Pro on usSee more

Multi-hop reasoning on 2WikiMultihopQA

48.44Exact Match (EM)

Prompt-R1

Updated 4mo ago

Evaluation Results

Method	Links
Prompt-R1 2025.11		48.44	54.41
CoT Reasoning 2025.11		43.75	49.13
SFT 2025.11		41.41	42.62
GEPA 2025.11		41.41	46.27
GRPO 2025.11		34.38	35.05
Baseline 2025.11		33.59	36.57
Baseline 2025.11		28.13	29.32
OPRO 2025.11		25	35.96
CoT Reasoning 2025.11		21.88	24.17
TextGrad 2025.11		18.75	27.5