Share your thoughts, 1 month free Claude Pro on usSee more

Standard Question Answering on SQuAD v2

19.53EM

Prompt-R1

Updated 4mo ago

Evaluation Results

Method	Links
Prompt-R1 2025.11		19.53	29.28
CoT Reasoning 2025.11		14.06	25.73
Baseline 2025.11		13.28	25.61
GEPA 2025.11		13.28	25.52
OPRO 2025.11		10.94	26.67
GRPO 2025.11		10.16	23.1
Baseline 2025.11		6.25	16.09
CoT Reasoning 2025.11		6.25	16.25
TextGrad 2025.11		6.25	22.04
SFT 2025.11		5.47	16.18