Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Question Answering on ExpertQA
Loading...
0.2385
Reward
DAPO+START
0.182756
0.197228
0.2117
0.226172
Mar 21, 2026
Reward
Win Rate
Updated 25d ago
Evaluation Results
Method
Method
Links
Reward
Win Rate
DAPO+START
Configuration=DAPO+START
2026.03
0.2385
65.29
DAPO
Configuration=DAPO
2026.03
0.2352
-
GRPO+START
Evaluation Protocol=Po...
2026.03
0.2201
59.24
GRPO
Evaluation Protocol=Po...
2026.03
0.2182
-
GRPO-MA
Evaluation Protocol=Po...
2026.03
0.214
34.08
GRPO+START
Evaluation Protocol=T_...
2026.03
0.1876
68.15
GRPO-MA
Evaluation Protocol=T_...
2026.03
0.1859
54.14
GRPO
Evaluation Protocol=T_...
2026.03
0.1849
-
Feedback
Search any
task
Search any
task