Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning on GPQA D (Accuracy, Loss)
Loading...
38.72
Accuracy
TIES
21.9552
26.3076
30.66
35.0124
Apr 10, 2026
Apr 18, 2026
Apr 27, 2026
May 6, 2026
May 14, 2026
May 23, 2026
Jun 1, 2026
Accuracy
Loss
Updated 1d ago
Evaluation Results
Method
Method
Links
Accuracy
Loss
TIES
Backbone=Qwen2.5-7B-Base
2026.06
38.72
-
Reasoner
Backbone=Qwen2.5-7B-Base
2026.06
35.35
-
TSV-Merge
Backbone=Qwen2.5-7B-Base
2026.06
35.02
-
RAM
Backbone=Qwen2.5-7B-Base
2026.06
34.51
-
RESMERGE
Backbone=Qwen2.5-7B-Ba...
2026.06
34.34
-
TA
Backbone=Qwen2.5-7B-Base
2026.06
33.67
-
DARE + TIES
Backbone=Qwen2.5-7B-Base
2026.06
32.15
-
ISO-CTS
Backbone=Qwen2.5-7B-Base
2026.06
31.48
-
SimpleRL
Backbone=Qwen2.5-7B-Base
2026.06
30.81
-
ISO-C
Backbone=Qwen2.5-7B-Base
2026.06
30.47
-
Zero
Backbone=Qwen2.5-7B-Base
2026.06
30.3
-
Qwen2.5-7B
Backbone=Qwen2.5-7B-Base
2026.06
30.13
-
Muon
Optimizer=Muon, Model...
2026.04
24.2
1.874
Adam+Nexus
Optimizer=Adam+Nexus,...
2026.04
23.4
1.881
AdamW
Optimizer=AdamW, Model...
2026.04
22.6
1.91
Feedback
Search any
task
Search any
task