Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on Overall
Loading...
7,709
Length (tokens)
Segment Selective SFT
7,477.12
9,042.31
10,607.5
12,172.69
Jan 31, 2026
Length (tokens)
Accuracy
Pass@1
Pass@6
Updated 3d ago
Evaluation Results
Method
Method
Links
Length (tokens)
Accuracy
Pass@1
Pass@6
Segment Selective SFT
Model Backbone=R1-Dist...
2026.01
7,709
-
65.8
80
Segment Selective SFT
Model Backbone=R1-Dist...
2026.01
8,499
64.5
-
-
Segment Selective SFT
Model Backbone=Qwen2.5...
2026.01
9,195
-
45
66.5
Segment Selective SFT
Model Backbone=R1-Dist...
2026.01
9,388
-
51.7
73.2
Segment Selective SFT
Model Backbone=Qwen2.5...
2026.01
9,852
45.6
-
-
Segment Selective SFT
Model Backbone=R1-Dist...
2026.01
13,506
46.9
-
-
Feedback
Search any
task
Search any
task