Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on IMOBench
Loading...
25.9
Pass@1
Composition-RL
12.692
16.121
19.55
22.979
Feb 12, 2026
Pass@1
Updated 3mo ago
Evaluation Results
Method
Method
Links
Pass@1
Composition-RL
Backbone=Qwen3-14B-Base
2026.02
25.9
Composition-RL
Backbone=Qwen3-4B-Base...
2026.02
22.9
Composition-RL
Backbone=Qwen3-30B-A3B...
2026.02
22.8
Standard RLVR
Backbone=Qwen3-14B-Base
2026.02
21.3
Composition-RL
Backbone=Qwen3-4B-Base...
2026.02
20.1
Composition-RL
Backbone=Qwen3-8B-Base
2026.02
18.4
Standard RLVR
Backbone=Qwen3-8B-Base
2026.02
16.2
Standard RLVR
Backbone=Qwen3-4B-Base
2026.02
14.4
Composition-RL
Backbone=Qwen3-4B-Base
2026.02
14.3
Standard RLVR
Backbone=Qwen3-30B-A3B...
2026.02
13.2
Feedback
Search any
task
Search any
task