Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on APEX 2025
Loading...
16.7
Accuracy
TRICE-30B
-0.668
3.841
8.35
12.859
May 7, 2026
Accuracy
Updated 26d ago
Evaluation Results
Method
Method
Links
Accuracy
TRICE-30B
Tool Usage=true, Param...
2026.05
16.7
TRICE-4B
Tool Usage=true, Param...
2026.05
13.9
GLM-4.7-Flash w/ recipe
Tool Usage=true, Param...
2026.05
13.9
GLM-4.7-Flash
Tool Usage=true, Param...
2026.05
11.1
Nemotron-3-Nano-30B-A3B
Tool Usage=true, Param...
2026.05
11.1
GPT-OSS-20B
Tool Usage=true, Param...
2026.05
8.3
TRICE-4B
Tool Usage=false, Para...
2026.05
5.6
Qwen3-4B-Thinking-2507
Tool Usage=false, Para...
2026.05
2.8
Qwen3-30B-A3B-Instruct-2507
Tool Usage=false, Para...
2026.05
2.8
Qwen3.5-4B
Tool Usage=false, Para...
2026.05
0
Qwen3.5-9B
Tool Usage=false, Para...
2026.05
0
Qwen3-30B-A3B-Thinking-2507
Tool Usage=false, Para...
2026.05
0
Qwen3.5-35B-A3B
Tool Usage=false, Para...
2026.05
0
TRICE-30B
Tool Usage=false, Para...
2026.05
0
Feedback
Search any
task
Search any
task