Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Math Reasoning on Gaokao En 2023
Loading...
74.7
Accuracy
NPG-Muse-8B
56.188
60.994
65.8
70.606
Aug 28, 2025
Sep 23, 2025
Oct 20, 2025
Nov 16, 2025
Dec 12, 2025
Jan 8, 2026
Feb 4, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
NPG-Muse-8B
Pass@k protocol=avg@8
2025.08
74.7
Full Attention
Base Model=DeepSeek-R1...
2026.02
74.2
LycheeDecode
Base Model=DeepSeek-R1...
2026.02
74.2
LycheeDecode
Base Model=DeepSeek-R1...
2026.02
72.7
Full Attention
Base Model=DeepSeek-R1...
2026.02
68.8
LycheeDecode
Base Model=DeepSeek-R1...
2026.02
68.8
LycheeDecode
Base Model=DeepSeek-R1...
2026.02
68.8
Qwen2.5-14B-Instruct-1M
Pass@k protocol=avg@8
2025.08
66
Qwen3-14B-Base
Pass@k protocol=avg@8
2025.08
64.8
TidalDecode
Base Model=DeepSeek-R1...
2026.02
63.3
TidalDecode
Base Model=DeepSeek-R1...
2026.02
62.5
NPG-Muse-7B
Pass@k protocol=avg@8
2025.08
61.1
Qwen2.5-7B-Ins-1M
Pass@k protocol=avg@8
2025.08
58.3
TidalDecode
Base Model=DeepSeek-R1...
2026.02
57.8
TidalDecode
Base Model=DeepSeek-R1...
2026.02
57
Qwen3-8B-Base
Pass@k protocol=avg@8
2025.08
56.9
Feedback
Search any
task
Search any
task