Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multilingual Reasoning and General Knowledge on Overall (18 languages)
Loading...
61.91
Accuracy
SP3F-7B
12.9052
25.6276
38.35
51.0724
Jan 26, 2026
Accuracy
Language Fidelity
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Language Fidelity
SP3F-7B
Training Stage=Full Pi...
2026.01
61.91
95.35
Qwen2.5-7B + RLVR
Training Stage=SFT + RLVR
2026.01
57.79
96.09
Qwen2.5-7B-Instruct + Translate Test
Training Stage=Instruc...
2026.01
57.01
85.98
Qwen2.5-7B-Instruct
Training Stage=Instruct
2026.01
55.87
89.21
Qwen2.5-7B + SFT
Training Stage=SFT
2026.01
21.7
82.11
Qwen2.5-7B
Training Stage=Base
2026.01
14.79
78.78
Feedback
Search any
task
Search any
task