Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
General QA on MMLU-Redux
Loading...
90.89
Exact Match
SnapMLA
40.9492
53.9146
66.88
79.8454
Dec 8, 2025
Dec 18, 2025
Dec 29, 2025
Jan 9, 2026
Jan 20, 2026
Jan 31, 2026
Feb 11, 2026
Exact Match
Updated 4d ago
Evaluation Results
Method
Method
Links
Exact Match
SnapMLA
Backbone=DeepSeek-V3.1...
2026.02
90.89
FlashMLA
Backbone=DeepSeek-V3.1...
2026.02
90.48
FlashMLA
Backbone=LongCat-Flash...
2026.02
89.3
SnapMLA
Backbone=LongCat-Flash...
2026.02
88.27
DeepSeek-R1
2025.12
88.17
CompassMax-V3-Thinking
2025.12
87.23
CompassMax-V3
2025.12
42.87
Feedback
Search any
task
Search any
task