Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Data Analysis on QRData Verified
Loading...
63.68
Accuracy
Kimi K2 Instruct
33.8944
41.6272
49.36
57.0928
Jan 22, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Kimi K2 Instruct
Model Type=Open-sourced
2026.01
63.68
GPT-5
Reasoning effort=mediu...
2026.01
61.75
Claude Sonnet 4.5
Model Type=Proprietary
2026.01
61.35
GPT-4o
Model Type=Proprietary
2026.01
60.24
GPT-5.1
Reasoning effort=high,...
2026.01
60.16
Claude Sonnet 4
Model Type=Proprietary
2026.01
59.06
GPT-5.1
Reasoning effort=none,...
2026.01
58.96
Deepseek-v3.1
Model Type=Open-sourced
2026.01
57.37
Qwen3-Coder 480B
Model Type=Open-sourced
2026.01
54.72
Qwen3 235B Instruct
Model Type=Open-sourced
2026.01
54.18
GPT-OSS-120B
Model Type=Open-sourced
2026.01
47.95
Qwen3-4B-Instruct
Model Type=Open-sourced
2026.01
45.27
Qwen2.5-7B-Instruct
Model Type=Open-sourced
2026.01
35.04
Feedback
Search any
task
Search any
task