Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multiple Choice Question Answering on CTIBench MCQA
Loading...
0.819
Score
GPT-5
0.47996
0.56798
0.656
0.74402
Jan 28, 2026
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
GPT-5
Model Group=frontier O...
2026.01
0.819
GPT-4.1
Model Group=frontier O...
2026.01
0.76
GPT-5-Mini
Model Group=frontier O...
2026.01
0.753
o3-Mini
Model Group=frontier O...
2026.01
0.716
GPT-OSS-120B
Model Group=GPT-OSS mo...
2026.01
0.714
Llama-Primus-Nemotron-70B-Instruct
Model Group=Llama-fami...
2026.01
0.705
Llama-3.3-70B-Instruct
Model Group=Llama-fami...
2026.01
0.692
Foundation-Sec-8B-Reasoning
Model Group=our reason...
2026.01
0.691
GPT-5-Nano
Model Group=frontier O...
2026.01
0.688
Qwen-3-14B
Model Group=smaller sp...
2026.01
0.664
Phi-4
Model Group=smaller sp...
2026.01
0.658
GPT-OSS-20B
Model Group=GPT-OSS mo...
2026.01
0.655
Foundation-Sec-8B-Instruct
Model Group=Llama-fami...
2026.01
0.65
Qwen-3-8B
Model Group=smaller sp...
2026.01
0.649
Llama-3.1-8B-Instruct
Model Group=Llama-fami...
2026.01
0.607
Llama-Primus-Merged
Model Group=Llama-fami...
2026.01
0.604
DeepHat-V1-7B
Model Group=smaller sp...
2026.01
0.493
Feedback
Search any
task
Search any
task