Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Question Answering on Probe 2
Loading...
70
Accuracy
ChatGPT 5 Thinking
25.592
37.121
48.65
60.179
Jan 21, 2026
Accuracy
Micro F1
Macro F1
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Micro F1
Macro F1
ChatGPT 5 Thinking
2026.01
70
-
-
Gemini 2.5 Pro
2026.01
62.7
-
-
Anthropic Claude-3-Haiku
2026.01
61
-
-
Naive Human
medical knowledge=none
2026.01
27.3
-
-
Feedback
Search any
task
Search any
task