Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Interactive Question Answering on AskOverconfidence
Loading...
84
Accuracy
Gemini
42.712
53.431
64.15
74.869
Feb 4, 2026
Accuracy
Coverage
Uniqueness
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Coverage
Uniqueness
Gemini
Evaluation Protocol=Mu...
2026.02
84
74.9
2.5
GPT
Evaluation Protocol=Mu...
2026.02
73
60.2
1.5
OursI
Evaluation Protocol=Mu...
2026.02
62.8
64.1
21
OursO
Evaluation Protocol=Mu...
2026.02
54.8
89.4
46.3
Qwen
Evaluation Protocol=Mu...
2026.02
44.3
18.8
0.8
Feedback
Search any
task
Search any
task