Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Abstention on AbstentionBench QAQA
Loading...
75.8
Abstention F1
Gemini 3
65.504
68.177
70.85
73.523
May 25, 2026
Abstention F1
Abstention Recall
Abstention Precision
Accuracy
Updated 8d ago
Evaluation Results
Method
Method
Links
Abstention F1
Abstention Recall
Abstention Precision
Accuracy
Gemini 3
Model Type=Proprietary...
2026.05
75.8
66.7
87.8
78.3
Claude Sonnet 4.5
Model Type=Proprietary...
2026.05
66.7
55.6
83.3
56.5
GPT-5.2
Model Type=Proprietary...
2026.05
66
59.3
74.4
65.2
TIAR
Backbone=Llama-3.1-8B-...
2026.05
65.9
51.9
90.3
43.5
Feedback
Search any
task
Search any
task