Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Abstention on AbstentionBench UMWP
Loading...
80.9
Abstention F1
GPT-5.2
71.956
74.278
76.6
78.922
May 25, 2026
Abstention F1
Abstention Recall
Abstention Precision
Accuracy
Updated 8d ago
Evaluation Results
Method
Method
Links
Abstention F1
Abstention Recall
Abstention Precision
Accuracy
GPT-5.2
Model Type=Proprietary...
2026.05
80.9
67.9
100
97.9
Claude Sonnet 4.5
Model Type=Proprietary...
2026.05
76.7
62.3
100
100
Gemini 3
Model Type=Proprietary...
2026.05
76.7
62.3
100
95.7
TIAR
Backbone=Llama-3.1-8B-...
2026.05
72.3
56.6
100
95.7
Feedback
Search any
task
Search any
task