Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Selective Prediction on MNLI
Loading...
86.92
Coverage @ 0.9
UAT-LITE
70.7688
74.9619
79.155
83.3481
Feb 3, 2026
Coverage @ 0.9
Coverage @ 0.8
Coverage @ 0.7
AURC
Updated 5d ago
Evaluation Results
Method
Method
Links
Coverage @ 0.9
Coverage @ 0.8
Coverage @ 0.7
AURC
UAT-LITE
evaluation=mean over f...
2026.02
86.92
90.3
92.75
0.0529
Baseline
type=deterministic
2026.02
71.39
75.27
79.26
0.164
Feedback
Search any
task
Search any
task