Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Answerability Prediction on FACT n=10 (matched pairs)
Loading...
0.75
AUC
Refusal (Keyword Classifier)
-0.03
0.1725
0.375
0.5775
May 4, 2026
AUC
F1 Score
Updated 28d ago
Evaluation Results
Method
Method
Links
AUC
F1 Score
Refusal (Keyword Classifier)
Model=Qwen
2026.05
0.75
66.7
Geometry (own_dist)
Model=Mistral
2026.05
0.71
70
Geometry (own_dist)
Model=Llama
2026.05
0.69
63.2
Geometry (own_dist)
Model=Qwen
2026.05
0.66
70
Refusal (Keyword Classifier)
Model=Llama
2026.05
0.55
18.2
Refusal (Keyword Classifier)
Model=Mistral
2026.05
0.55
30.8
SC (Self-Consistency)
Model=Mistral
2026.05
0.47
-
SC (Self-Consistency)
Model=Llama
2026.05
0.46
-
SC (Self-Consistency)
Model=Qwen
2026.05
0
-
Feedback
Search any
task
Search any
task