Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering with Abstention on SELFAWARE
Loading...
91.4
U-Ref
Abstain-R1
48.136
59.368
70.6
81.832
Apr 18, 2026
U-Ref
Updated 1mo ago
Evaluation Results
Method
Method
Links
U-Ref
Abstain-R1
Model=Abstain-R1
2026.04
91.4
Qwen2.5 3B Instruct
Model=Qwen2.5 3B Instruct
2026.04
82.3
DeepSeek-V3
Model=DeepSeek-V3
2026.04
72.1
Qwen2.5 7B Instruct
Model=Qwen2.5 7B Instruct
2026.04
71.2
DeepSeek-R1
Model=DeepSeek-R1
2026.04
63.8
Qwen2.5 32B Instruct
Model=Qwen2.5 32B Inst...
2026.04
62.7
Llama3.1 8B Instruct
Model=Llama3.1 8B Inst...
2026.04
49.8
Feedback
Search any
task
Search any
task