Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Intrinsic Reasoning on GNLI
Loading...
0.843
Spearman Correlation
Llama-3-Instruct
0.57572
0.64511
0.7145
0.78389
May 2, 2025
Spearman Correlation
Updated 1mo ago
Evaluation Results
Method
Method
Links
Spearman Correlation
Llama-3-Instruct
Evaluation Protocol=Pr...
2025.05
0.843
Always Tell Me The Odds
Backbone=Qwen2.5-14B-I...
2025.05
0.838
Always Tell Me The Odds
Backbone=Qwen2.5-14B-I...
2025.05
0.82
Always Tell Me The Odds
Backbone=Qwen2.5-14B-I...
2025.05
0.814
Always Tell Me The Odds
Backbone=Qwen2.5-7B-In...
2025.05
0.811
GPT-4o
Evaluation Protocol=0-...
2025.05
0.796
Always Tell Me The Odds
Backbone=Qwen2.5-8B-In...
2025.05
0.789
DeepSeek-R1-Distill-Qwen-32B
Evaluation Protocol=0-...
2025.05
0.755
RoBERTa-L
Type=Encoder
2025.05
0.586
Feedback
Search any
task
Search any
task