Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Logical Reasoning on CLUTRR rob train clean 23 all (test)
Loading...
35.6
Accuracy
Llama-3.1-8B-it (w/ SLR)
25.616
28.208
30.8
33.392
Jun 18, 2025
Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Accuracy
Llama-3.1-8B-it (w/ SLR)
Model=Llama-3.1-8B-it,...
2025.06
35.6
Llama-3.1-8B-it (Base)
Model=Llama-3.1-8B-it,...
2025.06
29.1
Llama-3.1-8B-it (w/ DeepSeek-R1)
Model=Llama-3.1-8B-it,...
2025.06
26
Feedback
Search any
task
Search any
task