Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Logical Reasoning on CLUTRR gen_train23_test2to10
Loading...
24
Accuracy
Llama-3.1-8B-it (w/ DeepSeek-R1)
9.648
13.374
17.1
20.826
Jun 18, 2025
Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Accuracy
Llama-3.1-8B-it (w/ DeepSeek-R1)
Model=Llama-3.1-8B-it,...
2025.06
24
Llama-3.1-8B-it (w/ SLR)
Model=Llama-3.1-8B-it,...
2025.06
19.1
Llama-3.1-8B-it (Base)
Model=Llama-3.1-8B-it,...
2025.06
10.2
Feedback
Search any
task
Search any
task