Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Logical Reasoning on CLUTRR gen_train234_test2to10
Loading...
25
Accuracy
Llama-3.1-8B-it (w/ DeepSeek-R1)
8.88
13.065
17.25
21.435
Jun 18, 2025
Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Accuracy
Llama-3.1-8B-it (w/ DeepSeek-R1)
Model=Llama-3.1-8B-it,...
2025.06
25
Llama-3.1-8B-it (w/ SLR)
Model=Llama-3.1-8B-it,...
2025.06
16.4
Llama-3.1-8B-it (Base)
Model=Llama-3.1-8B-it,...
2025.06
9.5
Feedback
Search any
task
Search any
task