Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Logical Reasoning on SLR-BENCH (test)

11.3LRL

llama-3.1-8b-SLR

-0.4522.5995.658.701Jun 18, 2025
Updated 22d ago

Evaluation Results

MethodLinks
2025.06
11.3
2025.06
10.5
2025.06
8.2
2025.06
7.5
2025.06
7.3
2025.06
7
2025.06
6.8
2025.06
6
2025.06
6
2025.06
5.9
2025.06
5.6
2025.06
5.5
2025.06
5.3
2025.06
4.5
2025.06
3.8
2025.06
3.8
2025.06
3.6
2025.06
3.4
2025.06
2.6
2025.06
1.8
2025.06
1
2025.06
0
2025.06
0
2025.06
0
2025.06
0
2025.06
0
2025.06
0