Share your thoughts, 1 month free Claude Pro on usSee more

Logical Reasoning on PrOntoQA

91.4Calibrated Accuracy

Llama 3.1 8B

Updated 2mo ago

Evaluation Results

Method	Links
Llama 3.1 8B 2026.05		91.4
Qwen2.5-7B 2026.05		80.7
Gemma 3 27B 2026.05		77.4
DS-R1-7B 2026.05		77.1
Qwen2.5-32B 2026.05		67.8
SC+IC (tune) 2024.05		63.8
Qwen 3.5 9B 2026.05		63.5
SC+IC (tune) 2024.05		60.4
SC+IC (tune) 2024.05		59.3
SC+IC (tune) 2024.05		56.6
SC+IC (tune) 2024.05		56.6
SC+IC (tune) 2024.05		55.7
SC+IC (tune) 2024.05		54.5
DS-R1-32B 2026.05		52.9
SC+IC (tune) 2024.05		50.8
Qwen2.5-14B 2026.05		49.6
DS-R1-14B 2026.05		49.5