Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Comprehensive Reasoning on ScienceQA

84.2Accuracy

Unsilencing Latent Reasoning

68.28872.41976.5580.681May 4, 2026
Updated 29d ago

Evaluation Results

MethodLinks
2026.05
84.2
2026.05
83.9
2026.05
83.8
2026.05
83.8
2026.05
83.4
2026.05
83.1
2026.05
82.8
2026.05
82.3
2026.05
78.8
2026.05
78.4
2026.05
74.3
2026.05
74.1
2026.05
73.9
2026.05
73.8
2026.05
73.7
2026.05
73.5
2026.05
73.2
2026.05
68.9