Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Aggregated Logical Reasoning on Overall Unsolvable

0.945Accuracy

GPT-5.1-Low

0.14940.355950.56250.76905Dec 1, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
0.945
2025.12
0.855
2025.12
0.85
2025.12
0.524
2025.12
0.345
2025.12
0.242
2025.12
0.18