Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Logic

Benchmarks

Task NameDataset NameSOTA ResultTrend
Logical reasoningLogic
Accuracy68.07
16
Spatial and logical reasoningLogic
Score66.89
6
Language ModelingLogic (val)
Perplexity131.95
2
LogicLogic Hard
Baseline Score34.63
1
LogicLogic Medium
Baseline Score47.37
1
LogicLogic Easy
Baseline Score0.425
1
Showing 6 of 6 rows