Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Logic

Benchmarks

Task NameDataset NameSOTA ResultTrend
Logical reasoningLogic
Accuracy68.07
16
Language ModelingLogic (val)
Perplexity131.95
2
LogicLogic Hard
Baseline Score34.63
1
LogicLogic Medium
Baseline Score47.37
1
LogicLogic Easy
Baseline Score0.425
1
Showing 5 of 5 rows