Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LogicBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Logical ReasoningLogicBench
Accuracy80.4
28
Skill retrievalLogicBench
Recall@131.4
11
Skill retrievalLogicBench
nDCG@131.4
11
Constrained DecodingLogicBench
Constraint Satisfaction98.5
7
Logical ReasoningLogicBench SEM variant
Accuracy92.26
2
Logical ReasoningLogicBench Original
Accuracy92.81
2
Showing 6 of 6 rows