Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CLUTRR

Benchmarks

Task NameDataset NameSOTA ResultTrend
Logical ReasoningCLUTRR
Accuracy95.9
42
Logical ReasoningCLUTRR (test)
Accuracy80.1
35
Inductive ReasoningClutrr
Pass@195.5
18
Binary ClassificationCLUTRR
Accuracy78
18
Showing 4 of 4 rows