Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CLUTRR

Benchmarks

Task NameDataset NameSOTA ResultTrend
Logical ReasoningCLUTRR
Accuracy95.9
42
Logical ReasoningCLUTRR (test)
Accuracy80.1
35
Hybrid ReasoningCLUTRR (test)
Accuracy76.4
24
Inductive ReasoningClutrr
Pass@195.5
18
Binary ClassificationCLUTRR
Accuracy78
18
Logical ReasoningCLUTRR rob_train_disc_23_all (test)
Accuracy41.6
3
Logical ReasoningCLUTRR rob train irr 23 all (test)
Accuracy34.5
3
Logical ReasoningCLUTRR rob_train_sup_23_all (test)
Accuracy45.2
3
Logical ReasoningCLUTRR rob train clean 23 all (test)
Accuracy35.6
3
Logical ReasoningCLUTRR gen_train234_test2to10
Accuracy25
3
Logical ReasoningCLUTRR gen_train23_test2to10
Accuracy24
3
Showing 11 of 11 rows