Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ListOps

Benchmarks

Task NameDataset NameSOTA ResultTrend
Hierarchical ReasoningListOps Long Range Arena (test)
Accuracy63.04
26
Hierarchical reasoning on symbolic sequencesLong ListOps (test)
Accuracy62.75
22
Sequence ClassificationListOps
Accuracy (%)43.2
13
Logical Expression EvaluationListOps-O Argument Generalization (Arguments 15)
Accuracy79
11
Logical Expression EvaluationListOps-O Argument Generalization (Arguments 10)
Accuracy0.8415
11
Logical Expression EvaluationListOps-O Length Generalization (Lengths 900-1000)
Accuracy99.5
11
Logical Expression EvaluationListOps-O Length Generalization (Lengths 500-600)
Accuracy99.4
11
Logical Expression EvaluationListOps-O Length Generalization (Lengths 200-300)
Accuracy99.9
11
Logical Expression EvaluationListOps-O near-IID (Lengths < 1000, Arguments < 5)
Accuracy99.9
11
Mathematical Expression EvaluationListOps Long Range Arena (test)
Accuracy41.4
7
Long-range sequence modelingListOps Long Range Arena (LRA) 2K (test)
Accuracy37.9
6
Long-range sequence modelingListOpsMix (test)
Accuracy70.43
5
Unsupervised ParsingListOps (test)
Accuracy68.07
5
Unsupervised ParsingListOps (val)
Accuracy67.65
5
Unsupervised ParsingListOps simplified (test)
Accuracy (Max)93.78
4
Showing 15 of 15 rows