Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
List operations evaluation on ListOps (5, 14) (test)
Loading...
53.1
Mean Accuracy
BBT-GRC
21.692
29.846
38
46.154
May 25, 2026
Mean Accuracy
Longest Bin
Updated 8d ago
Evaluation Results
Method
Method
Links
Mean Accuracy
Longest Bin
BBT-GRC
2026.05
53.1
41.4
RIR-GRC
2026.05
51.6
38.8
MLP-LDRU
2026.05
49
41.5
LSTM
2026.05
48.1
38.9
TF (ALiBi)
Positional Encoding=ALiBi
2026.05
41.7
25.8
TF (NoPE)
Positional Encoding=None
2026.05
32.5
24.2
TF (Sin.)
Positional Encoding=Si...
2026.05
22.9
8.7
Feedback
Search any
task
Search any
task