Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

List operations evaluation on ListOps (5, 9) (test)

49.6Mean Accuracy

BBT-GRC

20.37627.96335.5543.137May 25, 2026
Updated 8d ago

Evaluation Results

MethodLinks
2026.05
49.639.2
2026.05
48.136.4
2026.05
45.938.8
2026.05
44.936
2026.05
3922.8
2026.05
29.922.7
2026.05
21.58.7