Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Compositional Generalization on Evaluation Dataset Unseen (Fold 1)

0.4818Score

DeepSeek V3

0.2432240.3051620.36710.429038Jan 29, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
0.4818
2026.01
0.4668
2026.01
0.4627
2026.01
0.4626
2026.01
0.451
2026.01
0.4298
2026.01
0.4109
2026.01
0.4079
2026.01
0.3974
2026.01
0.3944
2026.01
0.388
2026.01
0.3811
2026.01
0.3786
2026.01
0.344
2026.01
0.3082
2026.01
0.3016
2026.01
0.2953
2026.01
0.2524