Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Compositional Generalization on Evaluation Dataset Unseen (Fold 3)

0.4022Score

Qwen2.5-7B

0.1755840.2344170.293250.352083Jan 29, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
0.4022
2026.01
0.3977
2026.01
0.3948
2026.01
0.3902
2026.01
0.3864
2026.01
0.3822
2026.01
0.3665
2026.01
0.358
2026.01
0.3497
2026.01
0.3484
2026.01
0.3481
2026.01
0.3443
2026.01
0.3346
2026.01
0.3235
2026.01
0.3065
2026.01
0.3053
2026.01
0.2665
2026.01
0.1843