Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Compositional Generalization on Evaluation Dataset (Fold 1 Seen)

0.6191Score

Mistral-7B

0.1621240.2807620.39940.518038Jan 29, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
0.6191
2026.01
0.5794
2026.01
0.5747
2026.01
0.5678
2026.01
0.5626
2026.01
0.5465
2026.01
0.5203
2026.01
0.5095
2026.01
0.4425
2026.01
0.4062
2026.01
0.3908
2026.01
0.3885
2026.01
0.3806
2026.01
0.3671
2026.01
0.3261
2026.01
0.3161
2026.01
0.2503
2026.01
0.1797