Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Physical Commonsense Reasoning on PIQA (Mean per-step regret)

0.152Mean Per-Step Regret

LinFTPL

0.144480.195240.2460.29676Feb 23, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
0.152
2026.02
0.161
2026.02
0.172
2026.02
0.173
2026.02
0.174
2026.02
0.174
2026.02
0.186
2026.02
0.192
2026.02
0.193
2026.02
0.197
2026.02
0.213
2026.02
0.236
2026.02
0.252
2026.02
0.259
2026.02
0.34