Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Delta Evaluation on PIQA (Physical Commonsense Reasoning)

0Delta Accuracy

AutoPrompt

-0.0832-0.0616-0.04-0.0184Dec 9, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
00.26
2025.12
00.15
2025.12
0-1.24
2025.12
00.11
2025.12
-0.011.75
2025.12
-0.010.31
2025.12
-0.01-1.23
2025.12
-0.012.01
2025.12
-0.010.48
2025.12
-0.01-1.22
2025.12
-0.010.17
2025.12
-0.010.72
2025.12
-0.021.17
2025.12
-0.020.62
2025.12
-0.02-0.35
2025.12
-0.02-0.62
2025.12
-0.021.39
2025.12
-0.031.34
2025.12
-0.03-0.07
2025.12
-0.041.24
2025.12
-0.041.78
2025.12
-0.060.28
2025.12
-0.07-2.53
2025.12
-0.08-0.23