Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Physical Commonsense Reasoning on PIQA (val test)
Loading...
79.42
Accuracy
LLaDA (Base)
48.8232
56.7666
64.71
72.6534
Feb 19, 2026
Mar 6, 2026
Mar 21, 2026
Apr 5, 2026
Apr 20, 2026
May 5, 2026
May 20, 2026
Accuracy
Updated 12d ago
Evaluation Results
Method
Method
Links
Accuracy
LLaDA (Base)
Pruning Ratio=0.0
2026.02
79.42
Sink-Aware (Ours)
Pruning Ratio=0.3
2026.02
69.55
LLaDA-structure
Pruning Ratio=0.3
2026.02
68.34
Sink-Aware (Ours)
Pruning Ratio=0.5
2026.02
60.37
LLaDA-structure
Pruning Ratio=0.5
2026.02
58.98
GPT2-201M
mode=Zero-shot, parame...
2026.05
54.9
SNN-194M
mode=Zero-shot, parame...
2026.05
53.8
Random
mode=Zero-shot
2026.05
50
Feedback
Search any
task
Search any
task