Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Physical Commonsense Reasoning on PIQA (PIQA Score)
Loading...
81.1
PIQA Score
LLAMA-4-SCOUT
58.636
64.468
70.3
76.132
Nov 14, 2025
Dec 13, 2025
Jan 11, 2026
Feb 9, 2026
Mar 10, 2026
Apr 8, 2026
May 7, 2026
PIQA Score
Updated 5d ago
Evaluation Results
Method
Method
Links
PIQA Score
LLAMA-4-SCOUT
PARAMS=109B, Variant=O...
2025.11
81.1
LLAMA-4-SCOUT
PARAMS=109B, Variant=FCSD
2025.11
80.8
LLAMA-4-SCOUT
PARAMS=109B, Variant=SFT
2025.11
80.7
QWEN-3-30B MOE
PARAMS=30B, Variant=OR...
2025.11
80.5
QWEN-3-30B MOE
PARAMS=30B, Variant=FCSD
2025.11
80.4
DEEPSEEK-V2-LITE
PARAMS=16B, Variant=OR...
2025.11
80.1
DEEPSEEK-V2-LITE
PARAMS=16B, Variant=FCSD
2025.11
79.9
DEEPSEEK-V2-LITE
PARAMS=16B, Variant=SFT
2025.11
78.2
QWEN-3-30B MOE
PARAMS=30B, Variant=SFT
2025.11
77.8
Echo-180M
Params=180M, Tokens=10...
2026.05
70.9
Transformer-180M
Params=180M, Tokens=10...
2026.05
67.1
Mamba-2-180M
Params=180M, Tokens=10...
2026.05
66.8
Mamba-3-MIMO-180M
Params=180M, Tokens=10...
2026.05
66.7
GDN-180M
Params=180M, Tokens=10...
2026.05
66.3
Mamba-3-SISO-180M
Params=180M, Tokens=10...
2026.05
66.1
Echo-50M
Params=50M, Tokens=3B,...
2026.05
59.5
Feedback
Search any
task
Search any
task