Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PIQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Commonsense ReasoningPIQA
Accuracy94.9
751
Physical Commonsense ReasoningPIQA
Accuracy94.9
572
Question AnsweringPIQA
Accuracy86.5
374
Physical Interaction Question AnsweringPIQA
Accuracy94.9
333
ReasoningPIQA
Accuracy96.5
145
Physical Commonsense ReasoningPIQA (val)
Accuracy83
116
Physical Commonsense ReasoningPIQA
Accuracy85.91
78
Physical ReasoningPIQA
Accuracy81.34
74
Common Sense ReasoningPIQA
Accuracy83
71
Zero-shot ReasoningPIQA
PIQA Zero-shot Accuracy80.9
62
Physical Commonsense ReasoningPIQA
Accuracy7,497
56
Commonsense reasoningPIQA 1.0 (test)
Accuracy82.21
48
Commonsense ReasoningPIQA (test)
Accuracy90.1
46
Physical Commonsense ReasoningPiQA
Accuracy76.56
45
Question AnsweringPiQA
Accuracy81.77
36
Physical ReasoningPIQA
Accuracy91.3
34
Zero-shot AccuracyPIQA
Zero-shot PIQA Accuracy81.5
30
Inactive Attention Head IdentificationPIQA
Percentage of Heads Zeroed31.3
28
Commonsense reasoningPIQA (out-of-domain)
Accuracy70.84
25
Physical Commonsense ReasoningPIQA
Delta Accuracy0
24
Physical Commonsense ReasoningPIQA (test)
Accuracy90.7
24
Physical ReasoningPIQA
Accuracy82.21
20
Common Sense ReasoningPIQA (dev)
Accuracy83.2
19
Correctness PredictionPIQA
Accuracy79.64
18
Physical Commonsense ReasoningPIQA
Mean Per-Step Regret0.152
15
Showing 25 of 58 rows