Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PIQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Commonsense ReasoningPIQA
Accuracy94.9
647
Physical Commonsense ReasoningPIQA
Accuracy92.93
329
Physical Interaction Question AnsweringPIQA
Accuracy94.9
323
ReasoningPIQA
Accuracy96.5
133
Physical Commonsense ReasoningPIQA (val)
Accuracy83
113
Question AnsweringPIQA
Accuracy81.8
83
Commonsense reasoningPIQA 1.0 (test)
Accuracy82.21
48
Commonsense ReasoningPIQA (test)
Accuracy90.1
46
Physical ReasoningPIQA
Accuracy81.34
44
Physical Commonsense ReasoningPIQA
Accuracy81.23
41
Physical ReasoningPIQA
Accuracy91.3
34
Zero-shot ReasoningPIQA
PIQA Zero-shot Accuracy80.9
31
Zero-shot AccuracyPIQA
Zero-shot PIQA Accuracy81.5
30
Commonsense reasoningPIQA (out-of-domain)
Accuracy70.84
25
Physical Commonsense ReasoningPIQA
Delta Accuracy0
24
Physical Commonsense ReasoningPIQA (test)
Accuracy90.7
24
Physical ReasoningPIQA
Accuracy82.21
20
Correctness PredictionPIQA
Accuracy79.64
18
Physical Commonsense ReasoningPIQA
Mean Per-Step Regret0.152
15
Question AnsweringPiQA
Accuracy81.77
15
Question AnsweringPIQA out-of-domain
ROUGE-L19.1
14
Physical Commonsense ReasoningPIQA
Accuracy91
12
ReasoningPIQA
Accuracy Improvement2.05
12
Question AnsweringPIQA
Accuracy (Baseline)77.31
11
Common Sense ReasoningPIQA (dev)
Accuracy83.2
11
Showing 25 of 44 rows