Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Commonsense Question Answering on ECQA
Loading...
57.2
Performance Score (Finetune Baseline vs Predict Baseline)
T5-base
42.224
46.112
50
53.888
May 4, 2023
Performance Score (Finetune Baseline vs Predict Baseline)
Performance Score (Finetune Baseline vs Predict Infusion)
Simulatability Score
Performance Score (Finetune Infusion vs Predict Infusion)
TREU Score
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Performance Score (Finetune Baseline vs Predict Baseline)
Performance Score (Finetune Baseline vs Predict Infusion)
Simulatability Score
Performance Score (Finetune Infusion vs Predict Infusion)
TREU Score
Accuracy
T5-base
Backbone=T5-base
2023.05
57.2
74.6
17.4
98.9
59.1
-
BART-base
Backbone=BART-base
2023.05
42.8
43.8
1
90.1
48.3
-
InfoRidge
Backbone=Qwen-2.5-0.5B
2025.07
-
-
-
-
-
65.45
InfoRidge
Backbone=LLaMA-3.1-8B
2025.07
-
-
-
-
-
80.49
Feedback
Search any
task
Search any
task