Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Question Answering on StrategyQA (accuracy)
Loading...
84
Accuracy
Gt-Margin
-1.8
20.475
42.75
65.025
Feb 10, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Gt-Margin
Model=LLaDA-8B, Comple...
2026.02
84
Gt-Margin
Model=Dream-7B, Comple...
2026.02
83.5
Top-Prob
Model=Dream-7B, Comple...
2026.02
78.5
Margin
Model=Dream-7B, Comple...
2026.02
76.5
Gt-Prob
Model=Dream-7B, Comple...
2026.02
70.5
Margin
Model=LLaDA-8B, Comple...
2026.02
65.5
Random
Model=Dream-7B, Comple...
2026.02
64
AR
Model=LLaDA-8B, Comple...
2026.02
63.5
Inverse-AR
Model=LLaDA-8B, Comple...
2026.02
63.5
Top-Prob
Model=LLaDA-8B, Comple...
2026.02
63.5
AR
Model=Dream-7B, Comple...
2026.02
60.9
Gt-Prob
Model=LLaDA-8B, Comple...
2026.02
50.5
Random
Model=LLaDA-8B, Comple...
2026.02
41.5
Inverse-AR
Model=Dream-7B, Comple...
2026.02
1.5
Feedback
Search any
task
Search any
task