Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Short-form Question Answering on Short-form QA Aggregate (Avg.) (test)
Loading...
35.93
EM
VSPO + PRS_short
8.5468
15.6559
22.765
29.8741
Dec 8, 2025
EM
Updated 4d ago
Evaluation Results
Method
Method
Links
EM
VSPO + PRS_short
Backbone=Qwen2.5-3B-In...
2025.12
35.93
GRPO + PRS_short
Backbone=Qwen2.5-3B-In...
2025.12
34.83
PPO + PRS_short
Backbone=Qwen2.5-3B-In...
2025.12
33.33
SFT
Backbone=Qwen2.5-3B-In...
2025.12
17.6
Untrained
Backbone=Qwen2.5-3B-In...
2025.12
9.6
Feedback
Search any
task
Search any
task