Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Pronoun Resolution on XWinograd
Loading...
56.4
Accuracy
DMoE
53.176
54.013
54.85
55.687
Jun 14, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
DMoE
Model Scale=1.7B, Trai...
2025.06
56.4
DMoE
Model Scale=1.7B, Trai...
2025.06
56.4
BLOOM + Continued Pre-training
Model Scale=1.7B, Trai...
2025.06
55.8
Branch-Train-Mix
Model Scale=1.7B, Trai...
2025.06
55.8
Branch-Train-Mix
Model Scale=1.7B, Trai...
2025.06
55.6
BLOOM + Continued Pre-training
Model Scale=1.7B, Trai...
2025.06
55.5
BLOOM
Model Scale=1.7B, Trai...
2025.06
55.5
DMoE
Model Scale=560M, Trai...
2025.06
55.1
DMoE
Model Scale=560M, Trai...
2025.06
55.1
BLOOM
Model Scale=1.7B, Trai...
2025.06
55.1
BLOOM + Continued Pre-training
Model Scale=560M, Trai...
2025.06
54.9
Branch-Train-Mix
Model Scale=560M, Trai...
2025.06
54.4
Branch-Train-Mix
Model Scale=560M, Trai...
2025.06
54.2
BLOOM + Continued Pre-training
Model Scale=560M, Trai...
2025.06
53.8
BLOOM
Model Scale=560M, Trai...
2025.06
53.7
BLOOM
Model Scale=560M, Trai...
2025.06
53.3
Feedback
Search any
task
Search any
task