Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Pronoun Resolution on XWinograd
Loading...
57.7
Accuracy
Transformer
51.2728
52.9414
54.61
56.2786
Jun 14, 2025
Aug 7, 2025
Oct 1, 2025
Nov 25, 2025
Jan 19, 2026
Mar 15, 2026
May 9, 2026
Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Accuracy
Transformer
Training=compute-match...
2026.05
57.7
DMoE
Model Scale=1.7B, Trai...
2025.06
56.4
DMoE
Model Scale=1.7B, Trai...
2025.06
56.4
BLOOM + Continued Pre-training
Model Scale=1.7B, Trai...
2025.06
55.8
Branch-Train-Mix
Model Scale=1.7B, Trai...
2025.06
55.8
Branch-Train-Mix
Model Scale=1.7B, Trai...
2025.06
55.6
BLOOM + Continued Pre-training
Model Scale=1.7B, Trai...
2025.06
55.5
BLOOM
Model Scale=1.7B, Trai...
2025.06
55.5
DMoE
Model Scale=560M, Trai...
2025.06
55.1
DMoE
Model Scale=560M, Trai...
2025.06
55.1
BLOOM
Model Scale=1.7B, Trai...
2025.06
55.1
BLOOM + Continued Pre-training
Model Scale=560M, Trai...
2025.06
54.9
Branch-Train-Mix
Model Scale=560M, Trai...
2025.06
54.4
Branch-Train-Mix
Model Scale=560M, Trai...
2025.06
54.2
BLOOM + Continued Pre-training
Model Scale=560M, Trai...
2025.06
53.8
BLOOM
Model Scale=560M, Trai...
2025.06
53.7
SRM
Training=compute-match...
2026.05
53.41
BLOOM
Model Scale=560M, Trai...
2025.06
53.3
Mamba
Training=compute-match...
2026.05
51.52
Feedback
Search any
task
Search any
task