Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Coreference Resolution on Winograd
Loading...
90.5
Accuracy
PaLM 2-M
46.3
57.775
69.25
80.725
May 17, 2023
Sep 25, 2023
Feb 4, 2024
Jun 15, 2024
Oct 24, 2024
Mar 5, 2025
Jul 15, 2025
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
PaLM 2-M
prompting=1-shot
2023.05
90.5
PaLM 2-L
prompting=1-shot
2023.05
89.5
PaLM
prompting=1-shot
2023.05
87.5
PaLM 2-S
prompting=1-shot
2023.05
87.5
Ettin-Dec-1B
Model Size Category=XL...
2025.07
79.1
OLMo-1B-0724
Model Size Category=XL...
2025.07
76.9
Llama-3.2-1B
Model Size Category=XL...
2025.07
74.7
Ettin-Dec-400m
Model Size Category=La...
2025.07
71.8
SmolLM2-360m
Model Size Category=La...
2025.07
70.3
Pythia-410m
Model Size Category=La...
2025.07
65.2
SmolLM2-135m
Model Size Category=Ba...
2025.07
59.7
Ettin-Dec-150m
Model Size Category=Ba...
2025.07
59
Pythia-160m
Model Size Category=Ba...
2025.07
58.2
Ettin-Dec-68m
Model Size Category=Sm...
2025.07
55.3
DistilGPT
Model Size Category=Sm...
2025.07
53.8
Pythia-14m
Model Size Category=XX...
2025.07
51.6
Ettin-Dec-32m
Model Size Category=XS...
2025.07
50.2
Ettin-Dec-17m
Model Size Category=XX...
2025.07
48
Feedback
Search any
task
Search any
task