Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Coreference Resolution on Winogender (test)
Loading...
80.7
Accuracy
GLM-130B
47.836
56.368
64.9
73.432
Oct 5, 2022
Accuracy
Normalized Log Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Normalized Log Accuracy
GLM-130B
Number of shots=1
2022.10
80.7
-
GLM-130B
Number of shots=0
2022.10
79.7
-
PaLM 540B
Number of shots=1
2022.10
79.4
-
Chinchilla
Number of shots=0
2022.10
78.3
-
PaLM 540B
Number of shots=0
2022.10
75
-
Gopher 280B
Number of shots=0
2022.10
71.4
-
GPT-3 (Davinci)
Number of shots=0
2022.10
64.2
-
GPT-3 (Davinci)
Number of shots=1
2022.10
62.6
-
OPT 175B
Number of shots=0
2022.10
54.8
-
BLOOM 176B
Number of shots=1
2022.10
53.1
-
BLOOM 176B
Number of shots=0
2022.10
49.1
-
HATified-SFT
Shot=5-shot
2026.03
-
67.9
Llama-Instruct
Shot=5-shot
2026.03
-
84.3
Feedback
Search any
task
Search any
task