Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Coreference Resolution on CLUEWSC
Loading...
90.98
EM
C-DPO
46.6968
58.1934
69.69
81.1866
Jan 11, 2024
May 18, 2024
Sep 23, 2024
Jan 29, 2025
Jun 6, 2025
Oct 12, 2025
Feb 17, 2026
EM
Updated 1mo ago
Evaluation Results
Method
Method
Links
EM
C-DPO
Size=70B, Evaluation P...
2024.09
90.98
C-SFT
Size=70B, Evaluation P...
2024.09
90.88
Llama-3 Instruct
Size=70B, Evaluation P...
2024.09
86.37
C-DPO
Size=8B, Evaluation Pr...
2024.09
85.04
GLM-5 Base
Architecture=MoE, Acti...
2026.02
84.2
GLM-4.5 Base
Architecture=MoE, Acti...
2026.02
83.5
DeepSeek-V3 Base
Architecture=MoE, Acti...
2026.02
82.7
C-SFT
Size=8B, Evaluation Pr...
2024.09
80.64
Llama-3 Instruct
Size=8B, Evaluation Pr...
2024.09
80.12
DeepSeek 7B (Dense)
# Shot=5-shot, # Total...
2024.01
73.1
DeepSeekMoE 142B (Half Activated)
# Shot=5-shot
2024.01
72.6
DeepSeekMoE 16B
# Shot=5-shot, # Total...
2024.01
72.1
DeepSeekMoE 16B
# Shot=5-shot, # Total...
2024.01
72.1
DeepSeekMoE 145B
# Shot=5-shot
2024.01
71.9
DeepSeek 67B (Dense)
# Shot=5-shot
2024.01
69.1
DeepSeekMoE Chat 16B
# Shot=5-shot, Total P...
2024.01
68.2
DeepSeek Chat 7B
# Shot=5-shot, Total P...
2024.01
66.2
GShard 137B
# Shot=5-shot
2024.01
65.7
LLaMA2 7B
# Shot=5-shot, # Total...
2024.01
64
LLaMA2 SFT 7B
# Shot=5-shot, Total P...
2024.01
48.4
Feedback
Search any
task
Search any
task