Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Coreference Resolution on XWinograd French
Loading...
69.9
Score
EuroLM
58.6056
61.5378
64.47
67.4022
Dec 25, 2025
Jan 13, 2026
Feb 2, 2026
Feb 22, 2026
Mar 13, 2026
Apr 2, 2026
Apr 22, 2026
Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score
EuroLM
Zero-shot=true, Chat T...
2025.12
69.9
Qwen3
Zero-shot=true, Chat T...
2025.12
69.9
Gamayun
Zero-shot=true, Chat T...
2025.12
67.5
MKC-e
Filtering Strategy=MKC-e
2026.04
67.47
Q3 negatives
Filtering Strategy=Q3...
2026.04
66.27
Nordic
Filtering Strategy=Nordic
2026.04
66.27
Llama3.2
Zero-shot=true, Chat T...
2025.12
65.1
HQ
Filtering Strategy=HQ
2026.04
65.06
No Filtering
Filtering Strategy=No...
2026.04
63.86
HQ all
Filtering Strategy=HQ all
2026.04
63.86
ML (15%)
Filtering Strategy=ML...
2026.04
63.86
HQ seed (LLM Training)
Filtering Strategy=HQ...
2026.04
62.65
Qwen2.5
Zero-shot=true, Chat T...
2025.12
61.5
ML (Q3)
Filtering Strategy=ML...
2026.04
61.45
HQ Romance (No Fra)
Filtering Strategy=HQ...
2026.04
60.24
ML
Filtering Strategy=ML
2026.04
60.24
Gemma3
Zero-shot=true, Chat T...
2025.12
60.2
HQ seed
Filtering Strategy=HQ...
2026.04
59.04
Feedback
Search any
task
Search any
task