Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multilingual Story Completion on XStoryCloze
Loading...
63.5
Extract Match
LANG
57.26
58.88
60.5
62.12
May 21, 2026
Extract Match
Accuracy
Updated 12d ago
Evaluation Results
Method
Method
Links
Extract Match
Accuracy
LANG
backbone=Qwen2.5-7B-In...
2026.05
63.5
63.5
Qwen2.5-7B-Instruct
2026.05
60.9
60.9
LC-GRPO
backbone=Qwen2.5-7B-In...
2026.05
60.2
60.2
Vanilla GRPO
backbone=Qwen2.5-7B-In...
2026.05
57.5
57.5
Feedback
Search any
task
Search any
task