Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Translation Preference Prediction on WMT ZH -> EN
Loading...
60.7
Pairwise Accuracy
Distribution-Calibrated Aggregation
50.82
53.385
55.95
58.515
Dec 2, 2025
Pairwise Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Pairwise Accuracy
Distribution-Calibrated Aggregation
n=12, Judge LLM=gemini...
2025.12
60.7
USC
n=12, Judge LLM=gemini...
2025.12
59
Distribution-Calibrated Aggregation
n=4, Judge LLM=gemini-...
2025.12
58.3
USC
n=4, Judge LLM=gemini-...
2025.12
56.1
GSC
n=12, Judge LLM=gemini...
2025.12
55
CI-SC
n=12, Judge LLM=gemini...
2025.12
54.5
SC
n=12, Judge LLM=gemini...
2025.12
53.9
CI-SC
n=4, Judge LLM=gemini-...
2025.12
53
Soft-SC
n=12, Judge LLM=gemini...
2025.12
52.9
Soft-SC
n=4, Judge LLM=gemini-...
2025.12
52.8
SC
n=4, Judge LLM=gemini-...
2025.12
51.5
GSC
n=4, Judge LLM=gemini-...
2025.12
51.2
Feedback
Search any
task
Search any
task