Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Quality Estimation on Gender-ambiguous EN-DE ES IT Neutral vs. Gendered
Loading...
0.9921
QE Score (DE)
FairQE
0.963188
0.970694
0.9782
0.985706
Apr 23, 2026
QE Score (DE)
QE Score (ES)
QE Score (IT)
Updated 1mo ago
Evaluation Results
Method
Method
Links
QE Score (DE)
QE Score (ES)
QE Score (IT)
FairQE
Backbone=MetricX 24 L
2026.04
0.9921
0.9948
0.9884
MetricX 24 L
Model Version=2024, Mo...
2026.04
0.9918
0.9805
0.9737
MetricX 24 XL
Model Version=2024, Mo...
2026.04
0.9877
0.9756
0.9707
GEMBA-MQM
2026.04
0.982
0.9801
0.9797
FairQE
Backbone=COMETKiwi 22
2026.04
0.9801
0.9693
0.9727
COMETKiwi 22
Model Version=2022
2026.04
0.9737
0.9689
0.9694
COMETKiwi 23 XL
Model Version=2023, Mo...
2026.04
0.9643
0.9513
0.9436
Feedback
Search any
task
Search any
task