Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Quality Estimation on EN-DE Gender-explicit
Loading...
99.7
Accuracy
FairQE (ours, w/ COMETKiwi 22)
93.772
95.311
96.85
98.389
Apr 23, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
FairQE (ours, w/ COMETKiwi 22)
Backbone=COMETKiwi 22
2026.04
99.7
COMETKiwi 22
Backbone=COMETKiwi, Mo...
2026.04
99.2
COMETKiwi 23 XL
Backbone=COMETKiwi, Mo...
2026.04
98.7
MetricX 24 XL
Backbone=MetricX, Mode...
2026.04
98.5
FairQE (ours, w/ MetricX 24 L)
Backbone=MetricX 24 L
2026.04
98.2
MetricX 24 L
Backbone=MetricX, Mode...
2026.04
97.6
GEMBA-MQM
Backbone=GEMBA-MQM
2026.04
94
Feedback
Search any
task
Search any
task