Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Factual mistake detection on DocENT (PoSh)
Loading...
64.8
Accuracy
Soft-MSD
34.016
42.008
50
57.992
May 7, 2026
Accuracy
Spearman Rho (ρ)
Kendall Tau (τ)
Updated 26d ago
Evaluation Results
Method
Method
Links
Accuracy
Spearman Rho (ρ)
Kendall Tau (τ)
Soft-MSD
2026.05
64.8
0.259
0.195
Soft-MSD
2026.05
64.4
0.219
0.165
CLIPScore
2026.05
53.5
0.181
0.136
CLIPScore
2026.05
45.3
0.145
0.108
FLEUR
2026.05
41.2
0.04
0.031
FLEUR
2026.05
35.2
0.053
0.04
Feedback
Search any
task
Search any
task