Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Region-level faithfulness analysis on DeFacto 1.5K (test)
Loading...
21.5
Mislocalized & Wrong Rate
DeFacto
20.62
26.56
32.5
38.44
Sep 25, 2025
Mislocalized & Wrong Rate
Spurious Correct Rate
Faithful but Wrong Rate
Masked Evidence Abstention Rate
Image Replacement Abstention Rate
Updated 13d ago
Evaluation Results
Method
Method
Links
Mislocalized & Wrong Rate
Spurious Correct Rate
Faithful but Wrong Rate
Masked Evidence Abstention Rate
Image Replacement Abstention Rate
DeFacto
2025.09
21.5
5.6
1.3
64.1
61.4
DeepEyes
2025.09
23.5
11
1.8
33.1
40.3
GRIT
2025.09
43.5
11.6
30.1
46.1
46.3
Feedback
Search any
task
Search any
task