Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Human Evaluation on CulturalVQA OOD (test)
Loading...
7.66
Faithfulness
MMBoundary
3.76
4.7725
5.785
6.7975
May 29, 2025
Faithfulness
Conciseness
Granularity
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Faithfulness
Conciseness
Granularity
Average Score
MMBoundary
2025.05
7.66
7.17
8.26
7.69
SaySelf
2025.05
7.12
6.81
6.58
6.83
RCE
2025.05
6.84
6.19
6.82
6.62
DRL
2025.05
6.55
5.63
6.07
6.08
Conf-CSR
2025.05
6.38
5.45
5.86
5.89
Multisample
2025.05
3.91
5.63
4.72
4.75
Feedback
Search any
task
Search any
task