Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Human Consistency Evaluation on RichHF-18K
Loading...
33.9
Kendall's Tau
Gemini-2.5-Pro
1.92
10.2225
18.525
26.8275
Jun 3, 2025
Kendall's Tau
Updated 1mo ago
Evaluation Results
Method
Method
Links
Kendall's Tau
Gemini-2.5-Pro
Model Category=Closed-...
2025.06
33.9
Qwen3-VL
Model Category=Open-so...
2025.06
33.7
UnifiedReward_Q
Model Category=Open-so...
2025.06
33.6
UnifiedReward_L
Model Category=Open-so...
2025.06
33.4
Minos
Model Category=Open-so...
2025.06
31.5
LLaVA-Critic
Model Category=Open-so...
2025.06
28.9
GPT-4o
Model Category=Closed-...
2025.06
27.1
LLaVA-OV
Model Category=Open-so...
2025.06
24.3
LLaVA-Critic
Model Category=Open-so...
2025.06
16.8
Prometheus-V
Model Category=Open-so...
2025.06
6.55
LLaVA-OV
Model Category=Open-so...
2025.06
3.15
Feedback
Search any
task
Search any
task