Share your thoughts, 1 month free Claude Pro on usSee more

Human Consistency Evaluation on RichHF-18K

33.9Kendall's Tau

Gemini-2.5-Pro

Updated 2mo ago

Evaluation Results

Method	Links
Gemini-2.5-Pro 2025.06		33.9
Qwen3-VL 2025.06		33.7
UnifiedReward_Q 2025.06		33.6
UnifiedReward_L 2025.06		33.4
Minos 2025.06		31.5
LLaVA-Critic 2025.06		28.9
GPT-4o 2025.06		27.1
LLaVA-OV 2025.06		24.3
LLaVA-Critic 2025.06		16.8
Prometheus-V 2025.06		6.55
LLaVA-OV 2025.06		3.15