Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Vision-Language Reward Model Evaluation on VLRewardBench
Loading...
74.5
Accuracy
LWE
62.228
65.414
68.6
71.786
Dec 7, 2025
Accuracy
Consistency
Pairwise Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Consistency
Pairwise Accuracy
LWE
Relative Inference Cos...
2025.12
74.5
80.5
64.6
TextGrad*
Relative Inference Cos...
2025.12
73
74.9
61.5
Dynamic Cheatsheet
Relative Inference Cos...
2025.12
69.8
86.8
62.9
Selective LWE
Relative Inference Cos...
2025.12
67.6
94
64.8
Sample-Specific Prompt
Relative Inference Cos...
2025.12
66.1
72.7
52.9
CoT
Relative Inference Cos...
2025.12
65.1
80.8
55.3
Vanilla
Relative Inference Cos...
2025.12
62.9
80.1
52.9
Majority Voting
Relative Inference Cos...
2025.12
62.7
81
53.7
Feedback
Search any
task
Search any
task