Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual Equivalence Reward Modeling on Visual-ERM-Bench AVG

42.1F1h Score

Visual-ERM

1.22811.83922.4533.061Mar 13, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
42.144.758.4
2026.03
40.643.453.4
2026.03
37.840.959.1
2026.03
32.73558.9
2026.03
29.532.456.2
2026.03
2529.556.5
2026.03
6.79.632.5
2026.03
5.36.517.5
2026.03
2.85.115.2