Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual Equivalence Reward Modeling on Visual-ERM-Bench Table

56.4F1 Score (h)

Visual-ERM

0.13614.74329.3543.957Mar 13, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
56.457.674.8
2026.03
48.150.145.6
2026.03
46.44849.9
2026.03
39.340.654.6
2026.03
35.737.456.2
2026.03
32.935.749.5
2026.03
9.910.931.7
2026.03
77.821.4
2026.03
2.33.112.6