Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual-ERM-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Visual Equivalence Reward ModelingVisual-ERM-Bench AVG
F1h Score42.1
9
Visual Equivalence Reward ModelingVisual-ERM-Bench SVG
F1h Score33.3
9
Visual Equivalence Reward ModelingVisual-ERM-Bench Table
F1 Score (h)56.4
9
Visual Equivalence Reward ModelingVisual-ERM-Bench Chart
F1h39.9
9
Showing 4 of 4 rows