Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multimodal Reward Modeling on PhyCritic-Bench

78.2Overall Score

Gemini-2.5-Pro

50.01657.33364.6571.967Feb 11, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
78.287.982.886.768.876.6-
2026.02
6878.865.586.765.657.4-
2026.02
67.163.670.773.381.263.8-
2026.02
64.757.670.246.77572.3-
2026.02
5675.850605048.9-
2026.02
54.751.565.543.346.955.3-
2026.02
51.651.551.746.75057.4-
2026.02
51.136.453.443.368.851.1-