Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Open-ended Utility Evaluation on MM-Vet v1 (test)

68.1Utility Score

No Steering

60.61262.55664.566.444Apr 10, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2026.04
68.1
2026.04
65
2026.04
64.2
2026.04
62.8
2026.04
60.9