Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MMEvalPro

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal ReasoningMMEvalPro
Accuracy82.8
15
Hallucination DetectionMMEvalPro perception
F1 (Faithful)98.6
5
Showing 2 of 2 rows