Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

R-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal Relation ReasoningR-Bench
Accuracy85.05
20
Real-world UnderstandingR-Bench
Distance Error55.5
19
Visual UnderstandingR-Bench (test)
MCQ (low)65.29
8
Relational Hallucination EvaluationR-Bench
F1 Score79.1
5
Showing 4 of 4 rows