Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CRIT

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal Question AnsweringCRIT Scientific Paper 1.0 (test)
EM15.9
11
Multimodal Question AnsweringCRIT (Video Frame) 1.0 (test)
Exact Match (EM)38.8
11
Multimodal Question AnsweringCRIT Natural Image 1.0 (test)
EM58.6
11
Visual ReasoningCRIT Scientific Paper
Exact Match (EM)15.9
6
Visual ReasoningCRIT Video Frame
EM38.8
6
Visual ReasoningCRIT Natural Image
EM58.6
6
Showing 6 of 6 rows