Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MMLongBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long-context document understandingMMLongBench-Doc
Accuracy55.8
58
Document Visual Question AnsweringMMLongbench doc
Accuracy45.6
48
Visual Document RetrievalMMLongBench
Doc Retrieval Rate53.82
46
Multimodal Document Question AnsweringMMLongBench-Doc
Overall Accuracy65.8
44
Document Question AnsweringMMLongBench-Doc
Accuracy (all)69.6
23
Long-document Visual Question AnsweringMMLongBench Overall
Average Score90.77
22
Long-document Visual Question AnsweringMMLongBench 128K context
MMLB-D83.33
22
Long-document Visual Question AnsweringMMLongBench 64K context
MMLB-D93.1
22
Multimodal Document Question AnsweringMMLongBench
Accuracy43.2
19
RetrievalMMLongBench
Recall75.86
18
Long-context Multi-modal UnderstandingMMLongBench
Text Accuracy27.49
17
Document Understanding, OCR & ChartsMMLongBench Doc
Score57.5
14
Reasoning over rich modalitiesMMLongBench Doc
Accuracy42.3
12
Multimodal Document Question AnsweringMMLongBench (test)
Chart Acc.34.7
12
Long-context Visual Question AnsweringMMLongBench 32K
Accuracy82.4
11
Long-context Visual Question AnsweringMMLongBench 128K
Accuracy78.6
11
Document Question AnsweringMMLongBench
Exact Match43.8
11
RetrievalMMLongBench Finreport
MRR@1049.62
6
RetrievalMMLongBench Doc
MRR@1047.64
6
Dataset Description ExtractionMMLongBench-Doc
Accuracy94.9
5
Long-document Visual Question AnsweringMMLongBench 512K context
MMLongBench-D Score31.91
4
Long-document Visual Question AnsweringMMLongBench 256K context
MMLB-D Score31.63
4
Showing 22 of 22 rows