Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-context document understanding on MMLongBench-Doc

55.8Accuracy

Synthetic Reasoning

4.42417.76231.144.438Oct 8, 2024Jan 8, 2025Apr 10, 2025Jul 12, 2025Oct 12, 2025Jan 12, 2026Apr 15, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.03
55.8-
2026.03
54.9-
2026.03
54.8-
2026.03
53.6-
2026.03
52.2-
2026.03
51.8-
2026.03
47.5-
2026.03
46.6-
2026.04
45.6-
2026.03
45-
2026.03
43-
2024.10
42.944.9
2026.04
42.8-
2026.04
42.8-
2026.04
42.1-
2026.04
41-
2026.04
40.1-
2026.03
39.9-
2026.04
39.8-
2026.04
38.6-
2026.04
36.1-
2026.04
33.9-
2026.04
33.8-
2024.10
33.335.7
2026.04
33-
2026.04
32.4-
2024.10
2928.6
2026.04
28.8-
2026.04
28.8-
2026.04
28.8-
2026.04
28.6-
2024.10
28.324.6
2024.10
28.220.6
2026.04
28.2-
2026.04
28.2-
2026.04
28-
2024.10
2721.3
2026.04
26.6-
2026.04
25.4-
2026.04
24.1-
2026.04
24.1-
2026.04
23-
2026.04
23-
2024.10
21.322.7
2026.04
21-
2026.04
21-
2026.04
18.8-
2026.04
18.8-
2026.04
18.4-
2026.04
18.4-
2024.10
18.217.9
2024.10
14.613
2024.10
13.811.3
2026.04
13.4-
2026.04
13.4-
2024.10
11.511.6
2024.10
76.8
2024.10
6.46